LinguaLibre

Difference between revisions of "Chat room"

Welcome to the Chat room! Place used to discuss any and all aspects of Lingua Libre: the project itself, discussions of the operations, policy and proposals, technical issues, etc. Other forums include for code-oriented issues, . Feel free to participate in any language you want to.

(→‎Fixed: stats not updated)
(498 intermediate revisions by 61 users not shown)
Line 2: Line 2:
 
{{Lang-CR}}
 
{{Lang-CR}}
 
<indicator name="talk"></indicator>
 
<indicator name="talk"></indicator>
 +
{{LL:Chat room/FAQ}}
 
__TOC__
 
__TOC__
 +
<!-- ****      DO NOT EDIT CONTENT ABOVE    **** -->
  
== Chatroom FAQ ==
+
== Results of Coverage Test of French Lemma and Non-Lemma forms is English Wiktionary ==
* '''How to download all audios of one language ? By speaker ?'''
 
** Languages are there [https://lingualibre.fr/datasets/ https://lingualibre.fr/datasets/]. A short server-side script is auto-ran every 2 days, itself using [https://github.com/lingua-libre/CommonsDownloadTool lingua-libre/CommonsDownloadTool]. For more, see [[Help:Download from LinguaLibre]].
 
  
* '''How to add missing languages ?'''
+
While playing around with generating lists for pronunciation from Wiktionary, I decided to run a few tests on the current coverage of French lemma and non-lemma forms in English Wiktionary. I choose French because it is the largest datasets in LL.
** Administrators can add new languages, they do so within few days. For users, please provide your language's [[:wikipedia:iso-639-3|iso-639-3]] code + link to the en.wikipedia.org's article. Optional infos are the common English name and wikidata IQ. For more, see [[Help:Add a new language]].
 
  
* '''How to keep my wikimedia project up to date ?'''
+
Current Coverage of French in Lingua Libre
** Contact [[User talk:0x010C|User:0x010C]], the botmaster of Lingua Libre Bot. For more, see [[Help:Bots]].
+
* Total French Entries in Lingua Libre by a native speaker: 233 982
 +
* Unique French Entries in Lingua Libre by a native speaker: 154 358
 +
* Percentage of overlap: 34%
 +
* Term with the greatest number of pronunciations: "blanc" with 40
  
* '''What IRL event.s are coming ? When ? Where ?'''
+
Current Coverage of [https://en.wiktionary.org/wiki/Category:French_lemmas Category:French lemmas]
** Nothing coming. For more, see [[LinguaLibre:Events]].
+
* Total entries in Category:French lemmas: 84 482
 +
* Pronounced entries: 50 917
 +
* Entries with pronunciation: 33 565
 +
* Coverage Percentage: 60.27%
  
* '''How to translate LinguaLibre User Interface into a new language ?'''
+
Current Coverage of [https://en.wiktionary.org/wiki/Category:French_non-lemma_forms Category:French non-lemma forms]
** Go to [https://translatewiki.net/w/i.php?title=Special:Translate&group=mwgithub-recordwizard&language=fr&filter=%21translated&action=translate translatewiki.net], change the url part <code>fr</code> into your language's [[:en:List_of_ISO_639-2_codes|ISO 639-2 code]]. For more, see [[Help:Translate]].
+
* Total entries in Category:French non-lemma forms: 29 1225
 +
* pronounced entries: 26 791
 +
* Entries with pronunciation: 264 434
 +
* Coverage Percentage: : 9.20%
  
* '''How to archive sections which have been answered ?'''
+
For me, there are several lessons to be drawn.
** After reviewing the section, add '<code><nowiki>{{done}} -- can be closed ~~~~</nowiki></code>' to the top of the section. After few days to 2 weeks, move the section's code to <code><nowiki>[[LinguaLibre:Chat_room/Archives/year]]</nowiki></code>.
+
# First, there has been amazing growth on LL. Covering 60.27% percent is a real achievement.
=== Archives ===
+
# The overlap percentage is quite small overall.
<!-- {{Colapse|1=Archives|2= Archives by year:}}
+
# There needs to be a clearer sense of when LL should stop requesting pronunciations for a certain term because 40 pronunciations of "blanc" seems a bit excessive.
<br/> -->
+
#  A need exists to continue pro-actively targeting entries in Wiktionary that are not in Lingua Libre. Currently, 297 999 French lemma and non-lemma forms  require pronunciations.
* [[/Archives/2021|2021]]
+
# Generating lists from Wiktionary and checking coverage is not as hard as I thought.
* [[/Archives/2020|2020]]
+
# Lingua Libre has almost caught up with Forvo in the number of French pronunciations (233 982 vs 254, 703). Overall, Lingua Libre has shown amazing and healthy progress in a very short period of time. I'm excited about these results. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 03:07, 1 June 2022 (UTC)
* [[/Archives/2019|2019]]
+
:{{Ping|Languageseeker}} This investigation is pretty cool. (I'm not sure i understand all your numbers yet, but i will read again when back on my PC). Its quite nice to see we are reaching Forvo level for our lead language. It's possible we have more unique words than forvo since we have [[user:Olafbot]] actively guiding and pushing us on that path.
* [[/Archives/2018|2018]]
+
:On Lili we have chosen to be a learning AND linguistic diversity audio database. When you account for gender, regional accents, age, voice type, having 40 french audios for a word is still 400+ voices short.
 +
:Also, all contributors are not able to contribute audio perfect files due to various shortcomings (hardware, no recording room, no noose cancelling system, etc). We lack proper rating and review system. It's on our [slow] roadmap tho. 😉
 +
:PS: Should i answer to you in French i get a feeling you are French or learning it. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:07, 1 June 2022 (UTC)
 +
:: {{Ping|YUG}} Salut, Yug. Oui, je suis en train d'apprendre le français. Comme nous avons discutez pendant notre reunion, c'est difficile de definer les limits d'une language. Comme je le vois, les formes lemma ne suffit pas. Maintenant, je suis en train de crée un Olafbot sur steroid pour francais. Mon plan est de réaliser un program python qui peux analyser les modèle utilizer sur Wiktionary. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 15:48, 7 June 2022 (UTC)
 +
:Hi {{ping|Languageseeker}}. I'm sorry I did not visit the Chat Room in a long time, and missed your report. Very interesting, good job! I remember a request I made to [[User:Olaf|Olaf]] some time ago: it would be interesting to have a list similar to the one Olafbot is updating, but containing only lemmas of the target language (to quickly have nearly all lemmas of a dictionary illustrated with an audio pron). Also, I suggest you to use the categories of the French version of Wiktionary when you plan to work on French (and some other languages, that are more extensively described there). As you can see [[:fr:wikt:Catégorie:Lemmes_en_français|here]], the category gathering French lemmas is more than 3 times more complete on the fr. version than on the en. version of Wiktionary. As you mentioned, these numbers are exciting, let's keep up the good work! All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 15:47, 26 November 2022 (UTC)
 +
::: {{Ping|WikiLucas00}} Sorry, I totally forgot about your request. The list is now ready for French: [[List:Fra/Filtered-lemmas-without-audio-sorted-by-number-of-wiktionaries]]. It's produced like the other lists, but it's limited to words from Catégorie:Lemmes_en_français. The list will be refreshed together with the rest. [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 16:54, 14 May 2023 (UTC)
 +
:::: Hello {{ping|Olaf}}! Thank you so much for this list, it's going to be very useful for sure! Let's cover 100% of Lemmas 😎 I'll tell the French contributors on Discord about it 😉 All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 22:18, 20 May 2023 (UTC)
  
== Datasets out of date ==
+
== How to create user page ==
Hello. It seems that the datasets page, although it claims to run every 2 days, is completely out of date: all the available zips are from April 2020 or November 2019 (and the full zip from May 2019). Is this a known problem? Is there a plan to address it? [[User:Julien Baley|Julien Baley]] ([[User talk:Julien Baley|talk]]) 23:17, 27 August 2020 (UTC)
 
:Indeed, it seems to have an issue with the dataset updating. I opened a [[phab:T261519|Phabricator ticket]] about this issue. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:24, 28 August 2020 (UTC)
 
  
== About the exclusion of already recorded words ==
+
Hello, my user name is Ngangaesther from Kenya. I am still  stuck on how am supposed to create my user page kindly help
Hi, I think the option to exclude words that I have already recorded is broken. This morning, I start a recording session and LL proposes me words that I registered two days ago. For example, I already registered [https://commons.wikimedia.org/wiki/File:LL-Q143_(epo)-Lepticed7-Belorusino.wav Belorusino] two days ago, but it does not disappear when I click exclude words already recorded. And notice the two versions of the file, which I already re-recorded it. Can someone fix this? [[User:Lepticed7|Lepticed7]] ([[User talk:Lepticed7|talk]]) 10:07, 15 November 2020 (UTC)
+
regards Esther
:I have opened a [[phab:T267876|Phabricator ticket]]. It may be fixed in the coming months but not sure. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:05, 15 November 2020 (UTC)
 
  
== Issue with the Main page ==
+
== Odia language missing from [[LinguaLibre:Stats/Languages|Stats/Languages]] ==
{{Move|LinguaLibre:Technical board|type=section}}
 
  
{{done}}
+
Hi there, for some reason, the Odia-language stats are missing from the [[LinguaLibre:Stats/Languages|Stats/Languages]] page. Also, "The most prolific speakers for the current month
 +
" section in the [[LinguaLibre:Stats/Speakers|Stats/Speakers]]  page is not loading at all since the time I checked last (about 10 days). I have tried on Chromium and Firefox and the result is the same even after clearing cache. --[[User:Psubhashish|Subhashish]] ([[User talk:Psubhashish|talk]]) 19:40, 28 July 2022 (UTC)
 +
:Hello [[User:Psubhashish|Subhashish]], it should be back online. We had a hackathon to put it back. We are calling for devs to push forwards. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 11:07, 10 August 2022 (UTC)
 +
:: Thank you for the update, [[User:Yug|Yug]]. --[[User:Psubhashish|Subhashish]] ([[User talk:Psubhashish|talk]]) 14:00, 10 August 2022 (UTC)
  
Hi, the main page here uses a sollution that requires the [[MediaWiki:Lang]] and its subpages to be populated, which they aren't which makes the main page not switch languages even if there is a translation available in that language and the language has been set. Could someone look into this if it's possibly to rework the structure or maybe somehow import the [[MediaWiki:Lang]] subpages? --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 17:40, 16 January 2021 (UTC)
+
== Manually-coded languages ==
:Hello {{ping|Sabelöga}} thank you very much for this remark! I just imported the MediaWiki:Lang subpages from Meta, and it seems to be working as of now 🙂 All the best. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 00:09, 17 January 2021 (UTC)
 
::That's excellent, that you so much for amending this, regards --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 21:59, 18 January 2021 (UTC)
 
:::Thanks a lot! The main page looks fine --[[User:Higa4|Higa4]] ([[User talk:Higa4|talk]]) 04:34, 19 January 2021 (UTC)
 
  
== Images missing ==
+
I came across [[:meta:Lingua Libre/SignIt]] recently (via betawiki) and was wondering if manually-coded languages would be appropriate for this as well? These are languages in sign modality, but strongly tied to a spoken/written language; they usually adopt the grammar of the nonmanual language, choosing instead to simply transpose the vocabulary. This means they are most often used in application-specific and pidgin contexts (Pidgin Sign for English and diver's signs are examples). In particular, I am interested in ''toki pona luka'', a manual form of {{q|338540}}. Since the vocab is the same as spoken/written toki pona, there are a minimal number of lexemes overall, so having a complete set of signs is easily achievable. Manually-coded languages including ''toki pona luka'' are generally not given a separate ISO 639 code since they are in effect equivalent to scripts. Would this cause a problem for the infrastructure as currently designed? [[User:Arlo Barnes|Arlo Barnes]] ([[User talk:Arlo Barnes|talk]]) 05:56, 17 August 2022 (UTC)
{{done}}<br/>
 
{{Move|LinguaLibre:Technical board|type=section}}
 
Images on [[Help:Add_a_new_language]] show up as missing on my end. Have they moved or is this some error? --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 21:08, 19 January 2021 (UTC)
 
:Hello {{ping|Sabelöga}} they are not actually missing (for example, I can see them on the page you are talking about), but I also have experienced similar issues on the website. Some images seem to randomly disappear for no reason, and to come back after a while, without any modification on the page. We will talk about it during the next team meeting. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 21:19, 19 January 2021 (UTC)
 
:::Actually, some images are really missing in the section "I know what I'm doing". This comes from the fact all images have been lost when the website have migrated to the new design. I had opened a [[phab:T264332|ticket]] about that but I think we will never find them back. So we should create new screenshot when we discover such missing image. I will try to do for the page you've mentioned. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 23:06, 19 January 2021 (UTC)
 
::::Yes, those were the one I were talking about. --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 16:56, 23 January 2021 (UTC)
 
  
== Translation error? ==
+
----
The translation units on [[Help:Configure_your_microphone]] does not align properly with each other. What I mean is that the translation units include several section when the software should just pick the one for each unit. I removed the <code><nowiki>__TOC__</nowiki></code> from the page since the TOC will appear anyway. So could a translation adminstrator mark the page for translation again and let's see if that solves the issue. --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 16:56, 23 January 2021 (UTC)
 
:{{done}} — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 22:21, 23 January 2021 (UTC)
 
  
== RecordWizard drops syllable ==
+
Hello [[User:Arlo Barnes|Arlo Barnes]],
:{{done}} -- can be archived. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:03, 27 January 2021 (UTC)
 
Prior to several days ago I was recording word pronunciations without problems, but now I can't. Several days ago RecordWizard started removing certain sounds from my voice, most frequently syllables "s" and "f". If I spell a word, for example "syllable", it records it as "yllable", just like if I never spelled "s". If I spell "sophicated", it records "soicated".
 
  
This sort of behaviour is not present at the test sound stage (the first thing RecordWizard asks user to do), it captures my speech perfecty. However the very same issue is present on different devices with different operation systems, different browsers and different microphones.
+
I understand "manually coded languages" as synonymous to "signed languages", am I correct?<br />If there is no distinct ISO for the signed language, we could still:
 +
* Create a new wikidata item without ISO, which will be used as identifier by LinguaLibre infrastructure
 +
* Use the spoken/write language ISO, and create lists of words all suffixed by <kbd>(signed)</kbd>.
 +
Either of those solutions could work.
  
My guess is that maybe some changes to noise recognition were deployed several days ago, and it now misinterprets those syllables as background noice. Anyway, I will be grateful for suggestions on how to fix this issue. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 07:43, 26 January 2021 (UTC)
+
If you have some knowledge of signed ''toki pona luka'' please let me know. We are adding features on Lingualibre and SignIt in order to be able to record video of signed words by late 2022. We are almost there. If you would like to record some basic signed words to share with the world, then let me know. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:58, 17 August 2022 (UTC)
  
:Hi {{u|Tohaomg}}, not easy to say what happens here. I am pretty sure nothing change at the backend since several months. With your examples "syllable" and "sophicated", does it happen every time you try to pronounce these words or does it happen randomly? In the first case, can other contributors try to record these words and see if the problem occurs for them as well? Myself, I just tried and I did not see this problem. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 21:39, 26 January 2021 (UTC)
+
: Signed languages and manually-coded languages share similarities (the manual modality) and differences (since sign languages are 'native' to the signed modality, they use it more fully, having complete deixis and time-reference systems, use of handshape classifiers, etc.) -- 'luka' means 'hand'/'five', so that's the part of the name that indicates the manual modality, but otherwise it's just garden-variety toki pona. I am interested in using SignIt to record this vocab, yes. The '(signed)' suffix seems like a good way to do it. [[User:Arlo Barnes|Arlo Barnes]] ([[User talk:Arlo Barnes|talk]]) 13:16, 19 August 2022 (UTC)
 +
::[[User:Arlo Barnes|Arlo Barnes]]: We increasingly have [[:commons:Commons:Bots/Requests/Dragons_Bot_(2)|tools]] to update and correct sign language recordings, so the suffix <code>(signed)</code> or the solution we choose appears incorrect, we still can correct it later using that bot.
 +
::I would encourage you to first train yourself and learn that manually-coded language over the coming months. Indeed, we still have a very last bug within our video recording chain, which makes rightful videos appears as audio on Commons. We expect to solve this last issue this fall (September or October ?). So for now, I encourage you to rest well, reload energy, to get ready to record later this year. Maybe identify near you some suitable place with elegant monochrome wall to film over or consider building yourself a low-cost recording studio,. Etc.  We can discuss it to keep it low cost and effective if you are interested, as I'm also looking for such walls and/or considering building one for myself.
 +
::See also : [https://github.com/lingua-libre/SignIt/issues/18 Minimal Sign Language Studio guideline]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 22:30, 19 August 2022 (UTC)
  
::I am not trying to record '''exactly''' those words, they are just examples to show you what I mean. I am actually trying to record words in Ukrainian language. When I try to record words, in some 3 cases out of 4 syllables are dropped, and in 1 out of 4 they are not, so I need to do in average 4 attempts to record a word. And this problem appeared abruptly several days ago, everything worked fine before. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 08:09, 27 January 2021 (UTC)
+
== Update my username ==
:::Could it come from your microphone? Did you try to record with other hardware for a test? [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 10:36, 27 January 2021 (UTC)
 
::::Yes, it happens on different devices. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 11:59, 27 January 2021 (UTC)
 
  
Solved it. Turns out, this effect is present only when loading lists longer than several hundred words. My next theory is that it was due to some sort of RAM shortage. Thank you for your time. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 13:06, 27 January 2021 (UTC)
+
I have changed my Wikimedia username but the previous name still appears in Lingua Libre. I know it's not included in unified logins. Anyway, please update my username to Aishik Rehman. [[User:Hirok Raja|Hirok Raja]] ([[User talk:Hirok Raja|talk]]) 15:14, 1 September 2022 (UTC)
:This bug stays weird...
+
: Hi Hirok Raja¸would you have an example of what you would like to see to be changed? I think you are talking about the filename but I am not sure, so with one example, it would be clearer. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]])
:Anyway, thanks Tohaomg for your audios <3 [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:03, 27 January 2021 (UTC)
+
::{{ping|Pamputt}} <br/> 1. Top menubar of lingualibre.org showing 'Hirok Raja' as my profile name. <br/> 2. After uploading when I try to check my uploads in Commons, it takes me to https://commons.m.wikimedia.org/wiki/Special:ListFiles/Hirok_Raja page. <br/> 3. 'Hirok Raja' being used as Default recorder in the file names and description <br/> 4. Change speaker name to 'Aishik Rehman' every time while recording is quite annoying to me. <br/> 5. Even here 'Hirok Raja' is showing as my signature by default ): [[User:Hirok Raja|Hirok Raja]] ([[User talk:Hirok Raja|talk]]) 19:16, 2 September 2022 (UTC)
::Happy to see that you've found a workaround. Indeed, what you guess could be the reason of the problem because the server is currently not very robust. It should evolve in the coming weeks/months. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:04, 27 January 2021 (UTC)
+
:::I suspect this is due to long term cookies. Would be interesting to push a clean up for your connection cookies for Lingualibre, it will log you out, then come back here. [https://support.mozilla.org/en-US/kb/storage?as=u&utm_source=inproduct&redirectslug=permission-store-data&redirectlocale=en-US On firefox].
 +
:::Open <code>about:preferences#privacy</code> > Go to "Cookies and Site Data"> Click "Manage Data" > Search "Lingualibre" > Remove selected. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:10, 2 September 2022 (UTC)
  
== Technical > Github Winter 20-21 review ==
+
== Siège communautaire de Wikimédia France – ouverture du vote / Community representative to Wikimédia France’s board - votes are opened ==
Following October 2020's 0x010C's departure we've reviewed the human needs for maintenance of various technical subprojects. Thanks to 4 months community effort things are in better position now :
 
* '''Definitions:''' All repositories are now well defined via a clean, one sentence descriptor. It maps sub-projects, so new volunteers know quickly what repository does what. See [https://github.com/lingua-libre/ github.com/lingua-libre].
 
* '''Mentors:''' 2/3 of repositories now have a volunteer referee-mentor with "correct" understanding, able to discuss the repository, guide new comers.
 
* '''Documentations:''' Most repositories are "correctly" documented via an existing readme.md. Improvement always welcome.
 
* '''Web servers:''' Wikimedia France hired a new Sysops, which guide and team up with volunteers on the server issues. Welcome to WMFR's MickeyBarber/Michael.
 
* '''Maintenance:''' Wikimedia France is reviewing freelance candidates for deeper mediawiki and recordwizar coding support. Thanks to Adelaide & WMFR's team.
 
* '''Globalization:''' Wikimedia France has plan to expand volunteership toward India. Thanks to Adelaide & WMFR's team.
 
All pretty positive. Pamputt, Jitrixis, Poslovitch, Adelaide, Mikey and myself pushed forward on these fronts.
 
  
'''Still ! The following repositories are currently leaderless and contributorless:'''
+
(English version below. Do not hesitate to correct my English translation.)
* [https://github.com/lingua-libre/LinguaRecorder LinguaRecorder]: Powerfull JS library to manage audio recording : intelligent cutting with regular padding, saturation control, various export options,...
 
* [https://github.com/lingua-libre/RecordWizard RecordWizard]: MediaWiki extension allowing mass recording of clean, well cut, well named pronunciation files.
 
* [https://github.com/lingua-libre/QueryViz QueryViz]: MediaWiki extension adding a <query> tag to display sparql queries results inside wiki pages
 
  
'''LinguaLibre Bot''' is under review by [[user:Poslovitch|Poslovitch]] but may gain from some more love. LinguaLibre Bot it's '''the most impactful yet underused piece''' of our sub-projects since it needs to be ''authorized'' per target language (ex: add audios to tamil wikipedia articles) and is only authorized for few languages & wiki :
+
(Message copié depuis le bistro du jour par [[User:Lepticed7|Lepticed7]] ([[User talk:Lepticed7|talk]]))
* [https://github.com/lingua-libre/Lingua-Libre-Bot Lingua-Libre-Bot]: Mediawiki bot facilitating the resuse of Lingua Libre's audio records on many wikis, including wikipedias and wiktionaries.
 
  
Satelite linguistic project :
+
Bonjour,
* [https://github.com/lingua-libre/SignIt SignIt]: ''LinguaLibre SignIt'' is a web-browser extension which translates a word in Sign Language, in order to learn sign language while reading online.
 
  
The end ! Thanks to all those who helped and are joining in :) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 13:11, 1 February 2021 (UTC)
+
En tant que président de la commission électorale pour [[:m:Wikimédia France/Gouvernance/Siège communautaire|l'élection du siège communautaire au conseil d'administration de Wikimédia France]], je vous annonce que le vote ouvre aujourd'hui (13 septembre) à 0h CEST. Il se terminera le 26 septembre à 23h59 CEST.
  
== LinguaLibre International call (France-India-others) ==
+
Comme il y a trois ans, le scrutin est public sur Meta. Les pages de votes sont disponibles dans [[:m:Category:Wikimédia France/Gouvernance/Siège communautaire/2022/Votes|la catégorie correspondante]] ou en lien sur la page principale. C'est un scrutin par approbation, le candidat qui aura le plus grand nombre de voix sera donc déclaré élu. Vous pouvez voter pour autant de candidats que vous le souhaitez.
:{{Done}} -- Please refer to [[LinguaLibre:Events#2021_International_call]]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 11:46, 12 February 2021 (UTC)
 
Namaskara/Hello,<br>Earlier we noted that we started getting more participation from India (I am from India as well). In October last year  when I had around 15,000 uploads, I then contacted Wikimedia France with the following idea, what I wrote in the email then:
 
<blockquote>I believe in India, as we have many languages and dialects, the tool is specially relevant. I wanted to have a discussion with the people working on the project, or can help with this idea.</blockquote>
 
This email was followed by a call with Adelaide and most possibly Lyokoi joined. Adelaide kindly invited me to attend another call, where I could briefly meet many of you.<br>
 
Now, I know, there is more interest from India (different languages). You might have seen some work in Marathi very recently. Just two days ago I attended a brief India (Maharashtra) LinguaLibre meeting. I did not expect to see so many participants, but it looks like around 10 or so people are interested to record Marathi pronunciation.
 
  
So, is it possible to have a France-India call? It is absolutely OK for me to make it an "international call", so that everyone can join. Here we can have some of the people from India (any country) who can tell their plans, ask questions, get to know from you, or share experience. There might be ideas and questions related to setting up the project page etc. Around 5 or more people from India will be interested to join, I think.
+
Si vous avez des questions, vous pouvez les poser sur la page de discussion ou par courriel à election@wikimedia.fr.
  
PS: Most possibly there are LinguaLibre calls arranged. Adelaide kindly invited me to two such calls. Otherwise, I do not get to know about these calls. If these calls are open and anyone can join, possibly can we announce the call dates and time on this Project Chat, so that anyone interested/eligible can join?
+
Pour la commission électorale, Mathis B, le 12 septembre 2022 à 22:00 (CEST)
  
Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 20:03, 1 February 2021 (UTC)
+
----
:Hello Titodutta,
 
:Nice to hear that news of a 10 people Lingualibre workshop in India's Marathi community. This is wonderful.
 
:Adelaide is definitively the person to contact for institutional relationship and workshops. She is coordinating-piloting-animating this project for Wikimedia France and knows who is who, where the human resources are, what is our wish-list and next moves.
 
:I'am interested by this call as well. Santosh, an Indian wikimedian contacted us as well (via email) for a similar need.
 
:I will send you an email to group us all. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 09:41, 2 February 2021 (UTC)
 
::Yes, it would be good to have documentation or project page creation process on this site. Other than Marathi, briefly, a few Kannada students from a south Indian college started working on Lingua Libre (you may see an event page [[:m:Alva's Wikipedia Student's Association/Events/Lingua Libre training session]]). Similarly, you might have seen some involvement from the Punjabi community where [[User:Nitesh Gill]] and a couple of other Punjabi community members are working. <br>Other than Indian languages, there is a good response from Japanese, Ukrainian, and a few other languages. From all these the thought of the "International call" came to my mind. <br>Other than small projects we can also think of "small events" in the future, such as a LinguaLibre-a-thon or Libre-a-thon (similar to edit-a-thon, for example on World Environment Day we can get together and records pronunciation which is related to the environment, and not on Commons. This can be a small event where we record our own respective languages). Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 23:51, 2 February 2021 (UTC)
 
:::{{ping|titodutta|सुबोध कुलकर्णी}} From what I see now with France and India, It seems the best seeds are with already very active wikipedians with interest in languages.
 
:::We also have a group of successful seeding due to already active wikimedian who have some institutional roles (Lyokoi, WikiLucas, Titudutta, सुबोध कुलकर्णी_Subodh). Basically, according to data, one out of 10 speaker who tried Lingualibre really stick in. So you need someone really active and outreaching, training 20, 30 people to initiate a local community.
 
:::Note: I create [[LinguaLibre:Events#2021_International_call]], please fill in informations as needed. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:07, 8 February 2021 (UTC)
 
  
==Marathi language stats==
+
(Message copied from the French Wikipedia Bistro by [[User:Lepticed7|Lepticed7]] ([[User talk:Lepticed7|talk]]))
{{Move|LinguaLibre:Technical board|type=section}}
 
:{{done}} [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:09, 8 February 2021 (UTC)
 
Mar records @2600 on [[:Commons:Category:Lingua_Libre_pronunciation-mar Wikimedia Commons]], but it is not reflected in LL stats - records per lang. It is just 163. Could anyone please look into and resolve? [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:08, 3 February 2021 (UTC)
 
:Hi {{u|सुबोध कुलकर्णी}}. Lingua Libre suffers a bug since the end of 2020. New developers are looking to this issue. Let us hope it will be fixed in the coming weeks. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:05, 3 February 2021 (UTC)
 
::{{Ping|सुबोध कुलकर्णी|सुबोध कुलकर्णी}} it's fixed ! Data are back online thanks to the devs hired by Wikimedia France and Adelaide. You can also use the {{tl|User records-mar}} template on userpages to tag speakers/uploaders. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:17, 11 February 2021 (UTC)
 
  
== Stats : toward records and beyond... ==
+
Hello,
Folks, given the stats page is broken [paid devs will fix it in coming weeks thanks to Wikimedia France !], I jumped with some regex to do the maths:
 
* [[:Commons:Category:Lingua_Libre_pronunciation]] > 101 languages, 385,929 audios as of RIGHT-NOW-NOW.
 
We will likely reach <big>400,000 this very months</big>. This feast is wildly due to the recent rise of Indic languages. We must also notice that most languages only have from 3 to 50 words, people trying out. Best results are achieved if we get users commit a bit, then things truly take off. Other thing, our 7 most active users provided 200,000 of our audios. 20 users contributed more than 3000 audios, and 20 others between 3000 and 1000 audios, so about 10% of speakers really hit it off. Quite interesting ! In my opinion we still have bottle necks on :
 
* reaching out to diverse & minority languages ;
 
* getting contributors to contribute consistently ;
 
* and creating words lists for our users.
 
Inventing and exploring new methods for each of these bottlenecks is always welcome. Recent success with Marathi ([[:Commons:Category:Lingua Libre pronunciation-mar‎]]: 15 C, 3,011 F) is a great example of reaching outside our usual pool, we surely may learn from this initiative.<br/>
 
I will hide in the code below the per-language stats as in tsv format, in case you want to check those.
 
<!--
 
ISO3 QUANTITY
 
others 404
 
afr‎ 2,541
 
amh‎ 209
 
ara‎ 4,683
 
arq‎ 241
 
ary‎ 1,286
 
atj‎ 492
 
aze‎ 325
 
bam‎ 44
 
bas‎ 105
 
bbj‎ 186
 
bci‎ 85
 
bcl‎ 244
 
bdu‎ 108
 
ben‎ 49,734
 
bik‎ 112
 
bre‎ 1
 
bse‎ 1
 
bum‎ 113
 
bzm‎ 109
 
cat‎ 954
 
ces‎ 10
 
cmn‎ 23
 
cym‎ 705
 
deu‎ 12,952
 
dua‎ 94
 
duf‎ 3
 
dyu‎ 47
 
ell‎ 18
 
eng‎ 12,511
 
epo‎ 28,823
 
eus‎ 3,050
 
fas‎ 817
 
fin‎ 63
 
fon‎ 93
 
fra‎ 178,271
 
fsl‎ 2
 
gaa‎ 209
 
gcf‎ 456
 
glg‎ 32
 
gsc‎ 4,879
 
hat‎ 15
 
hav‎ 129
 
heb‎ 73
 
hin‎ 432
 
hye‎ 28
 
ind‎ 5
 
ita‎ 2,357
 
jpn‎ 13
 
kor‎ 56
 
kab‎ 18
 
kan‎ 487
 
ken‎ 35
 
kik‎ 54
 
krc‎ 1
 
kur‎ 546
 
dag‎ 14
 
hau‎ 8
 
pan‎ 2,178
 
lnc‎ 5,316
 
ltz‎ 86
 
mal‎ 9
 
mar‎ 3,011
 
mcn‎ 24
 
mhk‎ 77
 
mkd‎ 12
 
mlg‎ 419
 
mos‎ 175
 
mua‎ 34
 
myv‎ 15
 
nld‎ 1,226
 
nor‎ 29
 
nso‎ 45
 
oci‎ 13,955
 
ory, ori‎ 3,728
 
pcd‎ 4
 
pol‎ 9,621
 
por‎ 2,586
 
que‎ 25
 
rcf‎ 17
 
ron‎ 2,065
 
rus‎ 3,558
 
spa‎ 2,559
 
sat‎ 434
 
shi‎ 35
 
shy‎ 1,210
 
srr‎ 1
 
swe‎ 504
 
tam‎ 104
 
tat‎ 3
 
tay‎ 4
 
tel‎ 5
 
tgl‎ 7
 
tha‎ 208
 
ukr‎ 16,216
 
vie‎ 1,234
 
wls‎ 3
 
ybb‎ 21
 
yue‎ 5,670
 
zho‎ 190
 
  
-->
+
as the chairman of the electoral commission for [[:m:Wikimédia France/Gouvernance/Siège communautaire|the election of the community representative to Wikimédia France’s board]], I announce that votes open today (13th september) at 0:00 CEST. They will be closed on 26th september at 23:59 CEST.
[[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:34, 6 February 2021 (UTC)
 
:10% of speakers commiting to 1000 recording or more is very interesting. So it suggest that, if we give a workshop to 10 people, one committed speaker will emerge. Thereby kick starting this language. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 01:20, 8 February 2021 (UTC)
 
  
== Reminder : Grants ==
+
Like it was the case three years ago, voting is on Meta. Voting pages are available in [[:m:Category:Wikimédia France/Gouvernance/Siège communautaire/2022/Votes|the corresponding category]] or as links in the main page. The elected candidate will be the one with the most approbation votes. You can vote for as many candidates as you wish.
Hello all, I'am monitoring grants these days and there is a summary table available here [[LinguaLibre:Grants]]
 
  
I think both rapid grants mechanisms could be of help to us now, to reach out to local community via small scale events, training, hardware, food, transportation costs, flyers' designs, etc. By example, [[:meta:Wikimedia_France/Micro-financement/Demande/µFi-2020-10-421441|This WM-France micro-fi's request]] organizes 4 evenings of contribution, getting 100€ for each evening. The same user has been welcome to do several Grant requests.<br> Heavier, the R&D Grant could surely be used for something. I have an idea on this, but we can trust Indian contributors to come up with relevant technical ideas and teams as well. {{ping|Titodutta}} [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 01:20, 8 February 2021 (UTC)
+
If you have any questions, you can ask them on the Talk page on Meta, or by email at election@wikimedia.fr.
  
== LinguaLibre Bot and Wikidata ==
+
For the electoral commission, Mathis B, 22:00, 12 septembre 2022 (CEST)
{{Move|LinguaLibre:Technical board|type=section}}
 
I have not checked the bot's contrib on Wikidata for quite some time. Yesterday I uploaded ~100 Bangal film names from Bangla Wikipedia. It looks like [https://www.wikidata.org/w/index.php?target=Lingua+Libre+Bot&namespace=all&tagfilter=&start=&end=&limit=50&title=Special%3AContributions the bot is not] active, unless I am missing something. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 18:10, 13 February 2021 (UTC)
 
  
== Update and technical improvements ==
+
== Is there a way to exclude username from Wikimedia Commons upload file name? ==
 +
:''See also [[Help:Renaming]].''
 +
This seems redundant and takes up a lot of space --[[User:Middle river exports|Middle river exports]] ([[User talk:Middle river exports|talk]]) 20:22, 9 October 2022 (UTC)
 +
:{{ping|Middle river exports}} Welcome MRE,
 +
:You could name your speaker with a single character I guess.
 +
:But keeping the name is voluntary. Each speaker has his/her own voice, which we want to document. If, outside of Wikimedia, you want to remove part of the filename, we have a technical tutorial to do so. See [[Help:Download datasets]] and [[Help:Renaming]]. Ping us back if your dataset is not up to date. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 13:16, 10 October 2022 (UTC)
 +
::I have solved this now by just changing my username to something shorter. This way I can upload English as Usmaan (عثمان) for example where instead of just repeating the username it shows two scripts which is more useful. (Apparently few enough people have Arabic script usernames that short common words are mostly available.) --[[User:Middle river exports|عثمان]] ([[User talk:Middle river exports|talk]]) 20:23, 10 October 2022 (UTC)
 +
:::All Unicode characters should be ok, in words and usernames ;) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:46, 11 October 2022 (UTC)
  
Hi all,
+
== Username update request ==
  
Full information and full disclosure, I'm working now with WikiValley and Wikimédia France in a paid capacity to help improve Lingua Libre technical structure (see [https://www.wikimedia.fr/emploi-wikimedia-france/nos-appels-doffres/appel-doffres-developpement-et-amelioration-de-loutil-web-lingua-libre/ this] - in French - for the scope of our intervention).
+
I realised my username on Mediawiki didn't carry over here when I changed it. On thus site could I please have it changed to: عُثمان
 +
--[[User:Middle river exports|عثمان]] ([[User talk:Middle river exports|talk]]) 08:45, 10 November 2022 (UTC)
  
One of our first action last Thursday was to restart the Blazegraph updater. A lot of tools are depending on this "fundamental brick" (including but not limited to): the SPARQL endpoint (and pages using it) and bots. Now, you can see that pages like [[Special:MyLanguage/LinguaLibre:Stats]] are up-to-date again and the bots should also restart soon (you can see more technical info on this on [[LinguaLibre:Technical board]])).
+
== Data on LinguaLibre:Stats isn't consistant with Wikipedia Commons's Category ==
  
The next big step will be to update this Mediawiki from 1.31 to 1.35 and moving it to a new server.
+
On the Stats page, the French have 254,387 records
  
If you see something or anything wrong or strange, don't hesitate to let me know. I'm also available for any question.
+
https://lingualibre.org/wiki/LinguaLibre:Stats/Languages
  
Cheers, [[User:VIGNERON|VIGNERON]] ([[User talk:VIGNERON|talk]]) 08:56, 15 February 2021 (UTC)
+
Meanwhile, the Category on commons.wikimedia.org has 253,464 records
:Nice ! Happy to see you folks jumping in. Thank you for the Stats ! We can witness our passage over 400,000 audios shortly. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:27, 15 February 2021 (UTC)
 
  
== 400,000 ==
+
https://commons.wikimedia.org/wiki/Category:Lingua_Libre_pronunciation-fra
  
The total amount of recordings on Lingua Libre reached '''400,000''' a few hours ago. February is already the second most fruitful month since the beginning of the project, even though we are only halfway through. LiLi is growing faster and faster, and this is only the beginning!<br/>Congratulations and thanks to everyone who gives some time to record voices and to spread the project around the world.<br/>
+
The stats display more records. This data inconsistency is strange. -- [[User:Shenlebantongying]], 10:36, 23 december 2022.
All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 18:10, 16 February 2021 (UTC)
+
:This means some item page exist here, but no audio are on Commons.
:And another milestone broken ! Big thanks to the [[user:Titodutta|Titodutta]] and Marathi effects, too ! [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:24, 16 February 2021 (UTC)
+
:Item creation here and upload are done at step 5 of the recording, nearly simultaneously.
::[[User:Yug|Yug]], [[User:WikiLucas00|WikiLucas]] and [[user:Titodutta|Titodutta]]- thanks for the support! Marathi community had decided to gift minimum 5000 records on the occasion of [[:en:Marathi Language Day|Marathi Language Day]] to be celebrated on 27 February. We have crossed 6000 records as of now. All credit goes to community members. [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:22, 26 February 2021 (UTC)
+
:So I don't know what is going on. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 17:41, 26 December 2022 (UTC)
::See also [[:Commons:Category:Lingua_Libre_pronunciation-mar]]
 
:::Congratulation to the Marathi community ! It's nice to see you contributes this way :) [[User:Yug|Yug]] ([[User talk:Yug|talk]])
 
  
== Chat room in your language ==
+
== [[c:Category:Lingua Libre pronunciation-bxg]] ==
  
Hi all. I've created [[Template:Lang-CR]] in order to list all the chat rooms. I think it would be interesting for people to discuss in their native language. The main discussion should remain on this chat room in English in order to be understood by most of the contributors. So feel free to create a village pump/chat room in your mother tongue. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:21, 16 February 2021 (UTC)
+
All files in this category are tagged with wrong language. I have requested moves for files in the category, but what's more to be done?--[[User:GZWDer|GZWDer]] ([[User talk:GZWDer|talk]]) 13:05, 12 January 2023 (UTC)
:It is welcome move. We need to discuss many local issues, policies, approaches, ideas etc. in own language. I have created Mar page [[LinguaLibre:संवाद-चर्चा दालन|संवाद-चर्चा दालन]]. Let me know whether the process is right. I will start engaging speakers here. [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:36, 26 February 2021 (UTC)
+
: Thanks for reporting. Actually all these items are erroneous (see [[Special:WhatLinksHere/Q590228]]):
::{{ping|सुबोध कुलकर्णी}} that's perfect. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:40, 26 February 2021 (UTC)
+
:* {{Q|798236}} (wrong language code)
 +
:* {{Q|802994}} (wrong language code)
 +
:* {{Q|802995}} (useless)
 +
:* {{Q|802996}} (useless)
 +
:* {{Q|802998}} (useless)
 +
:* {{Q|802999}} (useless)
 +
:* {{Q|803000}} (useless)
 +
:* {{Q|803001}} (useless)
 +
:* {{Q|803002}} (useless)
 +
:I have not checked yet if corresponding recordings are still on Commons. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:11, 13 January 2023 (UTC)
  
== New batch of lists available ! (1,000 languages) ==
+
== I can not publish my records recorded via Lingua Libre. ==
:''Please, remember to tag the list_talk's page with {{tl|UNILEX license}}.''
 
Greetings!<br>Thanks to [[:commons:user:Tshrinivasan|Tshrinivasan]] with who we discussed recent Indic (Marathi!) activity and lack of lists, I bumped again into UNILEX (GNU-like license), which is a Google-led Unicode Consortium project listing vocabulary for 999 languages. Data seems clean as far as I can tell. The two main maintainers are Google folks. So I suspect UNILEX uses Google's best scrappers and NLP cleaners. Within this data are tab-separated frequency lists as <code>{item}  {number_of_occurences}</code>. I forked their github, and made a script to convert their format into Lili's <code>List:*</code> format such as <code># {item}</code>. See:
 
* [https://github.com/lingua-libre/unilex/ github.com/lingua-libre/unilex]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash data/frequency-sorted-hash]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency/ig.txt ig.txt] – frequency
 
* [https://github.com/lingua-libre/unilex/ github.com/lingua-libre/unilex]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash data/frequency-sorted-count]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-count/ig.txt ig.txt] – sorted
 
* [https://github.com/lingua-libre/unilex/ github.com/lingua-libre/unilex]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash data/frequency-sorted-hash]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash/ig.txt ig.txt] – Lili's List format
 
You can check if there is your own language among the 999 available. For Marathi, replace <code>ig</code> by <code>mr</code>. I therefor created 2 local lists to test this approach :
 
* [[List:Mar/words-by-frequency-00001-to-01000]] – starts soft
 
* [[List:Mar/words-by-frequency-01001-to-05000]] – then I jumps to multiples of 5,000 : 01001-05000, 05001-10000, 10001-15000, etc.
 
'''<span style="color:green">Right now, 1000 lists are already formated in Lili's syntax within the [https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash /data/frequency-sorted-hash] directory.'''</span> If any community lacks wordlists on Lili's there you have them : copy, paste, done, situation unlocked ! [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:40, 24 February 2021 (UTC)
 
:{{ping|Titodutta}} hi! This may interest your community. There are dozen(s) Indic languages :) It could also help you. You already recorded most of those words for your language (ben), together with the "ignore already recorded words" functions, these lists can fill some gaps :) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:48, 24 February 2021 (UTC)
 
::* I love this. I'll inform the Marathi folks. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 17:16, 24 February 2021 (UTC)
 
::* This is just amazing. You don't know how much delighted I am feeling at this moment. I checked the Bengali list, a very few random words have typos, but that should not be more than 1% I guess. Over-all this will an extremely helpful resource for the communities. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 17:24, 24 February 2021 (UTC)
 
:::* I share your enthusiasm ! It's bot created I'am pretty sure, the clean up is likely just statistical. Now that those lists are technically available, ideal next step would be human review by local communities. Maybe groups of 2~3 users for copyedit sprints ? :D But this is optional IMHO. Also, the corpora coming from online documents, IRL objects like `chair`, `car`, `walk`, may be further down on these lists. But they must be there in the first 20,000 items. The best is the linguistic diversity of this set. Amazing. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:10, 24 February 2021 (UTC)
 
::::*It's a good resource indeed. Thanks! The Marathi words in the list are grammatically correct also, with nearly no typos. We have started discussion about this in our community. Currently, we have started working on Lexemes first, the recordings of the lists thus created will be done simultaneously. The community thinks this approach is more useful in long run. The separate group of speakers may adopt these lists. But then we have to devise way to avoid repetitions. We will definitely discuss more on this resource utilisation and let you know.[[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:14, 26 February 2021 (UTC)
 
[[:commons:user:Tshrinivasan|Tshrinivasan]], [[User:Yug|Yug]] - Marathi community plans to work on these lists. But [https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash] giving 404 error. Please help. [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:54, 5 March 2021 (UTC)
 
:[[:commons:user:Tshrinivasan|Tshrinivasan]], [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] : It's in active developements these days so I made few changes.
 
:* Currently at: [https://github.com/hugolpz/unilex-extended/tree/master/frequency-sorted-hash /hugolpz/unilex-extended/frequency-sorted-hash] which uses UNILEX as a git submodule to respect each project's scope.
 
:* I just ran the script for Marathi, so the lists are now local. When picking a list, type <code>List:Mar/M</code>:
 
:See also section below. My apologize for the changes. Hope it didn't affected you too much. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 07:47, 5 March 2021 (UTC)
 
=== Pause before running ===
 
[[File:Long_tail.svg|thumb|Long tail curves likely applies to languages ranked by number of speakers. Since macro-languages such Mandarin, English, Spanish, Hindi, etc are certain to be soon audio documented by the sheer force of demography, our effort-strategy should progressively shift toward the right, and increasingly rare languages. The rarer the languages and speakers, the more listening we should become and the more custom assistances we will have to provide.]]
 
[[User:Dragons Bot|Dragons Bot]] has been created, coded, tested, and is ready to import UNILEX's lists to LinguaLibre's <code>List:{iso}/{title}</code> namespaces. Given 1,000 pages and associated talk page will be create, I would like to pause few days to consider about this large list import / creation and why.
 
* Lili > Languages > existing breath: We reached 110 languages on LinguaLibre so far.
 
* Lili > Lists > non-sorted by usefulness : Sparql queries provides lists for all languages, but without prioritization on words' usefulness.
 
* Lili > Lists > sorted by usefulness :
 
** Hand picked frequency lists are present for about 7 languages : eng, mar, por, pol, tam, ron, kur. With optimal relevance for teaching/learning.
 
** {{u|Olafbot}}'s <code>List:*/Lemmas-without-audio-sorted-by-number-of-wiktionaries</code> for 72 languages, updated daily, with optimal relevance for wiktionaries.
 
** UNILEX can provide frequency lists for 1,000 languages. About 10 times our current language coverage. UNILEX plugs itself upon [https://github.com/Google/Corpuscrawler Github.com/Google/Corpuscrawler], and open source project which plan to support more languages. I dived into these chain and it's an 'easy' NLP pipeline to contribute too. The wikimedia comunity can use it and expand it.
 
'''Core issue:''' the core issue from online arrival of users is to increase retention of minority and semi-rare languages by smoothing their speakers work.  By example an user of [[:en:Wayuu language|Wayuu language]] arrived today. We local (frequency) list was available today. But UNILEX + Dragons Bot can provide a local Wayuu frequency list of 8000 items, ready to record.<br>
 
Since we don't know which semi-rare languages will come next, having 1,000 languages ready is a safe yet not so excessive bet. Assuming a [[:en:Zipf's law]]/[[:en:Long tail]] curve for languages and their speakers we can still predict that at least one out of 10~20 new language's speaker will miss a local wordlist. But together with OlafBot's lists, we move from 6% toward 90% of our languages habing a solid, '''usefulness-based roadmap''' to walk forward. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:21, 3 March 2021 (UTC)
 
: Well, I believe the idea to import Unilex lists is very good. One of the things a new user needs most is an idea of what to record. The Unilex lists suit this function, especially in the case of new languages, where there is no other list available, and no words have been already recorded. The only question I see is how to import the Unilex lists. Perhaps the best idea is to import 1000 most frequent words from each list. It would be even better if the recorded words were automatically removed from the lists and replaced by new ones (like in the case of Olafbot-managed lists), but even a static list is good as bait if the goal is just to attract more speakers of rare languages.
 
: One remark: you should translate the file names from Unilex to match LiLi's language codes (or perhaps you did it, I don't know, I didn't examine the code). It's not always the same, for example, Polish is "pl" in Unilex, and "Pol" in Lili. If you leave the old codes, the list won't be automatically found when a new user presses the "Local List" button. Anyway, the newbies are likely not to notice the lists at all regardless of all our efforts. [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 00:55, 4 March 2021 (UTC)
 
  
== jQuery.Deferred exception: this.pastRecords is undefined ==
+
Dear Colleagues,  
:''This discussion may be moved to [[LinguaLibre:Technical board]].''
 
Hello, there.
 
  
When I try to load a list of words to record from the FR wiktionary, the modal does not disappear when I click "Done" and seems blocked trying to load the words. During this time, the JS console complains that "jQuery.Deferred exception: this.pastRecords is undefined", and the last resource loaded is, in cURL format:
+
It records, but when I press the button to publish it on Wikimedia Commons. It does not work. It returns as "Retry failed upload" Any idea? Thank you. [[User:Key Mîrza|Key Mîrza]] ([[User talk:Key Mîrza|talk]]) 05:09, 28 January 2023 (UTC)
curl 'https://fr.wiktionary.org/w/api.php?action=query&format=json&origin=*&formatversion=2&prop=pageterms&wbptterms=label&generator=categorymembers&gcmnamespace=0&gcmtitle=%3ACat%C3%A9gorie%3ALocutions%20verbales%20en%20fran%C3%A7ais&gcmtype=page&gcmlimit=max' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:85.0) Gecko/20100101 Firefox/85.0' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Accept-Language: de,en-US;q=0.7,en;q=0.3' --compressed -H 'Origin: https://lingualibre.org' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Referer: https://lingualibre.org/' -H 'TE: Trailers'
+
:Is it happening for all your recordings or only some of them? [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 08:49, 28 January 2023 (UTC)
 +
:: It was all good until a month ago. Nowadays I am on a vacation in another city and trying to enter to my accout and make some more records. I can enter into my account and I can create records, but I can not publish them. I stuck at publishing stage. Nothing publishing. None of my records publishing. I even tried to record via my cell phone, even there nothig publishing. By the way, I just saw your previous message wecoming me. Thank you, for your kind wish. Best wishes... [[User:Key Mîrza|Key Mîrza]] ([[User talk:Key Mîrza|talk]]) 09:57, 28 January 2023 (UTC)
 +
:::Hmmm, I do not know what to say. Sometimes some recordings do not upload but they other do. When none recording uploads, I do not know what could be the origin. Could you try with another webbrowser (firefox or Chrome)? To go further, I think we would need a Javascript expert that could have some hints. {{ping|Poslovitch|Lepticed7}} maybe ? Another question, how many words do you try to record? If this is a lot, could you try with only a few (less than 10 for example). [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 15:42, 28 January 2023 (UTC)
 +
:::: I tried 11 words together, then even 1 word only for testing purpose. Nothing worked. You said Java. Do I need java to be able to work with the application? If so, that I need to install Java. Because I formatted my PC. May be it is not installed. Thank you. [[User:Key Mîrza|Key Mîrza]] ([[User talk:Key Mîrza|talk]]) 17:06, 28 January 2023 (UTC)
 +
:::::Java is different than Javascript. Javascript is language supported by the webbrowser so you do not need to install anything else than a webbrowser to record pronunciations on Lingua Libre. Unfortunately, I cannot dig further in this direction because I almost know nothing about Javascript. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 21:18, 28 January 2023 (UTC)
 +
:::::: Thank you, anyway. [[User:Key Mîrza|Key Mîrza]] ([[User talk:Key Mîrza|talk]]) 22:38, 28 January 2023 (UTC)
 +
:[[User:Key Mîrza|Key Mîrza]], thank you a lot for your voice, it make us discover new languages. Please be aware Lili works best on solid desktop computers. Also, you likely have a limit of 380 records uploads per 72 minutes. So you may need to leave your tab open, and click "retry" after that. You can expand those right by making a demand on Commons. See [[LinguaLibre:User rights]]. Contact us if you think it may be that. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:07, 5 February 2023 (UTC)
 +
::It's [https://commons.m.wikimedia.org/w/index.php?title=Special%3AUserRights&user=Key+M%C3%AErza confirmed], as all new contributor you are limited to 380 uploads per 72h. You can get more userrights by requesting those rights on Commons. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:15, 5 February 2023 (UTC)
  
Looks like there is a bug…
+
== Late 2022-2023 Winter report ==
 +
Hello all, allow me to share few overall news from the various recent, ongoing, or near-future efforts.
 +
* 🤖 User:Pamputt has taken over Lingualibre Bot and added support for the Kurdish wiktionary. See github.
 +
* 🌏 Melody (WMFr intern) and myself made a mini-editathon on writing template emails for outreach. See Lingualibre:Events.
 +
* ⚡ User:Elfix and myself will attend are collaborating for sparql requests (me) optimization (Elfix). We aim to create and languages gallery this spring.
 +
* 🔴 Wikimedia France's freelance on the record wizard is back on track, delivery of fixes should occur around May-June.
 +
* 🙋‍♀️ Adelaide (WMFr) mentioned the wish of a second intern on Lingualibre outreach this summer, to reuse Melody's assets, expand actions and geographic diversity.
 +
* 🫱🏼‍🫲🏽 Wikimedia France yearly strategic meetup is this week, and is expected to strengthen its (linguistic) diversity and metrics axes, for which Lingualibre is one of their champions.
 +
* 🧓 Eve and myself (likely) will be present at Toulouse's ''Forom des Langues'', in May, where ~60+ languages associations are present.
  
Regards.
+
For specific deadlines and events coming soon, please also check [[Lingualibre:Events/Program]]. We always welcome contributors. When necessary, WMFr may refund transportation costs. Worth a try ! [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:07, 5 February 2023 (UTC)
[[User:LoquaxFR|LoquaxFR]] ([[User talk:LoquaxFR|talk]]) 17:21, 24 February 2021 (UTC)
 
:Salut {{u|LoquaxFR}}, peux-tu décrire précisément ce que tu fais lorsque tu écris "when I try to load a list of words to record from the FR wiktionary" ? Comment charges-tu la liste de mots, le fais tu en utilisant en utalisant l'option « Catégorie Wikimedia » sur la droite ou bien en créant toi-même la liste de mots un par un ? Si tu utilises « Catégorie Wikimedia », peux-tu nous donner la catégorie que tu veux utiliser ? Est ce que tu arrives à reproduire le problème quelle que soit la catégorie avec laquelle tu veux travailler ? Merci d'avance pour ces renseignements qui je l'espère pourront permettre de cerner le problème le plus précisément possible. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 17:58, 24 February 2021 (UTC)
 
:: En français, ce sera plus simple, en effet. Le problème se reproduit systématiquement lorsque j’essaye d’utiliser une catégorie Wikimédia (celle du wiktionnaire français en l’occurrence); je n’utilise que cette possibilité pour charger des mots, et le problème apparaît pour toutes les catégories que j’essaye d’utiliser, que j’aie déjà enregistré presque tous les mots ou celles pour lesquelles je n’ai fait qu’une petite partie des milliers de termes. Le problème se produit en navigation privée également, donc ça ne semble pas être le cache ou les cookies. Si besoin de plus d’infos, n’hésite pas. [[User:LoquaxFR|LoquaxFR]] ([[User talk:LoquaxFR|talk]]) 18:08, 24 February 2021 (UTC)
 
:::Merci pour les infos supplémentaireS. Je viens de tester avec Firefox 78.7 et je ne rencontre pas ce problème. Peux-tu essayer avec un autre navigateur (Chromium ou autre) pour voir si le problème est inhérent à ton firefox (y compris en navigation privée). Ca peut par exemple venir d'un gadget que tu aurais installé. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:40, 24 February 2021 (UTC)
 
::::Addons Firefox qui casse le JS ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:57, 24 February 2021 (UTC)
 
::::: Chrome et Safari me donnent le même résultat ; j’ai également essayé depuis une autre bécane et un autre OS, sans mieux : l’erreur JS se montre toujours et rien ne se passe au moment de la validation de la modale. Est-ce que j’aurai enregistré trop de mots, faisant bugger le JS lorsqu’il essaye de retirer ceux déjà enregistrés ? Vu qu’on n’est que quelques-uns à en avoir enregistré autant, ça se pourrait. J’avais déjà remarqué que le chargement de listes depuis le Wiktionnaire mettait de plus en plus de temps pour moi (relativement, hein : quelques secondes d’attente au plus).  Est-ce un autre problème lié à mon compte ? [[User:LoquaxFR|LoquaxFR]] ([[User talk:LoquaxFR|talk]]) 06:30, 25 February 2021 (UTC)
 
::::::Merci pour les compléments d'info. J'ai ouvert [[phab:T275734|T275734]]. Faudrait voir avec {{u|Lepticed7}} et {{u|WikiLucas00}}, qui ont sensiblement le même nombre d'enregistrements que toi, pour tester si ils rencontrent aussi le même problème. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:54, 25 February 2021 (UTC)
 
::::::: Salut, perso, je sais pas si c’est lié, mais il y a certains enregistrements que le Record Wizard ne retire pas quand je veux retirer les mots déjà enregistrés. En atteste [https://commons.wikimedia.org/w/index.php?title=File%3ALL-Q143_(epo)-Lepticed7-aprilo.wav ce fichier], que j’ai enregistré trois fois. [[User:Lepticed7|Lepticed7]] ([[User talk:Lepticed7|talk]]) 10:45, 28 February 2021 (UTC)
 
  
== 50,000 ==
+
== Edit your nickname ==
February 2021. This month. We have seen 50,000 pronunciation in a month (see [[LinguaLibre:Statistics]]). This is for the first time we saw 50,000 entries in a month. This is great. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 08:51, 28 February 2021 (UTC)
 
:That's really amazing. The same month we passed 400k recordings! AND the shortest month in the year! I'm going to prepare a small News to be published every month (inspired by what you did in September if I remember correctly), I think February is a very good month to start with! I'll publish it on your talk page if you'd like 🙂 All the best ! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 16:11, 28 February 2021 (UTC)
 
:* We can actually officially start a bi-monthly [[LinguaLibre:Newsletter]] to published on 1 March, 1 May, 1 July and so on. What do you think? I am also requesting [[User:Pamputt]], [[User:Yug]], [[User:Lyokoï]], [[User:Lepticed7]] to comment. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 17:40, 28 February 2021 (UTC)
 
:::I would say, why not but I cannot lead for such project so if you are motivated to write and lead such newsletter, go ahead. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:39, 28 February 2021 (UTC)
 
::::On the [[LinguaLibre:Technical board/intro]] Poslovitch has started a [[LinguaLibre:Technical board/News|/News]] section which keeps log of important milestones. It's an interesting idea because it's minimalist, therefor low maintenance.
 
::::I'am also interested by a Newsletter for both external and internal purpose. I would help around yes. Editorial line would gain to be clarified: who are the expected readers, writing stuly, overall length, major sections, sections lenghts, etc. But this can "appears" with the first few issues :) Please keep a balance so the writing workload stays modest. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:57, 28 February 2021 (UTC)
 
:::::The /News of the technical board is mostly about technical news. '''I fully agree to the idea of a Newsletter, yet quarterly'''. We could grab some ideas from the French Wiktionary's ''[https://fr.wiktionary.org/wiki/Wiktionnaire:Actualit%C3%A9s Actualités]''. --[[User:Poslovitch|Poslovitch]] ([[User talk:Poslovitch|talk]]) 20:33, 28 February 2021 (UTC)
 
:::::* Salut, let's start with the newsletter of March. I'll add the stories I know such as 400,000 audios, 50,000 this month, the Wikimedia Wikimeet India, upcoming France-India call, French Wiktionary missed recording work etc. I'll start the draft tomorrow and ping you here.<br>In future we will need [[:mw:Extension:MassMessage]] to send newsletter to subscribers' talk page. A system admin is needed with access to the server and localsettings.php etc pages. I understand this will take time, so it can wait. Kind regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 21:24, 28 February 2021 (UTC)
 
:{{Ping|Titodutta}} hi, We are having on the mailing list another discussion about networking, cooperations and outward communications. I think the [[LinguaLibre:Newsletter]] page can be modeled upon Technical board and [[LinguaLibre:Bot]], a kind of hub for a subgroup of active users dedicated to a common goal. In this case <u>Communication</u>. The bimonthly Newsletter could be a core, founding element. But other discussion about outreach could take place there. We have so much to push in this direction : academic outreach, rare languages and under-represented countries, partner institutions, calling for new wikimedians, reminding far-away Wikimedian chapter of Lingualibre, etc. Having a hub dedicated to writing elegant co-edited texts, defining targets and leading the call for communication campaign would be a strong plus. I'am still focused on codes but I could help in few weeks. You seems to love it as well. Do we have other users interested to join such efforts ? Would be good to have few more folks. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:39, 2 March 2021 (UTC)
 
  
=== Newsletter : March 2021 review ? ===
+
Good evening, I would like to change my nickname because it did not update when I was renamed Manjiro91 then Manjiro5 instead of GamissimoYT on Wikimedia projects. Thanks in advance Regards '''[[User:GamissimoYT|<span style="color:#fc3">manȷıro</span>]]<sup><small>[[User talk:GamissimoYT|<span style="font-variant:small-caps; color:#000">💬</span>]]</small></sup>''' 22:53, 23 February 2023 (UTC)
:''You can co-edit this text. PS Titodutta: a rough summary of past months and emerging directions based on a message to an ex-contributor.''
 
In January and February, the « Lili » community has taken back control of the technical stack (access to servers, GitHub codes, bots, etc.) and made a call for more diverse speakers. The Indian community started to show up, with key Indic languages being Bengali (50,000) and Marathi (~10,000). Romanian, Polish, Ukrainian are also on the rise around 20,000 audios each. We continue to have some dozen smaller languages showing up but no powerful push yet.
 
  
Right now, an external software company is upgrading our MediaWiki and its modules thanks to Wikimedia France's funding. The volunteer dev team is also strong and internal organization is increasing. We now have [[LinguaLibre:Technical board]] as a tech hub, [[LinguaLibre:Bot]] as a bot hub, [[LinguaLibre:Events]] as an IRL/Online event hub.
+
== Tool to prepare words for Lingua Libre ==
When the main software upgrade settles down in a month we plan a [yet to create] [[LinguaLibre:Newsletter|LinguaLibre:Newsletter/room]] as an inward and outward communication hub.
 
  
In that last dimension, we could reach out to « relay users » on other wikis, who can share our news about LinguaLibre with communities of wiktionaries, wiksources, wikipedias, wikidata. We equally consider formally reaching out to non-Wikimedia groups such as Common Voice, Unicode, governmental and NGO agencies, research centers. Possibly in the form of group work and/or an online editathon when we gather to spread the news. This hub, summarizing the community's discussions, will therefore also clarify goals and strategies. We are looking for help with this matter.
+
Preparing words to be used in Lingua Libre has always been challenging. But I think this is a shared challenge. Crawling text from different sources and creating a clean list of words is very important. I've used [[User:Titodutta/Bengali_words_from_pages|Tito's]] instructions in the past, but using multiple tabs and multiple tools is not the best user experience. So, I thought I'd create something that is functional for me and simple enough to be tweaked. Introducing [[User:Psubhashish/tools/Prepare words for Lingua Libre|"Prepare words for Lingua Libre"]]. The tool is currently set for Odia but can be easily tweaked for other languages using non-Latin scripts. I'd request Lingua Libre core team to incorporate the tool into Lingua Libre so that users can use the platform to create a wordlist. Extracting words from any random text is always hard, especially new contributors. --[[User:Psubhashish|Subhashish]] ([[User talk:Psubhashish|talk]]) 03:44, 14 March 2023 (UTC)
 +
:Hi [[User:Psubhashish|Psubhashish]]. This is really nice. Do you think it would be easy to adapt it to create a [[Help:Create_a_new_generator|new generator]]? Generators can be used by anyone after they import them in their common.js. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:44, 14 March 2023 (UTC)
 +
:: Thanks [[User:Pamputt]]. That would be fantastic, but I probably don't have the right knowhow for doing that. I did take ChatGPT's help to create a [[User:Psubhashish/common.js|.js version]] from the [[User:Psubhashish/tools/Prepare words for Lingua Libre|HTML code]] I had shared earlier but would appreciate any help. I think having a tool inside Lingua Libre would be great so really liked the idea of new generators. Common users would like things well packaged rather than jumping from one platform to another. --[[User:Psubhashish|Subhashish]] ([[User talk:Psubhashish|talk]]) 13:09, 14 March 2023 (UTC)
  
This current forward dynamic is thanks to the early Autumn 2020's efforts. We weren't able to immediately convert those into actions but it still injected energy and vision into LinguaLibre which helped snowball the current dynamic. Also, many thanks to all those who got involved in this journey! [[User:Yug|Yug]] [[User talk:Yug|<small><font style="color:green;">(talk)</font></small>]] 07:20, 3 March 2021 (UTC)
+
== Problème de publication des enregistrements  ==
:Also, I just found out Commons grows at a speed of [https://stats.wikimedia.org/#/commons.wikimedia.org/content/pages-to-date/normal|line|2-year|page_type~content*non-content|monthly about 1 millions files per month]. So with 50,000 audios last month, Lili makes up to 5% of Commons' new files. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:57, 3 March 2021 (UTC)
 
* Made a minor change, I'll get back to this. Sorry for the delay, something kept me really busy for the last two days. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 20:20, 3 March 2021 (UTC)
 
  
== Marathi women speakers celebrate 'Women's Day' & 'Women History Month' on Lingua Libre ==
+
Bonjour, il y a quelques années, j'ai renommé mon compte GamissimoYT en Manjiro91. Plus tard, je l'ai renommé Manjiro5. Le problème est que le renommage de mon compte global Wikimedia ne s'est pas fait sur Lingua Libre. Je ne peux donc pas publier les audios que j'enregistre sur LinguaLibre et n'apparaissent pas non plus sur Commons. Pourriez-vous m'aider ? '''[[User:GamissimoYT|<span style="color:#fc3">manȷıro</span>]]<sup><small>[[User talk:GamissimoYT|<span style="font-variant:small-caps; color:#000">💬</span>]]</small></sup>''' 08:41, 26 April 2023 (UTC)
  
Greetings of coming World Women's day!<br>
+
== Renommer un dialecte en langue ==
Glad to share this news. Marathi language community in Maharashtra State of India has taken initiative to record their language from the last 2 months. Out of total 26 speakers, @24 are women from 4 different places in the state. The group has decided to reach 10,000 recording mark to celebrate 'Women's Day' and 15,000 mark in March. As of now 8600+ recordings are uploaded. A small group of women have also started working on Lexicographical data, the recordings of which would be done simultaneously. The activity is being coordinated by institutional partner Jnana Prabodhini, Pune and facilitated by CIS-A2K, affiliate of WMF in India. The community needs support from all of you. Thanks, [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 06:28, 5 March 2021 (UTC)
 
  
:Greeting  [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]], nice to witness this enthusiasm.
+
Bonjour,
:I imported UNILEX lists for Marathi. When in [[Special:RecordWizard|RecordWizard]]'s Step 3 as you pick a list, go for <code>Local list</code>, then <code>mar/M</code> and you will see lists of the most used words. I proposed a '''gentle ramp approach''' : first list has just 200 words, see [[List:Mar/Most_used_words,_UNILEX_1:_words_00001_to_00200]]. Given my experience it will allows better on-the-ground session with new users. 200 is gently ambitious, allows to pass the ''uncanny valley of the first 20 words'', and move to the ''joyful Lingualibre flow of rapid recording''. Perfect for demo and on-boarding. :)
 
:Following lists are for motivated users who chose to return. To consolidate skills, list 2 has 800 words while list 3 has 1000. At this state a nice 2,000 audio have been recorded by the speaker, while this words likely make up for 90% of daily conversations.
 
:It then moves into committed users. List 4 has 3000, the following ones 5,000 words each. These lists are not expected to be done in one strike but over several session of one hour or less, during a dedicated day or along a week or so.
 
:I hope these may help your language community to better on-board interested contributors :)
 
:We also encourage development of women speakers networks, so thanks a lot for your lead. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 08:57, 5 March 2021 (UTC)
 
:Added Marathi lists :
 
:* [[List:Mar/Most_used_words,_UNILEX_1:_words_00001_to_00200]]
 
:* [[List:Mar/Most_used_words,_UNILEX_2:_words_00201_to_01000]]
 
:* [[List:Mar/Most_used_words,_UNILEX_3:_words_01001_to_02000]]
 
:* [[List:Mar/Most_used_words,_UNILEX_4:_words_02001_to_05000]]
 
:* [[List:Mar/Most_used_words,_UNILEX_5:_words_05001_to_10000]]
 
:* [[List:Mar/Most_used_words,_UNILEX_6:_words_10001_to_15000]]
 
:* [[List:Mar/Most_used_words,_UNILEX_7:_words_15001_to_20000]]
 
:* [[List:Mar/Most_used_words,_UNILEX_8:_words_20001_to_25000]]
 
:* [[List:Mar/Most_used_words,_UNILEX_9:_words_25001_to_30000]]
 
:[[User:Yug|Yug]] ([[User talk:Yug|talk]]) 09:01, 5 March 2021 (UTC)
 
  
::Many thanks [[User:Yug|Yug]] for detailed explanation. These are useful to start with. Our group has taken lexicographical approach now to develop lists. So we need alphabetical lists to get forms of words. For example we create list like this - शरीर, शरीरभर, शरीराकडून, शरीराकडे, शरीराचं, शरीराचा, शरीराची, शरीराचे, शरीराच्या, शरीरात...etc. The members distribute work according to letters. Therefore it will be good if we can get modified lists. - [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 11:22, 5 March 2021 (UTC)
+
J'avais fait la demande pour l'ajout de "Teochew dialect" il y a quelques années lors de mes premiers essais. Cependant, il paraît plus pertinent de juste laisser "teochew" tout court sans le mot dialecte. Serait-il possible de faire ce changement.
:::I see. [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]], you could use [https://github.com/hugolpz/unilex-extended/tree/main/frequency-sorted-count/mr.txt frequency-sorted-count/mr.txt], keep the 30,000 most frequent, then sort alphabetically and split by hand on each letter. See [[Help:How_to_create_a_frequency_list%3F#UNILEX.27s_lists]]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 11:53, 5 March 2021 (UTC)
+
 
:::I tried to pushed it forward but it's a bit more complex than I anticipated. Ideally, you would 1) add a prefix so <code>औ.txt</code> becomes <code>/Marathi_words_starting_with_औ.txt</code>, 2) merge the rarest letters together. I must refocus on non-wiki projects, can you call for help from local wiki-developers ?
+
[[User:Assassas77|Assassas77]] ([[User talk:Assassas77|talk]]) 19:41, 7 May 2023 (UTC)
<pre>
+
:{{Done}} Solved [https://lingualibre.org/index.php?title=Q4465&type=revision&diff=912499&oldid=477865 here] by [[User:Assassas77]] ! It's a wiki :) [[User:Yug|Yug]] ([[User talk:Yug|talk]])
# Define language
+
 
iso=mr
+
== MediaWiki:Lang/* ==
# get file, cut out meta, sort by 2nd column (frequency), keep 50000, keep only word, sort by 1st column, alphabetically, save to .txt file
+
 
curl https://raw.githubusercontent.com/unicode-org/unilex/master/data/frequency/${iso}.txt | tail -n +6 | sort -k 2,2 -n -r | head -n 50000 | cut -d$'\t' -f1 | sort -k 1,1 > ${iso}.txt
+
What are the MediaWiki:Lang/* messages for? For example, [[MediaWiki:Lang/awa]]? It looks like they mostly just repeat the language code in the content. --[[User:Amire80|Amir E. Aharoni]] ([[User talk:Amire80|talk]]) 07:21, 24 May 2023 (UTC)
# get mr.txt content, for all line starting with alpha-num, convert first letter to lowercase, then print in files depending on first symbol
+
 
cat mr.txt | awk '{file = (/^[[:alnum:]]/ ? tolower(substr($0,1,1)) : "symbol") ".txt"; print >> file; close(file)}'
+
== Where are the Greek recordings? ==
# Remove a to z files
+
According to the statistics page there are 130 recordings of the Greek language (Q205, ISO: gre). However there is no category [[commons:category:Lingua Libre pronunciation-gre]] defined or any recordings added to this category. There is a category [[commons:category:Lingua Libre pronunciation-ell]], but it is empty. What happened to the 130 Greek recordings? [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 20:16, 9 June 2023 (UTC)
find . -regex './[a-z].txt' -delete
+
:Hi {{u|Olaf}}, for unclear reason (probably historical reason), it seems that all Greek recordings are categorized in [[c:Category:Lingua Libre pronunciation-other (Q9129)|Category:Lingua Libre pronunciation-other]]. We have to move all these recordings in the [[c:category:Lingua Libre pronunciation-gre|good catagory]] (I do not know if Commons has a some automatic tool for such job). And also redirect [[commons:category:Lingua Libre pronunciation-ell]] to [[c:category:Lingua Libre pronunciation-gre]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:24, 10 June 2023 (UTC)
# Convert to wiki lists format `# {item}
+
::Hi {{u|Pamputt}}. This happened because in [[wikidata:Q9129#P220]] both ISO 639-3 codes are deprecated, and [https://doc.wikimedia.org/Wikibase/master/php/docs_topics_lua.html#mw_wikibase_entity_getBestStatements entity:getBestStatements] function, used in [[commons:Module:Lingua Libre record#L-46]], doesn't accept deprecated entries, so the module can't get the language code and falls back to "other" category. We could change the Wikidata entry and the files would be moved automatically. However code "gre" must stay deprecated, because it is unclear if it refers to ancient or modern Greek. It would be better to promote "ell" to normal entry. Then changes in [[Q205]] would be also needed. It looks like bulk moving Lingua Libre recordings around doesn't require admin rights, so I can fix this issue if you agree to change the Greek language code to "ell" instead of "gre". [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 08:46, 10 June 2023 (UTC)
sed -i -E 's/^/# /g' `find . -type f -name "?.txt"`
+
:::Hi {{u|Olaf}} thank you for your investigation. So, I have modified {{Q|205}} to fix the issue on the Lingua Libre side. For Wikimedia Commons, you can go ahead. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 08:11, 18 June 2023 (UTC)
# See line counts, sorted numerically descendant
+
::::Thanks, {{u|Pamputt}}. It's not as easy, as I thought. Setting Greek ISO 639-3 code to normal from obsolete creates constraint validation with Modern Greek with the same code. In fact, LinguaLibre shouldn't record Greek words as Greek ([[Wikidata:Q9129|Q9129]]) but rather as Modern Greek ([[Wikidata:Q36510|Q36510]]). In fact Modern Greek is also defined in LinguaLibre: [[Q279]]. [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 13:26, 18 June 2023 (UTC)
wc -l * | sort -n -r
+
:::::If I understand correctly, the easiest way to manage this case would be to delete {{Q|205}}, so that no one can record in "this language" and thus select only {{Q|279}}. If so, I would require to replace all Lingua Libre statements that use {{Q|205}} by {{Q|279}}. There is currently [https://lingualibre.org/index.php?title=Special:WhatLinksHere/Q205&namespace=0&limit=500 137 items] that use {{Q|205}}, so I think it is manageable by hand. {{u|Olaf}}, what do you think about this "workaround"? [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:48, 18 June 2023 (UTC)
# See lines count, if n<200 then print filename, add file to merged.txt
+
::::::This would be perfect, it also requires renaming the 137 recordings in Commons, but it can be done. What about the [https://lingualibre.org/datasets/ datasets] to be downloaded from LinguaLibre, will they change automatically? [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 21:08, 18 June 2023 (UTC)
wc -l * | awk '$1 < 200 {print $2}' | xargs cat >> merged.txt
+
:[[User:Olaf|Olaf]], [[User:Pamputt|Pamputt]], I had nearly similar case with Chinese ISOs zho vs cmn. I have about 186 zho items (see [[Help:SPARQL_for_maintenance#.E2.9C.85_Recordings_.E2.86.92_With_ISO-639-3_.60zho.60_to_change_to_.60cmn.60|Help:SPARQL for maintenance]])]] which have the wrong iso. My plan is :
</pre>
+
:* to delete those audios, very simply, on both Lingualibre and Commons. The alternative would be to edit them all on both sites.
:::This already provides the lists by letters. It should put you solidly on the way. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 12:52, 5 March 2021 (UTC)
+
:* to [https://lingualibre.org/index.php?title=Q130&type=revision&diff=691521&oldid=444378 discourage recording] or delete that Lili Qid.
{| class="wikitable"
+
:so I may work on those audio, some day... [[User:Hugo en résidence|Hugo en résidence]] ([[User talk:Hugo en résidence|talk]]) 17:36, 18 June 2023 (UTC)
! Without merge (50 files) || With merging (32 files)
+
::I don't like deleting good recordings as a way of dealing with wrong categorization. Moreover some of them are probably in use, because Olafbot might have added them to Polish Wiktionary. If there is no other option, just leave them where they are in Commons, and remove Greek from Lingua Libre alone in favor of Modern Greek. But I think Pamputt's solution is better. [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 21:08, 18 June 2023 (UTC)
|- valign="top"
+
:::[[USer:Olaf]], I don't like either. But 186 recording is about 8 minutes work, and it have been confusing us for 3 years. Do point to that. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:35, 20 June 2023 (UTC)
|
+
::::Deleting 186 recordings is about the same amount of time as modifying the language statement. This is manageable by hand and I would prefer not to delete them. I do not have time for now but I will try to do it before the end of the month. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 11:47, 21 June 2023 (UTC)
<pre>
+
 
  99860 total
+
== Any Recording limitation in Lingua Libre ==
  50000 mr.txt
+
 
  4976 स.txt
+
Hello,I want to know any recording limitation in Lingua Libre. Because I'm planning a screen-cast in Tamil language. If anyone know please reply. Thank you [[User:Sriveenkat|Sriveenkat (🎤) ]] ([[User talk:Sriveenkat|talk]]) 11:11, 1 August 2023 (UTC)
  4462 प.txt
+
:I you are not an [[c:Commons:Patrol#Autopatrol|autopatrolled user]] on Wikimedia Commons, then you cannot upload more than 380 audios per 72 minutes. If you want to record more words within this timeslot, then you should request for [[c:Commons:Requests_for_rights#Autopatrol|this right]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 14:15, 1 August 2023 (UTC)
  3745 म.txt
+
::Hi, {{ping|Pamputt}}, I don't record 380 audios within 72 minutes. I'm planning to create screen-cast tutorial video in Tamil language. So I ask this question. Thank you for your reply [[User:Sriveenkat|Sriveenkat (🎤) ]] ([[User talk:Sriveenkat|talk]]) 14:35, 1 August 2023 (UTC)
  3545 क.txt
+
 
  3195 व.txt
+
== Exclusion list for generators? ==
  2201 न.txt
+
 
  2183 ब.txt
+
Hello, if there isn't a feature like this somewhere already, I propose a per-user blacklist of sorts, which would allow users to select words which would be excluded when you choose one of the generator options to generate words. I'm currently going through a list of words in a Wiktionary category, and I'm confronted with a growing list of words that I can't deal with because they aren't suitable for pronunciation (e.g. particles that surround other arbitrary words), or they're just homophones of something I've already recorded, etc. What would be necessary, techniaclly, in order to make this happen? [[User:Kiril kovachev|Kiril kovachev]] ([[User talk:Kiril kovachev|talk]]) 12:39, 10 August 2023 (UTC)
  2134 अ.txt
+
:Hi {{u|Kiril kovachev}}, I have opened a [[phab:T344221|Phabricator ticket]] for this request. If you know Javascript, you may have a look to the [https://github.com/lingua-libre/RecordWizard code] to propose a patch. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 05:52, 15 August 2023 (UTC)
  1789 र.txt
 
  1666 द.txt
 
  1623 आ.txt
 
  1568 ग.txt
 
  1524 ज.txt
 
  1507 त.txt
 
  1376 श.txt
 
  1132 ल.txt
 
  1102 ह.txt
 
  1089 च.txt
 
  1076 उ.txt
 
  1025 भ.txt
 
    809 य.txt
 
    791 फ.txt
 
    766 ख.txt
 
    652 ट.txt
 
    645 घ.txt
 
    480 ए.txt
 
    456 इ.txt
 
    446 ध.txt
 
    420 ड.txt
 
    318 ठ.txt
 
    273 झ.txt
 
    182 थ.txt
 
    163 ओ.txt
 
    118 छ.txt
 
    115 ऑ.txt
 
    64 ऐ.txt
 
    55 ढ.txt
 
    44 औ.txt
 
    29 २.txt
 
    26 ई.txt
 
    20 ष.txt
 
    20 ऊ.txt
 
    20 १.txt
 
    14 ऋ.txt
 
      6 ऱ.txt
 
      4 ३.txt
 
      2 ९.txt
 
      2 ८.txt
 
      1 ॐ.txt
 
      1 ४.txt
 
</pre>
 
|
 
<pre>
 
  4976 स.txt
 
  4462 प.txt
 
  3745 म.txt
 
  3545 क.txt
 
  3195 व.txt
 
  2201 न.txt
 
  2183 ब.txt
 
  2134 अ.txt
 
  1789 र.txt
 
  1666 द.txt
 
  1623 आ.txt
 
  1568 ग.txt
 
  1524 ज.txt
 
  1507 त.txt
 
  1376 श.txt
 
  1132 ल.txt
 
  1102 ह.txt
 
  1089 च.txt
 
  1076 उ.txt
 
  1025 भ.txt
 
    886 merged.txt
 
    809 य.txt
 
    791 फ.txt
 
    766 ख.txt
 
    652 ट.txt
 
    645 घ.txt
 
    480 ए.txt
 
    456 इ.txt
 
    446 ध.txt
 
    420 ड.txt
 
    318 ठ.txt
 
    273 झ.txt
 
</pre>
 
|}
 
: There is also a list [[List:Mar/Lemmas-without-audio-sorted-by-number-of-wiktionaries]] which is updated every day by a bot, so it should be always fresh. The list consists of words that are present in one or more Wiktionaries, but have no recording in Commons. At the top of the list, there are words with the largest number of Wiktionaries. You could probably give it a try too, [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]]. [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 16:34, 5 March 2021 (UTC)
 
  
== Automatically updated lists of unrecorded audio ==
+
== Barnstar Award Template ==
  
Not everybody here is probably aware that there are lists of unrecorded words available for 72 languages. The lists are sorted by the number of the language versions of Wiktionary where a corresponding word is described, with the most popular words at the top, so the lists should maximize in a way the usefulness of the recording. Words with audio recordings present in Commons are removed automatically from the lists every night. In this way, the lists should be always fresh. The lists have always a title in the form of <code><language code>/Lemmas-without-audio-sorted-by-number-of-wiktionaries</code>: {{olafbot-wikt}}. [[User:Olaf|Olaf]] ([[User talk:Olaf|talk]]) 16:51, 5 March 2021 (UTC)
+
There is any Barnstar Award Template for Lingua Libre? [[User:Sriveenkat|Sriveenkat (🎤) ]] ([[User talk:Sriveenkat|talk]]) 07:06, 13 September 2023 (UTC)
:This is game changer. Welcoming new contributors of 72 languages will no more be a tricking question of providing relevant lists. More lists coming. We can refocus on outreach and calling for new contributors to audio document their voices, their languages, their cultures. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:15, 5 March 2021 (UTC)
+
:There are [[Template:50k barnstar]] and [[Template:Speaker of the month]] and maybe other. [[User:WikiLucas00|WikiLucas00]] may know other barnstars. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 21:11, 13 September 2023 (UTC)
 +
::{{ping|Pamputt|WikiLucas00}} Ok Pamputt, I want give barnstar award for Some Beginner Speakers. It will be a motivating for them. Am I right?[[User:Sriveenkat|Sriveenkat (🎤) ]] ([[User talk:Sriveenkat|talk]]) 11:46, 14 September 2023 (UTC)
 +
:::Hello {{ping|Pamputt|Sriveenkat}}! Indeed, it would be a nice idea to offer awards for beginners, such as a barnstar for passing 1000 recordings for example. All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 16:08, 16 September 2023 (UTC)
  
== Outreach ==
+
==1,000,000th ==
[[File:Catalan_dialects-en.png|thumb|Dialects of Catalan.]]
+
* N  ! 08:38 కంటగిల్లు (Q1094614)‎ diffhist +3,648‎ V Bhavya talk contribs block ‎Created a new Item
I used the opportunity of bumping into a currently inactive user to go to his wikipedia (Catalan), ask him where I could announce we now have a {{wikt-list|cat}} list, and went to make [https://ca.wikipedia.org/wiki/Tema:W4oh9xw4ndkefh91 a gentle announcement]. I don't expect it to pay off soon, but by several pings, we should have some folks landing back here on Lingualibre. I didn't contact the ca:wikt community but you see the idea : leaving small many announcements here and there so people know our name. Smaller pings are ok. ''"Sorry all, i've been busy on LinguaLibre project those days"'', this would be helpful too. I tried to emphasis what service Lili provides to them (not sure I was good on that, but it's just a ping :) ). Please when you have the opportunity, reach out to local communities. Especially those not currently active. We have nice lists in 72+ languagea now. Let the wiki folks know and record more. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 08:24, 7 March 2021 (UTC)
+
* N  ! 08:38 కంటగించు (Q1094613)‎ diffhist +3,636‎ V Bhavya talk contribs block ‎Created a new Item
:{{Ping|Pamputt|}} hi, they started a light conversation-description of Catalan about cat valencia, cat central, cat balearic and cat Western (? not sure it was 3 or 4 different) pronunciations. Do you have any understanding on this Catalan issue ? Is this like Marseille French VS Paris French accents or something else ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:25, 7 March 2021 (UTC)
+
* '''N  ! 08:38 కంటకితము (Q1094612)‎ diffhist +3,636‎ V Bhavya talk contribs block ‎Created a new Item'''
::I do not precisely know how different are these Catalan varieties but they are more different than French from Paris and French from Marseille because theses varieties are considered as different dialects. So it is something like {{Q|930}} and {{Q|1186}} for the Occitan language. So we could start to import this dialect in Lingua Libre to be able to record in these dialects. At least, we should import the main dialects here, namely Northwestern Catalan, Valencian, Central Catalan, Balearic, Rossellonese and Alguerese. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:58, 7 March 2021 (UTC)
+
* N  ! 08:38 కంటకుడు (Q1094611)‎ diffhist +3,624‎ V Bhavya talk contribs block ‎Created a new Item
:::It seems to be the wish expressed by [[:ca:User:Vriullop|User:Vriullop]] too, and on another discussion I got. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:22, 7 March 2021 (UTC)
+
* N  ! 08:38 కంటక (Q1094610)‎ diffhist +3,588‎ V Bhavya talk contribs block ‎Created a new Item
::::{{Q|518078}}, {{Q|518079}}, {{Q|518087}}, {{Q|518106}}, {{Q|518118}}, {{Q|518128}} are now available, so we can record right now words in these dialects. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:09, 7 March 2021 (UTC)
+
* N  ! 08:38 కంటబడు (Q1094609)‎ diffhist +3,612‎ V Bhavya talk contribs block ‎Created a new Item
 +
[[User:Yug|Yug]] ([[User talk:Yug|talk]])
  
== License ? ==
+
== Why Lingua Libre Bot isn't running Wikidata? ==
{{done}}<br/>
 
I bumped again into [[User:Evist0/RecordWizard.json|cc-by-sa]] license for contributions. Aren't we supposed to contribute it all under CC-0 so it's Wikidata compatible ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:39, 8 March 2021 (UTC)
 
:The licence is up to the user's choice. --[[User:Poslovitch|Poslovitch]] ([[User talk:Poslovitch|talk]]) 21:54, 8 March 2021 (UTC)
 
::Then what do we do on wikidata ? Ooohhh... It's just a link toward Commons, no a copy of the audio file.... [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 22:53, 8 March 2021 (UTC)
 
  
== Metrics > [https://lingualibre.org/wiki/Special:ListUsers?username=&group=&creationSort=1&desc=1&wpsubmit=&wpFormIdentifier=mw-listusers-form&limit=500 Accounts creations] ==
+
{{ping|Poslovitch|Pamputt|WikiLucas00}}Why Lingua Libre Bot isn't running in Wikidata? {{u|Darafsh}} asked about in Wikidata Lexicographical data Telegram Group. What's the problem? Please kindly tell the issue. Thanks-[[User:Sriveenkat|Sriveenkat]] () ([[User talk:Sriveenkat|talk]])  16:12, 6 October 2023 (UTC)
Hi everyone !<br>
+
:{{ping|Sriveenkat}} could you point to an Lingua Libre item and a Wikidata item or lexeme that has not received the pronunciation? This will help to test and find what is wrong. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 19:22, 6 October 2023 (UTC)
We got about 5 times more account creations this January 2021 (~60) compare to January 2020 (~12).<br>
+
::Hi {{ping|Pamputt}} Recorded Audios doesn't received in the Wikidata Items and Wikidata Lexemes!. The User {{u|Darafsh}} have recorded some many words for Wikidata Lexeme Project. but never audios added to the Wikidata Lexemes. You can see the [[wikidata:Special:Contributions/Lingua Libre Bot]] The last contribution on 23:49, 9 September 2023. So, Iam just asking run the Lingua Libre Bot on Wikidata. I'm also recorded some words for Wikidata Lexeme Project I waited for some days, But never my audios added to wikidata lexemes. So, I run QuickStatements for Adding My audios.. Now User Darafsh also run QuickStatements for adding he's audios.. I think so many users using Lingua Libre for Automatically adding audios on Wikidata and some wikitionaries. I hope you understand Thankyou Regards [[User:Sriveenkat|Sriveenkat]] () ([[User talk:Sriveenkat|talk]])  05:38, 7 October 2023 (UTC)
Welcoming is largely done by hand these days. Having a bot for that may help.<br>
+
:Thanks to {{ping|Sriveenkat}} to start the discussion. If you need some examples, you may see Mazanin's contributions on [https://commons.wikimedia.org/wiki/Special:Contributions/Mazanin Commons]. This is the recorded audio: [https://commons.wikimedia.org/wiki/File:LL-Q9168_(fas)-Mazanin_(%D9%85%D8%A7%D8%B2%D9%86%DB%8C%D9%86)-%D9%87%D9%85%D8%A8%D8%A7%D8%B4%DB%8C.wav] and this is the lexeme entry on Wikidata: [https://www.wikidata.org/wiki/Lexeme:L1010467] but they are not connected yet. [[User:Darafsh|Darafsh]] ([[User talk:Darafsh|talk]]) 12:07, 7 October 2023 (UTC)
And, given that we are all overloaded, maybe would be wise to outreach for help. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 23:19, 8 March 2021 (UTC)
 
  
== Help - to delete word  ==
+
== SiteNotice ==
 +
Hi,<br />Translations are not working for Sitenotice. Install CentralNotice? ―[[User:Eihel|Eihel]] ([[User talk:Eihel|talk]]) 14:31, 7 October 2023 (UTC)
  
Hi, please guide me how i can delete recorded word from lili. already uploaded on wikimedia commons by mistake. Recorded Marathi word is 'कालका', which i want to delete. Thanks in advance.
+
== Global bot status ==
 +
Lingualibre Bot has been [https://meta.wikimedia.org/w/index.php?title=Steward_requests/Bot_status&diff=prev&oldid=25702991 approved]. cc {{ping|Pamputt|Poslovitch|WikiLucas00}}. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 12:31, 10 October 2023 (UTC)
 +
:Thank you for the request and congrats on the approval! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 12:40, 16 October 2023 (UTC)
  
:Hi {{u|Aparna Gondhalekar}}, there are two options depending whether "कालका" exists. If "कालका" exists but you record badly, then you just need to record it again and the new recording will replace the previous recording. Or if "कालका" does not exist, we need to delete the file directly on [[c:File:LL-Q1571 (mar)-Aparna Gondhalekar-कालका.wav|Wikimedia Commons]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 21:18, 9 March 2021 (UTC)
+
== ExternalTools - Wikidata Query Service - Recording Indian Actor and Actress Names in Tamil ==
  
== Wikimania 2021 ==
+
{{ping|Yug|Pamputt|WikiLucas00}} I am now interested in Recording Indian Actor and Actress Names in Tamil. So I make a [https://w.wiki/8G6T query], I Input that query url in ExternalTools. A error comes "Result must contain both "id" and "label" field." I think something need to modify on this query. Please anyone help for this. Thanks [[User:Sriveenkat|Sriveenkat]] ([[User talk:Sriveenkat|talk]])  19:58, 24 November 2023 (UTC)
It's not a big surprise, but it have been confirmed : [[:meta:Wikimania 2021|Wikimania_2021]] will be online only. It will limit our outreach. We used to go there and record 10~20 languages, 5-mins demoing to 30 people, and doing workshop to 40+ others. Also got plenty of small chats (100+) raising awareness about Lili and connecting with devs for fast discussions. Will need to find other way this year too. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:34, 9 March 2021 (UTC)
+
:{{ping|Sriveenkat}}, [https://w.wiki/8Gev this] works. Please note there is 6982 items if we remove the LIMIT, and I don't how the systems works with such larger list. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 23:13, 25 November 2023 (UTC)
 +
::{{ping|Yug}} Thanks for your reply. The query doesn't works for me :( Error in ExternalTools "undefine" [[User:Sriveenkat|Sriveenkat]] ([[User talk:Sriveenkat|talk]])  06:03, 26 November 2023 (UTC)
 +
:::{{ping|Sriveenkat}}, in Wikifata QS you have to run the query to check if it is working and providing data, if so go to the URL bar, copy that long url. Come back to Lingualibre Step 3, external tool, paste that long url. It worked for me. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 06:00, 27 November 2023 (UTC)
 +
::::{{ping|Sriveenkat}} Sorry, I missed something. On the Query Service bottom right, click "Link" > then on "SPARQL endpoint" : copy this url. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 08:25, 27 November 2023 (UTC)
 +
:::::{{ping|Yug}} Works with copying SPARQL endpoint link. Thank you much. I'm planning to record more proverbs, usage examples, places, persons, Lingualibre is really more comfortable to record it. Thanks Again [[User:Sriveenkat|Sriveenkat]] ([[User talk:Sriveenkat|talk]])  22:54, 27 November 2023 (UTC)
  
== Return with Return ==
+
== Logo redesign propositions ==
  
So, we are back. Almost after 50 days, we are back to work. Thanks to [[User:VIGNERON]], [[User:Yug]], [[User:Pamputt]] etc who were around. Let's make some noise.  
+
I had a bit of fun yesterday contributing to one of my favourite projects in a slightly different way. I've kept the ideas (microphone, wings) and colours of the current logo but made it a bit more polished. I've already taken a few opinions on Discord but I wanted to get a more general opinion. What do you think?
  
'''Idea:''' I have an idea, can you record the word "Return" or "Come back" (or something similar) in your language and put it in the gallery below? Please mention the language name, and meaning in the caption. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 02:09, 23 April 2021 (UTC)
+
Just so you know, I won't be at all offended if the community prefers to keep the current logo, because there are some very good reasons for keeping it (I'm thinking in particular of all the printed materials, the fact that it's simple (easy to draw by hand if we don't have a printer and maybe more "readable" if very small), its declination for sign languages, etc.).
:"Return/Come back" as in "LinguaLibre is back", [https://en.m.wikipedia.org/wiki/The_Lord_of_the_Rings:_The_Return_of_the_King#/languages :en:The Lord of the Rings: The Return of the King]] (70 languages) or [https://en.m.wikipedia.org/wiki/Return_of_the_Jedi#/languages en:Return of the Jedi] (63), right ? [[User:Titodutta|Titodutta]], please provide some examples / context. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 04:58, 23 April 2021 (UTC)
 
* Yes you are right. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 19:30, 23 April 2021 (UTC)
 
  
=== Return Gallery ===
+
<gallery style="text-align:center;"  heights="200px"  widths="200px">
<gallery>
+
File:Proposition refonte logo Lingua Libre (1).svg|Proposition 1
File:LL-Q9610 (ben)-Titodutta-প্রত্যাবর্তন.wav|প্রত্যাবর্তন (''Protyaborton'' in Bangla, means "Return")
+
File:Proposition refonte logo Lingua Libre (2).svg|Proposition 2
File:LL-Q150_(fra)-Yug-retour.wav|Retour (French)
 
 
</gallery>
 
</gallery>
 +
[[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 08:59, 3 December 2023 (UTC)
 +
:{{Ping|DSwissK}} hello,
 +
:We can add your proposition in the set of logos ideas within a Wikimedia Commons [[:commons:Category:Proposed Lingua Libre logo|Category:Proposed Lingua Libre logo]], for reference later on. But to be honest, good logo design requires design experience, artistic intuition, brand and public awareness, which are harder to gather than it seems. It also must fit a project's phase and branding strategy, when the project needs a new logo and project members willing to shift from the current high visibility logo to a new one. All together changing a logo is not something easy to push for. I made a similar answer [https://github.com/lingua-libre/SignIt/pull/41 here] few month ago about Lingua Libre SignIt. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 12:23, 4 December 2023 (UTC)
 +
:: {{Ping|Yug}} hi,
 +
:: Thank you for your input. I appreciate you explaining the complexities - you raise great context I had not fully considered. [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 09:05, 6 December 2023 (UTC)
  
== Translate doesn't seem to work ==
+
== Hebrew diacritics (Niqqud) ==
  
I can't seem to be able to translate pages, is this an error on my behalf or are there something wrong with the servers? --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 17:01, 23 April 2021 (UTC)
+
In Hebrew we use diacritics (Niqqud) to determine how to pronounce the words.
:Indeed, something is broken. There is a [[phab:T280972|Phabricator ticket]] to track this issue. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:30, 23 April 2021 (UTC)
 
::Okay, thank you. --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 22:01, 23 April 2021 (UTC)
 
:::Hello {{u|Pamputt}}, I tried to translate several pages from the Wiki directly, to test, taking inspiration from the T:xx translation markers (example: https://lingualibre.org/wiki/Translations:Help:Main/14/fr). An error occurs, always the same. I added a line in your [[phab:T280972|task]], notifying Tgr who may be interested. He may add the tag of the "OAuthAuthentication" project. Cordially. —[[User:Eihel|Eihel]] ([[User talk:Eihel|talk]]) 14:31, 25 April 2021 (UTC)
 
[[File:Translation error in Lingua Libre.png|600px|erreur de traduction]]
 
:Translations are back. Thanks. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:54, 27 April 2021 (UTC)
 
::I still can't seem to be able to translate :( {{ping|Pamputt|Eihel}}
 
  
== HIGH PRIORITY: Audio recordings have dust and clicks ==
+
Niqqud is usually common in the following cases:
:''Under investigation: Some users experience parasitic saturation (“Pock!”) or dust while other don't. This irregular occurrence reminds of earlier, non-solved “speed up bug”.''
+
# Young kids or people learning the language.
=== Discussion ===
+
# Formal use.
I've had friends record German and Romanian lists. They're using separate hardware, and have recorded thousands of words before, so I know their hardware is fine. The recordings they've done today suffer from loud clicks on half the recordings, so there seems to be a problem with the recording studio. I clearly have no idea what the problem is or how to fix it, but I hope someone else will!
+
# To distinguish between meanings when the base form is ambiguous.
  
Here are examples:
+
This is a short example:
* [[File:LL-Q188_(deu)-Natschoba-der_Wunsch.wav]] — LL-Q188_(deu)-Natschoba-der_Wunsch.wav
+
* Base form: גזר (GZR)
* [[File:LL-Q7913_(ron)-Andreea_Teodoraa-muscă.wav]] — LL-Q7913_(ron)-Andreea_Teodoraa-muscă.wav
+
* Carrot: גֶּזֶר (Gezer)
* [[File:LL-Q150 (fra)-Hélène (Hsarrazin)-corné.wav]] — LL-Q150 (fra)-Hélène (Hsarrazin)-corné.wav
+
* Masculine cut: גָּזַר (Gazar)
[[User:Julien Baley|Julien Baley]] ([[User talk:Julien Baleytalk]]) 16:24, 24 April 2021 (UTC)
+
* Piece: גֶּזֶר (Gezer)
 +
This is the corresponding Wiktionary article: https://he.wiktionary.org/wiki/גזר
  
: J'ai le même souci. [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 17:49, 24 April 2021 (UTC)
+
When fetching words from Wiktionary it's better to use the first headers instead of the item names because in many cases the term is ambiguous and the items name is the base form without any pronunciation guidance.
::Hmm, very annoying.I 've opened a [[phab:T281041|Phabricator ticket]]. I hope the issue will be fixed soon. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:38, 24 April 2021 (UTC)
 
::HIGH priority. No idea who can fix it. Can someone refine the diagnosis ? Can more people test with their configuration and report here ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:33, 25 April 2021 (UTC)
 
:::I notified Mr. Vion, the original coder of the JS recorder. He may have some insights. I suspect it's a bug with either :
 
:::* [https://github.com/lingua-libre/RecordWizard RecordWizard (studio)], the mw extension interfacing the user speaking and the audio processing layers. It got recent changes due to migration to mw 1.35.
 
:::* [https://github.com/lingua-libre/LinguaRecorder LinguaRecorder JS], the core JS library processing audio signal. No changes in past week.
 
:::Recent changes may have affected how the audio cuts are done. Either mw extension or the JS could need a fix.
 
:::This is a core bug preventing LinguaLibre core mission. Any insight is welcome. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:43, 25 April 2021 (UTC)
 
::::So {{Q|522922}} (deu:der_Wunsch), {{Q|522753}} (ron:muscă) and {{Q|523386}} (fra:corné). —[[User:Eihel|Eihel]] ([[User talk:Eihel|talk]]) 17:26, 25 April 2021 (UTC)
 
:::::{{ping|Eihel}} the 1st and 3rd ones sounds good to me. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:38, 25 April 2021 (UTC)
 
::::::{{ping|Yug}} the 1st and 3rd ones do not sound good to me, there's a clear click on the "der" and "cor". If you have populated the table below, perhaps your numbers are too optimistic (if we have a different judgement on these three). [[User:Julien Baley|Julien Baley]] ([[User talk:Julien Baley|talk]]) 12:56, 26 April 2021 (UTC)
 
:::::{{ping|Julien Baley|DSwissK|Eihel}}
 
:::::I reviewed recent recordings of 4 users.
 
:::::* Two contributors have perfect audios (100% good on 8 audios checked for each user).
 
:::::* Two new users have the bug (30% of audios with saturation).
 
:::::I first though it could be new users not using their hardware properly : microphone must not be overly sensitive, we should not let them vibrate, etc. It's a know-how we are transmitting when doing IRL workshops and that tech-friendly people fix quickly. Autodidact users have not been warned of this.
 
:::::But it does not explain why experienced users such as DSwissK and Julien's friend have such noise. So I'am confused.
 
:::::DSwissK, did you tried alternative microphone settings, with lower volume ? That you are not recently speaking louder or a changes you did not notice previously ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 22:02, 25 April 2021 (UTC)
 
:Hello {{u|Yug}}, I concede that the difference may be minimal on some records. You have to listen carefully, it's like "a diamond on a vinyl which jumps on a dust". Some files are more affected than others (depending on the vocal intonation), but all of the ones I have cited are problematic. To fully understand, you can try recording with Schtooka (former LiLi), then immediately redo the same recording on LiLi. As I said to Hélène, you can also compare with an existing recording {{Q|499309}}. Cordially. —[[User:Eihel|Eihel]] ([[User talk:Eihel|talk]]) 15:12, 26 April 2021 (UTC)
 
::{{ping|Eihel|Julien Baley}} I'am officially deaf from one ear so I'am not the best judge on audios. I pushed the review as far as I can do bu could other users help to review more audios so Mr. Vion can attack this investigation with clean clues and ratios. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:15, 26 April 2021 (UTC)
 
:::{{ping|Yug}} I'm very happy to help review some recordings, if you want; could you suggest a list of users? (I don't know how to find users that have recently recorded). [[User:Julien Baley|Julien Baley]] ([[User talk:Julien Baley|talk]]) 17:41, 26 April 2021 (UTC)
 
::::{{Ping|Julien Bale}} process added below. Thank you ! Note: the user I review (all those below) may have higher noise ratio since don't have a musical ear. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:56, 26 April 2021 (UTC)
 
:::::{{Ping|Yug}}; I've checked the entire table and added a few people (Hsarazin has only 1 recent recording, so I've amended the "14" that was shown). Some people have 0% problem, some close to 100%... the problems are very characteristic. [[User:Julien Baley|Julien Baley]] ([[User talk:Julien Baley|talk]]) 19:25, 26 April 2021 (UTC)
 
::{{Ping|Pamputt|DSwissK}} & others, I really need help on this one. We need to review and report 10+ recording for each user uploading audios to Commons and likely to send a custom message to each affected user, on their talk page and on their Commons' talk page (ex [[User_talk:Andreea_Teodoraa|msg]], ex [https://commons.wikimedia.org/w/index.php?title=User_talk%3AAndreea_Teodoraa&type=revision&diff=555617601&oldid=468099121 ping]). [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:36, 26 April 2021 (UTC)
 
:::{{ping|Yug}} not fully helpful but I added a section on [[LinguaLibre:Stats#The most prolific speakers for the current month]], it may help to narrow down to who did recent recordings. Cheers, [[User:VIGNERON|VIGNERON]] ([[User talk:VIGNERON|talk]]) 07:20, 27 April 2021 (UTC)
 
'''/!\''' The dust bug issue is confirmed as core and relatively widespread. I sent an email this morning to Wikimedia France (Adelaide, Remy, Michael) with suggested solutions : immediate, restoring a sitenotice ribon to inform our users ; short term, hiring Vion for analysis and possibly a fix. We should not be claiming to be back online and on our feet when we arent. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:09, 27 April 2021 (UTC)
 
:Good. The CSS fixes have been deployed. → Sitenotice is back. → Indentation is back. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:11, 27 April 2021 (UTC)
 
:{{Ping|WikiLucas00|DSwissK}} hi,
 
:Given you are the two active users having this issue we need you most.
 
:Could you record 15~30 other audios with another Web browser, such as Firefox or else. Then report the result with this ?
 
:If you have any other hypothesis to test I'am interested. (Changing microphones, etc.) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:23, 27 April 2021 (UTC)
 
::I had the impression (and DSwissK confirmed on Discord) that using Firefox slightly reduces the amount of problems encountered. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 19:53, 27 April 2021 (UTC)
 
:::Yup, I installed Firefox and could finally send some more audios (me and my daughter), with internal microphone on my laptop. Please review. [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 00:45, 28 April 2021 (UTC)
 
  
==== Limiting the number of words to record ====
+
As for Wikipedia etc. sometimes there's a word with the Niqqud inside the article but it will be a bit complicated to parse so we can skip that for now.
{{Ping|Yug|DSwissK|VIGNERON|Seb35|Pamputt|Titodutta}} I think that one important cause of the bugs is related to the RAM. Thus, loading a long list into the Record Wizard results in a maximum amount of bugs in the recordings (the length of this list -- its weight -- may vary, depending on the user's hardware and software).
 
  
I think we should try limiting (to 100 or 200 maximum) the possible number of words to be put into the Record Wizard, at least temporarily. There is no point in loading into the RW lists that are 1000-words long; taking a little break during the recording is never wrong, and it could help reducing the amount of bugs for the moment, while we try to find the source of the issue.<br/>Best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 19:53, 27 April 2021 (UTC)
+
== Lights on userrights ==
:We have to test this hypothesis. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:35, 27 April 2021 (UTC)
+
Hello all,<br>
::Tested and reporting : I used very small lists (less than 10 words) and still have the same issue. I encounter that bug on my smartphone, both my computers (desktop and laptop) under Chrome (latest version). Using internal or external microphone doesn't change anything. [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 00:42, 28 April 2021 (UTC)
+
I bumped again into [[LinguaLibre:User_rights]] and {{tl|Autopatrolled}}. To the extend of my knowledge we have no solution to this and no active user is munitoring this bottleneck. Is this assessment correct ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:03, 28 December 2023 (UTC)
:::{{ping|DSwissK}} thank you. This is helpful. Seems clearly software issue. I contacted Wikimedia France and Vion requesting them to jump in.
 
:::We need people with audio software skills to inspect those audios and people with JS+audio skills to review the audio input chains. Mr. Vion has both skills. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:52, 28 April 2021 (UTC)
 
:::I do not think it's RAM related.
 
:::Even with 1000 words we are dealing with 1000 words x 7KB per file = 7 MB.
 
:::Let's admit the browser stores the words in a very, very details-rich way, so the files are 1000 times heavier. We still are 7GB.
 
:::Most computers have 8~16GB of RAM by now.
 
:::I also recorded small list and apparently add the issue.
 
:::Most (all?) users affected had recorded few dozens words. Worst affected users: Natschoba → 149, Andreea Teodoraa → 247, WikiLucas00 → 64.
 
:::All but 3 users [[LinguaLibre:Stats#The most prolific speakers for the current month|this month]] have recorded less than 300 words. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 11:02, 28 April 2021 (UTC)
 
  
=== Review process ===
+
== A mobile app ==
'''To review recordings by another user :'''
 
# Go to [[Special:RecentChanges]] > Find recent recordings > Pick an user which is not already in the table below
 
# Open 10~20 of this user's recent recordings > Listen each > Count how many have unusual audio artifacts
 
# Add this user to the table below with its associated results and your comment
 
# If you feel necessary, please notify the user on Lili (ex [[User_talk:Andreea_Teodoraa|msg]]) and ping the user on Commons (ex [https://commons.wikimedia.org/w/index.php?title=User_talk%3AAndreea_Teodoraa&type=revision&diff=555617601&oldid=468099121 ping])
 
  
'''To be reviewed :'''
+
I personally think that contributing using a browser is quite dangerous, Firefox on mobile, for example, has a very strict page unloading policy which leads to closing the tab while uploading thus losing the remaining data which wasn't uploaded yet (I found a workaround but it's not perfect), are there any thought about this? (Maybe even expanding the current [https://www.saveriomorelli.com/commonvoice/ CV Project] app by Saverio Morelli?)
# With your usual web browser, go to [[Special:RecordWizard|Record Wizard (studio)]] > Step 3, enter your web browser name then 15 words in your language > Record, publish.
 
# Come on [[LinguaLibre:Chat room#Reviews-ready]] > Post a message with your web browser, its version [optional], and your OS.
 
  
'''To be reviewed, recording with another browser or device :'''
+
== Is the Record Wizard not working for anyone else? ==
# With your usual web browser, go to [[Special:RecordWizard|Record Wizard (studio)]] > Step 3, enter your web browser name then 15 words in your language > Record, publish.
 
# Come on [[LinguaLibre:Chat room#Reviews-ready]] > Post a message with your web browser, its version [optional], and your OS.
 
# Add some information so we know which of your recording are associated with this alternative browser or device.
 
  
=== Review-ready ===
+
My mic works with [https://mictests.com/ mictests.com], but [https://lingualibre.org/wiki/Special:RecordWizard the RecordWizard] doesn't pick anything up at the "check your microphone" stage. I've tried on both my phone and my laptop, and I can record sound in both cases, and I have the appropriate permissions enabled, but this particular website isn't detecting sounds. Is anyone else having this kind of problem? [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 23:43, 24 February 2024 (UTC)
* I recorded 10+ audios with Chrome 89.0.4389.114 (Official Build) (64-bit) : <s>all good for me, no review needed</s>. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:35, 27 April 2021 (UTC)
+
:Hello [[User:Grendelkhan]],
::{{ping|Yug}} Could you try 20 more with an up-to-date version of Chrome? — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 18:38, 27 April 2021 (UTC)
+
:I just received a second such report. User also checked [https://mictests.com/ mictests.com] sucessfully.
:::{{ping|WikiLucas00}} Done. I'am not sure, but I may have the bug as well. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:42, 27 April 2021 (UTC)
+
:On Firefox, Lingua Libre recording studio step 4, the microphone is allowed (we see the red microphone image on the left of the URL address). But after clicking the record button, no recording occurs.
::::{{ping|Yug}} The majority of [https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Yug&ilshowall=1 your last recordings] contain at least a click. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 19:56, 27 April 2021 (UTC)
+
:* Mictests on other site : successful.
 +
:*Device: Notebook
 +
:*OS: ?
 +
:*Browser: Firefox, Chrome.
 +
:*User: [[User:Akamycoco]].
 +
:*Languages affected: all.
 +
:*Dates : Worked on February 28. Stopped working on February 29.
 +
:Let's starts an investigation. Could you let me know your OS and precise web browser version ? (Help > About Chrome or similar)
 +
:Let me know as well if you have basic developer skills to Right-click on the staled page > Inspect > Console : are there any error message ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 07:55, 1 March 2024 (UTC)
  
=== Samples ===
+
::My laptop is using Google Chrome <tt>122.0.6261.94 (Official Build) (64-bit)</tt> on Linux (Debian Testing). No error messages in the console when I attempt the recording. My phone is using Chrome <tt>122.0.6261.90</tt> on Android 14 on a Pixel 5a. It ''does'' seem to work on Firefox <tt>115.7.0esr (64-bit)</tt> on my laptop. (I really should have checked that before.) So maybe this is solely a Chrome problem? [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 16:30, 2 March 2024 (UTC)
:''Under investigation: Some contributors experience parasitic saturation (“Pock!”) or dust while other don't.''
+
 
:''Please review your recent recordings and help expand table below so we can identify a recurring pattern among affected contributors vs non-affected ones.''
+
== Automatic categorization isn't documented. ==
{| class="wikitable"
+
 
! ||Username || # reviewed || % affected || Example file || Web Browser + version || Comment
+
So far as I can tell, this isn't documented: if, for user Foo, category <tt>Lingua Libre pronunciation by Foo</tt> exists on Commons, then all uploads will be categorized into that category. This is helpful! It's also easy to backfill after the fact using [[:commons:Help:Gadget-Cat-a-lot]]. I'm not sure where to document this, but it seems reasonable to do so ''somewhere''. [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 16:26, 3 March 2024 (UTC)
 +
 
 +
== Understanding lingua-libre ==
 +
 
 +
Hi, I am creating this discussion to understand lingua-libre better
 +
 
 +
== Uploads are failing ==
 +
:''TLDR: Large amount of users reporting failure to upload at step 5 : [[User:Grendelkhan|Grendelkhan]], [[User:Culex|Culex]], [[User:XANA000|XANA000]], [[User:Ardzun|Ardzun]] (Indonesian languages), [[User:Penn Zero MSSJ|Penn Zero MSSJ]], [[User:Univòc64]] (Whistled Occitan) and [[User:Akamycoco]] (Taiwanese languages). This likely only tip of iceberg. Only few users were able to [https://lingualibre.org/index.php?hidebots=1&translations=filter&hidepageedits=1&hideWikibase=1&hidelog=1&namespace=0&limit=1000&days=14&enhanced=1&title=Special:RecentChanges&urlversion=2 record in May], with atypically low number of recordings. Indonesia workshop with ~15 participants critically affected. Investigation ongoing. [[User:Hugo en résidence|Hugo en résidence]] ([[User talk:Hugo en résidence|talk]]) 14:20, 13 May 2024 (UTC)''
 +
 
 +
I can record words, but uploading them to Commons fails. The JavaScript console has the following message:
 +
 
 +
: <tt>'''Your IP address is in a range that has been [[m:Special:MyLanguage/Global blocks|blocked on all Wikimedia Foundation wikis]].''' The block was made by [[User:EPIC|‪EPIC‬]]. The reason given is ''[[m:Special:MyLanguage/NOP|Open proxy/Webhost]]: See the [[m:WM:OP/H|help page]] if you are affected''. * Start of block: 10:09, 1 May 2024 * Expiry of block: 10:09, 1 May 2027 Your current IP address is 2001:41d0:304:100::4790. The blocked range is ‪2001:41D0:0:0:0:0:0:0/33‬. Please include all above details in any queries you make. If you believe you were blocked by mistake, you can find additional information and instructions in the [[m:Special:MyLanguage/No open proxies|No open proxies]] global policy. Otherwise, to discuss the block please [[m:Steward requests/Global|post a request for review on Meta-Wiki]]. You could also send an email to the [[m:Special:MyLanguage/Stewards|stewards]] [[m:Special:MyLanguage/VRT|VRT]] queue at "stewards@wikimedia.org" including all above details.`, blockinfo: {…}, "*": "See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/&gt; for notice of API deprecations and breaking changes."
 +
 
 +
This is not my IP address shown in the error message, and whatismyip confirms that I'm not behind a proxy. The Global block request [https://meta.wikimedia.org/wiki/Steward_requests/Global/2024-w18#Global_block_for_Special:Contributions/2001:41D0:0:0:0:0:0:0/33 is here]. Is this affecting anyone else? I lost a heap of recordings. [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 22:26, 4 May 2024 (UTC)
 +
:Uploads are failing for me today too, even though I am recording with my account. [[User:Culex|Culex]] ([[User talk:Culex|talk]]) 15:04, 8 May 2024 (UTC)
 +
:: Idem--[[User:XANA000|XANA000]] ([[User talk:XANA000|talk]]) 16:49, 9 May 2024 (UTC)
 +
::: I can record, but i couldn’t uploaded until today. I was able to upload once yesterday, but after that I couldn't upload any more. [[User:Ardzun|Ardzun]] ([[User talk:Ardzun|talk]]) 06:04, 11 May 2024 (UTC)
 +
:I guess I'm not the only one who's been trying for weeks but could not publish audio after 1 May. Hope someone can fix it. [[User:Penn Zero MSSJ|Penn Zero MSSJ]] ([[User talk:Penn Zero MSSJ|talk]]) 20:54, 13 May 2024 (UTC)
 +
::[[User:Univòc64]] (Whistled occitan) and [[User:Akamycoco]] (Taiwanese languages) also reported issues.
 +
::It seems time to add a sitenotice warning. [[User:Hugo en résidence|Hugo en résidence]] ([[User talk:Hugo en résidence|talk]]) 14:07, 13 May 2024 (UTC)
 +
::In may we have mostly : 556 recordings by 7 users on May 1th, 174 recordings on May 11th ([[Special:Contributions/Austin Zhang|Austin Zhang]]), then nothing.
 +
::If we compare with [https://public-paws.wmcloud.org/User:Yug/QueryLingualibre-monthly.ipynb known monthly recordings], our average months recently was 30k audios, the lowest ones were 5k audios, May 2024 is heading toward 1200 audios or 5% of the average month and 20% of the lowest months. Something weird is going on indeed. 
 +
{| class=wikitable
 +
! Most prolific speakers for the current month || Months since 2022
 
|-
 
|-
| [[Special:Contributions/DSwissK|c]] || [[User:DSwissK]] || 15 || 33% (5) || [[File:LL-Q150 (fra)-DSwissK-gratter.wav]] ||  ||
+
|
 +
<query _pagination="10" locutor="<translate><!--T:7--> Item (locutor Qid)</translate>" locutorLabel="<translate><!--T:8--> Speakers of the Month</translate>" nb="<translate><!--T:9--> Number of records</translate>">
 +
SELECT ?locutor ?locutorLabel ?nb WHERE {
 +
  {
 +
    SELECT ?locutor (COUNT(?record) as ?nb)
 +
    WHERE {
 +
        ?record prop:P2 entity:Q2 .        # Q2: record, P2: instance of.
 +
        ?record prop:P5 ?locutor .          # Property:P5: speaker
 +
        ?record prop:P6 ?date .
 +
      FILTER ( YEAR(?date) = YEAR(NOW()) && MONTH(?date) = MONTH(NOW()) )
 +
    }
 +
    GROUP BY ?locutor ?locutorLabel
 +
    ORDER BY DESC(?nb)
 +
    LIMIT 50
 +
  }
 +
  SERVICE wikibase:label {
 +
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
 +
    ?locutor rdfs:label ?locutorLabel .
 +
  }
 +
}
 +
ORDER BY DESC(?nb)
 +
</query>
 +
|
 +
<pre>
 +
{ date:2022-01, records: 21290, speakers: 46, languages: 28 },
 +
{ date:2022-02, records: 3894, speakers: 40, languages: 17 },
 +
{ date:2022-03, records: 8357, speakers: 61, languages: 21 },
 +
{ date:2022-04, records: 5454, speakers: 34, languages: 18 },
 +
{ date:2022-05, records: 4702, speakers: 59, languages: 30 },
 +
{ date:2022-06, records: 7675, speakers: 41, languages: 18 },
 +
{ date:2022-07, records: 4364, speakers: 37, languages: 22 },
 +
{ date:2022-08, records: 9544, speakers: 45, languages: 23 },
 +
{ date:2022-09, records: 5802, speakers: 113, languages: 30 },
 +
{ date:2022-10, records: 6931, speakers: 74, languages: 32 },
 +
{ date:2022-11, records: 8461, speakers: 54, languages: 34 },
 +
{ date:2022-12, records: 11882, speakers: 54, languages: 23 },
 +
{ date:2023-01, records: 18150, speakers: 48, languages: 29 },
 +
{ date:2023-02, records: 32441, speakers: 65, languages: 29 },
 +
{ date:2023-03, records: 11527, speakers: 61, languages: 30 },
 +
{ date:2023-04, records: 8451, speakers: 58, languages: 35 },
 +
{ date:2023-05, records: 21282, speakers: 97, languages: 49 },
 +
{ date:2023-06, records: 17940, speakers: 56, languages: 35 },
 +
{ date:2023-07, records: 75825, speakers: 74, languages: 38 },
 +
{ date:2023-08, records: 32681, speakers: 54, languages: 30 },
 +
{ date:2023-09, records: 28813, speakers: 114, languages: 30 },
 +
{ date:2023-10, records: 60317, speakers: 167, languages: 47 },
 +
{ date:2023-11, records: 49704, speakers: 140, languages: 55 },
 +
{ date:2023-12, records: 42383, speakers: 114, languages: 41 },
 +
{ date:2024-01, records: 40572, speakers: 112, languages: 40 },
 +
{ date:2024-02, records: 22385, speakers: 197, languages: 57 },
 +
{ date:2024-03, records: 16997, speakers: 173, languages: 48 },
 +
{ date:2024-04, records: 8733, speakers: 117, languages: 42 },
 +
{ date:2024-05, records: 556, speakers: 7, languages: 7 }
 +
</pre>
 
|-
 
|-
| [[Special:Contributions/Natschoba|c]] || [[User:Natschoba]] || 20 || 95% (19) || [[File:LL-Q188_(deu)-Natschoba-der_Wunsch.wav]]<br>[[File:LL-Q188 (deu)-Natschoba-Anspruch erheben.wav]]<br>[[File:LL-Q188 (deu)-Natschoba-der Unfall.wav]]<br>[[File:LL-Q188 (deu)-Natschoba-der Teil.wav]] || ||  Several thousands of recordings before. No hardware change.
+
! Daily recordings over April and May 2024 ||   
 
|-
 
|-
| [[Special:Contributions/Andreea Teodoraa|c]] || [[User:Andreea Teodoraa]] || 11 || 75% (8) || [[File:LL-Q7913 (ron)-Andreea Teodoraa-muscă.wav]]<br>[[File:LL-Q7913 (ron)-Andreea Teodoraa-otravă.wav]]<br>[[File:LL-Q7913 (ron)-Andreea Teodoraa-ofițer.wav]] || || Several thousands of recordings before. Tried different mics and platforms, same behaviour.
+
|
 +
<query _pagination="40">
 +
SELECT
 +
?yearmonthday
 +
(COUNT(DISTINCT ?record) AS ?records)
 +
(COUNT(DISTINCT ?speaker) AS ?speakers)
 +
(COUNT(DISTINCT ?language) AS ?languages)
 +
WHERE {
 +
  ?record prop:P5 ?speaker .
 +
  ?record prop:P4 ?language .
 +
  ?record prop:P6 ?date .
 +
  BIND( SUBSTR(str(?date), 0, 11) as ?yearmonthday )
 +
    { SELECT ?record
 +
      WHERE {
 +
      ?record prop:P2 entity:Q2 .
 +
      ?record prop:P6 ?date .
 +
      FILTER(?date >= "2024-04-01T00:00:00Z"^^xsd:dateTime)
 +
      FILTER(?date < "2024-05-30T00:00:00Z"^^xsd:dateTime)
 +
    }
 +
  }
 +
}
 +
GROUP BY ?yearmonthday
 +
ORDER BY (?yearmonthday)
 +
</query>
 +
| <= stops on 2024.05.01<br>Note: [[Special:Contributions/Austin Zhang|Austin Zhang]] recorded 174 audios on 05.11
 +
|}
 +
[[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:39, 14 May 2024 (UTC)
 +
 
 +
=== Fixed ===
 +
Both IP ranges 2001:41D0:0:0:0:0:0:0/32 and 2001:41D0:0:0:0:0:0:0/33 were subject to global Wikimedia block at one point (see [https://meta.wikimedia.org/w/index.php?title=Steward_requests/Global&oldid=26774369#Unregistered_users_only_block_for_the_range_2001:41D0:0:0:0:0:0:0/32 Global ban range_2001:41D0:0:0:0:0:0:0/32]). Following our request, the ban have been reconfigured and uploads from LinguaLibre are possible again. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:38, 14 May 2024 (UTC)
 +
:I can record and upload since yesterday with my account, so that seems fixed. But it seems the stats are still not updated. [[User:Culex|Culex]] ([[User talk:Culex|talk]]) 12:08, 15 May 2024 (UTC)
 +
 
 +
=== Logs ===
 +
For references, I investigated the relevant block logs and uploads logs for May 2024.<br>Conclusion: the uploads collapse is coherent with the IP Ban. Still, given bug reports from Akamycoco in *March* and 咽頭べさ [[:c:File:Lingua_Libre_error_2024.webm|on step 4]], I suspects other bugs are lingering around.
 +
{| class=wikitable
 +
!width=50%| Global IP bans
 +
! Lingualibre uploads logs
 
|-
 
|-
| [[Special:Contributions/GeoMechain|c]] || [[User:GeoMechain]] || 15 || 0% (0) ||  || ||  
+
|
|-
+
* [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F32&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 18:46, 13 May 2024] EPIC talk contribs changed global block settings for 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (anonymous users only) (No open proxies <!-- SCLT ID: Possible VPN or Colocation -->)
| [[Special:Contributions/ClasseNoes|c]] || [[User:ClasseNoes]] || 15 || 0% (0) ||  || || 
+
[https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F32&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 00:51, 10 May 2024] AmandaNP talk contribs globally blocked 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (No open proxies <!-- SCLT ID: Possible VPN or Colocation -->)   
|-
+
* [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F33&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 17:02, 9 May 2024] EPIC talk contribs changed global block settings for 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (anonymous users only) (Open proxy/Webhost: See the help page if you are affected)
| [[Special:Contributions/Hsarrazin|c]] || [[User:Hsarrazin]] || 14 || 30% (4) || [[File:LL-Q150 (fra)-Hélène (Hsarrazin)-corné.wav]]<br>[[File:LL-Q150 (fra)-Hélène (Hsarrazin)-Bellevigne-les-Châteaux.wav]]<br>[[File:LL-Q150 (fra)-Hélène (Hsarrazin)-Saint-Sylvain-d’Anjou.wav]] ||  ||
+
* [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F33&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 17:09, 1 May 2024] EPIC talk contribs blocked 2001:41d0::/33 talk with an expiration time of 2 years, 364 days, 12 hours, 21 minutes and 36 seconds (anonymous users only, account creation disabled) (Open proxy/Webhost: See the help page if you are affected)  
|-
+
* [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F33&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 17:09, 1 May 2024] EPIC talk contribs globally blocked 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (Open proxy/Webhost: See the help page if you are affected)  
| [[Special:Contributions/ᱥᱟᱹᱜᱩᱱ ᱗|c]] || [[User:ᱥᱟᱹᱜᱩᱱ ᱗]] || 2 || 100% (2)|| [[File:LL-Q33965 (sat)-ᱵᱳᱫᱤ ᱵᱟᱥᱠᱤ (ᱥᱟᱹᱜᱩᱱ ᱗)-ᱢᱟᱨᱥᱟᱞ.wav]]<br>[[File:LL-Q33965_(sat)-ᱵᱳᱫᱤ_ᱵᱟᱥᱠᱤ_(ᱥᱟᱹᱜᱩᱱ_᱗)-ᱠᱟᱹᱢᱤ.wav]] ||  || Only 2 audios.
+
|
|-
+
* : [https://commons.wikimedia.org/wiki/Special:RecentChanges?hidebots=1&translations=filter&hidecategorization=1&hideWikibase=1&tagfilter=OAuth+CID%3A+1735&limit=500&days=30&urlversion=2 Uploads via Lingualibre resumed].
| [[Special:Contributions/Zoyahssn|c]] || [[User:Zoyahssn]] || 2 || 100% (2) || [[File:LL-Q1860 (eng)-Md Anan Islam (Zoyahssn)-Md Anan Islam.wav]] || ||  Suspects: Hardware & sound setting issue
+
 
|-
+
13 May 2024
| [[Special:Contributions/Olaf|c]] || [[User:Olaf]] || 15 || 0% (0) || — ||  || All recent recordings ok.
+
* [... Many more uploads]
 +
* Upload log 23:39 Elwinlhq talk contribs uploaded File:LL-Q5218 (que)-Elwinlhq-apaqay.wav ‎ Tag: Lingua Libre [2.2]
 +
* Upload log 19:05 Assassas77 talk contribs uploaded a new version of File:LL-Q9192 (cmn)-Assassas77-八角.wav ‎ Tag: Lingua Libre [2.2]
 +
* Upload log 19:05 Assassas77 talk contribs uploaded File:LL-Q9192 (cmn)-Assassas77-八角.wav ‎ Tag: Lingua Libre [2.2]
 +
* Upload log 16:38 Oh! Tea<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Oh!_Tea Commons > User:Oh!_Tea :  « nothing » on Commons]</ref> talk contribs uploaded File:LL-Q36759-Austin Zhang-sih8 buh8 sah8 nah4.wav ‎ Tag: Lingua Libre [2.2]
 +
11 May 2024
 +
* Upload log 20:21 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-buah8.wav ‎ Tag: Lingua Libre [2.2]
 +
* []... +172 recording by User:Oh! Tea]
 +
* Upload log 18:56 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-a2.wav ‎ Tag: Lingua Libre [2.2]
 +
10 May 2024
 +
* Upload log 06:08 CapitainAfrika<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:CapitainAfrika Commons > User:CapitainAfrika :  « IP block exempt » on Commons]</ref> talk contribs uploaded File:LL-Q36217 (lin)-CapitainAfrika-Wiki na monɔkɔ mua bísó.wav ‎ Tag: Lingua Libre [2.2]
 +
* Upload log 00:14 Ardzun<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Ardzun Commons > User:Ardzun :  « nothing »]</ref> talk contribs uploaded File:LL-Q13324 (min)-Ardzun-mada.wav ‎ Tag: Lingua Libre [2.2]
 +
9 May 2024
 +
* Upload log 17:08 Àncilu<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Àncilu Commons > User:Àncilu :  « Autopatroller » on Commons]</ref> talk contribs uploaded File:LL-Q652 (ita)-XANA000-orsù.wav ‎ Tag: Lingua Libre [2.2]
 +
* Upload log 17:05 Àncilu talk contribs uploaded File:LL-Q652 (ita)-XANA000-frac.wav ‎ Tag: Lingua Libre [2.2]
 +
5 May 2024
 +
* Upload log 21:15 Benoît Prieur<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Benoît_Prieur Commons > User:Benoît_Prieur :  « Administrator » on Commons]</ref> talk contribs uploaded File:LL-Q8785 (hye)-Benoît Prieur-Artsakh.wav ‎ Tag: Lingua Libre [2.2]
 +
1 May 2024
 +
* Upload log 16:09 Penn Zero MSSJ<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Penn_Zero_MSSJ Commons > User:Penn Zero MSSJ :  « nothing » on Commons]</ref> talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hệ số.wav ‎ Tag: Lingua Libre [2.2]
 +
* Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hỗn số.wav ‎ Tag: Lingua Libre [2.2]
 +
* Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hằng đẳng thức.wav ‎ Tag: Lingua Libre [2.2]
 +
* [... Many more uploads]
 
|-
 
|-
| [[Special:Contributions/WikiLucas00|c]] || [[User:WikiLucas00]] || 60 || 75% (45) || [[File:LL-Q150 (fra)-WikiLucas00-chapeaux.wav]]<br/>[[File:LL-Q150 (fra)-WikiLucas00-apologétique.wav]]<br/>[[File:LL-Q150 (fra)-WikiLucas00-sacerdotal.wav]]<br/>[[File:LL-Q150 (fra)-WikiLucas00-érythrocyte.wav]] || Brave 1.23.73 (Chromium: 90.0.4430.85)  || See [https://commons.wikimedia.org/wiki/Special:ListFiles/WikiLucas00 my 2021-04-26 10pm series]
+
|colspan=2| <small><references /></small>
 
|}
 
|}
 +
[[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:38, 14 May 2024 (UTC)

Revision as of 12:08, 15 May 2024

Chat rooms in various languages:
English · 🌐

Chatroom FAQ

How to download all audios of one language? By speaker?

Datasets are availale here. A script is updating the datasets every 2 days, using CommonsDownloadTool. For more, see Help:Download datasets.

How to add missing languages?

Administrators can add new languages on demand, they do so within few days. Please provide your language's ISO 639-3 code and/or its Wikidata ID. For more, see Help:Add a new language.

How to keep my wikimedia project up to date?

Contact Poslovitch, the master of Lingua Libre Bot. For more info, check out Help:Bots and LinguaLibre:Bot.

What IRL events are coming? When? Where?

Please see LinguaLibre:Events.

How to translate LinguaLibre User Interface into a new language?

Go to translatewiki.net. For more, see Help:Translate.

How to archive sections which have been answered?

After reviewing the section, add {{done}} ~~~~ to the top of the section. After few days to 2 weeks, move the section's code to [[LinguaLibre:Chat_room/Archives/year]].

Archives
202320222021202020192018

Results of Coverage Test of French Lemma and Non-Lemma forms is English Wiktionary

While playing around with generating lists for pronunciation from Wiktionary, I decided to run a few tests on the current coverage of French lemma and non-lemma forms in English Wiktionary. I choose French because it is the largest datasets in LL.

Current Coverage of French in Lingua Libre

  • Total French Entries in Lingua Libre by a native speaker: 233 982
  • Unique French Entries in Lingua Libre by a native speaker: 154 358
  • Percentage of overlap: 34%
  • Term with the greatest number of pronunciations: "blanc" with 40

Current Coverage of Category:French lemmas

  • Total entries in Category:French lemmas: 84 482
  • Pronounced entries: 50 917
  • Entries with pronunciation: 33 565
  • Coverage Percentage: 60.27%

Current Coverage of Category:French non-lemma forms

  • Total entries in Category:French non-lemma forms: 29 1225
  • pronounced entries: 26 791
  • Entries with pronunciation: 264 434
  • Coverage Percentage: : 9.20%

For me, there are several lessons to be drawn.

  1. First, there has been amazing growth on LL. Covering 60.27% percent is a real achievement.
  2. The overlap percentage is quite small overall.
  3. There needs to be a clearer sense of when LL should stop requesting pronunciations for a certain term because 40 pronunciations of "blanc" seems a bit excessive.
  4. A need exists to continue pro-actively targeting entries in Wiktionary that are not in Lingua Libre. Currently, 297 999 French lemma and non-lemma forms require pronunciations.
  5. Generating lists from Wiktionary and checking coverage is not as hard as I thought.
  6. Lingua Libre has almost caught up with Forvo in the number of French pronunciations (233 982 vs 254, 703). Overall, Lingua Libre has shown amazing and healthy progress in a very short period of time. I'm excited about these results. Languageseeker (talk) 03:07, 1 June 2022 (UTC)
@Languageseeker This investigation is pretty cool. (I'm not sure i understand all your numbers yet, but i will read again when back on my PC). Its quite nice to see we are reaching Forvo level for our lead language. It's possible we have more unique words than forvo since we have user:Olafbot actively guiding and pushing us on that path.
On Lili we have chosen to be a learning AND linguistic diversity audio database. When you account for gender, regional accents, age, voice type, having 40 french audios for a word is still 400+ voices short.
Also, all contributors are not able to contribute audio perfect files due to various shortcomings (hardware, no recording room, no noose cancelling system, etc). We lack proper rating and review system. It's on our [slow] roadmap tho. 😉
PS: Should i answer to you in French i get a feeling you are French or learning it. Yug (talk) 15:07, 1 June 2022 (UTC)
@YUG Salut, Yug. Oui, je suis en train d'apprendre le français. Comme nous avons discutez pendant notre reunion, c'est difficile de definer les limits d'une language. Comme je le vois, les formes lemma ne suffit pas. Maintenant, je suis en train de crée un Olafbot sur steroid pour francais. Mon plan est de réaliser un program python qui peux analyser les modèle utilizer sur Wiktionary. Languageseeker (talk) 15:48, 7 June 2022 (UTC)
Hi @Languageseeker . I'm sorry I did not visit the Chat Room in a long time, and missed your report. Very interesting, good job! I remember a request I made to Olaf some time ago: it would be interesting to have a list similar to the one Olafbot is updating, but containing only lemmas of the target language (to quickly have nearly all lemmas of a dictionary illustrated with an audio pron). Also, I suggest you to use the categories of the French version of Wiktionary when you plan to work on French (and some other languages, that are more extensively described there). As you can see here, the category gathering French lemmas is more than 3 times more complete on the fr. version than on the en. version of Wiktionary. As you mentioned, these numbers are exciting, let's keep up the good work! All the best — WikiLucas (🖋️) 15:47, 26 November 2022 (UTC)
@WikiLucas00 Sorry, I totally forgot about your request. The list is now ready for French: List:Fra/Filtered-lemmas-without-audio-sorted-by-number-of-wiktionaries. It's produced like the other lists, but it's limited to words from Catégorie:Lemmes_en_français. The list will be refreshed together with the rest. Olaf (talk) 16:54, 14 May 2023 (UTC)
Hello @Olaf ! Thank you so much for this list, it's going to be very useful for sure! Let's cover 100% of Lemmas 😎 I'll tell the French contributors on Discord about it 😉 All the best — WikiLucas (🖋️) 22:18, 20 May 2023 (UTC)

How to create user page

Hello, my user name is Ngangaesther from Kenya. I am still stuck on how am supposed to create my user page kindly help regards Esther

Odia language missing from Stats/Languages

Hi there, for some reason, the Odia-language stats are missing from the Stats/Languages page. Also, "The most prolific speakers for the current month " section in the Stats/Speakers page is not loading at all since the time I checked last (about 10 days). I have tried on Chromium and Firefox and the result is the same even after clearing cache. --Subhashish (talk) 19:40, 28 July 2022 (UTC)

Hello Subhashish, it should be back online. We had a hackathon to put it back. We are calling for devs to push forwards. Yug (talk) 11:07, 10 August 2022 (UTC)
Thank you for the update, Yug. --Subhashish (talk) 14:00, 10 August 2022 (UTC)

Manually-coded languages

I came across meta:Lingua Libre/SignIt recently (via betawiki) and was wondering if manually-coded languages would be appropriate for this as well? These are languages in sign modality, but strongly tied to a spoken/written language; they usually adopt the grammar of the nonmanual language, choosing instead to simply transpose the vocabulary. This means they are most often used in application-specific and pidgin contexts (Pidgin Sign for English and diver's signs are examples). In particular, I am interested in toki pona luka, a manual form of toki pona (Q338540). Since the vocab is the same as spoken/written toki pona, there are a minimal number of lexemes overall, so having a complete set of signs is easily achievable. Manually-coded languages including toki pona luka are generally not given a separate ISO 639 code since they are in effect equivalent to scripts. Would this cause a problem for the infrastructure as currently designed? Arlo Barnes (talk) 05:56, 17 August 2022 (UTC)


Hello Arlo Barnes,

I understand "manually coded languages" as synonymous to "signed languages", am I correct?
If there is no distinct ISO for the signed language, we could still:

  • Create a new wikidata item without ISO, which will be used as identifier by LinguaLibre infrastructure
  • Use the spoken/write language ISO, and create lists of words all suffixed by (signed).

Either of those solutions could work.

If you have some knowledge of signed toki pona luka please let me know. We are adding features on Lingualibre and SignIt in order to be able to record video of signed words by late 2022. We are almost there. If you would like to record some basic signed words to share with the world, then let me know. Yug (talk) 20:58, 17 August 2022 (UTC)

Signed languages and manually-coded languages share similarities (the manual modality) and differences (since sign languages are 'native' to the signed modality, they use it more fully, having complete deixis and time-reference systems, use of handshape classifiers, etc.) -- 'luka' means 'hand'/'five', so that's the part of the name that indicates the manual modality, but otherwise it's just garden-variety toki pona. I am interested in using SignIt to record this vocab, yes. The '(signed)' suffix seems like a good way to do it. Arlo Barnes (talk) 13:16, 19 August 2022 (UTC)
Arlo Barnes: We increasingly have tools to update and correct sign language recordings, so the suffix (signed) or the solution we choose appears incorrect, we still can correct it later using that bot.
I would encourage you to first train yourself and learn that manually-coded language over the coming months. Indeed, we still have a very last bug within our video recording chain, which makes rightful videos appears as audio on Commons. We expect to solve this last issue this fall (September or October ?). So for now, I encourage you to rest well, reload energy, to get ready to record later this year. Maybe identify near you some suitable place with elegant monochrome wall to film over or consider building yourself a low-cost recording studio,. Etc. We can discuss it to keep it low cost and effective if you are interested, as I'm also looking for such walls and/or considering building one for myself.
See also : Minimal Sign Language Studio guideline. Yug (talk) 22:30, 19 August 2022 (UTC)

Update my username

I have changed my Wikimedia username but the previous name still appears in Lingua Libre. I know it's not included in unified logins. Anyway, please update my username to Aishik Rehman. Hirok Raja (talk) 15:14, 1 September 2022 (UTC)

Hi Hirok Raja¸would you have an example of what you would like to see to be changed? I think you are talking about the filename but I am not sure, so with one example, it would be clearer. Pamputt (talk)
@Pamputt
1. Top menubar of lingualibre.org showing 'Hirok Raja' as my profile name.
2. After uploading when I try to check my uploads in Commons, it takes me to https://commons.m.wikimedia.org/wiki/Special:ListFiles/Hirok_Raja page.
3. 'Hirok Raja' being used as Default recorder in the file names and description
4. Change speaker name to 'Aishik Rehman' every time while recording is quite annoying to me.
5. Even here 'Hirok Raja' is showing as my signature by default ): Hirok Raja (talk) 19:16, 2 September 2022 (UTC)
I suspect this is due to long term cookies. Would be interesting to push a clean up for your connection cookies for Lingualibre, it will log you out, then come back here. On firefox.
Open about:preferences#privacy > Go to "Cookies and Site Data"> Click "Manage Data" > Search "Lingualibre" > Remove selected. Yug (talk) 21:10, 2 September 2022 (UTC)

Siège communautaire de Wikimédia France – ouverture du vote / Community representative to Wikimédia France’s board - votes are opened

(English version below. Do not hesitate to correct my English translation.)

(Message copié depuis le bistro du jour par Lepticed7 (talk))

Bonjour,

En tant que président de la commission électorale pour l'élection du siège communautaire au conseil d'administration de Wikimédia France, je vous annonce que le vote ouvre aujourd'hui (13 septembre) à 0h CEST. Il se terminera le 26 septembre à 23h59 CEST.

Comme il y a trois ans, le scrutin est public sur Meta. Les pages de votes sont disponibles dans la catégorie correspondante ou en lien sur la page principale. C'est un scrutin par approbation, le candidat qui aura le plus grand nombre de voix sera donc déclaré élu. Vous pouvez voter pour autant de candidats que vous le souhaitez.

Si vous avez des questions, vous pouvez les poser sur la page de discussion ou par courriel à election@wikimedia.fr.

Pour la commission électorale, Mathis B, le 12 septembre 2022 à 22:00 (CEST)


(Message copied from the French Wikipedia Bistro by Lepticed7 (talk))

Hello,

as the chairman of the electoral commission for the election of the community representative to Wikimédia France’s board, I announce that votes open today (13th september) at 0:00 CEST. They will be closed on 26th september at 23:59 CEST.

Like it was the case three years ago, voting is on Meta. Voting pages are available in the corresponding category or as links in the main page. The elected candidate will be the one with the most approbation votes. You can vote for as many candidates as you wish.

If you have any questions, you can ask them on the Talk page on Meta, or by email at election@wikimedia.fr.

For the electoral commission, Mathis B, 22:00, 12 septembre 2022 (CEST)

Is there a way to exclude username from Wikimedia Commons upload file name?

See also Help:Renaming.

This seems redundant and takes up a lot of space --Middle river exports (talk) 20:22, 9 October 2022 (UTC)

@Middle river exports Welcome MRE,
You could name your speaker with a single character I guess.
But keeping the name is voluntary. Each speaker has his/her own voice, which we want to document. If, outside of Wikimedia, you want to remove part of the filename, we have a technical tutorial to do so. See Help:Download datasets and Help:Renaming. Ping us back if your dataset is not up to date. Yug (talk) 13:16, 10 October 2022 (UTC)
I have solved this now by just changing my username to something shorter. This way I can upload English as Usmaan (عثمان) for example where instead of just repeating the username it shows two scripts which is more useful. (Apparently few enough people have Arabic script usernames that short common words are mostly available.) --عثمان (talk) 20:23, 10 October 2022 (UTC)
All Unicode characters should be ok, in words and usernames ;) Yug (talk) 19:46, 11 October 2022 (UTC)

Username update request

I realised my username on Mediawiki didn't carry over here when I changed it. On thus site could I please have it changed to: عُثمان --عثمان (talk) 08:45, 10 November 2022 (UTC)

Data on LinguaLibre:Stats isn't consistant with Wikipedia Commons's Category

On the Stats page, the French have 254,387 records

https://lingualibre.org/wiki/LinguaLibre:Stats/Languages

Meanwhile, the Category on commons.wikimedia.org has 253,464 records

https://commons.wikimedia.org/wiki/Category:Lingua_Libre_pronunciation-fra

The stats display more records. This data inconsistency is strange. -- User:Shenlebantongying, 10:36, 23 december 2022.

This means some item page exist here, but no audio are on Commons.
Item creation here and upload are done at step 5 of the recording, nearly simultaneously.
So I don't know what is going on. Yug (talk) 17:41, 26 December 2022 (UTC)

c:Category:Lingua Libre pronunciation-bxg

All files in this category are tagged with wrong language. I have requested moves for files in the category, but what's more to be done?--GZWDer (talk) 13:05, 12 January 2023 (UTC)

Thanks for reporting. Actually all these items are erroneous (see Special:WhatLinksHere/Q590228):
I have not checked yet if corresponding recordings are still on Commons. Pamputt (talk) 16:11, 13 January 2023 (UTC)

I can not publish my records recorded via Lingua Libre.

Dear Colleagues,

It records, but when I press the button to publish it on Wikimedia Commons. It does not work. It returns as "Retry failed upload" Any idea? Thank you. Key Mîrza (talk) 05:09, 28 January 2023 (UTC)

Is it happening for all your recordings or only some of them? Pamputt (talk) 08:49, 28 January 2023 (UTC)
It was all good until a month ago. Nowadays I am on a vacation in another city and trying to enter to my accout and make some more records. I can enter into my account and I can create records, but I can not publish them. I stuck at publishing stage. Nothing publishing. None of my records publishing. I even tried to record via my cell phone, even there nothig publishing. By the way, I just saw your previous message wecoming me. Thank you, for your kind wish. Best wishes... Key Mîrza (talk) 09:57, 28 January 2023 (UTC)
Hmmm, I do not know what to say. Sometimes some recordings do not upload but they other do. When none recording uploads, I do not know what could be the origin. Could you try with another webbrowser (firefox or Chrome)? To go further, I think we would need a Javascript expert that could have some hints. @Poslovitch & Lepticed7 maybe ? Another question, how many words do you try to record? If this is a lot, could you try with only a few (less than 10 for example). Pamputt (talk) 15:42, 28 January 2023 (UTC)
I tried 11 words together, then even 1 word only for testing purpose. Nothing worked. You said Java. Do I need java to be able to work with the application? If so, that I need to install Java. Because I formatted my PC. May be it is not installed. Thank you. Key Mîrza (talk) 17:06, 28 January 2023 (UTC)
Java is different than Javascript. Javascript is language supported by the webbrowser so you do not need to install anything else than a webbrowser to record pronunciations on Lingua Libre. Unfortunately, I cannot dig further in this direction because I almost know nothing about Javascript. Pamputt (talk) 21:18, 28 January 2023 (UTC)
Thank you, anyway. Key Mîrza (talk) 22:38, 28 January 2023 (UTC)
Key Mîrza, thank you a lot for your voice, it make us discover new languages. Please be aware Lili works best on solid desktop computers. Also, you likely have a limit of 380 records uploads per 72 minutes. So you may need to leave your tab open, and click "retry" after that. You can expand those right by making a demand on Commons. See LinguaLibre:User rights. Contact us if you think it may be that. Yug (talk) 15:07, 5 February 2023 (UTC)
It's confirmed, as all new contributor you are limited to 380 uploads per 72h. You can get more userrights by requesting those rights on Commons. Yug (talk) 15:15, 5 February 2023 (UTC)

Late 2022-2023 Winter report

Hello all, allow me to share few overall news from the various recent, ongoing, or near-future efforts.

  • 🤖 User:Pamputt has taken over Lingualibre Bot and added support for the Kurdish wiktionary. See github.
  • 🌏 Melody (WMFr intern) and myself made a mini-editathon on writing template emails for outreach. See Lingualibre:Events.
  • ⚡ User:Elfix and myself will attend are collaborating for sparql requests (me) optimization (Elfix). We aim to create and languages gallery this spring.
  • 🔴 Wikimedia France's freelance on the record wizard is back on track, delivery of fixes should occur around May-June.
  • 🙋‍♀️ Adelaide (WMFr) mentioned the wish of a second intern on Lingualibre outreach this summer, to reuse Melody's assets, expand actions and geographic diversity.
  • 🫱🏼‍🫲🏽 Wikimedia France yearly strategic meetup is this week, and is expected to strengthen its (linguistic) diversity and metrics axes, for which Lingualibre is one of their champions.
  • 🧓 Eve and myself (likely) will be present at Toulouse's Forom des Langues, in May, where ~60+ languages associations are present.

For specific deadlines and events coming soon, please also check Lingualibre:Events/Program. We always welcome contributors. When necessary, WMFr may refund transportation costs. Worth a try ! Yug (talk) 15:07, 5 February 2023 (UTC)

Edit your nickname

Good evening, I would like to change my nickname because it did not update when I was renamed Manjiro91 then Manjiro5 instead of GamissimoYT on Wikimedia projects. Thanks in advance Regards manȷıro💬 22:53, 23 February 2023 (UTC)

Tool to prepare words for Lingua Libre

Preparing words to be used in Lingua Libre has always been challenging. But I think this is a shared challenge. Crawling text from different sources and creating a clean list of words is very important. I've used Tito's instructions in the past, but using multiple tabs and multiple tools is not the best user experience. So, I thought I'd create something that is functional for me and simple enough to be tweaked. Introducing "Prepare words for Lingua Libre". The tool is currently set for Odia but can be easily tweaked for other languages using non-Latin scripts. I'd request Lingua Libre core team to incorporate the tool into Lingua Libre so that users can use the platform to create a wordlist. Extracting words from any random text is always hard, especially new contributors. --Subhashish (talk) 03:44, 14 March 2023 (UTC)

Hi Psubhashish. This is really nice. Do you think it would be easy to adapt it to create a new generator? Generators can be used by anyone after they import them in their common.js. Pamputt (talk) 06:44, 14 March 2023 (UTC)
Thanks User:Pamputt. That would be fantastic, but I probably don't have the right knowhow for doing that. I did take ChatGPT's help to create a .js version from the HTML code I had shared earlier but would appreciate any help. I think having a tool inside Lingua Libre would be great so really liked the idea of new generators. Common users would like things well packaged rather than jumping from one platform to another. --Subhashish (talk) 13:09, 14 March 2023 (UTC)

Problème de publication des enregistrements

Bonjour, il y a quelques années, j'ai renommé mon compte GamissimoYT en Manjiro91. Plus tard, je l'ai renommé Manjiro5. Le problème est que le renommage de mon compte global Wikimedia ne s'est pas fait sur Lingua Libre. Je ne peux donc pas publier les audios que j'enregistre sur LinguaLibre et n'apparaissent pas non plus sur Commons. Pourriez-vous m'aider ? manȷıro💬 08:41, 26 April 2023 (UTC)

Renommer un dialecte en langue

Bonjour,

J'avais fait la demande pour l'ajout de "Teochew dialect" il y a quelques années lors de mes premiers essais. Cependant, il paraît plus pertinent de juste laisser "teochew" tout court sans le mot dialecte. Serait-il possible de faire ce changement.

Assassas77 (talk) 19:41, 7 May 2023 (UTC)

Check-green.svg Done Solved here by User:Assassas77 ! It's a wiki :) Yug (talk)

MediaWiki:Lang/*

What are the MediaWiki:Lang/* messages for? For example, MediaWiki:Lang/awa? It looks like they mostly just repeat the language code in the content. --Amir E. Aharoni (talk) 07:21, 24 May 2023 (UTC)

Where are the Greek recordings?

According to the statistics page there are 130 recordings of the Greek language (Q205, ISO: gre). However there is no category commons:category:Lingua Libre pronunciation-gre defined or any recordings added to this category. There is a category commons:category:Lingua Libre pronunciation-ell, but it is empty. What happened to the 130 Greek recordings? Olaf (talk) 20:16, 9 June 2023 (UTC)

Hi Olaf, for unclear reason (probably historical reason), it seems that all Greek recordings are categorized in Category:Lingua Libre pronunciation-other. We have to move all these recordings in the good catagory (I do not know if Commons has a some automatic tool for such job). And also redirect commons:category:Lingua Libre pronunciation-ell to c:category:Lingua Libre pronunciation-gre. Pamputt (talk) 07:24, 10 June 2023 (UTC)
Hi Pamputt. This happened because in wikidata:Q9129#P220 both ISO 639-3 codes are deprecated, and entity:getBestStatements function, used in commons:Module:Lingua Libre record#L-46, doesn't accept deprecated entries, so the module can't get the language code and falls back to "other" category. We could change the Wikidata entry and the files would be moved automatically. However code "gre" must stay deprecated, because it is unclear if it refers to ancient or modern Greek. It would be better to promote "ell" to normal entry. Then changes in Greek (Q205) would be also needed. It looks like bulk moving Lingua Libre recordings around doesn't require admin rights, so I can fix this issue if you agree to change the Greek language code to "ell" instead of "gre". Olaf (talk) 08:46, 10 June 2023 (UTC)
Hi Olaf thank you for your investigation. So, I have modified Greek (Q205) to fix the issue on the Lingua Libre side. For Wikimedia Commons, you can go ahead. Pamputt (talk) 08:11, 18 June 2023 (UTC)
Thanks, Pamputt. It's not as easy, as I thought. Setting Greek ISO 639-3 code to normal from obsolete creates constraint validation with Modern Greek with the same code. In fact, LinguaLibre shouldn't record Greek words as Greek (Q9129) but rather as Modern Greek (Q36510). In fact Modern Greek is also defined in LinguaLibre: Modern Greek (Q279). Olaf (talk) 13:26, 18 June 2023 (UTC)
If I understand correctly, the easiest way to manage this case would be to delete Greek (Q205), so that no one can record in "this language" and thus select only Modern Greek (Q279). If so, I would require to replace all Lingua Libre statements that use Greek (Q205) by Modern Greek (Q279). There is currently 137 items that use Greek (Q205), so I think it is manageable by hand. Olaf, what do you think about this "workaround"? Pamputt (talk) 16:48, 18 June 2023 (UTC)
This would be perfect, it also requires renaming the 137 recordings in Commons, but it can be done. What about the datasets to be downloaded from LinguaLibre, will they change automatically? Olaf (talk) 21:08, 18 June 2023 (UTC)
Olaf, Pamputt, I had nearly similar case with Chinese ISOs zho vs cmn. I have about 186 zho items (see Help:SPARQL for maintenance)]] which have the wrong iso. My plan is :
  • to delete those audios, very simply, on both Lingualibre and Commons. The alternative would be to edit them all on both sites.
  • to discourage recording or delete that Lili Qid.
so I may work on those audio, some day... Hugo en résidence (talk) 17:36, 18 June 2023 (UTC)
I don't like deleting good recordings as a way of dealing with wrong categorization. Moreover some of them are probably in use, because Olafbot might have added them to Polish Wiktionary. If there is no other option, just leave them where they are in Commons, and remove Greek from Lingua Libre alone in favor of Modern Greek. But I think Pamputt's solution is better. Olaf (talk) 21:08, 18 June 2023 (UTC)
USer:Olaf, I don't like either. But 186 recording is about 8 minutes work, and it have been confusing us for 3 years. Do point to that. Yug (talk) 19:35, 20 June 2023 (UTC)
Deleting 186 recordings is about the same amount of time as modifying the language statement. This is manageable by hand and I would prefer not to delete them. I do not have time for now but I will try to do it before the end of the month. Pamputt (talk) 11:47, 21 June 2023 (UTC)

Any Recording limitation in Lingua Libre

Hello,I want to know any recording limitation in Lingua Libre. Because I'm planning a screen-cast in Tamil language. If anyone know please reply. Thank you Sriveenkat (🎤) (talk) 11:11, 1 August 2023 (UTC)

I you are not an autopatrolled user on Wikimedia Commons, then you cannot upload more than 380 audios per 72 minutes. If you want to record more words within this timeslot, then you should request for this right. Pamputt (talk) 14:15, 1 August 2023 (UTC)
Hi, @Pamputt , I don't record 380 audios within 72 minutes. I'm planning to create screen-cast tutorial video in Tamil language. So I ask this question. Thank you for your reply Sriveenkat (🎤) (talk) 14:35, 1 August 2023 (UTC)

Exclusion list for generators?

Hello, if there isn't a feature like this somewhere already, I propose a per-user blacklist of sorts, which would allow users to select words which would be excluded when you choose one of the generator options to generate words. I'm currently going through a list of words in a Wiktionary category, and I'm confronted with a growing list of words that I can't deal with because they aren't suitable for pronunciation (e.g. particles that surround other arbitrary words), or they're just homophones of something I've already recorded, etc. What would be necessary, techniaclly, in order to make this happen? Kiril kovachev (talk) 12:39, 10 August 2023 (UTC)

Hi Kiril kovachev, I have opened a Phabricator ticket for this request. If you know Javascript, you may have a look to the code to propose a patch. Pamputt (talk) 05:52, 15 August 2023 (UTC)

Barnstar Award Template

There is any Barnstar Award Template for Lingua Libre? Sriveenkat (🎤) (talk) 07:06, 13 September 2023 (UTC)

There are Template:50k barnstar and Template:Speaker of the month and maybe other. WikiLucas00 may know other barnstars. Pamputt (talk) 21:11, 13 September 2023 (UTC)
@Pamputt & WikiLucas00 Ok Pamputt, I want give barnstar award for Some Beginner Speakers. It will be a motivating for them. Am I right?Sriveenkat (🎤) (talk) 11:46, 14 September 2023 (UTC)
Hello @Pamputt & Sriveenkat ! Indeed, it would be a nice idea to offer awards for beginners, such as a barnstar for passing 1000 recordings for example. All the best — WikiLucas (🖋️) 16:08, 16 September 2023 (UTC)

1,000,000th

  • N ! 08:38 కంటగిల్లు (Q1094614)‎ diffhist +3,648‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటగించు (Q1094613)‎ diffhist +3,636‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటకితము (Q1094612)‎ diffhist +3,636‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటకుడు (Q1094611)‎ diffhist +3,624‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటక (Q1094610)‎ diffhist +3,588‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటబడు (Q1094609)‎ diffhist +3,612‎ V Bhavya talk contribs block ‎Created a new Item

Yug (talk)

Why Lingua Libre Bot isn't running Wikidata?

@Poslovitch, Pamputt, & WikiLucas00 Why Lingua Libre Bot isn't running in Wikidata? Darafsh asked about in Wikidata Lexicographical data Telegram Group. What's the problem? Please kindly tell the issue. Thanks-Sriveenkat () (talk) 16:12, 6 October 2023 (UTC)

@Sriveenkat could you point to an Lingua Libre item and a Wikidata item or lexeme that has not received the pronunciation? This will help to test and find what is wrong. Pamputt (talk) 19:22, 6 October 2023 (UTC)
Hi @Pamputt Recorded Audios doesn't received in the Wikidata Items and Wikidata Lexemes!. The User Darafsh have recorded some many words for Wikidata Lexeme Project. but never audios added to the Wikidata Lexemes. You can see the wikidata:Special:Contributions/Lingua Libre Bot The last contribution on 23:49, 9 September 2023. So, Iam just asking run the Lingua Libre Bot on Wikidata. I'm also recorded some words for Wikidata Lexeme Project I waited for some days, But never my audios added to wikidata lexemes. So, I run QuickStatements for Adding My audios.. Now User Darafsh also run QuickStatements for adding he's audios.. I think so many users using Lingua Libre for Automatically adding audios on Wikidata and some wikitionaries. I hope you understand Thankyou Regards Sriveenkat () (talk) 05:38, 7 October 2023 (UTC)
Thanks to @Sriveenkat to start the discussion. If you need some examples, you may see Mazanin's contributions on Commons. This is the recorded audio: [1] and this is the lexeme entry on Wikidata: [2] but they are not connected yet. Darafsh (talk) 12:07, 7 October 2023 (UTC)

SiteNotice

Hi,
Translations are not working for Sitenotice. Install CentralNotice? ―Eihel (talk) 14:31, 7 October 2023 (UTC)

Global bot status

Lingualibre Bot has been approved. cc @Pamputt, Poslovitch, & WikiLucas00 . Yug (talk) 12:31, 10 October 2023 (UTC)

Thank you for the request and congrats on the approval! — WikiLucas (🖋️) 12:40, 16 October 2023 (UTC)

ExternalTools - Wikidata Query Service - Recording Indian Actor and Actress Names in Tamil

@Yug, Pamputt, & WikiLucas00 I am now interested in Recording Indian Actor and Actress Names in Tamil. So I make a query, I Input that query url in ExternalTools. A error comes "Result must contain both "id" and "label" field." I think something need to modify on this query. Please anyone help for this. Thanks Sriveenkat (talk) 19:58, 24 November 2023 (UTC)

@Sriveenkat , this works. Please note there is 6982 items if we remove the LIMIT, and I don't how the systems works with such larger list. Yug (talk) 23:13, 25 November 2023 (UTC)
@Yug Thanks for your reply. The query doesn't works for me :( Error in ExternalTools "undefine" Sriveenkat (talk) 06:03, 26 November 2023 (UTC)
@Sriveenkat , in Wikifata QS you have to run the query to check if it is working and providing data, if so go to the URL bar, copy that long url. Come back to Lingualibre Step 3, external tool, paste that long url. It worked for me. Yug (talk) 06:00, 27 November 2023 (UTC)
@Sriveenkat Sorry, I missed something. On the Query Service bottom right, click "Link" > then on "SPARQL endpoint" : copy this url. Yug (talk) 08:25, 27 November 2023 (UTC)
@Yug Works with copying SPARQL endpoint link. Thank you much. I'm planning to record more proverbs, usage examples, places, persons, Lingualibre is really more comfortable to record it. Thanks Again Sriveenkat (talk) 22:54, 27 November 2023 (UTC)

Logo redesign propositions

I had a bit of fun yesterday contributing to one of my favourite projects in a slightly different way. I've kept the ideas (microphone, wings) and colours of the current logo but made it a bit more polished. I've already taken a few opinions on Discord but I wanted to get a more general opinion. What do you think?

Just so you know, I won't be at all offended if the community prefers to keep the current logo, because there are some very good reasons for keeping it (I'm thinking in particular of all the printed materials, the fact that it's simple (easy to draw by hand if we don't have a printer and maybe more "readable" if very small), its declination for sign languages, etc.).

DSwissK (talk) 08:59, 3 December 2023 (UTC)

@DSwissK hello,
We can add your proposition in the set of logos ideas within a Wikimedia Commons Category:Proposed Lingua Libre logo, for reference later on. But to be honest, good logo design requires design experience, artistic intuition, brand and public awareness, which are harder to gather than it seems. It also must fit a project's phase and branding strategy, when the project needs a new logo and project members willing to shift from the current high visibility logo to a new one. All together changing a logo is not something easy to push for. I made a similar answer here few month ago about Lingua Libre SignIt. Yug (talk) 12:23, 4 December 2023 (UTC)
@Yug hi,
Thank you for your input. I appreciate you explaining the complexities - you raise great context I had not fully considered. DSwissK (talk) 09:05, 6 December 2023 (UTC)

Hebrew diacritics (Niqqud)

In Hebrew we use diacritics (Niqqud) to determine how to pronounce the words.

Niqqud is usually common in the following cases:

  1. Young kids or people learning the language.
  2. Formal use.
  3. To distinguish between meanings when the base form is ambiguous.

This is a short example:

  • Base form: גזר (GZR)
  • Carrot: גֶּזֶר (Gezer)
  • Masculine cut: גָּזַר (Gazar)
  • Piece: גֶּזֶר (Gezer)

This is the corresponding Wiktionary article: https://he.wiktionary.org/wiki/גזר

When fetching words from Wiktionary it's better to use the first headers instead of the item names because in many cases the term is ambiguous and the items name is the base form without any pronunciation guidance.

As for Wikipedia etc. sometimes there's a word with the Niqqud inside the article but it will be a bit complicated to parse so we can skip that for now.

Lights on userrights

Hello all,
I bumped again into LinguaLibre:User_rights and {{Autopatrolled}}. To the extend of my knowledge we have no solution to this and no active user is munitoring this bottleneck. Is this assessment correct ? Yug (talk) 21:03, 28 December 2023 (UTC)

A mobile app

I personally think that contributing using a browser is quite dangerous, Firefox on mobile, for example, has a very strict page unloading policy which leads to closing the tab while uploading thus losing the remaining data which wasn't uploaded yet (I found a workaround but it's not perfect), are there any thought about this? (Maybe even expanding the current CV Project app by Saverio Morelli?)

Is the Record Wizard not working for anyone else?

My mic works with mictests.com, but the RecordWizard doesn't pick anything up at the "check your microphone" stage. I've tried on both my phone and my laptop, and I can record sound in both cases, and I have the appropriate permissions enabled, but this particular website isn't detecting sounds. Is anyone else having this kind of problem? Grendelkhan (talk) 23:43, 24 February 2024 (UTC)

Hello User:Grendelkhan,
I just received a second such report. User also checked mictests.com sucessfully.
On Firefox, Lingua Libre recording studio step 4, the microphone is allowed (we see the red microphone image on the left of the URL address). But after clicking the record button, no recording occurs.
  • Mictests on other site : successful.
  • Device: Notebook
  • OS: ?
  • Browser: Firefox, Chrome.
  • User: User:Akamycoco.
  • Languages affected: all.
  • Dates : Worked on February 28. Stopped working on February 29.
Let's starts an investigation. Could you let me know your OS and precise web browser version ? (Help > About Chrome or similar)
Let me know as well if you have basic developer skills to Right-click on the staled page > Inspect > Console : are there any error message ? Yug (talk) 07:55, 1 March 2024 (UTC)
My laptop is using Google Chrome 122.0.6261.94 (Official Build) (64-bit) on Linux (Debian Testing). No error messages in the console when I attempt the recording. My phone is using Chrome 122.0.6261.90 on Android 14 on a Pixel 5a. It does seem to work on Firefox 115.7.0esr (64-bit) on my laptop. (I really should have checked that before.) So maybe this is solely a Chrome problem? Grendelkhan (talk) 16:30, 2 March 2024 (UTC)

Automatic categorization isn't documented.

So far as I can tell, this isn't documented: if, for user Foo, category Lingua Libre pronunciation by Foo exists on Commons, then all uploads will be categorized into that category. This is helpful! It's also easy to backfill after the fact using commons:Help:Gadget-Cat-a-lot. I'm not sure where to document this, but it seems reasonable to do so somewhere. Grendelkhan (talk) 16:26, 3 March 2024 (UTC)

Understanding lingua-libre

Hi, I am creating this discussion to understand lingua-libre better

Uploads are failing

TLDR: Large amount of users reporting failure to upload at step 5 : Grendelkhan, Culex, XANA000, Ardzun (Indonesian languages), Penn Zero MSSJ, User:Univòc64 (Whistled Occitan) and User:Akamycoco (Taiwanese languages). This likely only tip of iceberg. Only few users were able to record in May, with atypically low number of recordings. Indonesia workshop with ~15 participants critically affected. Investigation ongoing. Hugo en résidence (talk) 14:20, 13 May 2024 (UTC)

I can record words, but uploading them to Commons fails. The JavaScript console has the following message:

Your IP address is in a range that has been blocked on all Wikimedia Foundation wikis. The block was made by ‪EPIC‬. The reason given is Open proxy/Webhost: See the help page if you are affected. * Start of block: 10:09, 1 May 2024 * Expiry of block: 10:09, 1 May 2027 Your current IP address is 2001:41d0:304:100::4790. The blocked range is ‪2001:41D0:0:0:0:0:0:0/33‬. Please include all above details in any queries you make. If you believe you were blocked by mistake, you can find additional information and instructions in the No open proxies global policy. Otherwise, to discuss the block please post a request for review on Meta-Wiki. You could also send an email to the stewards VRT queue at "stewards@wikimedia.org" including all above details.`, blockinfo: {…}, "*": "See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."

This is not my IP address shown in the error message, and whatismyip confirms that I'm not behind a proxy. The Global block request is here. Is this affecting anyone else? I lost a heap of recordings. Grendelkhan (talk) 22:26, 4 May 2024 (UTC)

Uploads are failing for me today too, even though I am recording with my account. Culex (talk) 15:04, 8 May 2024 (UTC)
Idem--XANA000 (talk) 16:49, 9 May 2024 (UTC)
I can record, but i couldn’t uploaded until today. I was able to upload once yesterday, but after that I couldn't upload any more. Ardzun (talk) 06:04, 11 May 2024 (UTC)
I guess I'm not the only one who's been trying for weeks but could not publish audio after 1 May. Hope someone can fix it. Penn Zero MSSJ (talk) 20:54, 13 May 2024 (UTC)
User:Univòc64 (Whistled occitan) and User:Akamycoco (Taiwanese languages) also reported issues.
It seems time to add a sitenotice warning. Hugo en résidence (talk) 14:07, 13 May 2024 (UTC)
In may we have mostly : 556 recordings by 7 users on May 1th, 174 recordings on May 11th (Austin Zhang), then nothing.
If we compare with known monthly recordings, our average months recently was 30k audios, the lowest ones were 5k audios, May 2024 is heading toward 1200 audios or 5% of the average month and 20% of the lowest months. Something weird is going on indeed.
Most prolific speakers for the current month Months since 2022
... Loading ...
{ date:2022-01, records: 21290, speakers: 46, languages: 28 },
{ date:2022-02, records: 3894, speakers: 40, languages: 17 },
{ date:2022-03, records: 8357, speakers: 61, languages: 21 },
{ date:2022-04, records: 5454, speakers: 34, languages: 18 },
{ date:2022-05, records: 4702, speakers: 59, languages: 30 },
{ date:2022-06, records: 7675, speakers: 41, languages: 18 },
{ date:2022-07, records: 4364, speakers: 37, languages: 22 },
{ date:2022-08, records: 9544, speakers: 45, languages: 23 },
{ date:2022-09, records: 5802, speakers: 113, languages: 30 },
{ date:2022-10, records: 6931, speakers: 74, languages: 32 },
{ date:2022-11, records: 8461, speakers: 54, languages: 34 },
{ date:2022-12, records: 11882, speakers: 54, languages: 23 },
{ date:2023-01, records: 18150, speakers: 48, languages: 29 },
{ date:2023-02, records: 32441, speakers: 65, languages: 29 },
{ date:2023-03, records: 11527, speakers: 61, languages: 30 },
{ date:2023-04, records: 8451, speakers: 58, languages: 35 },
{ date:2023-05, records: 21282, speakers: 97, languages: 49 },
{ date:2023-06, records: 17940, speakers: 56, languages: 35 },
{ date:2023-07, records: 75825, speakers: 74, languages: 38 },
{ date:2023-08, records: 32681, speakers: 54, languages: 30 },
{ date:2023-09, records: 28813, speakers: 114, languages: 30 },
{ date:2023-10, records: 60317, speakers: 167, languages: 47 },
{ date:2023-11, records: 49704, speakers: 140, languages: 55 },
{ date:2023-12, records: 42383, speakers: 114, languages: 41 },
{ date:2024-01, records: 40572, speakers: 112, languages: 40 },
{ date:2024-02, records: 22385, speakers: 197, languages: 57 },
{ date:2024-03, records: 16997, speakers: 173, languages: 48 },
{ date:2024-04, records: 8733, speakers: 117, languages: 42 },
{ date:2024-05, records: 556, speakers: 7, languages: 7 }
Daily recordings over April and May 2024
... Loading ...
<= stops on 2024.05.01
Note: Austin Zhang recorded 174 audios on 05.11

Yug (talk) 10:39, 14 May 2024 (UTC)

Fixed

Both IP ranges 2001:41D0:0:0:0:0:0:0/32 and 2001:41D0:0:0:0:0:0:0/33 were subject to global Wikimedia block at one point (see Global ban range_2001:41D0:0:0:0:0:0:0/32). Following our request, the ban have been reconfigured and uploads from LinguaLibre are possible again. Yug (talk) 10:38, 14 May 2024 (UTC)

I can record and upload since yesterday with my account, so that seems fixed. But it seems the stats are still not updated. Culex (talk) 12:08, 15 May 2024 (UTC)

Logs

For references, I investigated the relevant block logs and uploads logs for May 2024.
Conclusion: the uploads collapse is coherent with the IP Ban. Still, given bug reports from Akamycoco in *March* and 咽頭べさ on step 4, I suspects other bugs are lingering around.

Global IP bans Lingualibre uploads logs
  • 18:46, 13 May 2024 EPIC talk contribs changed global block settings for 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (anonymous users only) (No open proxies )
  • 00:51, 10 May 2024 AmandaNP talk contribs globally blocked 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (No open proxies )
  • 17:02, 9 May 2024 EPIC talk contribs changed global block settings for 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (anonymous users only) (Open proxy/Webhost: See the help page if you are affected)
  • 17:09, 1 May 2024 EPIC talk contribs blocked 2001:41d0::/33 talk with an expiration time of 2 years, 364 days, 12 hours, 21 minutes and 36 seconds (anonymous users only, account creation disabled) (Open proxy/Webhost: See the help page if you are affected)
  • 17:09, 1 May 2024 EPIC talk contribs globally blocked 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (Open proxy/Webhost: See the help page if you are affected)

13 May 2024

  • [... Many more uploads]
  • Upload log 23:39 Elwinlhq talk contribs uploaded File:LL-Q5218 (que)-Elwinlhq-apaqay.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 19:05 Assassas77 talk contribs uploaded a new version of File:LL-Q9192 (cmn)-Assassas77-八角.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 19:05 Assassas77 talk contribs uploaded File:LL-Q9192 (cmn)-Assassas77-八角.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 16:38 Oh! Tea[1] talk contribs uploaded File:LL-Q36759-Austin Zhang-sih8 buh8 sah8 nah4.wav ‎ Tag: Lingua Libre [2.2]

11 May 2024

  • Upload log 20:21 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-buah8.wav ‎ Tag: Lingua Libre [2.2]
  • []... +172 recording by User:Oh! Tea]
  • Upload log 18:56 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-a2.wav ‎ Tag: Lingua Libre [2.2]

10 May 2024

  • Upload log 06:08 CapitainAfrika[2] talk contribs uploaded File:LL-Q36217 (lin)-CapitainAfrika-Wiki na monɔkɔ mua bísó.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 00:14 Ardzun[3] talk contribs uploaded File:LL-Q13324 (min)-Ardzun-mada.wav ‎ Tag: Lingua Libre [2.2]

9 May 2024

  • Upload log 17:08 Àncilu[4] talk contribs uploaded File:LL-Q652 (ita)-XANA000-orsù.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 17:05 Àncilu talk contribs uploaded File:LL-Q652 (ita)-XANA000-frac.wav ‎ Tag: Lingua Libre [2.2]

5 May 2024

  • Upload log 21:15 Benoît Prieur[5] talk contribs uploaded File:LL-Q8785 (hye)-Benoît Prieur-Artsakh.wav ‎ Tag: Lingua Libre [2.2]

1 May 2024

  • Upload log 16:09 Penn Zero MSSJ[6] talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hệ số.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hỗn số.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hằng đẳng thức.wav ‎ Tag: Lingua Libre [2.2]
  • [... Many more uploads]

Yug (talk) 10:38, 14 May 2024 (UTC)