LinguaLibre

Chat room

Revision as of 10:38, 14 May 2024 by Yug (talk | contribs) (→‎Logs)

Welcome to the Chat room! Place used to discuss any and all aspects of Lingua Libre: the project itself, discussions of the operations, policy and proposals, technical issues, etc. Other forums include for code-oriented issues, . Feel free to participate in any language you want to.

Chat rooms in various languages:
English · 🌐

Chatroom FAQ

How to download all audios of one language? By speaker?

Datasets are availale here. A script is updating the datasets every 2 days, using CommonsDownloadTool. For more, see Help:Download datasets.

How to add missing languages?

Administrators can add new languages on demand, they do so within few days. Please provide your language's ISO 639-3 code and/or its Wikidata ID. For more, see Help:Add a new language.

How to keep my wikimedia project up to date?

Contact Poslovitch, the master of Lingua Libre Bot. For more info, check out Help:Bots and LinguaLibre:Bot.

What IRL events are coming? When? Where?

Please see LinguaLibre:Events.

How to translate LinguaLibre User Interface into a new language?

Go to translatewiki.net. For more, see Help:Translate.

How to archive sections which have been answered?

After reviewing the section, add {{done}} ~~~~ to the top of the section. After few days to 2 weeks, move the section's code to [[LinguaLibre:Chat_room/Archives/year]].

Archives
202320222021202020192018

Results of Coverage Test of French Lemma and Non-Lemma forms is English Wiktionary

While playing around with generating lists for pronunciation from Wiktionary, I decided to run a few tests on the current coverage of French lemma and non-lemma forms in English Wiktionary. I choose French because it is the largest datasets in LL.

Current Coverage of French in Lingua Libre

  • Total French Entries in Lingua Libre by a native speaker: 233 982
  • Unique French Entries in Lingua Libre by a native speaker: 154 358
  • Percentage of overlap: 34%
  • Term with the greatest number of pronunciations: "blanc" with 40

Current Coverage of Category:French lemmas

  • Total entries in Category:French lemmas: 84 482
  • Pronounced entries: 50 917
  • Entries with pronunciation: 33 565
  • Coverage Percentage: 60.27%

Current Coverage of Category:French non-lemma forms

  • Total entries in Category:French non-lemma forms: 29 1225
  • pronounced entries: 26 791
  • Entries with pronunciation: 264 434
  • Coverage Percentage: : 9.20%

For me, there are several lessons to be drawn.

  1. First, there has been amazing growth on LL. Covering 60.27% percent is a real achievement.
  2. The overlap percentage is quite small overall.
  3. There needs to be a clearer sense of when LL should stop requesting pronunciations for a certain term because 40 pronunciations of "blanc" seems a bit excessive.
  4. A need exists to continue pro-actively targeting entries in Wiktionary that are not in Lingua Libre. Currently, 297 999 French lemma and non-lemma forms require pronunciations.
  5. Generating lists from Wiktionary and checking coverage is not as hard as I thought.
  6. Lingua Libre has almost caught up with Forvo in the number of French pronunciations (233 982 vs 254, 703). Overall, Lingua Libre has shown amazing and healthy progress in a very short period of time. I'm excited about these results. Languageseeker (talk) 03:07, 1 June 2022 (UTC)
@Languageseeker This investigation is pretty cool. (I'm not sure i understand all your numbers yet, but i will read again when back on my PC). Its quite nice to see we are reaching Forvo level for our lead language. It's possible we have more unique words than forvo since we have user:Olafbot actively guiding and pushing us on that path.
On Lili we have chosen to be a learning AND linguistic diversity audio database. When you account for gender, regional accents, age, voice type, having 40 french audios for a word is still 400+ voices short.
Also, all contributors are not able to contribute audio perfect files due to various shortcomings (hardware, no recording room, no noose cancelling system, etc). We lack proper rating and review system. It's on our [slow] roadmap tho. 😉
PS: Should i answer to you in French i get a feeling you are French or learning it. Yug (talk) 15:07, 1 June 2022 (UTC)
@YUG Salut, Yug. Oui, je suis en train d'apprendre le français. Comme nous avons discutez pendant notre reunion, c'est difficile de definer les limits d'une language. Comme je le vois, les formes lemma ne suffit pas. Maintenant, je suis en train de crée un Olafbot sur steroid pour francais. Mon plan est de réaliser un program python qui peux analyser les modèle utilizer sur Wiktionary. Languageseeker (talk) 15:48, 7 June 2022 (UTC)
Hi @Languageseeker . I'm sorry I did not visit the Chat Room in a long time, and missed your report. Very interesting, good job! I remember a request I made to Olaf some time ago: it would be interesting to have a list similar to the one Olafbot is updating, but containing only lemmas of the target language (to quickly have nearly all lemmas of a dictionary illustrated with an audio pron). Also, I suggest you to use the categories of the French version of Wiktionary when you plan to work on French (and some other languages, that are more extensively described there). As you can see here, the category gathering French lemmas is more than 3 times more complete on the fr. version than on the en. version of Wiktionary. As you mentioned, these numbers are exciting, let's keep up the good work! All the best — WikiLucas (🖋️) 15:47, 26 November 2022 (UTC)
@WikiLucas00 Sorry, I totally forgot about your request. The list is now ready for French: List:Fra/Filtered-lemmas-without-audio-sorted-by-number-of-wiktionaries. It's produced like the other lists, but it's limited to words from Catégorie:Lemmes_en_français. The list will be refreshed together with the rest. Olaf (talk) 16:54, 14 May 2023 (UTC)
Hello @Olaf ! Thank you so much for this list, it's going to be very useful for sure! Let's cover 100% of Lemmas 😎 I'll tell the French contributors on Discord about it 😉 All the best — WikiLucas (🖋️) 22:18, 20 May 2023 (UTC)

How to create user page

Hello, my user name is Ngangaesther from Kenya. I am still stuck on how am supposed to create my user page kindly help regards Esther

Odia language missing from Stats/Languages

Hi there, for some reason, the Odia-language stats are missing from the Stats/Languages page. Also, "The most prolific speakers for the current month " section in the Stats/Speakers page is not loading at all since the time I checked last (about 10 days). I have tried on Chromium and Firefox and the result is the same even after clearing cache. --Subhashish (talk) 19:40, 28 July 2022 (UTC)

Hello Subhashish, it should be back online. We had a hackathon to put it back. We are calling for devs to push forwards. Yug (talk) 11:07, 10 August 2022 (UTC)
Thank you for the update, Yug. --Subhashish (talk) 14:00, 10 August 2022 (UTC)

Manually-coded languages

I came across meta:Lingua Libre/SignIt recently (via betawiki) and was wondering if manually-coded languages would be appropriate for this as well? These are languages in sign modality, but strongly tied to a spoken/written language; they usually adopt the grammar of the nonmanual language, choosing instead to simply transpose the vocabulary. This means they are most often used in application-specific and pidgin contexts (Pidgin Sign for English and diver's signs are examples). In particular, I am interested in toki pona luka, a manual form of toki pona (Q338540). Since the vocab is the same as spoken/written toki pona, there are a minimal number of lexemes overall, so having a complete set of signs is easily achievable. Manually-coded languages including toki pona luka are generally not given a separate ISO 639 code since they are in effect equivalent to scripts. Would this cause a problem for the infrastructure as currently designed? Arlo Barnes (talk) 05:56, 17 August 2022 (UTC)


Hello Arlo Barnes,

I understand "manually coded languages" as synonymous to "signed languages", am I correct?
If there is no distinct ISO for the signed language, we could still:

  • Create a new wikidata item without ISO, which will be used as identifier by LinguaLibre infrastructure
  • Use the spoken/write language ISO, and create lists of words all suffixed by (signed).

Either of those solutions could work.

If you have some knowledge of signed toki pona luka please let me know. We are adding features on Lingualibre and SignIt in order to be able to record video of signed words by late 2022. We are almost there. If you would like to record some basic signed words to share with the world, then let me know. Yug (talk) 20:58, 17 August 2022 (UTC)

Signed languages and manually-coded languages share similarities (the manual modality) and differences (since sign languages are 'native' to the signed modality, they use it more fully, having complete deixis and time-reference systems, use of handshape classifiers, etc.) -- 'luka' means 'hand'/'five', so that's the part of the name that indicates the manual modality, but otherwise it's just garden-variety toki pona. I am interested in using SignIt to record this vocab, yes. The '(signed)' suffix seems like a good way to do it. Arlo Barnes (talk) 13:16, 19 August 2022 (UTC)
Arlo Barnes: We increasingly have tools to update and correct sign language recordings, so the suffix (signed) or the solution we choose appears incorrect, we still can correct it later using that bot.
I would encourage you to first train yourself and learn that manually-coded language over the coming months. Indeed, we still have a very last bug within our video recording chain, which makes rightful videos appears as audio on Commons. We expect to solve this last issue this fall (September or October ?). So for now, I encourage you to rest well, reload energy, to get ready to record later this year. Maybe identify near you some suitable place with elegant monochrome wall to film over or consider building yourself a low-cost recording studio,. Etc. We can discuss it to keep it low cost and effective if you are interested, as I'm also looking for such walls and/or considering building one for myself.
See also : Minimal Sign Language Studio guideline. Yug (talk) 22:30, 19 August 2022 (UTC)

Update my username

I have changed my Wikimedia username but the previous name still appears in Lingua Libre. I know it's not included in unified logins. Anyway, please update my username to Aishik Rehman. Hirok Raja (talk) 15:14, 1 September 2022 (UTC)

Hi Hirok Raja¸would you have an example of what you would like to see to be changed? I think you are talking about the filename but I am not sure, so with one example, it would be clearer. Pamputt (talk)
@Pamputt
1. Top menubar of lingualibre.org showing 'Hirok Raja' as my profile name.
2. After uploading when I try to check my uploads in Commons, it takes me to https://commons.m.wikimedia.org/wiki/Special:ListFiles/Hirok_Raja page.
3. 'Hirok Raja' being used as Default recorder in the file names and description
4. Change speaker name to 'Aishik Rehman' every time while recording is quite annoying to me.
5. Even here 'Hirok Raja' is showing as my signature by default ): Hirok Raja (talk) 19:16, 2 September 2022 (UTC)
I suspect this is due to long term cookies. Would be interesting to push a clean up for your connection cookies for Lingualibre, it will log you out, then come back here. On firefox.
Open about:preferences#privacy > Go to "Cookies and Site Data"> Click "Manage Data" > Search "Lingualibre" > Remove selected. Yug (talk) 21:10, 2 September 2022 (UTC)

Siège communautaire de Wikimédia France – ouverture du vote / Community representative to Wikimédia France’s board - votes are opened

(English version below. Do not hesitate to correct my English translation.)

(Message copié depuis le bistro du jour par Lepticed7 (talk))

Bonjour,

En tant que président de la commission électorale pour l'élection du siège communautaire au conseil d'administration de Wikimédia France, je vous annonce que le vote ouvre aujourd'hui (13 septembre) à 0h CEST. Il se terminera le 26 septembre à 23h59 CEST.

Comme il y a trois ans, le scrutin est public sur Meta. Les pages de votes sont disponibles dans la catégorie correspondante ou en lien sur la page principale. C'est un scrutin par approbation, le candidat qui aura le plus grand nombre de voix sera donc déclaré élu. Vous pouvez voter pour autant de candidats que vous le souhaitez.

Si vous avez des questions, vous pouvez les poser sur la page de discussion ou par courriel à election@wikimedia.fr.

Pour la commission électorale, Mathis B, le 12 septembre 2022 à 22:00 (CEST)


(Message copied from the French Wikipedia Bistro by Lepticed7 (talk))

Hello,

as the chairman of the electoral commission for the election of the community representative to Wikimédia France’s board, I announce that votes open today (13th september) at 0:00 CEST. They will be closed on 26th september at 23:59 CEST.

Like it was the case three years ago, voting is on Meta. Voting pages are available in the corresponding category or as links in the main page. The elected candidate will be the one with the most approbation votes. You can vote for as many candidates as you wish.

If you have any questions, you can ask them on the Talk page on Meta, or by email at election@wikimedia.fr.

For the electoral commission, Mathis B, 22:00, 12 septembre 2022 (CEST)

Is there a way to exclude username from Wikimedia Commons upload file name?

See also Help:Renaming.

This seems redundant and takes up a lot of space --Middle river exports (talk) 20:22, 9 October 2022 (UTC)

@Middle river exports Welcome MRE,
You could name your speaker with a single character I guess.
But keeping the name is voluntary. Each speaker has his/her own voice, which we want to document. If, outside of Wikimedia, you want to remove part of the filename, we have a technical tutorial to do so. See Help:Download datasets and Help:Renaming. Ping us back if your dataset is not up to date. Yug (talk) 13:16, 10 October 2022 (UTC)
I have solved this now by just changing my username to something shorter. This way I can upload English as Usmaan (عثمان) for example where instead of just repeating the username it shows two scripts which is more useful. (Apparently few enough people have Arabic script usernames that short common words are mostly available.) --عثمان (talk) 20:23, 10 October 2022 (UTC)
All Unicode characters should be ok, in words and usernames ;) Yug (talk) 19:46, 11 October 2022 (UTC)

Username update request

I realised my username on Mediawiki didn't carry over here when I changed it. On thus site could I please have it changed to: عُثمان --عثمان (talk) 08:45, 10 November 2022 (UTC)

Data on LinguaLibre:Stats isn't consistant with Wikipedia Commons's Category

On the Stats page, the French have 254,387 records

https://lingualibre.org/wiki/LinguaLibre:Stats/Languages

Meanwhile, the Category on commons.wikimedia.org has 253,464 records

https://commons.wikimedia.org/wiki/Category:Lingua_Libre_pronunciation-fra

The stats display more records. This data inconsistency is strange. -- User:Shenlebantongying, 10:36, 23 december 2022.

This means some item page exist here, but no audio are on Commons.
Item creation here and upload are done at step 5 of the recording, nearly simultaneously.
So I don't know what is going on. Yug (talk) 17:41, 26 December 2022 (UTC)

c:Category:Lingua Libre pronunciation-bxg

All files in this category are tagged with wrong language. I have requested moves for files in the category, but what's more to be done?--GZWDer (talk) 13:05, 12 January 2023 (UTC)

Thanks for reporting. Actually all these items are erroneous (see Special:WhatLinksHere/Q590228):
I have not checked yet if corresponding recordings are still on Commons. Pamputt (talk) 16:11, 13 January 2023 (UTC)

I can not publish my records recorded via Lingua Libre.

Dear Colleagues,

It records, but when I press the button to publish it on Wikimedia Commons. It does not work. It returns as "Retry failed upload" Any idea? Thank you. Key Mîrza (talk) 05:09, 28 January 2023 (UTC)

Is it happening for all your recordings or only some of them? Pamputt (talk) 08:49, 28 January 2023 (UTC)
It was all good until a month ago. Nowadays I am on a vacation in another city and trying to enter to my accout and make some more records. I can enter into my account and I can create records, but I can not publish them. I stuck at publishing stage. Nothing publishing. None of my records publishing. I even tried to record via my cell phone, even there nothig publishing. By the way, I just saw your previous message wecoming me. Thank you, for your kind wish. Best wishes... Key Mîrza (talk) 09:57, 28 January 2023 (UTC)
Hmmm, I do not know what to say. Sometimes some recordings do not upload but they other do. When none recording uploads, I do not know what could be the origin. Could you try with another webbrowser (firefox or Chrome)? To go further, I think we would need a Javascript expert that could have some hints. @Poslovitch & Lepticed7 maybe ? Another question, how many words do you try to record? If this is a lot, could you try with only a few (less than 10 for example). Pamputt (talk) 15:42, 28 January 2023 (UTC)
I tried 11 words together, then even 1 word only for testing purpose. Nothing worked. You said Java. Do I need java to be able to work with the application? If so, that I need to install Java. Because I formatted my PC. May be it is not installed. Thank you. Key Mîrza (talk) 17:06, 28 January 2023 (UTC)
Java is different than Javascript. Javascript is language supported by the webbrowser so you do not need to install anything else than a webbrowser to record pronunciations on Lingua Libre. Unfortunately, I cannot dig further in this direction because I almost know nothing about Javascript. Pamputt (talk) 21:18, 28 January 2023 (UTC)
Thank you, anyway. Key Mîrza (talk) 22:38, 28 January 2023 (UTC)
Key Mîrza, thank you a lot for your voice, it make us discover new languages. Please be aware Lili works best on solid desktop computers. Also, you likely have a limit of 380 records uploads per 72 minutes. So you may need to leave your tab open, and click "retry" after that. You can expand those right by making a demand on Commons. See LinguaLibre:User rights. Contact us if you think it may be that. Yug (talk) 15:07, 5 February 2023 (UTC)
It's confirmed, as all new contributor you are limited to 380 uploads per 72h. You can get more userrights by requesting those rights on Commons. Yug (talk) 15:15, 5 February 2023 (UTC)

Late 2022-2023 Winter report

Hello all, allow me to share few overall news from the various recent, ongoing, or near-future efforts.

  • 🤖 User:Pamputt has taken over Lingualibre Bot and added support for the Kurdish wiktionary. See github.
  • 🌏 Melody (WMFr intern) and myself made a mini-editathon on writing template emails for outreach. See Lingualibre:Events.
  • ⚡ User:Elfix and myself will attend are collaborating for sparql requests (me) optimization (Elfix). We aim to create and languages gallery this spring.
  • 🔴 Wikimedia France's freelance on the record wizard is back on track, delivery of fixes should occur around May-June.
  • 🙋‍♀️ Adelaide (WMFr) mentioned the wish of a second intern on Lingualibre outreach this summer, to reuse Melody's assets, expand actions and geographic diversity.
  • 🫱🏼‍🫲🏽 Wikimedia France yearly strategic meetup is this week, and is expected to strengthen its (linguistic) diversity and metrics axes, for which Lingualibre is one of their champions.
  • 🧓 Eve and myself (likely) will be present at Toulouse's Forom des Langues, in May, where ~60+ languages associations are present.

For specific deadlines and events coming soon, please also check Lingualibre:Events/Program. We always welcome contributors. When necessary, WMFr may refund transportation costs. Worth a try ! Yug (talk) 15:07, 5 February 2023 (UTC)

Edit your nickname

Good evening, I would like to change my nickname because it did not update when I was renamed Manjiro91 then Manjiro5 instead of GamissimoYT on Wikimedia projects. Thanks in advance Regards manȷıro💬 22:53, 23 February 2023 (UTC)

Tool to prepare words for Lingua Libre

Preparing words to be used in Lingua Libre has always been challenging. But I think this is a shared challenge. Crawling text from different sources and creating a clean list of words is very important. I've used Tito's instructions in the past, but using multiple tabs and multiple tools is not the best user experience. So, I thought I'd create something that is functional for me and simple enough to be tweaked. Introducing "Prepare words for Lingua Libre". The tool is currently set for Odia but can be easily tweaked for other languages using non-Latin scripts. I'd request Lingua Libre core team to incorporate the tool into Lingua Libre so that users can use the platform to create a wordlist. Extracting words from any random text is always hard, especially new contributors. --Subhashish (talk) 03:44, 14 March 2023 (UTC)

Hi Psubhashish. This is really nice. Do you think it would be easy to adapt it to create a new generator? Generators can be used by anyone after they import them in their common.js. Pamputt (talk) 06:44, 14 March 2023 (UTC)
Thanks User:Pamputt. That would be fantastic, but I probably don't have the right knowhow for doing that. I did take ChatGPT's help to create a .js version from the HTML code I had shared earlier but would appreciate any help. I think having a tool inside Lingua Libre would be great so really liked the idea of new generators. Common users would like things well packaged rather than jumping from one platform to another. --Subhashish (talk) 13:09, 14 March 2023 (UTC)

Problème de publication des enregistrements

Bonjour, il y a quelques années, j'ai renommé mon compte GamissimoYT en Manjiro91. Plus tard, je l'ai renommé Manjiro5. Le problème est que le renommage de mon compte global Wikimedia ne s'est pas fait sur Lingua Libre. Je ne peux donc pas publier les audios que j'enregistre sur LinguaLibre et n'apparaissent pas non plus sur Commons. Pourriez-vous m'aider ? manȷıro💬 08:41, 26 April 2023 (UTC)

Renommer un dialecte en langue

Bonjour,

J'avais fait la demande pour l'ajout de "Teochew dialect" il y a quelques années lors de mes premiers essais. Cependant, il paraît plus pertinent de juste laisser "teochew" tout court sans le mot dialecte. Serait-il possible de faire ce changement.

Assassas77 (talk) 19:41, 7 May 2023 (UTC)

Check-green.svg Done Solved here by User:Assassas77 ! It's a wiki :) Yug (talk)

MediaWiki:Lang/*

What are the MediaWiki:Lang/* messages for? For example, MediaWiki:Lang/awa? It looks like they mostly just repeat the language code in the content. --Amir E. Aharoni (talk) 07:21, 24 May 2023 (UTC)

Where are the Greek recordings?

According to the statistics page there are 130 recordings of the Greek language (Q205, ISO: gre). However there is no category commons:category:Lingua Libre pronunciation-gre defined or any recordings added to this category. There is a category commons:category:Lingua Libre pronunciation-ell, but it is empty. What happened to the 130 Greek recordings? Olaf (talk) 20:16, 9 June 2023 (UTC)

Hi Olaf, for unclear reason (probably historical reason), it seems that all Greek recordings are categorized in Category:Lingua Libre pronunciation-other. We have to move all these recordings in the good catagory (I do not know if Commons has a some automatic tool for such job). And also redirect commons:category:Lingua Libre pronunciation-ell to c:category:Lingua Libre pronunciation-gre. Pamputt (talk) 07:24, 10 June 2023 (UTC)
Hi Pamputt. This happened because in wikidata:Q9129#P220 both ISO 639-3 codes are deprecated, and entity:getBestStatements function, used in commons:Module:Lingua Libre record#L-46, doesn't accept deprecated entries, so the module can't get the language code and falls back to "other" category. We could change the Wikidata entry and the files would be moved automatically. However code "gre" must stay deprecated, because it is unclear if it refers to ancient or modern Greek. It would be better to promote "ell" to normal entry. Then changes in Q205 would be also needed. It looks like bulk moving Lingua Libre recordings around doesn't require admin rights, so I can fix this issue if you agree to change the Greek language code to "ell" instead of "gre". Olaf (talk) 08:46, 10 June 2023 (UTC)
Hi Olaf thank you for your investigation. So, I have modified Greek (Q205) to fix the issue on the Lingua Libre side. For Wikimedia Commons, you can go ahead. Pamputt (talk) 08:11, 18 June 2023 (UTC)
Thanks, Pamputt. It's not as easy, as I thought. Setting Greek ISO 639-3 code to normal from obsolete creates constraint validation with Modern Greek with the same code. In fact, LinguaLibre shouldn't record Greek words as Greek (Q9129) but rather as Modern Greek (Q36510). In fact Modern Greek is also defined in LinguaLibre: Q279. Olaf (talk) 13:26, 18 June 2023 (UTC)
If I understand correctly, the easiest way to manage this case would be to delete Greek (Q205), so that no one can record in "this language" and thus select only Modern Greek (Q279). If so, I would require to replace all Lingua Libre statements that use Greek (Q205) by Modern Greek (Q279). There is currently 137 items that use Greek (Q205), so I think it is manageable by hand. Olaf, what do you think about this "workaround"? Pamputt (talk) 16:48, 18 June 2023 (UTC)
This would be perfect, it also requires renaming the 137 recordings in Commons, but it can be done. What about the datasets to be downloaded from LinguaLibre, will they change automatically? Olaf (talk) 21:08, 18 June 2023 (UTC)
Olaf, Pamputt, I had nearly similar case with Chinese ISOs zho vs cmn. I have about 186 zho items (see Help:SPARQL for maintenance)]] which have the wrong iso. My plan is :
  • to delete those audios, very simply, on both Lingualibre and Commons. The alternative would be to edit them all on both sites.
  • to discourage recording or delete that Lili Qid.
so I may work on those audio, some day... Hugo en résidence (talk) 17:36, 18 June 2023 (UTC)
I don't like deleting good recordings as a way of dealing with wrong categorization. Moreover some of them are probably in use, because Olafbot might have added them to Polish Wiktionary. If there is no other option, just leave them where they are in Commons, and remove Greek from Lingua Libre alone in favor of Modern Greek. But I think Pamputt's solution is better. Olaf (talk) 21:08, 18 June 2023 (UTC)
USer:Olaf, I don't like either. But 186 recording is about 8 minutes work, and it have been confusing us for 3 years. Do point to that. Yug (talk) 19:35, 20 June 2023 (UTC)
Deleting 186 recordings is about the same amount of time as modifying the language statement. This is manageable by hand and I would prefer not to delete them. I do not have time for now but I will try to do it before the end of the month. Pamputt (talk) 11:47, 21 June 2023 (UTC)

Any Recording limitation in Lingua Libre

Hello,I want to know any recording limitation in Lingua Libre. Because I'm planning a screen-cast in Tamil language. If anyone know please reply. Thank you Sriveenkat (🎤) (talk) 11:11, 1 August 2023 (UTC)

I you are not an autopatrolled user on Wikimedia Commons, then you cannot upload more than 380 audios per 72 minutes. If you want to record more words within this timeslot, then you should request for this right. Pamputt (talk) 14:15, 1 August 2023 (UTC)
Hi, @Pamputt , I don't record 380 audios within 72 minutes. I'm planning to create screen-cast tutorial video in Tamil language. So I ask this question. Thank you for your reply Sriveenkat (🎤) (talk) 14:35, 1 August 2023 (UTC)

Exclusion list for generators?

Hello, if there isn't a feature like this somewhere already, I propose a per-user blacklist of sorts, which would allow users to select words which would be excluded when you choose one of the generator options to generate words. I'm currently going through a list of words in a Wiktionary category, and I'm confronted with a growing list of words that I can't deal with because they aren't suitable for pronunciation (e.g. particles that surround other arbitrary words), or they're just homophones of something I've already recorded, etc. What would be necessary, techniaclly, in order to make this happen? Kiril kovachev (talk) 12:39, 10 August 2023 (UTC)

Hi Kiril kovachev, I have opened a Phabricator ticket for this request. If you know Javascript, you may have a look to the code to propose a patch. Pamputt (talk) 05:52, 15 August 2023 (UTC)

Barnstar Award Template

There is any Barnstar Award Template for Lingua Libre? Sriveenkat (🎤) (talk) 07:06, 13 September 2023 (UTC)

There are Template:50k barnstar and Template:Speaker of the month and maybe other. WikiLucas00 may know other barnstars. Pamputt (talk) 21:11, 13 September 2023 (UTC)
@Pamputt & WikiLucas00 Ok Pamputt, I want give barnstar award for Some Beginner Speakers. It will be a motivating for them. Am I right?Sriveenkat (🎤) (talk) 11:46, 14 September 2023 (UTC)
Hello @Pamputt & Sriveenkat ! Indeed, it would be a nice idea to offer awards for beginners, such as a barnstar for passing 1000 recordings for example. All the best — WikiLucas (🖋️) 16:08, 16 September 2023 (UTC)

1,000,000th

  • N ! 08:38 కంటగిల్లు (Q1094614)‎ diffhist +3,648‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటగించు (Q1094613)‎ diffhist +3,636‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటకితము (Q1094612)‎ diffhist +3,636‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటకుడు (Q1094611)‎ diffhist +3,624‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటక (Q1094610)‎ diffhist +3,588‎ V Bhavya talk contribs block ‎Created a new Item
  • N ! 08:38 కంటబడు (Q1094609)‎ diffhist +3,612‎ V Bhavya talk contribs block ‎Created a new Item

Yug (talk)

Why Lingua Libre Bot isn't running Wikidata?

@Poslovitch, Pamputt, & WikiLucas00 Why Lingua Libre Bot isn't running in Wikidata? Darafsh asked about in Wikidata Lexicographical data Telegram Group. What's the problem? Please kindly tell the issue. Thanks-Sriveenkat () (talk) 16:12, 6 October 2023 (UTC)

@Sriveenkat could you point to an Lingua Libre item and a Wikidata item or lexeme that has not received the pronunciation? This will help to test and find what is wrong. Pamputt (talk) 19:22, 6 October 2023 (UTC)
Hi @Pamputt Recorded Audios doesn't received in the Wikidata Items and Wikidata Lexemes!. The User Darafsh have recorded some many words for Wikidata Lexeme Project. but never audios added to the Wikidata Lexemes. You can see the wikidata:Special:Contributions/Lingua Libre Bot The last contribution on 23:49, 9 September 2023. So, Iam just asking run the Lingua Libre Bot on Wikidata. I'm also recorded some words for Wikidata Lexeme Project I waited for some days, But never my audios added to wikidata lexemes. So, I run QuickStatements for Adding My audios.. Now User Darafsh also run QuickStatements for adding he's audios.. I think so many users using Lingua Libre for Automatically adding audios on Wikidata and some wikitionaries. I hope you understand Thankyou Regards Sriveenkat () (talk) 05:38, 7 October 2023 (UTC)
Thanks to @Sriveenkat to start the discussion. If you need some examples, you may see Mazanin's contributions on Commons. This is the recorded audio: [1] and this is the lexeme entry on Wikidata: [2] but they are not connected yet. Darafsh (talk) 12:07, 7 October 2023 (UTC)

SiteNotice

Hi,
Translations are not working for Sitenotice. Install CentralNotice? ―Eihel (talk) 14:31, 7 October 2023 (UTC)

Global bot status

Lingualibre Bot has been approved. cc @Pamputt, Poslovitch, & WikiLucas00 . Yug (talk) 12:31, 10 October 2023 (UTC)

Thank you for the request and congrats on the approval! — WikiLucas (🖋️) 12:40, 16 October 2023 (UTC)

ExternalTools - Wikidata Query Service - Recording Indian Actor and Actress Names in Tamil

@Yug, Pamputt, & WikiLucas00 I am now interested in Recording Indian Actor and Actress Names in Tamil. So I make a query, I Input that query url in ExternalTools. A error comes "Result must contain both "id" and "label" field." I think something need to modify on this query. Please anyone help for this. Thanks Sriveenkat (talk) 19:58, 24 November 2023 (UTC)

@Sriveenkat , this works. Please note there is 6982 items if we remove the LIMIT, and I don't how the systems works with such larger list. Yug (talk) 23:13, 25 November 2023 (UTC)
@Yug Thanks for your reply. The query doesn't works for me :( Error in ExternalTools "undefine" Sriveenkat (talk) 06:03, 26 November 2023 (UTC)
@Sriveenkat , in Wikifata QS you have to run the query to check if it is working and providing data, if so go to the URL bar, copy that long url. Come back to Lingualibre Step 3, external tool, paste that long url. It worked for me. Yug (talk) 06:00, 27 November 2023 (UTC)
@Sriveenkat Sorry, I missed something. On the Query Service bottom right, click "Link" > then on "SPARQL endpoint" : copy this url. Yug (talk) 08:25, 27 November 2023 (UTC)
@Yug Works with copying SPARQL endpoint link. Thank you much. I'm planning to record more proverbs, usage examples, places, persons, Lingualibre is really more comfortable to record it. Thanks Again Sriveenkat (talk) 22:54, 27 November 2023 (UTC)

Logo redesign propositions

I had a bit of fun yesterday contributing to one of my favourite projects in a slightly different way. I've kept the ideas (microphone, wings) and colours of the current logo but made it a bit more polished. I've already taken a few opinions on Discord but I wanted to get a more general opinion. What do you think?

Just so you know, I won't be at all offended if the community prefers to keep the current logo, because there are some very good reasons for keeping it (I'm thinking in particular of all the printed materials, the fact that it's simple (easy to draw by hand if we don't have a printer and maybe more "readable" if very small), its declination for sign languages, etc.).

DSwissK (talk) 08:59, 3 December 2023 (UTC)

@DSwissK hello,
We can add your proposition in the set of logos ideas within a Wikimedia Commons Category:Proposed Lingua Libre logo, for reference later on. But to be honest, good logo design requires design experience, artistic intuition, brand and public awareness, which are harder to gather than it seems. It also must fit a project's phase and branding strategy, when the project needs a new logo and project members willing to shift from the current high visibility logo to a new one. All together changing a logo is not something easy to push for. I made a similar answer here few month ago about Lingua Libre SignIt. Yug (talk) 12:23, 4 December 2023 (UTC)
@Yug hi,
Thank you for your input. I appreciate you explaining the complexities - you raise great context I had not fully considered. DSwissK (talk) 09:05, 6 December 2023 (UTC)

Hebrew diacritics (Niqqud)

In Hebrew we use diacritics (Niqqud) to determine how to pronounce the words.

Niqqud is usually common in the following cases:

  1. Young kids or people learning the language.
  2. Formal use.
  3. To distinguish between meanings when the base form is ambiguous.

This is a short example:

  • Base form: גזר (GZR)
  • Carrot: גֶּזֶר (Gezer)
  • Masculine cut: גָּזַר (Gazar)
  • Piece: גֶּזֶר (Gezer)

This is the corresponding Wiktionary article: https://he.wiktionary.org/wiki/גזר

When fetching words from Wiktionary it's better to use the first headers instead of the item names because in many cases the term is ambiguous and the items name is the base form without any pronunciation guidance.

As for Wikipedia etc. sometimes there's a word with the Niqqud inside the article but it will be a bit complicated to parse so we can skip that for now.

Lights on userrights

Hello all,
I bumped again into LinguaLibre:User_rights and {{Autopatrolled}}. To the extend of my knowledge we have no solution to this and no active user is munitoring this bottleneck. Is this assessment correct ? Yug (talk) 21:03, 28 December 2023 (UTC)

A mobile app

I personally think that contributing using a browser is quite dangerous, Firefox on mobile, for example, has a very strict page unloading policy which leads to closing the tab while uploading thus losing the remaining data which wasn't uploaded yet (I found a workaround but it's not perfect), are there any thought about this? (Maybe even expanding the current CV Project app by Saverio Morelli?)

Is the Record Wizard not working for anyone else?

My mic works with mictests.com, but the RecordWizard doesn't pick anything up at the "check your microphone" stage. I've tried on both my phone and my laptop, and I can record sound in both cases, and I have the appropriate permissions enabled, but this particular website isn't detecting sounds. Is anyone else having this kind of problem? Grendelkhan (talk) 23:43, 24 February 2024 (UTC)

Hello User:Grendelkhan,
I just received a second such report. User also checked mictests.com sucessfully.
On Firefox, Lingua Libre recording studio step 4, the microphone is allowed (we see the red microphone image on the left of the URL address). But after clicking the record button, no recording occurs.
  • Mictests on other site : successful.
  • Device: Notebook
  • OS: ?
  • Browser: Firefox, Chrome.
  • User: User:Akamycoco.
  • Languages affected: all.
  • Dates : Worked on February 28. Stopped working on February 29.
Let's starts an investigation. Could you let me know your OS and precise web browser version ? (Help > About Chrome or similar)
Let me know as well if you have basic developer skills to Right-click on the staled page > Inspect > Console : are there any error message ? Yug (talk) 07:55, 1 March 2024 (UTC)
My laptop is using Google Chrome 122.0.6261.94 (Official Build) (64-bit) on Linux (Debian Testing). No error messages in the console when I attempt the recording. My phone is using Chrome 122.0.6261.90 on Android 14 on a Pixel 5a. It does seem to work on Firefox 115.7.0esr (64-bit) on my laptop. (I really should have checked that before.) So maybe this is solely a Chrome problem? Grendelkhan (talk) 16:30, 2 March 2024 (UTC)

Automatic categorization isn't documented.

So far as I can tell, this isn't documented: if, for user Foo, category Lingua Libre pronunciation by Foo exists on Commons, then all uploads will be categorized into that category. This is helpful! It's also easy to backfill after the fact using commons:Help:Gadget-Cat-a-lot. I'm not sure where to document this, but it seems reasonable to do so somewhere. Grendelkhan (talk) 16:26, 3 March 2024 (UTC)

Understanding lingua-libre

Hi, I am creating this discussion to understand lingua-libre better

Uploads are failing

TLDR: Large amount of users reporting failure to upload at step 5 : Grendelkhan, Culex, XANA000, Ardzun (Indonesian languages), Penn Zero MSSJ, User:Univòc64 (Whistled Occitan) and User:Akamycoco (Taiwanese languages). This likely only tip of iceberg. Only few users were able to record in May, with atypically low number of recordings. Indonesia workshop with ~15 participants critically affected. Investigation ongoing. Hugo en résidence (talk) 14:20, 13 May 2024 (UTC)

I can record words, but uploading them to Commons fails. The JavaScript console has the following message:

Your IP address is in a range that has been blocked on all Wikimedia Foundation wikis. The block was made by ‪EPIC‬. The reason given is Open proxy/Webhost: See the help page if you are affected. * Start of block: 10:09, 1 May 2024 * Expiry of block: 10:09, 1 May 2027 Your current IP address is 2001:41d0:304:100::4790. The blocked range is ‪2001:41D0:0:0:0:0:0:0/33‬. Please include all above details in any queries you make. If you believe you were blocked by mistake, you can find additional information and instructions in the No open proxies global policy. Otherwise, to discuss the block please post a request for review on Meta-Wiki. You could also send an email to the stewards VRT queue at "stewards@wikimedia.org" including all above details.`, blockinfo: {…}, "*": "See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."

This is not my IP address shown in the error message, and whatismyip confirms that I'm not behind a proxy. The Global block request is here. Is this affecting anyone else? I lost a heap of recordings. Grendelkhan (talk) 22:26, 4 May 2024 (UTC)

Uploads are failing for me today too, even though I am recording with my account. Culex (talk) 15:04, 8 May 2024 (UTC)
Idem--XANA000 (talk) 16:49, 9 May 2024 (UTC)
I can record, but i couldn’t uploaded until today. I was able to upload once yesterday, but after that I couldn't upload any more. Ardzun (talk) 06:04, 11 May 2024 (UTC)
I guess I'm not the only one who's been trying for weeks but could not publish audio after 1 May. Hope someone can fix it. Penn Zero MSSJ (talk) 20:54, 13 May 2024 (UTC)
User:Univòc64 (Whistled occitan) and User:Akamycoco (Taiwanese languages) also reported issues.
It seems time to add a sitenotice warning. Hugo en résidence (talk) 14:07, 13 May 2024 (UTC)
In may we have mostly : 556 recordings by 7 users on May 1th, 174 recordings on May 11th (Austin Zhang), then nothing.
If we compare with known monthly recordings, our average months recently was 30k audios, the lowest ones were 5k audios, May 2024 is heading toward 1200 audios or 5% of the average month and 20% of the lowest months. Something weird is going on indeed.
Most prolific speakers for the current month Months since 2022
... Loading ...
{ date:2022-01, records: 21290, speakers: 46, languages: 28 },
{ date:2022-02, records: 3894, speakers: 40, languages: 17 },
{ date:2022-03, records: 8357, speakers: 61, languages: 21 },
{ date:2022-04, records: 5454, speakers: 34, languages: 18 },
{ date:2022-05, records: 4702, speakers: 59, languages: 30 },
{ date:2022-06, records: 7675, speakers: 41, languages: 18 },
{ date:2022-07, records: 4364, speakers: 37, languages: 22 },
{ date:2022-08, records: 9544, speakers: 45, languages: 23 },
{ date:2022-09, records: 5802, speakers: 113, languages: 30 },
{ date:2022-10, records: 6931, speakers: 74, languages: 32 },
{ date:2022-11, records: 8461, speakers: 54, languages: 34 },
{ date:2022-12, records: 11882, speakers: 54, languages: 23 },
{ date:2023-01, records: 18150, speakers: 48, languages: 29 },
{ date:2023-02, records: 32441, speakers: 65, languages: 29 },
{ date:2023-03, records: 11527, speakers: 61, languages: 30 },
{ date:2023-04, records: 8451, speakers: 58, languages: 35 },
{ date:2023-05, records: 21282, speakers: 97, languages: 49 },
{ date:2023-06, records: 17940, speakers: 56, languages: 35 },
{ date:2023-07, records: 75825, speakers: 74, languages: 38 },
{ date:2023-08, records: 32681, speakers: 54, languages: 30 },
{ date:2023-09, records: 28813, speakers: 114, languages: 30 },
{ date:2023-10, records: 60317, speakers: 167, languages: 47 },
{ date:2023-11, records: 49704, speakers: 140, languages: 55 },
{ date:2023-12, records: 42383, speakers: 114, languages: 41 },
{ date:2024-01, records: 40572, speakers: 112, languages: 40 },
{ date:2024-02, records: 22385, speakers: 197, languages: 57 },
{ date:2024-03, records: 16997, speakers: 173, languages: 48 },
{ date:2024-04, records: 8733, speakers: 117, languages: 42 },
{ date:2024-05, records: 556, speakers: 7, languages: 7 }
Daily recordings over April and May 2024
... Loading ...
<= stops on 2024.05.01
Note: Austin Zhang recorded 174 audios on 05.11

Fixed

Both IP ranges 2001:41D0:0:0:0:0:0:0/32 and 2001:41D0:0:0:0:0:0:0/33 were subject to global block at one point. See also Global ban range_2001:41D0:0:0:0:0:0:0/32. Ban have been fixed, uploads are possible again.

Yug (talk) 10:38, 14 May 2024 (UTC)

Logs

For references, I investigated the relevant block logs and uploads logs for May 2024.
Conclusion: the uploads collapse is coherent with the IP Ban. Still, given bug reports from Akamycoco in *March* and 咽頭べさ on step 4, I suspects other bugs are lingering around.

Global IP bans Lingualibre uploads logs
  • 18:46, 13 May 2024 EPIC talk contribs changed global block settings for 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (anonymous users only) (No open proxies )
  • 00:51, 10 May 2024 AmandaNP talk contribs globally blocked 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (No open proxies )
  • 17:02, 9 May 2024 EPIC talk contribs changed global block settings for 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (anonymous users only) (Open proxy/Webhost: See the help page if you are affected)
  • 17:09, 1 May 2024 EPIC talk contribs blocked 2001:41d0::/33 talk with an expiration time of 2 years, 364 days, 12 hours, 21 minutes and 36 seconds (anonymous users only, account creation disabled) (Open proxy/Webhost: See the help page if you are affected)
  • 17:09, 1 May 2024 EPIC talk contribs globally blocked 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (Open proxy/Webhost: See the help page if you are affected)

13 May 2024

  • [... Many more uploads]
  • Upload log 23:39 Elwinlhq talk contribs uploaded File:LL-Q5218 (que)-Elwinlhq-apaqay.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 19:05 Assassas77 talk contribs uploaded a new version of File:LL-Q9192 (cmn)-Assassas77-八角.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 19:05 Assassas77 talk contribs uploaded File:LL-Q9192 (cmn)-Assassas77-八角.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 16:38 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-sih8 buh8 sah8 nah4.wav ‎ Tag: Lingua Libre [2.2]

11 May 2024

  • Upload log 20:21 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-buah8.wav ‎ Tag: Lingua Libre [2.2]
  • []... +172 recording by User:Oh! Tea]
  • Upload log 18:56 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-a2.wav ‎ Tag: Lingua Libre [2.2]

10 May 2024

  • Upload log 06:08 CapitainAfrika talk contribs uploaded File:LL-Q36217 (lin)-CapitainAfrika-Wiki na monɔkɔ mua bísó.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 00:14 Ardzun talk contribs uploaded File:LL-Q13324 (min)-Ardzun-mada.wav ‎ Tag: Lingua Libre [2.2]

9 May 2024

  • Upload log 17:08 Àncilu talk contribs uploaded File:LL-Q652 (ita)-XANA000-orsù.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 17:05 Àncilu talk contribs uploaded File:LL-Q652 (ita)-XANA000-frac.wav ‎ Tag: Lingua Libre [2.2]

5 May 2024

  • Upload log 21:15 Benoît Prieur talk contribs uploaded File:LL-Q8785 (hye)-Benoît Prieur-Artsakh.wav ‎ Tag: Lingua Libre [2.2]

1 May 2024

  • Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hệ số.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hỗn số.wav ‎ Tag: Lingua Libre [2.2]
  • Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hằng đẳng thức.wav ‎ Tag: Lingua Libre [2.2]
  • [... Many more uploads]

Yug (talk) 10:38, 14 May 2024 (UTC)