|
|
(691 intermediate revisions by 67 users not shown) |
Line 2: |
Line 2: |
| {{Lang-CR}} | | {{Lang-CR}} |
| <indicator name="talk"></indicator> | | <indicator name="talk"></indicator> |
| + | {{LL:Chat room/FAQ}} |
| __TOC__ | | __TOC__ |
| + | <!-- **** DO NOT EDIT CONTENT ABOVE **** --> |
| | | |
− | == Chatroom FAQ ==
| |
− | * '''How to download all audios of one language ? By speaker ?'''
| |
− | ** Languages are there [https://lingualibre.fr/datasets/ https://lingualibre.fr/datasets/]. A short server-side script is auto-ran every 2 days, itself using [https://github.com/lingua-libre/CommonsDownloadTool lingua-libre/CommonsDownloadTool]. For more, see [[Help:Download from LinguaLibre]].
| |
| | | |
− | * '''How to add missing languages ?'''
| + | == Is the Record Wizard not working for anyone else? == |
− | ** Administrators can add new languages, they do so within few days. For users, please provide your language's [[:wikipedia:iso-639-3|iso-639-3]] code + link to the en.wikipedia.org's article. Optional infos are the common English name and wikidata IQ. For more, see [[Help:Add a new language]].
| |
| | | |
− | * '''How to keep my wikimedia project up to date ?'''
| + | My mic works with [https://mictests.com/ mictests.com], but [https://lingualibre.org/wiki/Special:RecordWizard the RecordWizard] doesn't pick anything up at the "check your microphone" stage. I've tried on both my phone and my laptop, and I can record sound in both cases, and I have the appropriate permissions enabled, but this particular website isn't detecting sounds. Is anyone else having this kind of problem? [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 23:43, 24 February 2024 (UTC) |
− | ** Contact [[User talk:0x010C|User:0x010C]], the botmaster of Lingua Libre Bot. For more, see [[Help:Bots]].
| + | :Hello [[User:Grendelkhan]], |
| + | :I just received a second such report. User also checked [https://mictests.com/ mictests.com] sucessfully. |
| + | :On Firefox, Lingua Libre recording studio step 4, the microphone is allowed (we see the red microphone image on the left of the URL address). But after clicking the record button, no recording occurs. |
| + | :* Mictests on other site : successful. |
| + | :*Device: Notebook |
| + | :*OS: ? |
| + | :*Browser: Firefox, Chrome. |
| + | :*User: [[User:Akamycoco]]. |
| + | :*Languages affected: all. |
| + | :*Dates : Worked on February 28. Stopped working on February 29. |
| + | :Let's starts an investigation. Could you let me know your OS and precise web browser version ? (Help > About Chrome or similar) |
| + | :Let me know as well if you have basic developer skills to Right-click on the staled page > Inspect > Console : are there any error message ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 07:55, 1 March 2024 (UTC) |
| | | |
− | * '''What IRL event.s are coming ? When ? Where ?'''
| + | ::My laptop is using Google Chrome <tt>122.0.6261.94 (Official Build) (64-bit)</tt> on Linux (Debian Testing). No error messages in the console when I attempt the recording. My phone is using Chrome <tt>122.0.6261.90</tt> on Android 14 on a Pixel 5a. It ''does'' seem to work on Firefox <tt>115.7.0esr (64-bit)</tt> on my laptop. (I really should have checked that before.) So maybe this is solely a Chrome problem? [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 16:30, 2 March 2024 (UTC) |
− | ** Nothing coming. For more, see [[LinguaLibre:Events]].
| |
| | | |
− | * '''How to translate LinguaLibre User Interface into a new language ?'''
| + | == Automatic categorization isn't documented. == |
− | ** Go to [https://translatewiki.net/w/i.php?title=Special:Translate&group=mwgithub-recordwizard&language=fr&filter=%21translated&action=translate translatewiki.net], change the url part <code>fr</code> into your language's [[:en:List_of_ISO_639-2_codes|ISO 639-2 code]]. For more, see [[Help:Translate]].
| |
| | | |
− | * '''How to archive sections which have been answered ?'''
| + | So far as I can tell, this isn't documented: if, for user Foo, category <tt>Lingua Libre pronunciation by Foo</tt> exists on Commons, then all uploads will be categorized into that category. This is helpful! It's also easy to backfill after the fact using [[:commons:Help:Gadget-Cat-a-lot]]. I'm not sure where to document this, but it seems reasonable to do so ''somewhere''. [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 16:26, 3 March 2024 (UTC) |
− | ** After reviewing the section, add '<code><nowiki>{{done}} -- can be closed ~~~~</nowiki></code>' to the top of the section. After few days to 2 weeks, move the section's code to <code><nowiki>[[LinguaLibre:Chat_room/Archives/year]]</nowiki></code>.
| |
− | === Archives ===
| |
− | <!-- {{Colapse|1=Archives|2= Archives by year:}}
| |
− | <br/> -->
| |
− | * [[/Archives/2020|2020]]
| |
− | * [[/Archives/2019|2019]]
| |
− | * [[/Archives/2018|2018]]
| |
| | | |
− | == speedy et / ou delete == | + | == Understanding lingua-libre == |
− | Bonjour,<br />
| |
− | Il peut arriver qu'un Élément Qs ne sert plus (enregistrement impropre, page WM différente, titrage incorrect, etc). On peut l'effacer de Commons, mais il subsiste ici. Pour commencer, je propose la création d'une page dédiée à la suppression, avec un/des template/s speedy et/ou delete.<br />
| |
− | Une de mes créations ne me convenait pas, alors j'ai effacé le fichier sur Commons pour le remplacer par un autre avec mes propres outils pour remettre tout en ordre sur LL. Bref le temps de la nouvelle création, Q309179 avait disparu. Pour speedy et delete, qu'en dites-vous ? Des commentaires ? —[[User:Eihel|Eihel]] ([[User talk:Eihel|talk]]) 17:35, 29 May 2020 (UTC) <small>ps. J'ai déjà ajouté un template. Voir [[LinguaLibre:Administrators' noticeboard]]</small>
| |
− | :Salut Eihel, oui pourquoi pas. À noter cependant que si une prononciation est incorrecte, réenregistrer le mot uploadera la nouvelle prononciation sur Commons à la place de l'ancien enregistrement. Par ailleurs vu que les noms des fichiers sont générés automatiquement par Lingua Libre, les cas à traiter devraient être relativement rares. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 12:46, 31 May 2020 (UTC)
| |
− | ::De manière générale nous avons un point faible dans la gestion dynamique des audios : consultation agreable, renommage, suppression, etc. C'est mentionné (consultation agréable) plus haut dans la comparaison avec Shtooka. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:24, 3 June 2020 (UTC)
| |
| | | |
− | ==='''Discussion'''===
| + | Hi, I am creating this discussion to understand lingua-libre better |
− | Hi Yug, I guess it should be better to open tickets on [[phab:project/view/3393/|Phabricator]] to keep track of all theses issues and be able to discuss each one more easiyl (structured way). [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 12:51, 31 May 2020 (UTC) | |
− | :Thanks Pamputt :) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 07:51, 1 June 2020 (UTC)
| |
− | ::+1 to pamputt, phabricator is more appropriate for that for advanced users. — [[User:0x010C|'''0'''x'''010<span style="color: #00C41C;">C</span>''']] <sup>[[User_talk:0x010C|~talk~]]</sup> 14:36, 1 June 2020 (UTC)
| |
| | | |
− | == Bugs == | + | == Uploads are failing == |
− | === Enregistrements accélérés ===
| + | :''TLDR: Large amount of users reporting failure to upload at step 5 : [[User:Grendelkhan|Grendelkhan]], [[User:Culex|Culex]], [[User:XANA000|XANA000]], [[User:Ardzun|Ardzun]] (Indonesian languages), [[User:Penn Zero MSSJ|Penn Zero MSSJ]], [[User:Univòc64]] (Whistled Occitan) and [[User:Akamycoco]] (Taiwanese languages). This likely only tip of iceberg. Only few users were able to [https://lingualibre.org/index.php?hidebots=1&translations=filter&hidepageedits=1&hideWikibase=1&hidelog=1&namespace=0&limit=1000&days=14&enhanced=1&title=Special:RecentChanges&urlversion=2 record in May], with atypically low number of recordings. Indonesia workshop with ~15 participants critically affected. Investigation ongoing. [[User:Hugo en résidence|Hugo en résidence]] ([[User talk:Hugo en résidence|talk]]) 14:20, 13 May 2024 (UTC)'' |
− | Bonjour,
| |
− | Mes enregistrements du jour ont été accélérés. Heureusement, je me suis vite rendu compte. Quelques exemples : [[Q332977]] [[Q332978]] [[Q332979]] [[Q332980]] [[Q332981]] [[Q332982]].
| |
| | | |
− | PS : Le lien "Commencer une nouvelle discussion" ci-dessus n'a pas l'air de fonctionner.
| + | I can record words, but uploading them to Commons fails. The JavaScript console has the following message: |
| | | |
− | [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 08:36, 28 June 2020 (UTC) | + | : <tt>'''Your IP address is in a range that has been [[m:Special:MyLanguage/Global blocks|blocked on all Wikimedia Foundation wikis]].''' The block was made by [[User:EPIC|EPIC]]. The reason given is ''[[m:Special:MyLanguage/NOP|Open proxy/Webhost]]: See the [[m:WM:OP/H|help page]] if you are affected''. * Start of block: 10:09, 1 May 2024 * Expiry of block: 10:09, 1 May 2027 Your current IP address is 2001:41d0:304:100::4790. The blocked range is 2001:41D0:0:0:0:0:0:0/33. Please include all above details in any queries you make. If you believe you were blocked by mistake, you can find additional information and instructions in the [[m:Special:MyLanguage/No open proxies|No open proxies]] global policy. Otherwise, to discuss the block please [[m:Steward requests/Global|post a request for review on Meta-Wiki]]. You could also send an email to the [[m:Special:MyLanguage/Stewards|stewards]] [[m:Special:MyLanguage/VRT|VRT]] queue at "stewards@wikimedia.org" including all above details.`, blockinfo: {…}, "*": "See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes." |
− | :Salut [[User:DSwissK|DSwissK]], problème étrange. J'ai ouvert un [[phab:T256663|ticket sur Phabricator]] à ce sujet. J'en ai également ouvert [[phab:T256665|un autre]] à propos de lien « commencer une nouvelle discussion » car je n'ai pas trouvé comment le corriger moi-même. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 17:40, 29 June 2020 (UTC)
| |
− | ::{{ping|DSwissK|Pamputt}} I got the same feedbacks of speeded up audios from [[User:Luilui6666|Luilui6666]] for Cantonese, today. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:24, 16 July 2020 (UTC)
| |
− | ::[https://lingualibre.org/index.php?title=Special:Contributions/Luilui6666&dir=prev&offset=20200709043912&limit=500&target=Luilui6666 Contributions] > Example (corrupted): [https://lingualibre.org/wiki/Q338365 Q338365] [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:42, 16 July 2020 (UTC)
| |
− | ::Should we review and remove all the bad audios, so it become easier to re-record ? And where should we remove them, here or on Commons ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 17:44, 16 July 2020 (UTC)
| |
− | :::{{ping|Yug}} We can list such items [[LinguaLibre:Misleading_items|here]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:44, 18 July 2020 (UTC)
| |
| | | |
− | === Accès impossible au Wizard dans l'interface en occitan ===
| + | This is not my IP address shown in the error message, and whatismyip confirms that I'm not behind a proxy. The Global block request [https://meta.wikimedia.org/wiki/Steward_requests/Global/2024-w18#Global_block_for_Special:Contributions/2001:41D0:0:0:0:0:0:0/33 is here]. Is this affecting anyone else? I lost a heap of recordings. [[User:Grendelkhan|Grendelkhan]] ([[User talk:Grendelkhan|talk]]) 22:26, 4 May 2024 (UTC) |
− | Bonjour,
| + | :Uploads are failing for me today too, even though I am recording with my account. [[User:Culex|Culex]] ([[User talk:Culex|talk]]) 15:04, 8 May 2024 (UTC) |
− | J'ai toujours un problème pour enregister quand l'interface est en occitan. Je dois faire un atelier public cet été et je suis obligée de passer en français.
| + | :: Idem--[[User:XANA000|XANA000]] ([[User talk:XANA000|talk]]) 16:49, 9 May 2024 (UTC) |
− | * Si je clique sur le bouton d'enregistrement en haut de la page, j'ai l'erreur suivante :
| + | ::: I can record, but i couldn’t uploaded until today. I was able to upload once yesterday, but after that I couldn't upload any more. [[User:Ardzun|Ardzun]] ([[User talk:Ardzun|talk]]) 06:04, 11 May 2024 (UTC) |
− | Fatal error: Maximum execution time of 30 seconds exceeded in /home/www/lingualibre.org/includes/cache/MessageCache.php on line 812
| + | :I guess I'm not the only one who's been trying for weeks but could not publish audio after 1 May. Hope someone can fix it. [[User:Penn Zero MSSJ|Penn Zero MSSJ]] ([[User talk:Penn Zero MSSJ|talk]]) 20:54, 13 May 2024 (UTC) |
− | * Si je clique sur le bouton d'enregistrement en bas de la page d'accueil, j'ai l'erreur suivante :
| + | ::[[User:Univòc64]] (Whistled occitan) and [[User:Akamycoco]] (Taiwanese languages) also reported issues. |
− | Fatal error: Maximum execution time of 30 seconds exceeded in /home/www/lingualibre.org/languages/Language.php on line 198
| + | ::It seems time to add a sitenotice warning. [[User:Hugo en résidence|Hugo en résidence]] ([[User talk:Hugo en résidence|talk]]) 14:07, 13 May 2024 (UTC) |
− | [[User:Guilhelma|Guilhelma]]
| + | ::In may we have mostly : 556 recordings by 7 users on May 1th, 174 recordings on May 11th ([[Special:Contributions/Austin Zhang|Austin Zhang]]), then nothing. |
− | :J'ai ajouté les nouveaux messages d'erreur au [[phab:T210477|ticket Phabricator]] qui parle des problèmes avec la version en occitan. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 08:55, 19 July 2020 (UTC) | + | ::If we compare with [https://public-paws.wmcloud.org/User:Yug/QueryLingualibre-monthly.ipynb known monthly recordings], our average months recently was 30k audios, the lowest ones were 5k audios, May 2024 is heading toward 1200 audios or 5% of the average month and 20% of the lowest months. Something weird is going on indeed. |
− | ::{{ping|Guilhelma}}, is this bug confirmed and reoccuring ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:21, 22 September 2020 (UTC) | + | {| class=wikitable |
− | ::{{ping|Guilhelma}}, est-ce que ce bug est confirmé et continue de vous géner ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:21, 22 September 2020 (UTC) | + | ! Most prolific speakers for the current month || Months since 2022 |
− | Yes, the bug is confirmed
| |
− | Fatal error: Maximum execution time of 30 seconds exceeded in /home/www/lingualibre.org/languages/Language.php on line 4422[[User:Guilhelma|Guilhelma]]
| |
− | | |
− | === Adding list from Wikidata ===
| |
− | Hello. It seems the interface has changed since i last used it and i cannot see how to create a word list from a Wikidata query. Could someone tell me the best way of doing this? thanks [[User:Jason.nlw|Jason.nlw]] ([[User talk:Jason.nlw|talk]]) 08:49, 17 August 2020 (UTC)
| |
− | :Hi [[User:Jason.nlw|Jason.nlw]], as far as I remember it has never been possible to generate such list but I may be wrong. I opened a [[phab:T260650|feature request]] on Phabricator. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:26, 18 August 2020 (UTC)
| |
− | * The only workaround is now: run a query--> download the label list as csv-->copy the column-->create a local list on LiLi. This won't remember and link the Wikdiata items though, and the bot won't work either. You can only record the words. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 00:42, 19 September 2020 (UTC)
| |
− | | |
− | == Datasets out of date ==
| |
− | Hello. It seems that the datasets page, although it claims to run every 2 days, is completely out of date: all the available zips are from April 2020 or November 2019 (and the full zip from May 2019). Is this a known problem? Is there a plan to address it? [[User:Julien Baley|Julien Baley]] ([[User talk:Julien Baley|talk]]) 23:17, 27 August 2020 (UTC)
| |
− | :Indeed, it seems to have an issue with the dataset updating. I opened a [[phab:T261519|Phabricator ticket]] about this issue. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:24, 28 August 2020 (UTC)
| |
− | | |
− | == Pages translation ==
| |
− | I would like to be able to mark pages for translation, but I don't have the user rights (pagetranslation) to do so. This rights are restricted to sysops (see [[Special:ListGroupRights]]). Should we create a translation administrator user group? Are there plans for creating a page like [[LinguaLibre:Requests for rights|this]] in the future? — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 03:08, 13 September 2020 (UTC)
| |
− | :{{ping|WikiLucas00}} indeed, currently there are not a lot of different user rights available here (bot, admin, bureaucrat). If you think we should have more, please feel free to open a ticket asking for that on [[phab:project/view/3393/|Phabricator]]. About, [[LinguaLibre:Requests for rights]], the same, feel free to create and initialize this page :D [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 17:06, 13 September 2020 (UTC)
| |
− | ::{{ping|Pamputt}} I created [[phab:T262855|a task on Phabricator]]. Let's first see how it evolves before creating a Request page. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 19:42, 14 September 2020 (UTC) | |
− | ::* Greetings, not sure specifically about this right, but most of the rights are managed at Localsettings.php ([[:mw:Manual:User rights]]). Good wishes. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 00:35, 19 September 2020 (UTC)
| |
− | ==== New admins ? ====
| |
− | ''See also [[Special:ListUsers/sysop]]''
| |
− | | |
− | {{ping|Pamputt|WikiLucas00|Titodutta|Lyokoï}} I think it would be nice to make WikiLucas an admin. We are a micro-wiki, WikiLucas has proven to be active and knowledgeable, all lights are green to make him a sysop. I would also encourage to have one or two Indian admins. Indian users are the second largest community here, they bring new insights to the projects, let's empower them properly. Any idea who among this later community would need the admin tools ? (page translate, page deletion, language import) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 05:09, 23 September 2020 (UTC)
| |
− | | |
− | Checking over [[Special:ListUsers/sysop]] I also notice :
| |
− | * Bureaucrats: 0x010C is taking a year off; Xenophon is a WMfr staff with bureacrat right for security reasons but barely active here; GrandCelinien... I barely crossed him; it leaves Pamputt as the single active bureaucrat. [[:en:Bus factor|Not enough]]. We need at least 3 '''active''' bureaucrats. I propose to promote Lyokoï to bureacrat if he is ok. He is a regular contributor and solid bet. We also will need someone on the Indian/Asian side soon. Bureacrats mainly can gives users more rights, such as admin status. It's not much but when we need it we need it, and relying on one single Bureaucrat is no a good practice.
| |
− | *: I’m OK to be a bureaucrat. If you want it, I see no problem. [[User:Lyokoï|Lyokoï]] ([[User talk:Lyokoï|talk]]) 16:36, 6 October 2020 (UTC)
| |
− | * Admins: WikiLucas is an obvious candidate, he bumped into limitations (page translation right above). I see about 3 Indian contributors quite engaged here, could we promote one ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 05:24, 23 September 2020 (UTC)
| |
− | ::{{ping|Yug}} I am clearly not opposed to have more bureaucrat or admin, neither to have some Indian contributors among them. That being said, I don't think we're in a hurry (the Lingua Libre community is not very active at the moment). I prefer to take some time to give the rights to people involved in Lingua Libre, so that we can be sure that they will use their rights for at least a few months. Yet, if someone requests admin or bureaucrat right, just ask (there is no bureaucratic procedure yet here). [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:25, 23 September 2020 (UTC) | |
− | :::{{ping|Yug|Pamputt|Lyokoï|Titodutta}} I agree with Yug, I would be more valuable to the project as an admin. As Pamputt has pointed out, the project is not very active for the moment, but in the light of future events -- for instance the training course I will be giving this month with Emma Vadillo to the alumni of the ''INaLCO'' in Paris <small>(being able to quickly delete the potential mistakes of the learners would be worthy)</small>, or the possible nomination of the project to the [[m:Coolest_Tool_Award|Coolest Tool Awards]]), its outreach will grow, so will the community. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 17:12, 3 October 2020 (UTC) | |
− | :{{Ping|Pamputt}} per the request just above let's '''grant WikiLucas00 adminship''', he is one of the most active here anyway, and admin is just an active user with a toolbox to add languages, block users, add translatable pages.
| |
− | :As '''for my general argument''', I consider that being a small wiki and with most admins/bureaucrats rarely passing by or via occasional sprints (my case), we therefor need a high ratio of admins/bureaucrates so there is always one around and checking upon the [[Special:RecentChanges]]. | |
− | :Last, '''as for Bureaucrats''', we are failing the [[:en:Bus factor|Bus factor/Bus test]] : it's a organizational risk we should not fail, ever. I recommend adding one Bureaucrat for sure. Keeping Xenophon as far away backup. I would also recommend to keep the door open for one more, preferably from the East-Asian community (different timezone, human network, strategic opportunity, etc.).
| |
− | :We also need to '''recruit an admin on Commons''' able to do mass delete when we provide a list of files. Do we have this already ? {{ping|VIGNERON}} ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:23, 3 October 2020 (UTC)
| |
− | ::Ok, I granted [[User:WikiLucas00|WikiLucas00]] as admin. For Commons, it would be indeed interested to have such profile. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 21:04, 4 October 2020 (UTC)
| |
− | ::: {{ping|Yug|Pamputt}} Thank you for your trust. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 21:25, 4 October 2020 (UTC)
| |
− | ::Ok for me to give admin statut to WikiLucas. [[User:Lyokoï|Lyokoï]] ([[User talk:Lyokoï|talk]]) 16:26, 6 October 2020 (UTC)
| |
− | | |
− | {{ping|Yug}} I'm a bit late but yes, obivously, if you need me as a Commons admin, you're welcome. Cheers, [[User:VIGNERON|VIGNERON]] ([[User talk:VIGNERON|talk]]) 17:29, 4 December 2020 (UTC)
| |
− | | |
− | == 0x010C year offgrid : preparations ==
| |
− | Hello folks, [[User:0x010C|0x010C]] anounced by email his soon to be departure from the project for a year+ off grid (he will tell more here if he wish to ;) ). We can't fully replace our [[:en:Benevolent_dictator_for_life#;)|benevolent lead developer]]. But could we brainstorm to see where he was active, and how to best fill the gap ? I'am kick starting this table but I have a biais since I don't know every task 0x010C was taking on nor do I know all active users on the project and your full skillsets. Please help us to fill in the gaps. 0x010C will be available between '''Oct. 15th and October 30th to pass some know how''' to who wish to. Let's prepare our questions properly for this transition. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:01, 21 September 2020 (UTC)
| |
− | | |
− | {| class="wikitable sortable" | |
− | ! Critical || Task / Aspect || Requirement ? || Who else knows ?<br>Who wish to learn ?|| Satisfy<br>requirements (%) | |
− | |-style="background:#FFAA0066;border-color:#FFAA0099"
| |
− | | high || Server maintenance || 1. Has back-end sysop knowledge<br>2. Has access rights to WMFr server (see WMFr sysop).<br>3. Knows how to maintain/restarts scripts and processes.<br>4. Knows how to restart NGINX server || Mickey Barber (WMFr) || 0% → 100%
| |
| |- | | |- |
− | | high || Edit recording wizard JS library || 1. Has advanced javascript know how.<br>2. Knows where js code is {link to js repository}<br>3. Edit and test js code locally .<br>4. Has access rights to push. || None || 0% → 0% | + | | |
| + | <query _pagination="10" locutor="<translate><!--T:7--> Item (locutor Qid)</translate>" locutorLabel="<translate><!--T:8--> Speakers of the Month</translate>" nb="<translate><!--T:9--> Number of records</translate>"> |
| + | SELECT ?locutor ?locutorLabel ?nb WHERE { |
| + | { |
| + | SELECT ?locutor (COUNT(?record) as ?nb) |
| + | WHERE { |
| + | ?record prop:P2 entity:Q2 . # Q2: record, P2: instance of. |
| + | ?record prop:P5 ?locutor . # Property:P5: speaker |
| + | ?record prop:P6 ?date . |
| + | FILTER ( YEAR(?date) = YEAR(NOW()) && MONTH(?date) = MONTH(NOW()) ) |
| + | } |
| + | GROUP BY ?locutor ?locutorLabel |
| + | ORDER BY DESC(?nb) |
| + | LIMIT 50 |
| + | } |
| + | SERVICE wikibase:label { |
| + | bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . |
| + | ?locutor rdfs:label ?locutorLabel . |
| + | } |
| + | } |
| + | ORDER BY DESC(?nb) |
| + | </query> |
| + | | |
| + | <pre> |
| + | { date:2022-01, records: 21290, speakers: 46, languages: 28 }, |
| + | { date:2022-02, records: 3894, speakers: 40, languages: 17 }, |
| + | { date:2022-03, records: 8357, speakers: 61, languages: 21 }, |
| + | { date:2022-04, records: 5454, speakers: 34, languages: 18 }, |
| + | { date:2022-05, records: 4702, speakers: 59, languages: 30 }, |
| + | { date:2022-06, records: 7675, speakers: 41, languages: 18 }, |
| + | { date:2022-07, records: 4364, speakers: 37, languages: 22 }, |
| + | { date:2022-08, records: 9544, speakers: 45, languages: 23 }, |
| + | { date:2022-09, records: 5802, speakers: 113, languages: 30 }, |
| + | { date:2022-10, records: 6931, speakers: 74, languages: 32 }, |
| + | { date:2022-11, records: 8461, speakers: 54, languages: 34 }, |
| + | { date:2022-12, records: 11882, speakers: 54, languages: 23 }, |
| + | { date:2023-01, records: 18150, speakers: 48, languages: 29 }, |
| + | { date:2023-02, records: 32441, speakers: 65, languages: 29 }, |
| + | { date:2023-03, records: 11527, speakers: 61, languages: 30 }, |
| + | { date:2023-04, records: 8451, speakers: 58, languages: 35 }, |
| + | { date:2023-05, records: 21282, speakers: 97, languages: 49 }, |
| + | { date:2023-06, records: 17940, speakers: 56, languages: 35 }, |
| + | { date:2023-07, records: 75825, speakers: 74, languages: 38 }, |
| + | { date:2023-08, records: 32681, speakers: 54, languages: 30 }, |
| + | { date:2023-09, records: 28813, speakers: 114, languages: 30 }, |
| + | { date:2023-10, records: 60317, speakers: 167, languages: 47 }, |
| + | { date:2023-11, records: 49704, speakers: 140, languages: 55 }, |
| + | { date:2023-12, records: 42383, speakers: 114, languages: 41 }, |
| + | { date:2024-01, records: 40572, speakers: 112, languages: 40 }, |
| + | { date:2024-02, records: 22385, speakers: 197, languages: 57 }, |
| + | { date:2024-03, records: 16997, speakers: 173, languages: 48 }, |
| + | { date:2024-04, records: 8733, speakers: 117, languages: 42 }, |
| + | { date:2024-05, records: 556, speakers: 7, languages: 7 } |
| + | </pre> |
| |- | | |- |
− | | high || Deploying fix into production || 1. Has back-end sysop knowledge<br>2. Has access rights to server. <br>3. Has access rights to pull corrected code.<br>4. Knows how to rebuild/deploy. || Mickey Barber (WMFr) || 0% → 80% | + | ! Daily recordings over April and May 2024 || |
| |- | | |- |
− | | high || Add new language to LinguaLibre || 1. Has <code>administrator</code> user rights<br>2. Can read tutorial {add tutorial link here} || Has done it: Pamputt, Lyokoy, Yug, ... || 90% → 90% | + | | |
− | |-
| + | <query _pagination="40"> |
− | | high || Read Phabricator task, fix code || 1. Has background knowledge to understand bug description.<br>2. Edit code, test locally.<br>4. Has access rights to push. || No replacement for real code, code deployment.<br>Replacements available for CSS, wiki content fixes. || 30% → 60%
| + | SELECT |
− | |-
| + | ?yearmonthday |
− | | medium || Assign user rights || 1. Has bureaucrats status<br>2. Know how to assign new user rights. || [[Special:ListUsers/sysop]]: 0x010C, GrandCelinien, Pamputt, Xenophôn.<br>Few more wouldn't hurt to counter unequal activity levels. || 100% → 100%
| + | (COUNT(DISTINCT ?record) AS ?records) |
− | |-
| + | (COUNT(DISTINCT ?speaker) AS ?speakers) |
− | | medium || Github repository manager || 1. Have access to repository.<br>2. Has <code>owner</code> status.<br>3. Can manage userrights || Has understanding: Yug, Poslovitch || 70% → 100%
| + | (COUNT(DISTINCT ?language) AS ?languages) |
− | |-
| + | WHERE { |
− | | medium || Create tasks on Phabricator || 1. Have account on phabricator.<br>2. Has background knowledge to write sharp bug / task description.<br>3. Know to manage Phabricator tasks || Pamputt, WikiLucas, Poslovitch, Yug, ... || 70% → 100%
| + | ?record prop:P5 ?speaker . |
− | |-
| + | ?record prop:P4 ?language . |
− | | low - assumed very stable || LinguaLibre -> Wikimedia Commons API communication || 1. Knows NodeJS (?) scripts.<br>Well documented on [[:mw:API]].<br>2. Knows where to edit existing nodes scripts.<br>3. Can test locally.<br>4. Has access rights to push. || None || 0% → 0%
| + | ?record prop:P6 ?date . |
− | |-
| + | BIND( SUBSTR(str(?date), 0, 11) as ?yearmonthday ) |
− | | low || Update site's CSS || 1. Edit [[MediaWiki:Common.css]] (hack)<br>or<br>1. Edit {git repository page}. || Has basic understanding of the hack way: Yug, Poslovitch, WikiLucas. || 30% → 100%
| + | { SELECT ?record |
− | |- | + | WHERE { |
− | | colspan=5| <small>Please help describe where we need help to take on 0x010C's skills. This year long's departure is the opportunity to us to increase our know how in these various fields. See also: [[:Commons:Category:Lingua_Libre]].</small>
| + | ?record prop:P2 entity:Q2 . |
− | |-
| + | ?record prop:P6 ?date . |
− | | colspan=5| <center>[[File:2018-12_Lingua_Libre_webrequest_flow.png|center|700px]]</center>
| + | FILTER(?date >= "2024-04-01T00:00:00Z"^^xsd:dateTime) |
| + | FILTER(?date < "2024-05-30T00:00:00Z"^^xsd:dateTime) |
| + | } |
| + | } |
| + | } |
| + | GROUP BY ?yearmonthday |
| + | ORDER BY (?yearmonthday) |
| + | </query> |
| + | | <= stops on 2024.05.01<br>Note: [[Special:Contributions/Austin Zhang|Austin Zhang]] recorded 174 audios on 05.11 |
| |} | | |} |
− | :Thanks for that overview. For now, the worst is there is no developer at all. without that, I think, we can only list all the issues we encounter on Phabricator waiting one is hired by WMFr or other. Considering all the tickets opened on Phabricator, a new developer can be busy for several month, especially if he/she does not very well the project as 0x010C does. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 15:56, 21 September 2020 (UTC)
| + | [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:39, 14 May 2024 (UTC) |
− | ::Yes. I'am quite worry about the speeded-up-audios and add-language bugs. The first literally throw to the trash hour-long efforts and '''pollute''' existing audios datasets, really bad. The second prevents diversity growth. It would be good to mount an emergency budget to pay 0x010C to fix these 2 critical phabricator issues before he leaves. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:42, 21 September 2020 (UTC)
| |
− | ==== Feedbacks from 0x010C ====
| |
− | Ok! So, I just got a correct phone-call with 0x010C! 0x has great projects ahead it's awesome, so happy.<br>
| |
− | Also, he pointed out the following points:
| |
− | * 0x010C will pass by LL's discussion page to '''add pointers''' to the table above and answer questions.
| |
− | ** This will occurs in late October.
| |
− | * 0x010C will ''not'' be able to '''inspect the speeding-up bug''' : we need to find an alternative to fix this.
| |
− | * 0x010C underlined the most critical need : '''a server sysop''', able to do server maintenance and restarts processes. Some peripheral routines such as the SPARQL counter occasionally fails and needs to be restarted manually.
| |
− | | |
− | First, '''on the speeding-up bug''', we therefor have 2 ways to push forward :
| |
− | # Corner the bug. We currently suspect it to be linked to Chrome. Test more. With various browsers. Record 30 words, listen to them, then report results (see section below)
| |
− | # Hire a freelance asap. The bug is suspected to be within https://github.com/lingua-libre/LinguaRecorder . There, the last 2 modifications and prime suspects are:
| |
− | #* 2020-05-09 [https://github.com/lingua-libre/LinguaRecorder/commit/102aa5041cbe24255fdb522bb045f693e9ca05fd#diff-e3f94ea1709f1bc0a8f6d9b4d22192f2 src/AudioRecord.js]
| |
− | #* 2020-04-28: [https://github.com/lingua-libre/LinguaRecorder/commit/102aa5041cbe24255fdb522bb045f693e9ca05fd#diff-e3f94ea1709f1bc0a8f6d9b4d22192f2 src/LinguaRecorder.js]
| |
− | | |
− | Secondly, on the '''critical server's maintenance, Wikimedia France's server sysop is also leaving soon''' and the next one is not yet identified. So we may need to send a far reaching call for a server sysop's help, either a volunteer, or the sysop of some friendly chapter ? (UK? DE? IT?) It could equally be the opportunity to open up deeply to non-French member. LL is 3 years old yet most of the LL's admins are French. Not smart. I would especially encourage to open up toward the Indian community, Odia, Tamil, who have been quite active and with high quality feedbacks. They themselves developed a shell-based audio recording tools few years back, so there is there a culture which value orality and acts to protect it. The call shouldn't be limited to this community. The Mediawiki community (techs), Commons, Wikidata, could have some relevant volunters with the needed skillsets. Would be good if we could write together a call to find a volunteer server sysop, together with a call for more diverse contributions to LinguaLibre.
| |
− | | |
− | While we can co-write this call asap, I think we should first corner the speeding-up bug before asking people to join in and contribute. So as long as the speeding-up bug is unsolved, we can only call for a server sysop, not for more contributors.
| |
| | | |
− | Meanwhile, please update the table above as you feel suit. I'am sure I forgot a lot of things. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 17:52, 22 September 2020 (UTC)
| + | === Fixed === |
| + | Both IP ranges 2001:41D0:0:0:0:0:0:0/32 and 2001:41D0:0:0:0:0:0:0/33 were subject to global Wikimedia block at one point (see [https://meta.wikimedia.org/w/index.php?title=Steward_requests/Global&oldid=26774369#Unregistered_users_only_block_for_the_range_2001:41D0:0:0:0:0:0:0/32 Global ban range_2001:41D0:0:0:0:0:0:0/32]). Following our request, the ban have been reconfigured and uploads from LinguaLibre are possible again. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:38, 14 May 2024 (UTC) |
| + | :I can record and upload since yesterday with my account, so that seems fixed. But it seems the stats are still not updated. [[User:Culex|Culex]] ([[User talk:Culex|talk]]) 12:08, 15 May 2024 (UTC) |
| | | |
− | == Speeding-up bug : call for testers == | + | === Logs === |
− | Please, we need testers to corner that nasty bug ! Could you test recording with various browsers ? Go to [[Special:RecordWizard]], pick a random language, record 30 words, listen to them, [DO NOT UPLOAD], then report here the resulting pentad ;)
| + | For references, I investigated the relevant block logs and uploads logs for May 2024.<br>Conclusion: the uploads collapse is coherent with the IP Ban. Still, given bug reports from Akamycoco in *March* and 咽頭べさ [[:c:File:Lingua_Libre_error_2024.webm|on step 4]], I suspects other bugs are lingering around. |
− | * Test list (suggested) : <code>List:Kur/Test</code> (10 words)
| + | {| class=wikitable |
− | * Username : <code>yourusername</code>
| + | !width=50%| Global IP bans |
− | * Speeding bug : <code>true|false</code>
| + | ! Lingualibre uploads logs |
− | * Web browser : <code>name-version</code> | |
− | * OS : <code>name-version</code> | |
− | * Microphone : <code>internal|external</code>
| |
− | [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:00, 22 September 2020 (UTC) PS: {{ping|DSwissK}} | |
− | {|class="wikitable sortable" style="width:100%;" | |
− | ! Tester username || Success rate || Web browser + version || OS + version || Microphone type || Comments | |
− | |-
| |
− | | DSwissK || 0% (speeding bug occurs at every recording) || Google Chrome 84.0.4183.127 || Android 9 || internal || List:Fra/Dico des Ados (3k+ words)
| |
− | |-
| |
− | | ClasseNoes || 0% (speeding bug occurs at every recording) || Google Chrome || ChromeOS || internal || List:Fra/Dico des Ados (3k+ words)
| |
− | |-
| |
− | | Luilui6666 || 0% (speeding bug occurs<br>on one session) || Google Chrome || MacOS || || 126 (100%) audios of [https://lingualibre.org/index.php?title=Special:Contributions/Luilui6666&offset=20200717000000&limit=128&target=Luilui6666 the 04:5*am upload batch] are corrupted. This session contains longer than average phrases. Session before and after are ok.
| |
− | |-
| |
− | | Yug || 100% (no bug) || Google Chrome 85.0.4183.121 (64-bit) || Ubuntu 20.04 || internal || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Yug || 100% (no bug) || Google Chrome 85.0.4183.121 (64-bit) || Ubuntu 20.04 || external || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Yug || 100% (no bug) || Chromium 85.0.4183.121 (64-bit) || Ubuntu 20.04 || internal || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Yug || 100% (no bug) || Chromium 85.0.4183.121 (64-bit) || Ubuntu 20.04 || external || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Yug || 100% (no bug) || Kiwi 77.0.3865.92 (2020-08-15) || Android 9 || external || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Yug || 100% (no bug) || Kiwi 77.0.3865.92 (2020-08-15) || Android 9 || internal || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Yug || 100% (no bug) || Chrome 80.0.3987.99 || Android 9 || external || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Yug || 100% (no bug) || Chrome 80.0.3987.99 || Android 9 || internal || Observed at step 5. Review before Publish. List:Kur/Test (10 words)
| |
− | |-
| |
− | | Pamputt || 100% (no bug) || Firefox 78.3.0 esr || Mageia Linux 7 || internal ||
| |
− | |-
| |
− | | DSwissK || 100% (no bug) || Google Chrome 86.0.4240.75 || Android 10 || internal || List:Fra/Dico des Ados (3k+ words)
| |
− | |-
| |
− | | DSwissK || 100% (no bug) || Google Chrome 86.0.4240.99 || Android 10 || external|| List:Fra/Dico des Ados/alea (54 words)
| |
− | |-
| |
− | | DSwissK || 0% (all bugged) || Google Chrome 86.0.4240.198 || Android 10 || external || List:Fra/Dico des Ados (3k+ words)
| |
| |- | | |- |
− | | DSwissK || 100% (no bug) || Google Chrome 86.0.4240.198 || Android 10 || external || List:Fra/Dico des Ados/alea (30 words) | + | | |
− | |-
| + | * [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F32&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 18:46, 13 May 2024] EPIC talk contribs changed global block settings for 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (anonymous users only) (No open proxies <!-- SCLT ID: Possible VPN or Colocation -->) |
− | | <add yourself> || || || || ||
| + | * [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F32&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 00:51, 10 May 2024] AmandaNP talk contribs globally blocked 2001:41d0::/32 talk with an expiration time of 00:51, 10 May 2026 (No open proxies <!-- SCLT ID: Possible VPN or Colocation -->) |
− | |}
| + | * [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F33&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 17:02, 9 May 2024] EPIC talk contribs changed global block settings for 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (anonymous users only) (Open proxy/Webhost: See the help page if you are affected) |
− | {{Ping|ClasseNoes|DSwissK}} It works fine with me on both Chrome and Chromium. Did you and could you try with other OSes ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:59, 6 October 2020 (UTC)
| + | * [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F33&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 17:09, 1 May 2024] EPIC talk contribs blocked 2001:41d0::/33 talk with an expiration time of 2 years, 364 days, 12 hours, 21 minutes and 36 seconds (anonymous users only, account creation disabled) (Open proxy/Webhost: See the help page if you are affected) |
− | :{{Ping|Titodutta|Lyokoï|Pamputt}}, we need help to corner the speed up bug by doing more devices testings. Could you help ? INALCO workshop is on Oct. 17th. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 17:32, 6 October 2020 (UTC)
| + | * [https://meta.wikimedia.org/wiki/Special:Log?type=&user=&page=User%3A2001%3A41D0%3A%3A%2F33&wpdate=&tagfilter=&wpFormIdentifier=logeventslist 17:09, 1 May 2024] EPIC talk contribs globally blocked 2001:41d0::/33 talk with an expiration time of 17:09, 1 May 2027 (Open proxy/Webhost: See the help page if you are affected) |
− | ::I've added the information relative to my configuration. Another explanation may be the internet quality (bandwith, latency, etc). In the case of micro-cuts, some software accelerates the voice to make up for the delay. So maybe it could happen more often with 3G connection instead of optical fibers. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 19:50, 6 October 2020 (UTC)
| + | | |
− | :::0x010C was suggesting a purely client-side issue. The recording into audio data are done client-side. I don't see clear pattern emerge so far. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:52, 6 October 2020 (UTC)
| + | * : [https://commons.wikimedia.org/wiki/Special:RecentChanges?hidebots=1&translations=filter&hidecategorization=1&hideWikibase=1&tagfilter=OAuth+CID%3A+1735&limit=500&days=30&urlversion=2 Uploads via Lingualibre resumed]. |
− | ::::It's definitely NOT internet quality for ClasseNoes and myself. We had a good connexion (not over mobile) and the bug occured at several days of difference. [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 06:19, 8 October 2020 (UTC)
| |
− | :::::'''Current assessment:''' 3 out of 3 of our users with speeding bug used Google Chrome (v84?), on Android 9, ChromeOS, MacOS. The best lead we have so far is a Google Chrome recording API implementation-related, either due to a recent Google Chrome update, or to recent LinguaLibre JS's update done by 0x010C around May. (See above to link to suspected JS code). I took a quick (3mins) look at "[https://www.google.com/search?q=Chrome+audio+recording+speed+bug&oq=Chrome+audio+recording+speed+bug&uact=5 Google Chrome + audio recording + speed bug]" but nothing conclusive.
| |
− | :::::'''More tests?:''' Could you [[User:DSwissK|DSwissK]] & [[User:ClasseNoes|ClasseNoes]]* test again on the same devices (hardware, OS) but with different web browser.
| |
− | :::::@ClasseNoes: could you check your exact Google Chrome's version ?
| |
− | :::::@DSwissK, after retesting on Android 9 Chrome v.84, do you have the possibility to update so to test Android 9 Chrome to v.85 ?Android 9 Chrome to v.85 works for me on small test lists (~10 items). Could you also comment more : does this speeding bug show up ''EACH'' time you use this Android 9 Chrome v.84 pair ? Is there a saturation effect with longer lists ?
| |
− | :::::<nowiki>*: Luilui6666 is a student who did a paid recording sprint on Cantonese and moved on, he volutarily helped a lot already with the `ratelimit` bug, I can't request further free help from this side. PS: 270€ for 9h and 5000 audios despite the ratelimit bug. Really fruitful experiment! Worth it to boost a language.</nowiki> [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 12:18, 8 October 2020 (UTC) | |
− | :::::: I was thinking maybe it was because of the huge list I'm using ([[List:Fra/Dico des Ados]]) but no, it works fine (see last row) on last Chrome version (and Android 10 that I flashed this week-end on the same smartphone). [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 07:23, 11 October 2020 (UTC)
| |
− | :::::::{{ping|DSwissK}} I suspect Google Chrome v84 to be the issue. But we have not enough details to be conclusive. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 08:49, 15 October 2020 (UTC)
| |
− | :::::::After verification, Luilui6666 had both corrupted batch and non-corrupted batch of audios about 30mins appart. Recordings ok before and after [https://lingualibre.org/index.php?title=Special:Contributions/Luilui6666&offset=20200717000000&limit=128&target=Luilui6666 the corrupted session]. My Chrome-centered hypothesis is challenged. Must be something else. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 09:45, 17 October 2020 (UTC)
| |
− | ::::::::Yug, how did it go on Oct 17 ? Did you encounter that problem with some users ? [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 11:10, 20 October 2020 (UTC)
| |
− | :::::::::{{ping|Eavq|wikiLucas00|DSwissK|Nicolas_Lopez_de_Silanes_WMFr}} Hello DSwissK. I was not there, I'am too far off (near Spain, 800km). Need to ask WikiLucas, Eavq and Nicolas. See also [[LinguaLibre:Formations_CCWL]]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:16, 22 October 2020 (UTC) | |
− | | |
− | == Lingua Libre Story for September 2020 ==
| |
− | | |
− | :''This is not an official story or newsletter. This is an attempt by the project user(s) to share some updates about the program. There might be more stories which we have missed.''<br>
| |
− | September 2020 was an eventful month and we have seen a lot of activities of uploading new content and also around project-related discussion. Here are some of the best stories from September 2020.
| |
− | * '''300,000 files:''' On 10 September 2020 we completed 300,000 pronunciation uploads. After the launch in August 2018, the first 100,000 files were uploaded in April 2019, and the milestone of 200,000 files was reached on January 2020. As of 30 September 2020 there are 366 speakers at this project working in 92 languages.
| |
− | * '''Maximum number of pronunciations in a month:''' In September 2020, 23,209 files were uploaded. This is the maximum number of files uploaded ever in a particular calendar month (earlier it was 22,963 files in June 2020, and 22,293 files in May 2019).
| |
− | * '''Indian language in top 3 list:''' This month Bengali language came into the top three languages by the number of files uploaded using Lingua Libre. This is possibly the first time a non-European/Indian language came into the top three most-uploaded languages on the project. As of 30 September there were 26,757 files in Bengali (the top two languages by file count were French: 164,626 files and Esperanto: 28,100)
| |
− | * '''Project chat:''' Several discussion started on the Chat room, such as [[LinguaLibre:Chat_room#Speeding-up_bug_:_call_for_testers|Bug testing]] (you may help), [[LinguaLibre:Chat_room#0x010C_year_offgrid_:_preparations|Technical preparations]] etc.
| |
− | * '''Coming:''' 1) Oct. 17th's workshop at [[:en:INALCO|INALCO University]], Paris. This University teach about 105 languages. 2) In late October, [[User:0x010C]] willing to share server's know-how before year-long departure off-grid. | |
− | That's it. Have a good time. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 16:30, 1 October 2020 (UTC)
| |
− | :Thank you [[User:Titodutta]], it's an interesting format. We can also think of it as collaborative news-letter, edited here, then shareable to our networks. I added a "Coming" section. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:31, 4 October 2020 (UTC)
| |
− | | |
− | == English label and non-English label == | |
− | Most probably I did not notice this earlier [[Q389651]]. Label: this is not English, this is Bengali. My language is set to Bengali as well. Sad thing is: this affects many Bengali files, if not all. I also saw one of the recent uploads in other language: Esperanto: [[Q389566]]. For Bengali, and several other languages the script is completely different. (a whole lot of bot work I think). Opinion? --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 22:04, 4 October 2020 (UTC)
| |
− | :{{ping|Titodutta}} not sure if this is really a bug. This behaviour comes from the fact Lingua Libre uses Wikibase to handle its own items. And Wikibase allows as many labels as there are languages but actually we do not need any label on Lingua Libre. So, by default it is always English. That's said, I understand it can be weird for some people, so I think the label should be the word that has been recording in English and in the language of the word so that it can be displayed as it when we you use Lingua Libre in your mother tongue. Or maybe it could be the same label for all languages. Anyway, except it is a bit strange, it is not a big deal because these labels are not used by any one. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 19:41, 6 October 2020 (UTC)
| |
− | | |
− | == Other bugs from India ==
| |
− | Hi [[User:Titodutta|কথা]], I was happy to meet you tonight even if it was short. About the bugs you discussed I have created [[phab:T264790|T264790]]. There is also the problem with the labels discussed above. You talk about a problem about duplicates in the word list but I am not sure I have understood correctly because I was not able to reproduce. So could you open a bug report on Phrabricator to describe what is wrong? If you are uncomfortable with Phabricator, you can describe the problem here and I will open the ticket over there. And there was also another point but I have forgotten it so could you kindly remind me? [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 19:45, 6 October 2020 (UTC)
| |
− | * The 4 bugs we discussed today:
| |
− | :a) Special pages showing errors, you have explained it above.
| |
− | :: {{done}} [[phab:T264790|T264790]].
| |
− | | |
− | : b)Post file move error on Wikimedia Commons: it has 2 types of problems: on Commons, after moved the files are not displayed, b) "Remove words already recorded" don't count those words, so if LL-...Hello.wav is moved to LL-...Bonjour.wav, LinguaLibre does not understand Bonjour is already recorded, and asks to record again.
| |
− | :: This is currently an issue. Ideally, recording should not be renamed manually because it happens what you described. The problem comes rom the fact the Commons database and the Lingua Libre Wikibase are not connected. The workaround is to modify manually the Lingua Libre item corresponding to the recording once renaming has been done on Commons.
| |
− | :: If Wikimedia France finds some money, a tool allowing to rename and to apply the needed changes in Lingua Libre could be developed. See [[phab:T264789|T264789]] for a brief overview (you should develop further what we would like this tool be able to do).
| |
− | | |
− | : c) some words are being eliminated: I'll try to explain this: a particular word, such as "Paris", when I try to generate from a category from Wikipedia, I get this word and record it. Then I try another option: "Nearby" to generate words, and there also I get the same word. Now, ''possibly'' LiLi ''sometimes'' fails to understand the word is already recorded. This is not applicable for all files, I have seen this in 3-4 files. <br>Let me give a clear example, see [https://lingualibre.org/index.php?title=Q381622&oldid=330125#P19 this edit]. This is pretty clear I got the word from Wikipedia. The immediate [https://lingualibre.org/index.php?title=Q381622&type=revision&diff=339010&oldid=330125 next edit] I used "Nearby" to get a list.<br>I have checked it for words. I once I can find more samples, I'll report a bug.
| |
− | :: Hard to debug. I tried several times with several words and I was not able to reproduce this bug. Are you sure you clicked on "Remove words already recorded"? So not sure opening a bug report is very useful until you have a word that triggers this bug. Did you try to rerecord [[Q381622]] to see wether this bug occurs again?
| |
− | | |
− | : d) There are actually more bugs, which we did not discuss, that may need quick fix: For example if you use Vector Skin ([https://lingualibre.org/wiki/LinguaLibre:Main_Page?useskin=vector click here for preview]) at the top of the page it says "A maintenance operation is planned for today. ..." I am seeing this message for 2 months now. It might be a minor fix, perhaps we forgot to remove this notice. b) vector skin main page may need more work, as the main page is designed for LiLi skin, the recent files and other nice designs are not working in vector at all. | |
− | :: Lingua Libre supports officially only one skin (BlueLLs). I guess Vector has not been disabled when we moved to the new version of the website. I will open a ticket to ask to remove Vector so that there is only one skin to support. See [[phab:T265079|T265079]].
| |
− | | |
− | : e) Coming to internationalization (which is not a bug): some important pages need to be in English also, as of now, such as [[LinguaLibre:Privacy_policy]]. Of course the page can be marked for translation, however until the page is in English also, this might a bit difficult to translate directly from French.
| |
− | :: This is known. A working group is planned to work on the documentation page later this month. We will move all pages to English and make them translatable before the end of the year.
| |
− | :: {{done}}
| |
− | | |
− | :(Fun fact: You saw I used the word "LiLi". Sometime ago I posted on your talk page about the pronunciation of "LinguaLibre". I am aware in Indian community we often use the short form LiLi/Lili in our discussion, which is a [https://www.sheknows.com/baby-names/name/lili/ female name] in multiple language including a few Indian languages.)
| |
− | : This was indeed good to talk you. This was very kind you switched to English briefly. As gradually we are seeing more contribution from India and other countries, possibly we can have a global meet/France-India meet in future. | |
− | Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 20:47, 6 October 2020 (UTC)
| |
− | :: {{support}} I approve using the nickname "'''LiLi'''"! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 11:05, 7 October 2020 (UTC)
| |
− | ::Yes, it's an elegant nickname. Thanks for the suggestion :) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:58, 7 October 2020 (UTC) (The writen "LL" don't translate well when we oralize it in French.)
| |
− | {{ping|Titodutta}} for '''e)''', I've marked the page for translation, and translated it into English :). Please do not hesitate if you see other pages in the same situation. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 13:08, 8 October 2020 (UTC)
| |
− | :{{ping|Titodutta}} I answered point by point in your text to make the discussion more understandable. Feel free to reply below. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 19:08, 8 October 2020 (UTC)
| |
− | | |
− | == Add "Recent changes (non-audio)" to "Tools" menu == | |
− | :1. {{Done}} -- this allow quick access to list of recent changes, without the MASSIVE overload of audio recordings. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:21, 8 October 2020 (UTC) | |
− | I just found out my [[User:Yug/common.js]] doesnt work, doesn't even run a simple <code>console.log("Hello world!")</code>. Any idea why ?
| |
− | | |
− | [[MediaWiki:common.js]] does work as expected. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:52, 8 October 2020 (UTC)
| |
− | :[[:mw:Manual:Interface/JavaScript]]: ''"If $wgAllowUserJs is set to true, users can customize the interface for only themselves by creating and importing personal scripts in certain user subpages."'' [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:59, 8 October 2020 (UTC)
| |
− | :Damn. I wanted to test on myself before to move to [[MediaWiki:common.js]] [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:00, 8 October 2020 (UTC)
| |
− | ::{{ping|Yug}} I reverted what you did. I think it is not a good idea to enable it for everyone because it loads more javascript for something that almost no one uses; it is possible to get the same results in a few clicks. So please, add this code only in your [[Special:MyPage/common.js|common.js]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 19:20, 8 October 2020 (UTC) | |
− | :::Hello Pamputt. As explained above, [[Special:MyPage/common.js]] is not activated on LL so I couldn't test it there first and we aren't able to do so. So I went ahead and tested this non-breaking change on the site-wide Common.js.
| |
− | :::Ok for the revert. Review and community discussion and approval was required, your input and revert are meaningful parts of this required discussion.
| |
− | :::'''As for the whole rational...''' ''Recent change'' is an access point which mainly allow active users do to patrolling activity.
| |
− | :::Current Recent change ([[Special:RecentChanges]]) access point just display 50 last changes while Lili records between 300 and 700 audios per days. The stream of recent changes is therefor overfloaded by large amount of audios files which no-one but the speaker actually will create, edit, review. Does someone browser those 3~700 audios changes daily? listen them ? Can you or me review the correctness of Bengali recordings' file names ? Or review their content correctness ? Unlikely. As far as I can see, it's a stream of "Done" things : there is no practical patrolling to do this flow of audio files, nor is it any need to patrol them. This situation is proper to LinguaLibre. Most wikis are text based. [[:Commons:Special:RecentChanges]] is an hybrid with uploads but also lot of file renaming, editing, discussions, projects pages so the stream is a mix. LinguaLibre file-work is close to 100% only uploading it by the speaker. Then nothing, the stream is dispatched to Commons.
| |
− | :::On the other hand, the meaningful changes done on textual pages which require active monitoring is made harder since these textual changes are buried down among the number of audio files. It seems to me that patrolling edited text-pages (documentation, discussion, user pages) seems as or more relevant, but is currently made inaccessible or accessible via a more complex access point.
| |
− | :::But definitive adoption would needs consensus, which we haven't. So no quick adoption needed. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 08:57, 9 October 2020 (UTC)
| |
− | :* Suggestions: a) enable user common.js and common.css pages (I was not aware that these pages are disabled!), AND/OR b) put this as an opt-in gadget at [[Special:Preferences#mw-prefsection-gadgets|Special:Preferences/Gadgets]]. The "Gadgets" page is empty now, and gradually gadgets can be added, mostly to be opted in (or manually enabled) by interested users. Kind regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 23:20, 9 October 2020 (UTC)
| |
− | ::I do not know why user common.js is disable. Anyway, I added a new gadget for this settings. So feel free to enable it. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 09:51, 10 October 2020 (UTC)
| |
− | :::I think the default php setting is without personal js. The developer has to set <code>$wgAllowUserJs</code> to true, which 0x010C apparently never did because the need never arose before. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 13:35, 12 October 2020 (UTC)
| |
− | | |
− | == Translation == | |
− | I translated into English several pages ([[LinguaLibre:Privacy policy]], [[Help:Your first record]] and [[LinguaLibre:About]]) that were originally written in French, and marked the new versions for translation (I also marked for translation the latest version of [[LinguaLibre:Stats]], which includes the latest crossed thresholds in the description paragraph, and a new row in a table). I think the translation of the pages will be easier for non-French-speakers as from now. Though, '''''every translation of these pages - '''except in French''' - is now outdated (for all or part of it)'''''.
| |
− | | |
− | While translating (or patrolling other people translating), please be careful with the code (one should not change the code on translation pages, only on the main page (/en)). — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 11:50, 10 October 2020 (UTC)<br/>
| |
− | <small>PS: for the Stats page, I changed the translation areas, to limit the amount of code in the translation, in order to limit the risks of translators breaking the code — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 22:13, 10 October 2020 (UTC)</small>
| |
| | | |
− | :Perfect. Meanwhile, I finish to translate the main page and I added a button at the bottom to be able to translate this page. So, go ahead :) [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 15:25, 10 October 2020 (UTC)
| + | 13 May 2024 |
− | ::{{ping|Pamputt}} Great! All of this is a big step forward {{smile}}. I think we could also take some time to design improvements for the chat room. Like most "Village pumps", we could divide it in two parts: permanent content in one hand, such as the FAQs, but in the other hand, some content such as this topic, archived after a while (for instance every 3-6 months, since it is not too active), in order to make the page lighter and easier to read. Also, the link "Start a new discussion" does not seem to be working. When I look over the text of the Header it's clickable, but not on the actual page... — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 21:25, 10 October 2020 (UTC)
| + | * [... Many more uploads] |
− | :::{{ping|WikiLucas00}} Good idea to create a FAQ page in parallel of the chat room; feel free to start one. If you are interested in documentation, there is a meeting end of October (https://framadate.org/1C4aA6vVYWz2izgp). About archiving the chat room, this is done once a year (manually) ; see [[LinguaLibre:Chat_room/Archives/2018]] and [[LinguaLibre:Chat_room/Archives/2019]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 08:25, 11 October 2020 (UTC) | + | * Upload log 23:39 Elwinlhq talk contribs uploaded File:LL-Q5218 (que)-Elwinlhq-apaqay.wav Tag: Lingua Libre [2.2] |
− | ::::Hi {{ping|Pamputt}}
| + | * Upload log 19:05 Assassas77 talk contribs uploaded a new version of File:LL-Q9192 (cmn)-Assassas77-八角.wav Tag: Lingua Libre [2.2] |
− | ::::I [[LinguaLibre:Chat_room/Archives/2020|archived]] some topics that were solved since more than a month. Although, I still have a problem regarding the site's blue headers. I can't click on the links they contain (for example, the '''Start a new discussion''' button in the current page's header, or the link to access the main page when visiting a subpage such as [[LinguaLibre:Chat_room/Archives/2020|this one]]). Is it the same for you? Do you know what could cause this? — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 19:23, 10 November 2020 (UTC)
| + | * Upload log 19:05 Assassas77 talk contribs uploaded File:LL-Q9192 (cmn)-Assassas77-八角.wav Tag: Lingua Libre [2.2] |
− | :::::{{ping|WikiLucas00}} yes this is the same here. About the cause [[User:Lyokoï|Lyokoï]] told me that this is because there is some transparent image in front of the text (I did not check) so in principle it should be quite easy to fix (this is probably a CSS problem). [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 22:32, 10 November 2020 (UTC)
| + | * Upload log 16:38 Oh! Tea<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Oh!_Tea Commons > User:Oh!_Tea : « nothing » on Commons]</ref> talk contribs uploaded File:LL-Q36759-Austin Zhang-sih8 buh8 sah8 nah4.wav Tag: Lingua Libre [2.2] |
− | | + | 11 May 2024 |
− | == Stats page ==
| + | * Upload log 20:21 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-buah8.wav Tag: Lingua Libre [2.2] |
− | Is the Stats page loading now? <br>Some parts of the source page should not be taken by Fuzzybot to other language pages as it is. For example, the Statistics (En) page had language labels in "Fr" (which should have been in En, as the page was in En), while translating I fixed it. Now it is again broken, and I can not edit the bn page, other than translating. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 22:12, 10 October 2020 (UTC)
| + | * []... +172 recording by User:Oh! Tea] |
− | :{{ping|Titodutta}} You are not able to change the labels while translating? for instance in this section, you could replace the words that are within quotation marks by words in Bengali: <nowiki><query yearmonth="Date" records="New records" speakers="Active speakers" languages="Active languages"></nowiki> — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 22:16, 10 October 2020 (UTC) | + | * Upload log 18:56 Oh! Tea talk contribs uploaded File:LL-Q36759-Austin Zhang-a2.wav Tag: Lingua Libre [2.2] |
− | ::Thanks, yes, [https://drive.google.com/file/d/1gUo138CfGh3Y6noWwQ144d7tMuCO2v01/view?usp=sharing translating] seems to be the only option. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 22:38, 10 October 2020 (UTC)
| + | 10 May 2024 |
− | ::* The stat page seems to be much slower now to load. Can anyone else check please? Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 00:02, 11 October 2020 (UTC)
| + | * Upload log 06:08 CapitainAfrika<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:CapitainAfrika Commons > User:CapitainAfrika : « IP block exempt » on Commons]</ref> talk contribs uploaded File:LL-Q36217 (lin)-CapitainAfrika-Wiki na monɔkɔ mua bísó.wav Tag: Lingua Libre [2.2] |
− | : {{ping|Titodutta}} I just understood you were talking about the name of each language in the table, and not the title of the columns. Sorry for this. I added a new section to the translation, it's mostly some code, and in theory the translator only has to insert the language code instead of "en" in the section. The thing is, only "fr" and "en" seem to be working... I set the English stats page and every translated stats page to "en", except for French (it made more sense like this). I don't know where to find the "languageLabel" in order to translate them into other languages... <br/> I tried many changes to the requests and really felt the slowness of the current system while waiting everytime for the tables to load. {{ping|Pamputt}}, do you know if we could be able to add a cache to this pages (to be purged on a regular basis), to avoid having to load the whole request everytime?(for translated stats pages, the waiting time is so long that I don't think many people wait until the end) — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 00:54, 11 October 2020 (UTC)
| + | * Upload log 00:14 Ardzun<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Ardzun Commons > User:Ardzun : « nothing »]</ref> talk contribs uploaded File:LL-Q13324 (min)-Ardzun-mada.wav Tag: Lingua Libre [2.2] |
− | :* Bn is working fine at [[User:Titodutta/প্রশ্ন]], other languages should work fine as well. Each language is an item such as [[Q126]], which needs labels in different languages. Thanks. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 01:00, 11 October 2020 (UTC)
| + | 9 May 2024 |
− | ::I am not really an expert of the SPARQL system. [[User:VIGNERON|VIGNERON]] knows much more about that. I only know there is a [[phab:T212079|bug report]] about the performance issues. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 08:11, 11 October 2020 (UTC) | + | * Upload log 17:08 Àncilu<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Àncilu Commons > User:Àncilu : « Autopatroller » on Commons]</ref> talk contribs uploaded File:LL-Q652 (ita)-XANA000-orsù.wav Tag: Lingua Libre [2.2] |
− | | + | * Upload log 17:05 Àncilu talk contribs uploaded File:LL-Q652 (ita)-XANA000-frac.wav Tag: Lingua Libre [2.2] |
− | | + | 5 May 2024 |
− | ==Priorities of Lingua Libre==
| + | * Upload log 21:15 Benoît Prieur<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Benoît_Prieur Commons > User:Benoît_Prieur : « Administrator » on Commons]</ref> talk contribs uploaded File:LL-Q8785 (hye)-Benoît Prieur-Artsakh.wav Tag: Lingua Libre [2.2] |
− | I've looked around and I can't seem to find any priorities of this project. It seems that the overall goal is to record pronunciation, but how this will be done is less clear. Based on my experience with Forvo, I think that this will help the project.
| + | 1 May 2024 |
− | | + | * Upload log 16:09 Penn Zero MSSJ<ref>[https://commons.wikimedia.org/wiki/Special:Log?page=User:Penn_Zero_MSSJ Commons > User:Penn Zero MSSJ : « nothing » on Commons]</ref> talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hệ số.wav Tag: Lingua Libre [2.2] |
− | === Words priority ===
| + | * Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hỗn số.wav Tag: Lingua Libre [2.2] |
− | Focus on pronouncing headwords first. Forvo is flooded with overly specific phrases that only a few uses will use. It would be helpful to scrape a large authoritative dictionary such as the OED, Duden, or TLFI to get a list of words. I don't think that words are under copyright.
| + | * Upload log 16:09 Penn Zero MSSJ talk contribs uploaded File:LL-Q9199 (vie)-Penn Zero MSSJ-hằng đẳng thức.wav Tag: Lingua Libre [2.2] |
− | : We recommend frequency lists and authoritative this, but the copyright status of those are embiguous. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC)
| + | * [... Many more uploads] |
− | :: Could we use wiktionary to help create this official list? [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 22:45, 21 October 2020 (UTC)
| |
− | :::{{ping|Languageseeker}} see [[User:Titodutta#কোয়েরি]]'s wiki query. I think we can use wiktionary, but I'am unclear how. Maybe it's even available in the Wizard as a built-in feature when you chose the list. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:41, 22 October 2020 (UTC)
| |
− | | |
− | === Words variations ===
| |
− | For each headword, pronounce it with the definite, indefinite, and solo; e.g. "the dog", "a dog", "dog". Also pronounce the declined forms in languages such as Latin or German. Group them all on one page under the headword. For phrases, there's no need to inflect or decline them.
| |
− | : Words variations and verbs are typical to English and western languages. I'am not sure how each language process those questions, but I think we have no recommendation in place. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC) | |
− | :: This is why I believe it is important to create recommendations in place before the site gets too large. We don't want to have to manually deal with these issues latter. I know that the editors of Forvo are struggling with precisely this issue, especially in English. The best thing to do would be to create a bot to tag alternative spellings, generate files for them, and automatically generating the pages for the alternative spelling. For example, in French, you have électroménager and électro-ménager . If a user pronounces either one of these orthographic variations, the bot should generate files and pages for both variations. Otherwise, we'll be asking to effectively pronounce the same word multiple times. It would also probably be helpful to create a bot to scrape alternative spellings from wiktionary. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 22:44, 21 October 2020 (UTC)
| |
− | :::Maybe we should consider "Portals" per language.... with the specific tips, recommended list, active/reference users. Seems a good idea (IF someone ready to attack it XD) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:18, 22 October 2020 (UTC)
| |
− | :::: {{ping|Yug}} You inspired me to create a [[phab:T266306|phabricator ticket]] on how to do this. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 02:00, 23 October 2020 (UTC) | |
− | | |
− | === For verbs ===
| |
− | It's best to focus on the irregular and model ones first. It's also makes sense to pronounce them in all the possible permutations. For example, in French, il/elle/on est should have the following entries: "il est", "elle est", "on est", "il/elle/on est", and "est"
| |
− | : As for the previous point, each user is free to record whatever he/she wants. So the question becomes how to manage/organise/browser all the recordings. For that point, everything has to be done so your point of view is more than welcome. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | | |
− | === User supplied lists ===
| |
− | They are great, but quickly turn into a headache. They require lots of proofreading that can overwhelm editors. Only after we finish pronouncing all the headwords and verbs should we open this to general suggestions.
| |
− | :For "user supplied lists" and "site supplied lists" there are some ongoing efforts this side. An user can create a list which becomes a site supplied list for later users. [EDIT]: We have help pages recommending and demonstrating how to create frequency lists for better impact. See [[Help:Main]]> search "frequency". [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC) | |
− | :: If I understand correctly, you suggest to have "official" lists that are proofread so that we can propose these lists in priority to the users. I think it is a good idea because some users do not always know which words to record. We should open a Phrabicator ticket to keep track of that feature request. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | ::: Precisely, I'm not against allowing speakers to pronounce whatever they want, but we should have some official list of what we want pronounced. This is why I also suggested the different forms that we should target, especially if we begin to suggest words for users to pronounce. Otherwise, people will begin adding lists with misspelled words or phrases that have no widespread usage. This will create unnecessary work for editors to correct and delete. Do we really want discussions about whether or not "The Pink Adrietic restaurant will closed today at 9:30 due to an alien invasjon" should be on the official list at this stage? What will we do if a user adds 60,000 of these? [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 22:56, 21 October 2020 (UTC)
| |
− | ::::(Note: I edited my paragraph above.) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:43, 22 October 2020 (UTC)
| |
− | | |
− | === Site supplied lists ===
| |
− | Such lists of words that usersy can pronounce are better from a project management standpoint. This list should be randomized at each refresh in case the user is not interested in the selection. Let users see the list first and then make them log in.
| |
− | : I am not sure to understand exactly what you mean. Is it related to the previous item? Could you give an example? [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC) | |
− | :: It's related to the previous item. If we create official lists, we should not adopt the last-in/first-pronounced model of Forvo. On Forvo, the last word added is the first word in the list for users to pronounce. Speakers have no option to change the way that the list is generated. I believe that we should have more flexibility. Instead of displaying the last words first, I'm proposing that LL randomly sorts the list by default. We could also add a drop-down menu with: Random, Newest, and Oldest. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 23:00, 21 October 2020 (UTC) Edit: Official lists can also help prevent unnecessary duplication of effort. Look at the entry for "arbre" on French Wiktionary. Do we really need 32 pronunciations of "arbre"? Does it make sense to add 32 different sound files to one wiktionary page? [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 16:06, 22 October 2020 (UTC)
| |
− | :::Note: There are various schools of though here on LinguaLibre :
| |
− | :::* the '''linguists''', which would be happy to have 200,000 version of "arbre" so one could study the variability.
| |
− | :::* the '''language teachers/learners/learning apps developpers''', who one one clear and standard speaker for 10~30,000 most frequent words, just once each and with zero hole in the dataset.
| |
− | :::[[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:45, 22 October 2020 (UTC)
| |
− | :::: I'm not for limiting the maximum number of pronunciations for an item, but I do not want a situation where "arbre" has 200,000 pronunciations and "cigale de mer" has zero. A site supplied list will make it more likely that rarer words will receive at least one pronunciation. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 02:07, 23 October 2020 (UTC)
| |
− | :::::+1. Maybe a specific label within the list's pagename. Ex: List:CMN/HSK-0001-to-8868_(RECOMMENDED) ?
| |
− | :::::This decision could be made within a language community via its Portal. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:11, 23 October 2020 (UTC)
| |
− | ::::::+1. I like the idea of adding labels. I think this should help also avoid any copyright claims. We can add tags such as "HSK 1.1" (HSK Revision 1 Level 1) "HSK 2.1" (HSK Revision 2 Level 1) and "HSK 3.1" (HSK Revision 3 Level 1) for the various iterations of HSK and that way a user can easily tag see all the word pronounced for a given language list. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 00:48, 24 October 2020 (UTC)
| |
− | :I also support this idea. Currently, all the lists of a given language are proposed to the contributor. For the French language, it starts to become a bit messy to find what we are interested in. So I think the system of local list of the Record Wizard should be improved to highlight some specific lists. I will open a Phabricator ticket to keep track of this idea. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 13:46, 1 November 2020 (UTC)
| |
− | | |
− | === Accents matter ===
| |
− | :''See also: [[Help:Renaming]] (using metadata tags). | |
− | They should be tagged as part of the filename. For example, <code>LL-Q1860_(eng)-Commander_Keane-phonate.wav</code> contains no accent information. <code>LL-Q1860_(eng_Au)-Commander_Keane-phonate.wav</code> would be better. Also, allow users to filter by accent codes. We would need to think about and propose a list of accents for each language.
| |
− | : The choice has been done to tag the location in the wiki metadata. Indeed, people are not always aware that they have an accent and an accent can highly vary in a given country or region. So it becomes difficult to find the good granularity. So that, saving the location allow to write some query to get exactly the recordings we are interested in. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | :: But, geographic location is not an accurate predictor for an accent because people move around and some have speaking impediments. For example, you can have an American living in Paris or a Parisian living in America. Who will speak the words more accurately? A person living in London could speak the Queen's English or have a Cockney accent. Especially for language learning, accents are important. We're not judging accents, but merely tagging their existence per speaker. If a user has an incorrect accent listed, I believe that mods should be able to change it annd Lingua Libre will automatically retag all their pronunciations. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 23:06, 21 October 2020 (UTC)
| |
− | :::{{ping|Languageseeker}}: I believe information on accent is conserved thanks to the file '''metadata'''. To make them more visible, see [[Help:Renaming]] and fish metadata tag' value to push it into the new filename of your desired shape. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:48, 22 October 2020 (UTC) | |
− | :::: {{ping|Yug}} I don't see accents on either user profiles or on Wiki Commons. See: speaker [[Q141723]] for instance. Accents should be easily visible and filterable. For Example, on commons there should be a category for Lingua Libre pronunciation in French (Parisian Accent). On the description page for a file on Commons, there should be "AccendId" under "languageId" [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 16:00, 22 October 2020 (UTC)
| |
− | :::::The profile contains the location of where you learnt a language. City and country, if I remember well. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:59, 22 October 2020 (UTC)
| |
− | ::::: {{ping|Yug}} Correct, but this does not necessarily translate into accent. We're making an assumption that an accent can be geographically located and that all individuals that inhabit that geographic location will have a particular accent. Also, an accent can span across geographic region as well. What's wrong with adding a third parameter called accent? Is there any reason that you feel that it would be detrimental? [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 20:10, 22 October 2020 (UTC) | |
− | ::::::The filenames are already 3 times too long in my opinion.........
| |
− | ::::::I don't remember clearly. But maybe we then assumed the *speaker* to be the data marking the accent. {{reply to|Lyokoï|p=}} may remember. --[[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:16, 22 October 2020 (UTC)
| |
− | :::::::Could we set an advance option in the user profile, where users can decide how they wish the language files to be named based on metadata? That way users can have fine-tune the way they see the files?
| |
− | :::::::My point is that this is an invalid assumption. People speak with dialects and may have a speech disorder that Lingua Libre should have a metadata tag for. It's easier to add at first, then to have to manually add later. It shouldn't be that hard to implement. Furthermore, it will also make it easier for users to filter pronunciations. Take, for instance, Russian that has three major accents with twenty-four sub-divisions in Russia. Each accents occurs in hundreds of cities and villages. If we don't have dialect metadata, then we need to create a list with all of these geographical locations to group the pronunciations in the same dialect. If we have dialect metadata, then I can just filter by "Central Russian" or "Chukhloma enclave." All we would need to do is add the following three options to person's user page "Dialect" (required); "Dialect Sub-group" (optional); "Speech Disorder" (optional) and then have that propagate automatically to all their pronunciations. To help users and promote standardization, we can make these drop down menus with an option for custom. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 01:12, 24 October 2020 (UTC)
| |
− | :I really think that accent are subjective. In France, the "standard" accent is the one used in Paris, so people from Paris think they have no accent. Someone living in the North or the East of France may think he/she has no accent bu people from other regions think they have one. So, you are not always aware that we speak with an accent and if we are aware, we do not not know how to name it. That is why the information of the location, even if not perfect, is not so bad.
| |
− | : That's said, to fit your need, there is [[phab:T201135|this ticket]] that asked for custom categories. I think it can be used for this purpose. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 13:43, 1 November 2020 (UTC) | |
− | | |
− | === Authoritative lists (& copyrights) ===
| |
− | Lists such as HSK or JLPT should be an high priority. This would help language learners the most.
| |
− | : No problem to import such lists if they are not copyrighted. No idea on that. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | ::{{ping|Languageseeker|Pamputt}} Actually, we are not Wikipedia nor Commons. So we could set up our own copyright rule in accordance with the server's geographic localisation and associated laws. We can consider, like Anki and Skritter.com do, that we are just '''hosting content uploaded by the user, who is the legally responsible party''', and as the European and French law commands: we will take down any content '''following a formal complain'''. This is the true legal requirement we have in France. Our (LinguaLibre) rules don't have to be as pro-active as Wikipedia or Commons, which decided to go beyond what is legally required. Where we put the cursor is really up to us. It's a matter of internal policies. Some tolerance such as the one I cited above could be greatly advantageous to LinguaLibre's objectives. Indeed, it's the strategy that took Anki, Skritter, Memzine, Duolinguo, and many other actor of online language learning. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:55, 22 October 2020 (UTC) | |
− | ::: {{ping|Yug|Pamputt}} Agreed, we shouldn't worry too much about copyrights until someone submits a claim and then remove the entries. I think it'd be very difficult to file a copyright claim for "chien." We could also automatically reorder list to prevent an argument that the specific order makes them somehow under copyright. We could also rely on corpuses in the PD at first. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 16:04, 22 October 2020 (UTC)
| |
− | ::::'''Copyrights violations''' are not possible for stand alone lexemes such "狗" (gǒu: dog) from HSK.
| |
− | ::::List are different : lists are specific creation of one's mind so its author can claim copyrights on a given list.
| |
− | ::::Still, the Chinese Ministry of Education which authored the HSK '''''LIST OF SELECTED WORDS''''' and could legally file a copyright complain yet never filed copyrights violations on any for-profit companies to remove their online HSK lists. Then why would the MoE do so for a non-profit ? And if they do, we can simply delete the said list(s).
| |
− | ::::'''Shuffling''' is not enough to claim '''difference and originality''', especially when your page is named "List:cmn/HSK1" ^^ | |
− | ::::To claim originality the minimum would be to substantially edit the list. In order to not loose data, it lead to adding words. The HSK 1 to 6's 8800+ words could indeed be extended to 11000 via a merge with a relevant frequency lists, around this threshold of difference we could start to claim originality. The algo could be ligthly more complex, with ranking... You see the idea. But I thing the fair-use option is more practical and relevant for us (see below).
| |
− | ::::'''Sum up:''' as I shared above, I think we could advise and state that :
| |
− | ::::* Our policy should be based on the law of the land (Europe & France's laws) and current observed online practices.
| |
− | ::::* Our users make fair-use judgement and uploads the lists | |
− | ::::* Our admins and/or bureaucrats (?) receives the copyrights claims, one of them do a rapid review process on a case by case basis, then remove the list if the complain has merit.
| |
− | ::::This seems a good balance satisfying both relevant laws together with our project's objectives and interests. Can we push this way ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:40, 22 October 2020 (UTC)
| |
− | ::::: {{ping|Yug}} This is an awesome idea. I think that it would make a lot of sense to combine a set of frequency lists with official lists, then deduplicate them. This should prevent any copyright claims. We can even think about whether it makes sense to create frequency lists based on data in Project Gutenberg as part of this process. Then, ve should also write a script to automatically add inflections, conjugations, articles, etc. based on wiktionary data. These lists could be the first set of official lists for LL. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 22:06, 22 October 2020 (UTC) | |
− | ::::::We don't have the human resources to find and merge those list as I wished above. It need knowledge of the target language, of the available resources (only major, official languages such as EN/FR/ES/DE/JA/CN/KO have HSK/JLPT-like lists), and programming skills. Then add free time and willingness. Having them all is very rare. Better to redefine our copyright rule toward more tolerance so we accept any list there is. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:09, 23 October 2020 (UTC)
| |
− | ::::::: Agreed. I do think that this might take too much time and dev energy for a temporary and potentially problematic taks. I'm against user supplied lists at this stage for the same reason. I don't think that we have the resources to proofread and merge multiple lists. I'd propose focusing on creating an official list from the various languages of Wiktionary because that will have no copyright claims and will, eventually, contain all the words and phrases in a language. If we create a list from that, it will probably contain around 500, 000 to 600, 000 items per language which will be enough for an initial set and then we can open it up to user suggestions afterwards. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 00:39, 24 October 2020 (UTC)
| |
− | :{{ping|Languageseeker}} hi there, ex-phd in Chinese Language learning with a focus on vocabulary and elearning. There is a common agreement in the foreign language acquisition academic literature that ''"a fluent adult master about 20k words"'' (concepts), whatever the language we talk about. When learning a new language, 3000 words (most common concepts) are enough to kick start autonomous learning : with this 3000 one student can ask about and learn the other needed words. 20k is native-like mastery. 50k is paper dictionaries. +500k would be only relevant as a multi-year target for radical audio lexicography (wiktionary !). Such massive scale would surely require automated script and can't be done by hand. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:03, 11 December 2020 (UTC)
| |
− | | |
− | === Recording quality and post recording clean ups matters ===
| |
− | Words pronounced with lots of static or background hum should be deleted unless it's a truly rare language.
| |
− | : We have no or weak process for that. There are documented methods to denoise audios, see : [[Help:SoX]]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC)
| |
− | :: Yes, some tools have to been developed to control that. Ideas are welcomed. About denoising, a [[phab:T251638|ticket]] already exists to add by default in the Record Wizard. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | ::: Most of this has to do with recording equipment. On Forvo, recording quality is usually pretty fairly consistent for users. We may have to flag users with recording quality issues and delete their files. It could be a manually review process for a few files or batch deletion. The same would be true for users with terrible or fake accents. (Think Dick Van Dyke in Mary Poppins) [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 23:11, 21 October 2020 (UTC)
| |
− | ::::Personnally, I think we should argue more for good microphones and silent room. We frequently run after users and get lightly-noisy audios which are not satisfying. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:03, 22 October 2020 (UTC)
| |
− | :::: {{ping|Yug}} It seems that we need a statement on our expectations for recording quality. In the end, we only need one good pronunciation entry per language + accent. Terrible quality recording help nobody unless they are the only one that we have. It might be worth having a voting system such as on Forvo to help flag good/bad speakers. We can even toy with the idea of a speaker of the week or month to reward those who really help us out. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 16:46, 22 October 2020 (UTC)
| |
− | :+1. We need a quality statement somewhere. It's not perfect but it sets the tone a bit so we may require more quality from contributors. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:11, 22 October 2020 (UTC)
| |
− | :: I decided to create a draft of standard for Libre Lingua in [[phab:T266309|phabricator.]] [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 03:36, 23 October 2020 (UTC)
| |
− | :::{{reply to|Languageseeker}} Not dev related, to move back here. Also we need to check we may have something already in place making most of the job. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:03, 23 October 2020 (UTC)
| |
− | ==== Minimal threshold ? ====
| |
− | I also recommend to push for sets of more that 1000 words. Basically, in the download page, we should compile list by language by all datasets (one speaker one language) of less than 1000 audios should be ignored. We assume that sets of 1000+ audios are autoconfirmed. While sets of <1000 audios are potential beginners and likely noisy ''play-around'' (as for myself ! My audios are just test and not good !). Ideally we would have a download page such as :
| |
− | {| class="wikitable"
| |
− | |+ Download audios
| |
− | ! Language || All audios || Top 1 speaker || Autoconfirmed speakers (≥1000) || Other speakers (≤999)
| |
− | |-
| |
− | | French || 88,934 audios by 34 speakers || 47,076 audios by speaker Tom Smith|| 76,567 audios by 4 speakers || 12,367 audios by 30 speakers
| |
| |- | | |- |
− | | Gascon || ... || ... || ... || ... | + | |colspan=2| <small><references /></small> |
| |} | | |} |
− | [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:11, 22 October 2020 (UTC) | + | [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:38, 14 May 2024 (UTC) |
− | : Disagree, I don't think we should require a minimum number because that would discourage users. This would especially impact language with fewer speaker and can perpetuate oppression. Instead, I think we should not divide audio into datasets by speaker. Moreover, a system to vote on pronunciations and report them can help to flag problems. As a final resort, we can vote on whether or not to batch delete the pronunciation of speakers that are particularly horrible. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 03:36, 23 October 2020 (UTC)
| |
− | ::{{ping|Languageseeker}} datasets are grouped by languages via downloadable zips, then by speakers.
| |
− | ::My proposal is to create variable packagings for one languages : All / Top speaker / Autoconfirmed speakers / Non-autoconfirmed speakers.
| |
− | ::The ranking system / API would be great yes. Right now when I review a list of words I have to copy the filename(-filepath), store it, to then send a message "this audio is to redo". Not right. Maybe a smart template could do as of now. There is also the question of synch between Lili and Commons to keep in mind. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:00, 23 October 2020 (UTC)
| |
− | | |
− | === Volume normalization ===
| |
− | :''{{done}} -- feature request on Phabricator. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:05, 23 October 2020 (UTC)''
| |
− | It should be automatically provided across the entire corpus. We don't want one word at 140db and the other at 20db.
| |
− | : Volume normalization : we already have some normalization, we reject low db and high db recordings. But I cannot specify the exact mechanism : per file ? per recording set? I think it's the former. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC)
| |
− | :: This has already been proposed and is saved in [[phab:T213535|ticket]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | | |
− | === Anki plugin ===
| |
− | ::''{{done}} a proposal have been documented on phabricator. To keep in mind and follow through there. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:30, 22 October 2020 (UTC)''
| |
− | It would be great to develop an Anki plugin that would enable users to automatically add audio to flashcards. This is the biggest downside to Forvo that requires users to manually add one word at a time. Providing an Anki plugin will help to popularize this project attracting new users. Since, Anki is python based, this can be based on the French bot. Having a large group of testers can help to identify how the metadata of these files can be improved.
| |
− | : Anki plugin: YES, it's about 1~2 day work. Maybe Anki folks could help.
| |
− | :# Read Anki's documentation for Anki decks folder's syntax
| |
− | :# Download the target [https://lingualibre.org/datasets/ language folder]
| |
− | :# Create a bash script to pick up the filepaths, the words, and create the Anki decks file with the proper syntax.
| |
− | :# Document process to share file with Anki community [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC)
| |
− | :: Please, feel free to open a Phabricator ticket to keep in mind this need. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | :::Appreciated. I'll write up a full proposal in the next few days and open a Phabricator ticket.
| |
− | :::[EDIT]:Ticket created for [[phab:T266209|Anki Plugin]]. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 05:22, 22 October 2020 (UTC)
| |
− | ::::Awesome. Thank you :) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 15:06, 22 October 2020 (UTC)
| |
− | | |
− | === Flac ===
| |
− | Files should be uploaded and stored as '''flac files''' to enable '''tagging''' and reduce file user. All modern browsers use flac and it's emerged as the default lossless audio compression format that is widely supported across different devices. Tagging will help keep the metadata with the file and enable easier renaming by end user or bots.
| |
− | : Tagging and flacs : we do tag files in the files' code; we already have long discussion about file formats and chose to keep .wav together with sharing scripts to mass convert to alternative formats. See [[Help:Convert_audios%3F]]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC)
| |
− | :: I was not aware about the reason of choosing the wave format. Concerning FLAC, there is [[phab:T213534|a ticket]] explaining why it is interesting (in addition of the points given here). [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:06, 21 October 2020 (UTC)
| |
− | :::It's a decision by Mr. Vion and {{reply to|Lyokoï|p=}} if i remember well. Maybe worth creating a [[Help:Formats]].
| |
− | :::Side note: it also seems to be the time to create categories to class our help pages via several dimensions : scripts, lists, guidelines, recording, ... ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:32, 22 October 2020 (UTC)
| |
− | | |
− | === Discussions ===
| |
− | These are my few thoughts and I'd love to hear any feedback. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 19:30, 20 October 2020 (UTC) Edited on 20 October 2020, 23:35 (UTC)
| |
− | :{{ping|Languageseeker}}Hi there, thank you for this review. I edited your points and added bold so the key topics are more visible and we agree on naming for the discussion to continue. I will try to answer to several of your points [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:47, 21 October 2020 (UTC)
| |
− | ::I reorganised by section and move your answer, Yug, so that it will be easier to follow the different points (I hope so). I will add some answers as well. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 15:49, 21 October 2020 (UTC)
| |
− | ::: [[User:Pamputt|Pamputt]] Thank you!! [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 23:12, 21 October 2020 (UTC)
| |
− | ::::Thanks, good by me. As long as it improves and is in good faith it's a good practice to allow. ;) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:21, 22 October 2020 (UTC)
| |
− | | |
− | == Datasets to download ==
| |
− | Hello the team, I just noticed that https://lingualibre.org/datasets/ which is central for external developers such Anki's community to reuse our audios work has been lost in the recent UI revamp. Any idea where to put it back so it stays highly visible ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:50, 21 October 2020 (UTC)
| |
− | :The dates visible are also mainly from 2019. Any idea what they are ? First compilation ? Last compilation ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:53, 21 October 2020 (UTC)
| |
− | ::Yea, there's a message redirecting towards the datasets in [[DataViz:Records]] but I agree they would be more visible under "Actions" (upper left) or "Tools" (bottom left) - [[User:Eavq|Eavq]] 18:40, 29 October 2020 (UTC)
| |
− | :::Another idea would be to add a menu besides Record Wizard, Discussion, Statistics, Help and About. You could name it "Dataset". That's said, I do not know whether we can do it by ourselves or whether we need access to the server. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 13:30, 1 November 2020 (UTC)
| |
− | | |
− | == CSS fixes ==
| |
− | :See [[MediaWiki:Common.css]]
| |
− | There is clearly some CSS to update. Most notably for the H2, H3, H4 section titles *within wikipedia pages*. I will test some solutions soon. Feel free to test other CSS aspects in [[MediaWiki:Common.css]] (admin only?) or in [[User:Yourname/common.css]]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:55, 21 October 2020 (UTC)
| |
− | :* This is indeed interesting. Spacing and other things need)ed) some attention). Possibly a good idea is to try all codes from [[:en:Help:Cheatsheet]] and see the results. Regards, --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 21:23, 25 October 2020 (UTC)
| |
− | | |
− | == Proposed Community Standards ==
| |
− | Based on a conversation with Yug, I've decided to draft a community compact statement that sets out our standards. Any and all feedback welcome.
| |
− | | |
− | | |
− | Lingua Libre is dedicated to providing free, high-quality recording of words and phrases in all languages. To achieve this, we ask you to abide by the following community standards
| |
− | # All people and accents are welcome. However, do not assume an accent that you do not normally use.
| |
− | # Please, fill out your profile information accurately. This enables the correct usage of your pronunciations.
| |
− | # Do not upload pronunciations that you did not create or are posted with a restrictive license elsewhere.
| |
− | # Please, record in a quiet room with no background noise audible when listening with headphones.
| |
− | # Do not include excessive silence before or after your pronunciations.
| |
− | # Record your pronunciations in a relaxed, neutral tone. If you wish to pronounce them fast, slow, or emphatically, please tag your pronunciations appropriately.
| |
− | # Discrimination, cyberbullying, harassment, stalking, or any other form of intimidation on the base of age, accent, class, disability, ethnic identity, gender identity, geographical location, marital status, native language, political beliefs, race, religious identity, religious beliefs, sex, sexual orientation, or any other category will not be tolerated. Any such behavior will result in a permanent ban.
| |
− | # Lingua Libre reserves the right to delete any pronunciations that do not follow these guidelines.
| |
− | | |
− | [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 00:34, 24 October 2020 (UTC)
| |
− | | |
− | *Hello, points are OK, except #8, for various reasons such as a) LL pronunciation are uploaded on Wikimedia Commons and follow Wikimedia Commons guidelines, several of the points mentioned above are not reasons to delete on Wikimedia Commons. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 21:14, 25 October 2020 (UTC)
| |
− | :: I appreciate your feedback. I think that we need to review files before adding to the Wikimedia Commons because otherwise we have the potential to flood Commons with recordings of extremely poor quality. I've heard some files with truly horrific sound quality issues on Forvo and have also seen files in which the speaker did not properly read the text. In any language, pronouncing the headwords and all their variants will amount to somewhere between 200,000 to 1,000,000 unique entries. We don't want to that many files that are unusable on Wikimedia. I'm ok with just flagging the files and then deleting them when better quality versions are available. Of course, for rare languages or places of more limited means, we won't delete files just because of sound quality.
| |
− | :: I mainly want to have a deletion policy for users that pronounce word in a stereotypical manner that perpetuates discrimination. For example, the white person speaking "black," a straight person attempting to sound "gay," etc. These should be collective judgement calls by administrations with bans and deletions. [[User:Languageseeker|Languageseeker]] ([[User talk:Languageseeker|talk]]) 00:42, 26 October 2020 (UTC)
| |
− | :::Actually the problem right now is we do not have any sysop tool to check the quality of the files and to manage these fileS. It may evolve in the future but currently we have no mean to apply the policy we discuss here. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 13:27, 1 November 2020 (UTC)
| |
− | | |
− | == Lingua Libre bot on Wiktionary ==
| |
− | Most probably LinguaLibre bot is working on 2–3 Wiktionary projects now, such as French, Occitan etc. What are the steps to enable the bot for other projests such as [[:bn:wikt:|Bengali Wiktionary]]? --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 15:14, 27 October 2020 (UTC)
| |
− | :{{ping|Titodutta}} Do you have the bot name under hand, could you share it ? I think the bot master was 0x010C, who is leaving for a year nearly off grid (scientific mission in isolated place).
| |
− | :We need to call for new botmasters and bot hacker/coder. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:26, 31 October 2020 (UTC)
| |
− | :: The bot's name is [[m:User:Lingua_Libre_Bot|Lingua Libre Bot]]. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 00:27, 1 November 2020 (UTC)
| |
− | :I do not think Lingua Libre Bot will support new Wiktionaries soon because 0x010C is away for a long time. That's said, [https://github.com/lingua-libre/Lingua-Libre-Bot the code is public] and it is possible to use it to run its own bot. You can adapt the code from [https://github.com/lingua-libre/Lingua-Libre-Bot/tree/master/wikis already supported Wiktionary] to your Wiktionary. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 13:24, 1 November 2020 (UTC)
| |
− | | |
− | == Template:Welcome ==
| |
− | {{done}}<br>
| |
− | It would be good to create a [[Template:Welcome]] (see a similar template on French Wiktionary [[:fr:wikt:Modèle:Bienvenue]] or English Wikipedia [[:en:Template:Welcome]]). We see editors joining this site, we have seen more participation from South-East Asia, and India recently. It would be good to send a welcome message that will have links to important pages, including this Project Chat page. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 14:04, 6 November 2020 (UTC)
| |
− | :I've imported [[Template:Welcome]] from Wikidata (it is an international template). Now, we need to write the welcome text specific to Lingua Libre. It happens [[Template:Welcome/text|here]]. You are more than welcome to make a nice template. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 15:37, 6 November 2020 (UTC)
| |
− | ::{{ping|Pamputt}} I made a [[Template:Welcome/text|very first draft]], which needs styling improvements, in order to make it fit in a smaller space, like a box. I had issues with the translation tags, especially when it came to links (the <tvar> tag for links in translations does not seem to work, do you know why?).
| |
− | ::Do you agree with the content? Do you see important information to add? — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 20:50, 8 November 2020 (UTC)
| |
− | ::* Hello, one stable version can be created on a sandbox, and other versions can be translated from that. Links like [[Special:MyPage]] works for any language. Any link of the main text preferably should have a filter of "Special:MyLanguage". I would like to see 2 points, a) please make sure to listen to all the pronunciation before uploading, b) consider using a microphone, and upload audio without background noise (or in other words, it would be good to add some details about the best practices). Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 21:42, 8 November 2020 (UTC)
| |
− | ::::I changed appearance a little and made [[User:WikiLucas00/Welcome/text/sandbox|this sandbox for the text]], but I think we all agree that it's not nice. We need someone with good skills to design the style of this page, once we have all agreed on its content. {{ping|Titodutta}} we already give advice about best practices in the Help pages, but this of course still needs to be improved. Regards 🙂 — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 02:18, 9 November 2020 (UTC)
| |
− | To {{ping|Pamputt|Titodutta|Yug|Languageseeker|Eavq|Nicolas Lopez de Silanes WMFr}} , ...<br/> Hello everyone. Could we talk about the information we want to display on this Welcome template ? I created [[User:WikiLucas00/Welcome_collab|this collaborative page]] that we could all edit to add (or remove) points that we find interesting. This is only about the content, and the page is currently in English (we will focus on appearance and other languages later). Regarding the appearance, I made [[User:WikiLucas00/Welcome/text/sandbox|this suggestion]] (not definitive). Please tell me your thoughts about it.<br/>Best regards — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 23:30, 15 November 2020 (UTC)
| |
− | : Nice job [[User:WikiLucas00|WikiLucas00]]. I really like [[User:WikiLucas00/Welcome/text/sandbox]]. The current content looks good to me. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:03, 16 November 2020 (UTC)
| |
− | :* Great work indeed WikiLucas00. It looks fantastic. I have added a couple of [https://lingualibre.org/index.php?title=User%3AWikiLucas00%2FWelcome%2Ftext%2Fsandbox&type=revision&diff=357645&oldid=357167 subst options]. Test results can be seen [https://lingualibre.org/index.php?title=User:Titodutta/sandbox1&oldid=357646 here]. Using <nowiki>{{subst:User:WikiLucas00/Welcome/text/sandbox}}</nowiki> will the call the BASEPAGENAME, and put the signature I think. I have also linked the logo as it is under CC-SA (not CC0), it needs a link back for attribution details. Feel free to revert as it is your sandbox. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 10:13, 16 November 2020 (UTC)
| |
− | ::{{ping|WikiLucas00|Titodutta}} I moved what WikiLucas00 has prepared to [[Template:Welcome/text]] so that it is right now usable with [[Template:Welcome]]. So if the content looks good to you, you can translate it to your language. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:37, 17 November 2020 (UTC)
| |
− | ::: Great job! We are now able to greet properly future newcomers 🙂 How can we watch/be alerted when new contributors contribute here for the first time? — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 22:20, 17 November 2020 (UTC)
| |
− | :::: I do not know a perfect way to be aware of new user here but you can watch the changes in the [https://lingualibre.org/index.php?title=Special:RecentChanges&days=30&from=&namespace=2&limit=800 User namespace]. When you see a line such as "User:XYZ/RecordWizard.json", it means that this user set its personal config. So it is a good start to see whether he/she already received a Welcome message. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 16:43, 18 November 2020 (UTC)
| |
− | ::::* Or [https://lingualibre.org/index.php?title=Special%3ALog&type=newusers&user=&page=&year=&month=-1&tagfilter=&hide_thanks_log=1&hide_patrol_log=1&hide_tag_log=1 User creation log]. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 18:43, 18 November 2020 (UTC)
| |
− | :::::{{ping|Pamputt|Titodutta}} Thank you! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 19:04, 18 November 2020 (UTC)
| |
− | :::PS: When you want to use the template to welcome a new contributor, '''please use it this way:''' <code><nowiki>{{subst:welcome|--~~~~}}</nowiki></code>. Thank you! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 22:28, 17 November 2020 (UTC)
| |
− | === Users ===
| |
− | {{done}}<br>
| |
− | ''Feel free to make it a level 3 header, however this idea came to my mind when I was checking text of this template''<br>
| |
− | It would be good to register [[User:Example]] and [[User:Lingua Libre]]. User:Example userpage and its subpages can be used for demo user pages or other demos. Lingua Libre username can be taken as that is the project name. Preferably someone (admin/from Wikimedia Fr) should register the names. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 11:28, 16 November 2020 (UTC)
| |
− | :Because we can log in Lingua Libre only via an account on any Wikimedia website, I checked what is the status of [[c:User:Example|User:Example]] and [[c:User:Lingua Libre|User:Lingua Libre]]. The first is blocked and only used as an example on Wikimedia Commons. So here, I think we can use this page ([[User:Example]]) for our purpose and I will protect it so that only admin can edit. About [[User:Lingua Libre]], there is already a real account. It has been created in 2017 by someone involved (I guess) but I do not know who. So the best is not to use this account because it could be used, in theory, and we work only with [[User:Example]]. Thus, I have copied the content of [[User:WikiLucas00/User_Page_Demo]] to [[User:Example]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 17:07, 16 November 2020 (UTC)
| |
− | | |
− | | |
− | '''Update''': The template is now fully useable and translatable. I also imported [[Template:PedagoWikiCode|this template]] from the French-speaking Wikisource, in order to explain how to use the babel template, on this demo user page: [[User:Example]] (also translatable, but using LangSwitch --you have to create a subpage named with your language code, and list it on the main page). — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 19:15, 18 November 2020 (UTC)
| |
− | | |
− | == About the exclusion of already recorded words ==
| |
− | Hi, I think the option to exclude words that I have already recorded is broken. This morning, I start a recording session and LL proposes me words that I registered two days ago. For example, I already registered [https://commons.wikimedia.org/wiki/File:LL-Q143_(epo)-Lepticed7-Belorusino.wav Belorusino] two days ago, but it does not disappear when I click exclude words already recorded. And notice the two versions of the file, which I already re-recorded it. Can someone fix this? [[User:Lepticed7|Lepticed7]] ([[User talk:Lepticed7|talk]]) 10:07, 15 November 2020 (UTC)
| |
− | :I have opened a [[phab:T267876|Phabricator ticket]]. It may be fixed in the coming months but not sure. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:05, 15 November 2020 (UTC)
| |
− | | |
− | | |
− | == A Wikimedian in Residence for Lingua Libre ==
| |
− | Dear all,
| |
− | | |
− | After an epic (but domestic) brainstorming, [[User:Sebleouf|Sebleouf]], [[User:Noé|Noé]] and [[User:WikiLucas00|WikiLucas00]] proudly come to announce the first [https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedian_in_residence residence] dedicated to Lingua Libre!
| |
− | | |
− | To fund this residence, which will take the form of an internship, we take advantage of the on-going project ''Dictionnaire des francophones'' which plans to reuse Lingua Libre recordings.
| |
− | | |
− | [[User:WikiLucas00|WikiLucas00]] will be hosted as an intern at the Lyon 3 University, at the ''Institut international pour la Francophonie'' under the supervision of [[User:Sebleouf|Sebleouf]].
| |
− | | |
− | From May to August 2021, WikiLucas plans to implement suggested tasks, in order to facilitate individual collection and recording-workshops, and will explore solutions for patrolling new recordings.
| |
− | | |
− | The full presentation is [https://fr.wiktionary.org/wiki/Projet:Lingualibriste_en_résidence/en here]
| |
− | | |
− | Best Regards,
| |
− | | |
− | [[User:Sebleouf|Sebleouf]] ([[User talk:Sebleouf|talk]]), [[User:Noé|Noé]] ([[User talk:Noé|talk]]) and [[User:WikiLucas00|WikiLucas00]] ([[User talk:WikiLucas00|talk]]) 14:28, 20 November 2020 (UTC)
| |
− | | |
− | :That's really AWESOME news !!! Congrats. All the best. [[User:DSwissK|DSwissK]] ([[User talk:DSwissK|talk]]) 16:15, 22 November 2020 (UTC)
| |
− | | |
− | == Stats page November 2020 ==
| |
− | {{done}} — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 13:09, 11 February 2021 (UTC)<br/>
| |
− | The stats page is most possibly not working for several days and showing a week's old statistics. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 23:40, 26 November 2020 (UTC)
| |
− | :There is possibly a kind of cache. How do you know that there is a week lag? [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:00, 27 November 2020 (UTC)
| |
− | ::... because the stats are not getting updated for close to 10 days now and showing old stat disregarding recent uploads, regardless of device, browser etc. Let's see sometime tomorrow. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 06:07, 30 November 2020 (UTC)
| |
− | :::Same for me, counter blocked at 333606 items for a few days. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 10:23, 1 December 2020 (UTC)
| |
− | :::Can this be fixed [[User:Pamputt]]? --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 23:42, 4 December 2020 (UTC)
| |
− | ::::We had an online meeting today with the WMFrance team and reported the issue, things should be fixed in the next days/weeks. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 02:31, 5 December 2020 (UTC)
| |
− | ::::{{ping|Titodutta}} While the stats are still broken, you can have an rough idea of the total number of recordings with [https://petscan.wmflabs.org/?psid=17970278 this petscan query]. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 17:59, 7 December 2020 (UTC)
| |
− | * [[User:Pamputt]], [[User:WikiLucas00]], is there any update on fixing this bug? [[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 05:42, 31 December 2020 (UTC)
| |
− | ::As far as I know, some people at WMFr started to look at this issue with no conclusion for now. That's said, it is Christmas holiday in France and that work is probably frozen until next Monday. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 11:55, 2 January 2021 (UTC)
| |
− | {{ping|Titodutta|Pamputt|Yug}} Update: the new team of developers hired by WMFr just fixed the BlazeGraph updater, which means that stats and other SPARQL queries should now be up to date! Please report any problem you might encounter with it so that we can tell them if needed. All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 13:09, 11 February 2021 (UTC)
| |
− | | |
− | == Meta Community Wishlist for 2021 ==
| |
− | Hello {{ping|Pamputt|Yug|Titodutta}} and everyone else!<br/>
| |
− | For the [[m:Community_Wishlist_Survey_2021|Community Wishlist Survey 2021 on Meta]], I posted [[m:Community_Wishlist_Survey_2021/Wiktionary/Adopt_Lingua_Libre_Bot_service_as_a_WMF_tool|a request]] that had already been posted last year, and which got 40 support votes but didn't make it to the top 5. It is about adopting Lingua Libre Bot as a WMF Tool. Please tell me your thoughts about it, we still have a short amount of time to modify it before the opening of the votes. Regards — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 17:33, 29 November 2020 (UTC)
| |
− | :Sounds good for me. Thanks for the proposal. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:06, 30 November 2020 (UTC)
| |
− | :* That's pretty timely. It would be good to see this working in Indian language Wiktionary projects as well. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 10:10, 30 November 2020 (UTC)
| |
− | | |
− | == New interwiki link for Lingua Libre ==
| |
− | Good news!<br/>
| |
− | [[m:Talk:Interwiki_map#Lingua_Libre|Pamputt's request on Meta]] in May 2020, related to the creation of an interwiki link for LiLi, has been accepted! We now have to wait (a few days or weeks) for the next cache update in order to be able to use links with the <code><nowiki>[[LinguaLibre:...]]</nowiki></code> prefix on any wiki! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 11:13, 1 December 2020 (UTC)
| |
− | * This is an excellent news. Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 06:02, 2 December 2020 (UTC)
| |
− | * Yes indeed, very good news. I hope we will not need to wait too long. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:52, 2 December 2020 (UTC)
| |
− | ::{{ping|Pamputt|Titodutta}} it is finally working! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 18:36, 16 December 2020 (UTC)
| |
− | :::Excellent. Thanks for your work. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 18:47, 16 December 2020 (UTC)
| |
− | | |
− | == Translation update ==
| |
− | {{done}}
| |
− | | |
− | Can someone update the Japanese translation onto the pages bellow please?
| |
− | *https://lingualibre.org/wiki/LinguaLibre:Main_Page
| |
− | *https://lingualibre.org/wiki/Special:RecordWizard (translation updated almost a year ago on the Translatewiki.net, but not updated)
| |
− | | |
− | Also, is it possible to make the help pages as bellow translatable? I can see the edit button, but can't find the translation button. Thanks in advance.
| |
− | * https://lingualibre.org/wiki/Help:Your_first_record
| |
− | * https://lingualibre.org/wiki/Help:Choosing_a_microphone
| |
− | * https://lingualibre.org/wiki/Help:Configure_your_microphone
| |
− | * https://lingualibre.org/wiki/Help:RecordWizard_manual
| |
− | * https://lingualibre.org/wiki/Help:Sound_library
| |
− | * https://lingualibre.org/wiki/Help:Querying_Lingua_Libre
| |
− | * https://lingualibre.org/wiki/Help:SPARQL
| |
− | -- [[User:Higa4|Higa4]] ([[User talk:Higa4|talk]]) 01:16, 10 December 2020 (UTC)
| |
− | :Hi [[User:Higa4|Higa4]], about [[LinguaLibre:Main Page]] and [[Special:RecordWizard]], could you tell precisely what has to be translated? When I switch to Japanese language, I see that this is translated, so I need to better understand what is wrong to be able to help.
| |
− | :Concerning the help pages, you can translate them from [https://lingualibre.org/index.php?title=Special%3ALanguageStats&x=D&language=ja&suppresscomplete=1 this page] (click on the page title you are interested in). [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:20, 10 December 2020 (UTC)
| |
− | :: Thanks for your comment. Main page looks like this to me. Selected language is Japasese(日本語), but it shows English.
| |
− | * https://i.imgur.com/m3eCzGJ.png
| |
− | Some examples where translated strings are not updated on record wizard (red square)
| |
− | * https://i.imgur.com/EIVcQhy.png
| |
− | * https://i.imgur.com/Dr9YvHr.png
| |
− | * https://i.imgur.com/GGag0y1.png
| |
− | :: About help pages translation, thanks, I see. --[[User:Higa4|Higa4]] ([[User talk:Higa4|talk]]) 13:28, 10 December 2020 (UTC)
| |
− | * Yes, some strings are not translated such as "Welcome to Lingua Libre, the participative linguistic media library....". Any labg code can be tried (does not change setting) adding "?uselang=ja" at the end of the URL (without quotes, here ja stands for Japanese, bn for Bengali etc). --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 15:59, 10 December 2020 (UTC)
| |
− | :: I will open a bug report on Phabricator a bit later to ask updating translations from TranslateWiki.
| |
− | :: Concerning the main page, it is weird. The problem comes from <nowiki>"{{int:lang}}"</nowiki> that returns "{{int:lang}}" (should be "ja" instead of "en") when Japanese is set and then the interface is displayed in English. I have no idea why there is this behaviour. Maybe also create a Phabricator to keep track of that issue.
| |
− | :: PS: [[User:Higa4|Higa4]], next time you need to share pictures, you can directly import the file on Lingua Libre using [[Special:Upload]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:06, 10 December 2020 (UTC)
| |
− | ::: I have tried several language to test the value of <nowiki>"{{int:lang}}"</nowiki> and it seems that only "fr" and "es" are managed. I do not know why it does not work with other languages. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:51, 10 December 2020 (UTC)
| |
− | ::: I have created [[phab:T269885|T269885]] and [[phab:T269887|T269887]]. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:56, 10 December 2020 (UTC)
| |
− | :::: Thanks a lot. I understand current situation and I will use upload form next time. --[[User:Higa4|Higa4]] ([[User talk:Higa4|talk]]) 23:37, 10 December 2020 (UTC)
| |
− | | |
− | == In-file metadata ? ==
| |
− | :{{done}} (see [[phab:T269969]]): the present section enlightened that demographic, license and linguistic metadata are currently not embedded within individual files's metadata. A Phabricator feature request have been opened. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:13, 14 December 2020 (UTC)
| |
− | Hello folks, did someone else played with in-file metadata or know about them ? With LinguaLibre's ancestor (SWAC RECORDER), all the metadata about the word, language and speaker were hard coded within the files, so a 2016 terminal command was like and returned something such as :
| |
− | {{Colapse|1=Unroll|2=
| |
− | <source lang="bash">
| |
− | $ avconv -i ./cmn-jiāoliú.flac 2>&1 # print out metadata of $file, for some formats only
| |
− | | |
− | ffmpeg version 2.8.14-0ubuntu0.16.04.1 Copyright (c) 2000-2018 the FFmpeg developers
| |
− | built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
| |
− | [...]
| |
− | Input #0, flac, from './cmn-jiāoliú.flac':
| |
− | Metadata:
| |
− | TITLE : 交流
| |
− | LICENSE : Creative Commons BY-SA 3.0 U.S
| |
− | COPYRIGHT : (c) 2009 Yue Tan
| |
− | ARTIST : Tan
| |
− | DATE : 2009-07-08
| |
− | GENRE : Speech
| |
− | SWAC_LANG : cmn
| |
− | SWAC_TEXT : 交流
| |
− | SWAC_ALPHAIDX : jiāoliú
| |
− | SWAC_SPEAK_NAME : Tan
| |
− | SWAC_SPEAK_GENDER: F
| |
− | SWAC_SPEAK_BIRTH_YEAR: 1978
| |
− | SWAC_SPEAK_LANG : zho
| |
− | SWAC_SPEAK_LANG_REGION: Liaoning
| |
− | SWAC_SPEAK_LIV_COUNTRY: FR
| |
− | SWAC_SPEAK_LIV_TOWN: Caen
| |
− | SWAC_PRON_PHON : jiāoliú
| |
− | SWAC_COLL_SECTION: HSK niveau II
| |
− | SWAC_COLL_LICENSE: Creative Commons BY-SA 3.0 U.S
| |
− | SWAC_COLL_COPYRIGHT: (c) 2009 Yue Tan
| |
− | SWAC_TECH_DATE : 2009-07-08
| |
− | SWAC_TECH_SOFT : Shtooka Recorder/1.3
| |
− | Duration: 00:00:01.40, start: 0.000000, bitrate: 447 kb/s
| |
− | Stream #0:0: Audio: flac, 44100 Hz, mono, s16
| |
− | </source>
| |
− | }}
| |
− | NOTE: 2009's Swac Recorder was asking few different questions compared to 2020's Lingua Libre.
| |
− | | |
− | Today I downloaded [[:commons:File:LL-Q150 (fra)-Tsaag Valren-zèbre.wav]] and refreshed my shell commands. I got far less metadata :
| |
− | {{Colapse|1=Unroll|2=
| |
− | <source lang="bash">
| |
− | $ exiftool ./LL-Q150_\(fra\)-Tsaag_Valren-zèbre.wav.mp3
| |
− | | |
− | ExifTool Version Number : 11.88
| |
− | File Name : LL-Q150_(fra)-Tsaag_Valren-zèbre.wav.mp3
| |
− | Directory : .
| |
− | File Size : 22 kB
| |
− | File Modification Date/Time : 2020:12:11 19:29:44+01:00
| |
− | File Access Date/Time : 2020:12:11 19:29:44+01:00
| |
− | File Inode Change Date/Time : 2020:12:11 19:29:44+01:00
| |
− | File Permissions : rw-rw-r--
| |
− | File Type : MP3
| |
− | File Type Extension : mp3
| |
− | MIME Type : audio/mpeg
| |
− | MPEG Audio Version : 1
| |
− | Audio Layer : 3
| |
− | Sample Rate : 44100
| |
− | Channel Mode : Stereo
| |
− | MS Stereo : Off
| |
− | Intensity Stereo : Off
| |
− | Copyright Flag : False
| |
− | Original Media : False
| |
− | Emphasis : None
| |
− | VBR Frames : 48
| |
− | VBR Bytes : 22892
| |
− | VBR Scale : 0
| |
− | ID3 Size : 45
| |
− | Encoder Settings : Lavf57.56.101
| |
− | Audio Bitrate : 146 kbps
| |
− | Duration : 1.25 s (approx)
| |
− | </source>
| |
− | }}
| |
− | | |
− | I'am quite surprised. No open license, no data on the speaker (name, place, gender, accent), language, recording date, ... As far as I can see, there is simply ''ZERO'' linguistic metadata embedded within the audio files themselves. All we have is the filename on the format <code>LL-{Qid_lang}_({iso_lang})-{username}-{word}.ext</code>.
| |
− | | |
− | Am I doing something wrong ? Using to wrong shell tool to read the metadata ? (avconv has been depreciated) Did I miss a policy to externalize metadata to the Qid items' page ? If so, it's unfortunate... When the files land on a PC, there is no pointer toward Lingua Libre, nor to the tutorials on how to pull the complementary data.
| |
− | | |
− | EDIT: There is [https://askubuntu.com/questions/226773/how-to-read-mp3-tags-in-shell/657492#comment281174_226778 a discussion] about possible shell tools to get metadata from audio files, via the following tools :
| |
− | <code>sudo apt-get install ffmpeg lltag eyed3 mp3info id3v2 libimage-exiftool-perl libid3-tools id3tool</code>
| |
− | [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:40, 11 December 2020 (UTC)
| |
− | :{{ping|WikiLucas00}} any idea ? This section is quite technical, better to archive it quickly. Maybe only 0x10C knows. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:46, 11 December 2020 (UTC)
| |
− | ::As far as I know, it was not planned to include the metadata in the file itself because it was decided to manage metadata on Commons and Lingua Libre wikibase. That said, the one does not prevent the other. So I opened a [[phab:T269969|Phabricator task]] to keep in mind this request. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 21:56, 11 December 2020 (UTC)
| |
− | :::{{Support}} I agree metadata in the files would be valuable even if we already have them elsewhere (it's always good to have the info directly within the file, for when it is used offline, downloaded, sent etc). You can mark the topic as "done" if you want, so that we archive it in the next batch. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 01:12, 12 December 2020 (UTC)
| |
− | ::::Hi, As a best practice, let the section be read at least 2 to 4 weeks, so this section's conclusion can spread a bit among LL's contributors. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 10:13, 14 December 2020 (UTC)
| |
− | | |
− | == Coolest Tool Award ==
| |
− | Lingua Libre just won the [[m:Coolest Tool Award|Coolest Tool Award]] in the category "Diversity", for tools helping to include a variety of languages, people and cultures. That's another great news for the project! 🎉
| |
− | | |
− | You can watch the event on [https://www.youtube.com/watch?v=zYM4k_LD_9w&t=2041 Mediawiki's Youtube channel]. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 18:52, 11 December 2020 (UTC)
| |
− | :Muahahahah !! Awesome ! <3 Maybe Contacting some news outlet with a brief would be nice ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 19:08, 11 December 2020 (UTC)
| |
− | ::Hehe, it's the beginning of glory 🎉🎉🎉 [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 22:01, 11 December 2020 (UTC)
| |
− | ::* Congratulations. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 22:26, 11 December 2020 (UTC)
| |
− | | |
− | == How to find authors in specific language ==
| |
− | Hi! I have a set of words in some language and I would like to find persons who recorded in this language to ask if they could do recordings of my set. How can I find all the persons who in the past did recordings in given language with LL? [[User:KaMan|KaMan]] ([[User talk:KaMan|talk]]) 12:54, 18 December 2020 (UTC)
| |
− | :In principle, it is possible to write query to get this information. That's said, I am not really skillful with SPARQL to help you further. So you can have a look on [[LinguaLibre:Stats]] to have example. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:42, 18 December 2020 (UTC)
| |
− | :: Please copy the query below, go to the [https://lingualibre.org/bigdata/#query SPARQL ENDPOINT], paste it, and click "execute" button. Then you will get the list of English(Q22) speakers. You can change Q22 to Q21(French) or Q24(Germany).
| |
− | <pre>
| |
− | select ?speaker ?speakerLabel
| |
− | where {
| |
− | ?speaker prop:P2 entity:Q3 .
| |
− | ?speaker prop:P4 entity:Q22 .
| |
− | SERVICE wikibase:label {
| |
− | bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
| |
− | }
| |
− | }
| |
− | </pre>
| |
− | Also, you can get all the language-Qnumber list by executing the query below.
| |
− | <pre>
| |
− | select ?lang ?langLabel
| |
− | where {
| |
− | ?lang prop:P2 entity:Q4 .
| |
− | SERVICE wikibase:label {
| |
− | bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
| |
− | }
| |
− | }
| |
− | </pre>
| |
− | --[[User:Higa4|Higa4]] ([[User talk:Higa4|talk]]) 11:40, 19 December 2020 (UTC)
| |
− | | |
− | == Issue with the Main page ==
| |
− | {{Move|LinguaLibre:Technical board|type=section}}
| |
− | | |
− | {{done}}
| |
− | | |
− | Hi, the main page here uses a sollution that requires the [[MediaWiki:Lang]] and its subpages to be populated, which they aren't which makes the main page not switch languages even if there is a translation available in that language and the language has been set. Could someone look into this if it's possibly to rework the structure or maybe somehow import the [[MediaWiki:Lang]] subpages? --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 17:40, 16 January 2021 (UTC)
| |
− | :Hello {{ping|Sabelöga}} thank you very much for this remark! I just imported the MediaWiki:Lang subpages from Meta, and it seems to be working as of now 🙂 All the best. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 00:09, 17 January 2021 (UTC)
| |
− | ::That's excellent, that you so much for amending this, regards --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 21:59, 18 January 2021 (UTC)
| |
− | :::Thanks a lot! The main page looks fine --[[User:Higa4|Higa4]] ([[User talk:Higa4|talk]]) 04:34, 19 January 2021 (UTC)
| |
− | | |
− | == Images missing ==
| |
− | {{Move|LinguaLibre:Technical board|type=section}}
| |
− | Images on [[Help:Add_a_new_language]] show up as missing on my end. Have they moved or is this some error? --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 21:08, 19 January 2021 (UTC)
| |
− | :Hello {{ping|Sabelöga}} they are not actually missing (for example, I can see them on the page you are talking about), but I also have experienced similar issues on the website. Some images seem to randomly disappear for no reason, and to come back after a while, without any modification on the page. We will talk about it during the next team meeting. — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 21:19, 19 January 2021 (UTC)
| |
− | :::Actually, some images are really missing in the section "I know what I'm doing". This comes from the fact all images have been lost when the website have migrated to the new design. I had opened a [[phab:T264332|ticket]] about that but I think we will never find them back. So we should create new screenshot when we discover such missing image. I will try to do for the page you've mentioned. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 23:06, 19 January 2021 (UTC)
| |
− | ::::Yes, those were the one I were talking about. --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 16:56, 23 January 2021 (UTC)
| |
− | | |
− | == Translation error? ==
| |
− | The translation units on [[Help:Configure_your_microphone]] does not align properly with each other. What I mean is that the translation units include several section when the software should just pick the one for each unit. I removed the <code><nowiki>__TOC__</nowiki></code> from the page since the TOC will appear anyway. So could a translation adminstrator mark the page for translation again and let's see if that solves the issue. --[[User:Sabelöga|Sabelöga]] ([[User talk:Sabelöga|talk]]) 16:56, 23 January 2021 (UTC)
| |
− | :{{done}} — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 22:21, 23 January 2021 (UTC)
| |
− | | |
− | == RecordWizard drops syllable ==
| |
− | :{{done}} -- can be archived. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:03, 27 January 2021 (UTC)
| |
− | Prior to several days ago I was recording word pronunciations without problems, but now I can't. Several days ago RecordWizard started removing certain sounds from my voice, most frequently syllables "s" and "f". If I spell a word, for example "syllable", it records it as "yllable", just like if I never spelled "s". If I spell "sophicated", it records "soicated".
| |
− | | |
− | This sort of behaviour is not present at the test sound stage (the first thing RecordWizard asks user to do), it captures my speech perfecty. However the very same issue is present on different devices with different operation systems, different browsers and different microphones.
| |
− | | |
− | My guess is that maybe some changes to noise recognition were deployed several days ago, and it now misinterprets those syllables as background noice. Anyway, I will be grateful for suggestions on how to fix this issue. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 07:43, 26 January 2021 (UTC)
| |
− | | |
− | :Hi {{u|Tohaomg}}, not easy to say what happens here. I am pretty sure nothing change at the backend since several months. With your examples "syllable" and "sophicated", does it happen every time you try to pronounce these words or does it happen randomly? In the first case, can other contributors try to record these words and see if the problem occurs for them as well? Myself, I just tried and I did not see this problem. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 21:39, 26 January 2021 (UTC)
| |
− | | |
− | ::I am not trying to record '''exactly''' those words, they are just examples to show you what I mean. I am actually trying to record words in Ukrainian language. When I try to record words, in some 3 cases out of 4 syllables are dropped, and in 1 out of 4 they are not, so I need to do in average 4 attempts to record a word. And this problem appeared abruptly several days ago, everything worked fine before. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 08:09, 27 January 2021 (UTC)
| |
− | :::Could it come from your microphone? Did you try to record with other hardware for a test? [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 10:36, 27 January 2021 (UTC)
| |
− | ::::Yes, it happens on different devices. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 11:59, 27 January 2021 (UTC)
| |
− | | |
− | Solved it. Turns out, this effect is present only when loading lists longer than several hundred words. My next theory is that it was due to some sort of RAM shortage. Thank you for your time. --[[User:Tohaomg|Tohaomg]] ([[User talk:Tohaomg|talk]]) 13:06, 27 January 2021 (UTC)
| |
− | :This bug stays weird...
| |
− | :Anyway, thanks Tohaomg for your audios <3 [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:03, 27 January 2021 (UTC)
| |
− | ::Happy to see that you've found a workaround. Indeed, what you guess could be the reason of the problem because the server is currently not very robust. It should evolve in the coming weeks/months. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:04, 27 January 2021 (UTC)
| |
− | | |
− | == Technical > Github Winter 20-21 review ==
| |
− | Following October 2020's 0x010C's departure we've reviewed the human needs for maintenance of various technical subprojects. Thanks to 4 months community effort things are in better position now :
| |
− | * '''Definitions:''' All repositories are now well defined via a clean, one sentence descriptor. It maps sub-projects, so new volunteers know quickly what repository does what. See [https://github.com/lingua-libre/ github.com/lingua-libre].
| |
− | * '''Mentors:''' 2/3 of repositories now have a volunteer referee-mentor with "correct" understanding, able to discuss the repository, guide new comers.
| |
− | * '''Documentations:''' Most repositories are "correctly" documented via an existing readme.md. Improvement always welcome.
| |
− | * '''Web servers:''' Wikimedia France hired a new Sysops, which guide and team up with volunteers on the server issues. Welcome to WMFR's MickeyBarber/Michael.
| |
− | * '''Maintenance:''' Wikimedia France is reviewing freelance candidates for deeper mediawiki and recordwizar coding support. Thanks to Adelaide & WMFR's team.
| |
− | * '''Globalization:''' Wikimedia France has plan to expand volunteership toward India. Thanks to Adelaide & WMFR's team.
| |
− | All pretty positive. Pamputt, Jitrixis, Poslovitch, Adelaide, Mikey and myself pushed forward on these fronts.
| |
− | | |
− | '''Still ! The following repositories are currently leaderless and contributorless:'''
| |
− | * [https://github.com/lingua-libre/LinguaRecorder LinguaRecorder]: Powerfull JS library to manage audio recording : intelligent cutting with regular padding, saturation control, various export options,...
| |
− | * [https://github.com/lingua-libre/RecordWizard RecordWizard]: MediaWiki extension allowing mass recording of clean, well cut, well named pronunciation files.
| |
− | * [https://github.com/lingua-libre/QueryViz QueryViz]: MediaWiki extension adding a <query> tag to display sparql queries results inside wiki pages
| |
− | | |
− | '''LinguaLibre Bot''' is under review by [[user:Poslovitch|Poslovitch]] but may gain from some more love. LinguaLibre Bot it's '''the most impactful yet underused piece''' of our sub-projects since it needs to be ''authorized'' per target language (ex: add audios to tamil wikipedia articles) and is only authorized for few languages & wiki :
| |
− | * [https://github.com/lingua-libre/Lingua-Libre-Bot Lingua-Libre-Bot]: Mediawiki bot facilitating the resuse of Lingua Libre's audio records on many wikis, including wikipedias and wiktionaries.
| |
− | | |
− | Satelite linguistic project :
| |
− | * [https://github.com/lingua-libre/SignIt SignIt]: ''LinguaLibre SignIt'' is a web-browser extension which translates a word in Sign Language, in order to learn sign language while reading online.
| |
− | | |
− | The end ! Thanks to all those who helped and are joining in :) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 13:11, 1 February 2021 (UTC)
| |
− | | |
− | == LinguaLibre International call (France-India-others) ==
| |
− | :{{Done}} -- Please refer to [[LinguaLibre:Events#2021_International_call]]. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 11:46, 12 February 2021 (UTC)
| |
− | Namaskara/Hello,<br>Earlier we noted that we started getting more participation from India (I am from India as well). In October last year when I had around 15,000 uploads, I then contacted Wikimedia France with the following idea, what I wrote in the email then:
| |
− | <blockquote>I believe in India, as we have many languages and dialects, the tool is specially relevant. I wanted to have a discussion with the people working on the project, or can help with this idea.</blockquote>
| |
− | This email was followed by a call with Adelaide and most possibly Lyokoi joined. Adelaide kindly invited me to attend another call, where I could briefly meet many of you.<br>
| |
− | Now, I know, there is more interest from India (different languages). You might have seen some work in Marathi very recently. Just two days ago I attended a brief India (Maharashtra) LinguaLibre meeting. I did not expect to see so many participants, but it looks like around 10 or so people are interested to record Marathi pronunciation.
| |
− | | |
− | So, is it possible to have a France-India call? It is absolutely OK for me to make it an "international call", so that everyone can join. Here we can have some of the people from India (any country) who can tell their plans, ask questions, get to know from you, or share experience. There might be ideas and questions related to setting up the project page etc. Around 5 or more people from India will be interested to join, I think.
| |
− | | |
− | PS: Most possibly there are LinguaLibre calls arranged. Adelaide kindly invited me to two such calls. Otherwise, I do not get to know about these calls. If these calls are open and anyone can join, possibly can we announce the call dates and time on this Project Chat, so that anyone interested/eligible can join?
| |
− | | |
− | Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 20:03, 1 February 2021 (UTC)
| |
− | :Hello Titodutta,
| |
− | :Nice to hear that news of a 10 people Lingualibre workshop in India's Marathi community. This is wonderful.
| |
− | :Adelaide is definitively the person to contact for institutional relationship and workshops. She is coordinating-piloting-animating this project for Wikimedia France and knows who is who, where the human resources are, what is our wish-list and next moves.
| |
− | :I'am interested by this call as well. Santosh, an Indian wikimedian contacted us as well (via email) for a similar need.
| |
− | :I will send you an email to group us all. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 09:41, 2 February 2021 (UTC)
| |
− | ::Yes, it would be good to have documentation or project page creation process on this site. Other than Marathi, briefly, a few Kannada students from a south Indian college started working on Lingua Libre (you may see an event page [[:m:Alva's Wikipedia Student's Association/Events/Lingua Libre training session]]). Similarly, you might have seen some involvement from the Punjabi community where [[User:Nitesh Gill]] and a couple of other Punjabi community members are working. <br>Other than Indian languages, there is a good response from Japanese, Ukrainian, and a few other languages. From all these the thought of the "International call" came to my mind. <br>Other than small projects we can also think of "small events" in the future, such as a LinguaLibre-a-thon or Libre-a-thon (similar to edit-a-thon, for example on World Environment Day we can get together and records pronunciation which is related to the environment, and not on Commons. This can be a small event where we record our own respective languages). Regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 23:51, 2 February 2021 (UTC)
| |
− | :::{{ping|titodutta|सुबोध कुलकर्णी}} From what I see now with France and India, It seems the best seeds are with already very active wikipedians with interest in languages.
| |
− | :::We also have a group of successful seeding due to already active wikimedian who have some institutional roles (Lyokoi, WikiLucas, Titudutta, सुबोध कुलकर्णी_Subodh). Basically, according to data, one out of 10 speaker who tried Lingualibre really stick in. So you need someone really active and outreaching, training 20, 30 people to initiate a local community.
| |
− | :::Note: I create [[LinguaLibre:Events#2021_International_call]], please fill in informations as needed. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:07, 8 February 2021 (UTC)
| |
− | | |
− | ==Marathi language stats==
| |
− | {{Move|LinguaLibre:Technical board|type=section}}
| |
− | :{{done}} [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:09, 8 February 2021 (UTC)
| |
− | Mar records @2600 on [[:Commons:Category:Lingua_Libre_pronunciation-mar Wikimedia Commons]], but it is not reflected in LL stats - records per lang. It is just 163. Could anyone please look into and resolve? [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:08, 3 February 2021 (UTC)
| |
− | :Hi {{u|सुबोध कुलकर्णी}}. Lingua Libre suffers a bug since the end of 2020. New developers are looking to this issue. Let us hope it will be fixed in the coming weeks. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 07:05, 3 February 2021 (UTC)
| |
− | ::{{Ping|सुबोध कुलकर्णी|सुबोध कुलकर्णी}} it's fixed ! Data are back online thanks to the devs hired by Wikimedia France and Adelaide. You can also use the {{tl|User records-mar}} template on userpages to tag speakers/uploaders. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:17, 11 February 2021 (UTC)
| |
− | | |
− | == Stats : toward records and beyond... ==
| |
− | Folks, given the stats page is broken [paid devs will fix it in coming weeks thanks to Wikimedia France !], I jumped with some regex to do the maths:
| |
− | * [[:Commons:Category:Lingua_Libre_pronunciation]] > 101 languages, 385,929 audios as of RIGHT-NOW-NOW.
| |
− | We will likely reach <big>400,000 this very months</big>. This feast is wildly due to the recent rise of Indic languages. We must also notice that most languages only have from 3 to 50 words, people trying out. Best results are achieved if we get users commit a bit, then things truly take off. Other thing, our 7 most active users provided 200,000 of our audios. 20 users contributed more than 3000 audios, and 20 others between 3000 and 1000 audios, so about 10% of speakers really hit it off. Quite interesting ! In my opinion we still have bottle necks on :
| |
− | * reaching out to diverse & minority languages ;
| |
− | * getting contributors to contribute consistently ;
| |
− | * and creating words lists for our users.
| |
− | Inventing and exploring new methods for each of these bottlenecks is always welcome. Recent success with Marathi ([[:Commons:Category:Lingua Libre pronunciation-mar]]: 15 C, 3,011 F) is a great example of reaching outside our usual pool, we surely may learn from this initiative.<br/>
| |
− | I will hide in the code below the per-language stats as in tsv format, in case you want to check those.
| |
− | <!--
| |
− | ISO3 QUANTITY
| |
− | others 404
| |
− | afr 2,541
| |
− | amh 209
| |
− | ara 4,683
| |
− | arq 241
| |
− | ary 1,286
| |
− | atj 492
| |
− | aze 325
| |
− | bam 44
| |
− | bas 105
| |
− | bbj 186
| |
− | bci 85
| |
− | bcl 244
| |
− | bdu 108
| |
− | ben 49,734
| |
− | bik 112
| |
− | bre 1
| |
− | bse 1
| |
− | bum 113
| |
− | bzm 109
| |
− | cat 954
| |
− | ces 10
| |
− | cmn 23
| |
− | cym 705
| |
− | deu 12,952
| |
− | dua 94
| |
− | duf 3
| |
− | dyu 47
| |
− | ell 18
| |
− | eng 12,511
| |
− | epo 28,823
| |
− | eus 3,050
| |
− | fas 817
| |
− | fin 63
| |
− | fon 93
| |
− | fra 178,271
| |
− | fsl 2
| |
− | gaa 209
| |
− | gcf 456
| |
− | glg 32
| |
− | gsc 4,879
| |
− | hat 15
| |
− | hav 129
| |
− | heb 73
| |
− | hin 432
| |
− | hye 28
| |
− | ind 5
| |
− | ita 2,357
| |
− | jpn 13
| |
− | kor 56
| |
− | kab 18
| |
− | kan 487
| |
− | ken 35
| |
− | kik 54
| |
− | krc 1
| |
− | kur 546
| |
− | dag 14
| |
− | hau 8
| |
− | pan 2,178
| |
− | lnc 5,316
| |
− | ltz 86
| |
− | mal 9
| |
− | mar 3,011
| |
− | mcn 24
| |
− | mhk 77
| |
− | mkd 12
| |
− | mlg 419
| |
− | mos 175
| |
− | mua 34
| |
− | myv 15
| |
− | nld 1,226
| |
− | nor 29
| |
− | nso 45
| |
− | oci 13,955
| |
− | ory, ori 3,728
| |
− | pcd 4
| |
− | pol 9,621
| |
− | por 2,586
| |
− | que 25
| |
− | rcf 17
| |
− | ron 2,065
| |
− | rus 3,558
| |
− | spa 2,559
| |
− | sat 434
| |
− | shi 35
| |
− | shy 1,210
| |
− | srr 1
| |
− | swe 504
| |
− | tam 104
| |
− | tat 3
| |
− | tay 4
| |
− | tel 5
| |
− | tgl 7
| |
− | tha 208
| |
− | ukr 16,216
| |
− | vie 1,234
| |
− | wls 3
| |
− | ybb 21
| |
− | yue 5,670
| |
− | zho 190
| |
− | | |
− | -->
| |
− | [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:34, 6 February 2021 (UTC)
| |
− | :10% of speakers commiting to 1000 recording or more is very interesting. So it suggest that, if we give a workshop to 10 people, one committed speaker will emerge. Thereby kick starting this language. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 01:20, 8 February 2021 (UTC)
| |
− | | |
− | == Reminder : Grants ==
| |
− | Hello all, I'am monitoring grants these days and there is a summary table available here [[LinguaLibre:Grants]]
| |
− | | |
− | I think both rapid grants mechanisms could be of help to us now, to reach out to local community via small scale events, training, hardware, food, transportation costs, flyers' designs, etc. By example, [[:meta:Wikimedia_France/Micro-financement/Demande/µFi-2020-10-421441|This WM-France micro-fi's request]] organizes 4 evenings of contribution, getting 100€ for each evening. The same user has been welcome to do several Grant requests.<br> Heavier, the R&D Grant could surely be used for something. I have an idea on this, but we can trust Indian contributors to come up with relevant technical ideas and teams as well. {{ping|Titodutta}} [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 01:20, 8 February 2021 (UTC)
| |
− | | |
− | == LinguaLibre Bot and Wikidata ==
| |
− | {{Move|LinguaLibre:Technical board|type=section}}
| |
− | I have not checked the bot's contrib on Wikidata for quite some time. Yesterday I uploaded ~100 Bangal film names from Bangla Wikipedia. It looks like [https://www.wikidata.org/w/index.php?target=Lingua+Libre+Bot&namespace=all&tagfilter=&start=&end=&limit=50&title=Special%3AContributions the bot is not] active, unless I am missing something. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 18:10, 13 February 2021 (UTC)
| |
− | | |
− | == Update and technical improvements ==
| |
− | | |
− | Hi all,
| |
− | | |
− | Full information and full disclosure, I'm working now with WikiValley and Wikimédia France in a paid capacity to help improve Lingua Libre technical structure (see [https://www.wikimedia.fr/emploi-wikimedia-france/nos-appels-doffres/appel-doffres-developpement-et-amelioration-de-loutil-web-lingua-libre/ this] - in French - for the scope of our intervention).
| |
− | | |
− | One of our first action last Thursday was to restart the Blazegraph updater. A lot of tools are depending on this "fundamental brick" (including but not limited to): the SPARQL endpoint (and pages using it) and bots. Now, you can see that pages like [[Special:MyLanguage/LinguaLibre:Stats]] are up-to-date again and the bots should also restart soon (you can see more technical info on this on [[LinguaLibre:Technical board]])).
| |
− | | |
− | The next big step will be to update this Mediawiki from 1.31 to 1.35 and moving it to a new server.
| |
− | | |
− | If you see something or anything wrong or strange, don't hesitate to let me know. I'm also available for any question.
| |
− | | |
− | Cheers, [[User:VIGNERON|VIGNERON]] ([[User talk:VIGNERON|talk]]) 08:56, 15 February 2021 (UTC)
| |
− | :Nice ! Happy to see you folks jumping in. Thank you for the Stats ! We can witness our passage over 400,000 audios shortly. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:27, 15 February 2021 (UTC)
| |
− | | |
− | == 400,000 ==
| |
− | | |
− | The total amount of recordings on Lingua Libre reached '''400,000''' a few hours ago. February is already the second most fruitful month since the beginning of the project, even though we are only halfway through. LiLi is growing faster and faster, and this is only the beginning!<br/>Congratulations and thanks to everyone who gives some time to record voices and to spread the project around the world.<br/>
| |
− | All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 18:10, 16 February 2021 (UTC)
| |
− | :And another milestone broken ! Big thanks to the [[user:Titodutta|Titodutta]] and Marathi effects, too ! [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 21:24, 16 February 2021 (UTC)
| |
− | ::[[User:Yug|Yug]], [[User:WikiLucas00|WikiLucas]] and [[user:Titodutta|Titodutta]]- thanks for the support! Marathi community had decided to gift minimum 5000 records on the occasion of [[:en:Marathi Language Day|Marathi Language Day]] to be celebrated on 27 February. We have crossed 6000 records as of now. All credit goes to community members. [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:22, 26 February 2021 (UTC)
| |
− | ::See also [[:Commons:Category:Lingua_Libre_pronunciation-mar]]
| |
− | :::Congratulation to the Marathi community ! It's nice to see you contributes this way :) [[User:Yug|Yug]] ([[User talk:Yug|talk]])
| |
− | | |
− | == Chat room in your language ==
| |
− | | |
− | Hi all. I've created [[Template:Lang-CR]] in order to list all the chat rooms. I think it would be interesting for people to discuss in their native language. The main discussion should remain on this chat room in English in order to be understood by most of the contributors. So feel free to create a village pump/chat room in your mother tongue. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 20:21, 16 February 2021 (UTC)
| |
− | :It is welcome move. We need to discuss many local issues, policies, approaches, ideas etc. in own language. I have created Mar page [[LinguaLibre:संवाद-चर्चा दालन|संवाद-चर्चा दालन]]. Let me know whether the process is right. I will start engaging speakers here. [[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:36, 26 February 2021 (UTC)
| |
− | ::{{ping|सुबोध कुलकर्णी}} that's perfect. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:40, 26 February 2021 (UTC)
| |
| | | |
− | == New batch of lists available ! (999 languages) == | + | == Kinyarwanda language representation == |
− | :''Please, remember to tag the list_talk's page with {{tl|UNILEX license}}.''
| |
− | Greetings!<br>Thanks to [[:commons:user:Tshrinivasan|Tshrinivasan]] with who we discussed recent Indic (Marathi!) activity and lack of lists, I bumped again into UNILEX (GNU-like license), which is a Google-led Unicode Consortium project listing vocabulary for 999 languages. Data seems clean as far as I can tell. The two main maintainers are Google folks. So I suspect UNILEX uses Google's best scrappers and NLP cleaners. Within this data are tab-separated frequency lists as <code>{item} {number_of_occurences}</code>. I forked their github, and made a script to convert their format into Lili's <code>List:*</code> format such as <code># {item}</code>. See:
| |
− | * [https://github.com/lingua-libre/unilex/ github.com/lingua-libre/unilex]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash data/frequency-sorted-hash]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency/ig.txt ig.txt] – frequency
| |
− | * [https://github.com/lingua-libre/unilex/ github.com/lingua-libre/unilex]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash data/frequency-sorted-count]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-count/ig.txt ig.txt] – sorted
| |
− | * [https://github.com/lingua-libre/unilex/ github.com/lingua-libre/unilex]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash data/frequency-sorted-hash]/[https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash/ig.txt ig.txt] – Lili's List format
| |
− | You can check if there is your own language among the 999 available. For Marathi, replace <code>ig</code> by <code>mr</code>. I therefor created 2 local lists to test this approach :
| |
− | * [[List:Mar/words-by-frequency-00001-to-01000]] – starts soft
| |
− | * [[List:Mar/words-by-frequency-01001-to-05000]] – then I jumps to multiples of 5,000 : 01001-05000, 05001-10000, 10001-15000, etc.
| |
− | '''<span style="color:green">Right now, 1000 lists are already formated in Lili's syntax within the [https://github.com/lingua-libre/unilex/tree/master/data/frequency-sorted-hash /data/frequency-sorted-hash] directory.'''</span> If any community lacks wordlists on Lili's there you have them : copy, paste, done, situation unlocked ! [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:40, 24 February 2021 (UTC)
| |
− | :{{ping|Titodutta}} hi! This may interest your community. There are dozen(s) Indic languages :) It could also help you. You already recorded most of those words for your language (ben), together with the "ignore already recorded words" functions, these lists can fill some gaps :) [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:48, 24 February 2021 (UTC)
| |
− | ::* I love this. I'll inform the Marathi folks. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 17:16, 24 February 2021 (UTC)
| |
− | ::* This is just amazing. You don't know how much delighted I am feeling at this moment. I checked the Bengali list, a very few random words have typos, but that should not be more than 1% I guess. Over-all this will an extremely helpful resource for the communities. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 17:24, 24 February 2021 (UTC)
| |
− | :::* I share your enthusiasm ! It's bot created I'am pretty sure, the clean up is likely just statistical. Now that those lists are technically available, ideal next step would be human review by local communities. Maybe groups of 2~3 users for copyedit sprints ? :D But this is optional IMHO. Also, the corpora coming from online documents, IRL objects like `chair`, `car`, `walk`, may be further down on these lists. But they must be there in the first 20,000 items. The best is the linguistic diversity of this set. Amazing. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:10, 24 February 2021 (UTC)
| |
− | ::::*It's a good resource indeed. Thanks! The Marathi words in the list are grammatically correct also, with nearly no typos. We have started discussion about this in our community. Currently, we have started working on Lexemes first, the recordings of the lists thus created will be done simultaneously. The community thinks this approach is more useful in long run. The separate group of speakers may adopt these lists. But then we have to devise way to avoid repetitions. We will definitely discuss more on this resource utilisation and let you know.[[User:सुबोध कुलकर्णी|सुबोध कुलकर्णी]] ([[User talk:सुबोध कुलकर्णी|talk]]) 05:14, 26 February 2021 (UTC)
| |
− | === Pause before running ===
| |
− | [[File:Long_tail.svg|thumb|Long tail curves likely applies to languages ranked by number of speakers. Since macro-languages such Mandarin, English, Spanish, Hindi, etc are certain to be soon audio documented by the sheer force of demography, our effort-strategy should progressively shift toward the right, and increasingly rare languages. The rarer the languages and speakers, the more listening we should become and the more custom assistances we will have to provide.]]
| |
− | [[User:Dragons Bot|Dragons Bot]] has been created, coded, tested, and is ready to import UNILEX's lists to LinguaLibre's <code>List:{iso}/{title}</code> namespaces. Given 1,000 pages and associated talk page will be create, I would like to pause few days to consider about this large list import / creation and why.
| |
− | * Lili > Languages > existing breath: We reached 110 languages on LinguaLibre so far.
| |
− | * Lili > Lists > non-sorted by usefulness : Sparql queries provides lists for all languages, but without prioritization on words' usefulness.
| |
− | * Lili > Lists > sorted by usefulness :
| |
− | ** Hand picked frequency lists are present for about 7 languages : eng, mar, por, pol, tam, ron, kur. With optimal relevance for teaching/learning.
| |
− | ** OlafBot's <code>List:*/Lemmas-without-audio-sorted-by-number-of-wiktionaries</code> counts, with optimal relevance for wiktionaries.
| |
− | ** UNILEX can provide frequency lists for 1,000 languages. About 10 times our current language coverage. UNILEX plugs itself upon [https://github.com/Google/Corpuscrawler Github.com/Google/Corpuscrawler], and open source project which plan to support more languages. I dived into these chain and it's an 'easy' NLP pipeline to contribute too. The wikimedia comunity can use it and expand it.
| |
− | '''Core issue:''' the core issue from online arrival of users is to increase retention of minority and semi-rare languages by smoothing their speakers work. By example an user of [[:en:Wayuu language|Wayuu language]] arrived today. We local (frequency) list was available today. But UNILEX + Dragons Bot can provide a local Wayuu frequency list of 8000 items, ready to record.<br>
| |
− | Since we don't know which semi-rare languages will come next, having 1,000 languages ready is a safe yet not so excessive bet. Assuming a [[:en:Zipf's law]]/[[:en:Long tail]] curve for languages and their speakers we can still predict that at least one out of 10~20 new language's speaker will miss a local wordlist. But together with OlafBot's lists, we move from 6% toward 90% of our languages habing a solid, '''usefulness-based roadmap''' to walk forward. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 14:21, 3 March 2021 (UTC)
| |
| | | |
− | == jQuery.Deferred exception: this.pastRecords is undefined ==
| + | I'm Robert RUGAMBA from Rwanda and i belong to Wikimedia Rwanda as a volunteer and event organizer. |
− | :''This discussion may be moved to [[LinguaLibre:Technical board]].'' | + | I'm exited to explore this platform of lingua libre and i wish my local languages to be add and represented. the wikidata rabel is: https://www.wikidata.org/wiki/Q33573 |
− | Hello, there.
| |
| | | |
− | When I try to load a list of words to record from the FR wiktionary, the modal does not disappear when I click "Done" and seems blocked trying to load the words. During this time, the JS console complains that "jQuery.Deferred exception: this.pastRecords is undefined", and the last resource loaded is, in cURL format:
| + | Thanks. [[User:Annick green|Annick green]] |
− | curl 'https://fr.wiktionary.org/w/api.php?action=query&format=json&origin=*&formatversion=2&prop=pageterms&wbptterms=label&generator=categorymembers&gcmnamespace=0&gcmtitle=%3ACat%C3%A9gorie%3ALocutions%20verbales%20en%20fran%C3%A7ais&gcmtype=page&gcmlimit=max' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:85.0) Gecko/20100101 Firefox/85.0' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Accept-Language: de,en-US;q=0.7,en;q=0.3' --compressed -H 'Origin: https://lingualibre.org' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Referer: https://lingualibre.org/' -H 'TE: Trailers'
| + | :{{Done}} This language was already on Lingualibre as [[Q285]]. If you open [[Special:RecordWizard]], at step 2, add it to your list of known languages. Please type in « Kinyarwanda », «Ikinyarwanda » and you should find it. Only user who have declared to know Kinyarwanda can record in Kinyarwanda. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 16:50, 27 June 2024 (UTC) |
| | | |
− | Looks like there is a bug…
| + | == Rename my pseudonym == |
| | | |
− | Regards.
| + | Hello. I've renamed my account on wikimedia sites but can't log in directly from this username here. Do i have something to do ? My old username is '''ElsaBester''' and the new one is '''L'embellie'''. Thanks ! |
− | [[User:LoquaxFR|LoquaxFR]] ([[User talk:LoquaxFR|talk]]) 17:21, 24 February 2021 (UTC)
| + | :Hello [[User:ElsaBester|L'embellie]], |
− | :Salut {{u|LoquaxFR}}, peux-tu décrire précisément ce que tu fais lorsque tu écris "when I try to load a list of words to record from the FR wiktionary" ? Comment charges-tu la liste de mots, le fais tu en utilisant en utalisant l'option « Catégorie Wikimedia » sur la droite ou bien en créant toi-même la liste de mots un par un ? Si tu utilises « Catégorie Wikimedia », peux-tu nous donner la catégorie que tu veux utiliser ? Est ce que tu arrives à reproduire le problème quelle que soit la catégorie avec laquelle tu veux travailler ? Merci d'avance pour ces renseignements qui je l'espère pourront permettre de cerner le problème le plus précisément possible. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 17:58, 24 February 2021 (UTC)
| + | :I may ping [[User:WikiLucas00|WikiLucas00]], but I think we don't currently have solution for your issue. |
− | :: En français, ce sera plus simple, en effet. Le problème se reproduit systématiquement lorsque j’essaye d’utiliser une catégorie Wikimédia (celle du wiktionnaire français en l’occurrence); je n’utilise que cette possibilité pour charger des mots, et le problème apparaît pour toutes les catégories que j’essaye d’utiliser, que j’aie déjà enregistré presque tous les mots ou celles pour lesquelles je n’ai fait qu’une petite partie des milliers de termes. Le problème se produit en navigation privée également, donc ça ne semble pas être le cache ou les cookies. Si besoin de plus d’infos, n’hésite pas. [[User:LoquaxFR|LoquaxFR]] ([[User talk:LoquaxFR|talk]]) 18:08, 24 February 2021 (UTC) | + | :We are phasing out this wiki, we hope to release a new Lingualibre this winter or early 2025. So this issue will be irrelevant by then. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:00, 22 August 2024 (UTC) |
− | :::Merci pour les infos supplémentaireS. Je viens de tester avec Firefox 78.7 et je ne rencontre pas ce problème. Peux-tu essayer avec un autre navigateur (Chromium ou autre) pour voir si le problème est inhérent à ton firefox (y compris en navigation privée). Ca peut par exemple venir d'un gadget que tu aurais installé. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:40, 24 February 2021 (UTC) | + | ::Hey there {{ping|ElsaBester|Yug}}. Sorry I don't have a solution, but I found this in the Chat Room's archives: [[LinguaLibre:Chat_room/Archives/2023#Update_my_username]]. Good luck — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 18:46, 26 August 2024 (UTC) |
− | ::::Addons Firefox qui casse le JS ? [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:57, 24 February 2021 (UTC) | + | :::Hello {{ping|ElsaBester}} you may also look at my latest reply on [[User talk:Yug]], it's not a great option but maybe you'll want to try it. All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 12:24, 31 August 2024 (UTC) |
− | ::::: Chrome et Safari me donnent le même résultat ; j’ai également essayé depuis une autre bécane et un autre OS, sans mieux : l’erreur JS se montre toujours et rien ne se passe au moment de la validation de la modale. Est-ce que j’aurai enregistré trop de mots, faisant bugger le JS lorsqu’il essaye de retirer ceux déjà enregistrés ? Vu qu’on n’est que quelques-uns à en avoir enregistré autant, ça se pourrait. J’avais déjà remarqué que le chargement de listes depuis le Wiktionnaire mettait de plus en plus de temps pour moi (relativement, hein : quelques secondes d’attente au plus). Est-ce un autre problème lié à mon compte ? [[User:LoquaxFR|LoquaxFR]] ([[User talk:LoquaxFR|talk]]) 06:30, 25 February 2021 (UTC)
| |
− | ::::::Merci pour les compléments d'info. J'ai ouvert [[phab:T275734|T275734]]. Faudrait voir avec {{u|Lepticed7}} et {{u|WikiLucas00}}, qui ont sensiblement le même nombre d'enregistrements que toi, pour tester si ils rencontrent aussi le même problème. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 06:54, 25 February 2021 (UTC) | |
− | ::::::: Salut, perso, je sais pas si c’est lié, mais il y a certains enregistrements que le Record Wizard ne retire pas quand je veux retirer les mots déjà enregistrés. En atteste [https://commons.wikimedia.org/w/index.php?title=File%3ALL-Q143_(epo)-Lepticed7-aprilo.wav ce fichier], que j’ai enregistré trois fois. [[User:Lepticed7|Lepticed7]] ([[User talk:Lepticed7|talk]]) 10:45, 28 February 2021 (UTC)
| |
| | | |
− | == 50,000 == | + | == Two French words that are impossible to record == |
− | February 2021. This month. We have seen 50,000 pronunciation in a month (see [[LinguaLibre:Statistics]]). This is for the first time we saw 50,000 entries in a month. This is great. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 08:51, 28 February 2021 (UTC)
| |
− | :That's really amazing. The same month we passed 400k recordings! AND the shortest month in the year! I'm going to prepare a small News to be published every month (inspired by what you did in September if I remember correctly), I think February is a very good month to start with! I'll publish it on your talk page if you'd like 🙂 All the best ! — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 16:11, 28 February 2021 (UTC)
| |
− | :* We can actually officially start a bi-monthly [[LinguaLibre:Newsletter]] to published on 1 March, 1 May, 1 July and so on. What do you think? I am also requesting [[User:Pamputt]], [[User:Yug]], [[User:Lyokoï]], [[User:Lepticed7]] to comment. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 17:40, 28 February 2021 (UTC)
| |
− | :::I would say, why not but I cannot lead for such project so if you are motivated to write and lead such newsletter, go ahead. [[User:Pamputt|Pamputt]] ([[User talk:Pamputt|talk]]) 18:39, 28 February 2021 (UTC)
| |
− | ::::On the [[LinguaLibre:Technical board/intro]] Poslovitch has started a [[LinguaLibre:Technical board/News|/News]] section which keeps log of important milestones. It's an interesting idea because it's minimalist, therefor low maintenance.
| |
− | ::::I'am also interested by a Newsletter for both external and internal purpose. I would help around yes. Editorial line would gain to be clarified: who are the expected readers, writing stuly, overall length, major sections, sections lenghts, etc. But this can "appears" with the first few issues :) Please keep a balance so the writing workload stays modest. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 18:57, 28 February 2021 (UTC)
| |
− | :::::The /News of the technical board is mostly about technical news. '''I fully agree to the idea of a Newsletter, yet quarterly'''. We could grab some ideas from the French Wiktionary's ''[https://fr.wiktionary.org/wiki/Wiktionnaire:Actualit%C3%A9s Actualités]''. --[[User:Poslovitch|Poslovitch]] ([[User talk:Poslovitch|talk]]) 20:33, 28 February 2021 (UTC)
| |
− | :::::* Salut, let's start with the newsletter of March. I'll add the stories I know such as 400,000 audios, 50,000 this month, the Wikimedia Wikimeet India, upcoming France-India call, French Wiktionary missed recording work etc. I'll start the draft tomorrow and ping you here.<br>In future we will need [[:mw:Extension:MassMessage]] to send newsletter to subscribers' talk page. A system admin is needed with access to the server and localsettings.php etc pages. I understand this will take time, so it can wait. Kind regards. --[[User:Titodutta|টিটো দত্ত (Titodutta)]] ([[User talk:Titodutta|কথা]]) 21:24, 28 February 2021 (UTC)
| |
− | :{{Ping|Titodutta}} hi, We are having on the mailing list another discussion about networking, cooperations and outward communications. I think the [[LinguaLibre:Newsletter]] page can be modeled upon Technical board and [[LinguaLibre:Bot]], a kind of hub for a subgroup of active users dedicated to a common goal. In this case <u>Communication</u>. The bimonthly Newsletter could be a core, founding element. But other discussion about outreach could take place there. We have so much to push in this direction : academic outreach, rare languages and under-represented countries, partner institutions, calling for new wikimedians, reminding far-away Wikimedian chapter of Lingualibre, etc. Having a hub dedicated to writing elegant co-edited texts, defining targets and leading the call for communication campaign would be a strong plus. I'am still focused on codes but I could help in few weeks. You seems to love it as well. Do we have other users interested to join such efforts ? Would be good to have few more folks. [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 20:39, 2 March 2021 (UTC)
| |
| | | |
− | === Newsletter : March 2021 review ? ===
| + | Hi, |
− | :''You can co-edit this text. PS Titodutta: a rough summary of past months and emerging directions based on a message to an ex-contributor.''
| |
− | In January and February, the « Lili » community has taken back control of the technical stack (access to servers, GitHub codes, bots, etc.) and made a call for more diverse speakers. The Indian community started to show up, with key Indic languages being Bengala (50,000) and Marathi (~10,000). Romanian, Polish, Ukrainian are also on the rise around 20,000 audios each. We continue to have some dozen smaller languages showing up but no powerful push yet.
| |
| | | |
− | Right now, an external software company is upgrading our MediaWiki and its modules thanks to Wikimedia France's funding. The volunteer dev team is also strong and internal organization is increasing. We now have [[LinguaLibre:Technical board]] as a tech hub, [[LinguaLibre:Bot]] as a bot hub, [[LinguaLibre:Events]] as an IRL/Online event hub.
| + | Two words are impossible to record (even before uploading): ''esclavesse'' and ''scribesse'' (all my attempts with other words work). [[User:Avatea|Avatea]] ([[User talk:Avatea|talk]]) 18:49, 30 August 2024 (UTC) |
− | When the main software upgrade settles down in a month we plan a [yet to create] [[LinguaLibre:Newsletter]] as an inward and outward communication hub.
| + | :Hi {{ping|Avatea}}. Sorry for the late reply. I couldn't reproduce the issue on my side, as you can see ({{Q|1385666}}, {{Q|1385667}}) I just recorded a few words ending with -esse, including the two words you mention, without encountering any issue. Did you try again recently? All the best — '''[[User:WikiLucas00|WikiLucas]]''' [[User talk:WikiLucas00|(🖋️)]] 14:42, 21 September 2024 (UTC) |
| + | :: Hi {{ping| WikiLucas00}} |
| + | :: No. I just tried, I was able to record another word, but still not those two. [[User:Avatea|Avatea]] ([[User talk:Avatea|talk]]) 19:06, 21 September 2024 (UTC) |
| + | :: After several dozen new recordings (and I had made hundreds of others before), still unable to record these two words. Tested on macOS and Windows. [[User:Avatea|Avatea]] ([[User talk:Avatea|talk]]) 21:26, 7 October 2024 (UTC) |
| | | |
− | In that last dimension, we could reach out to « relay users » on other wikis, who can share our news about LinguaLibre with communities of wiktionaries, wiksources, wikipedias, wikidata. We equally consider formally reaching out to non-Wikimedia groups such as Common Voice, Unicode, governmental and NGO agencies, research centers. Possibly in the form of group work and/or an online editathon when we gather to spread the news. This hub, summarizing the community's discussions, will therefore also clarify goals and strategies. We are looking for help with this matter.
| + | == Supprimer deux enregistrements incorrects. == |
| | | |
− | This current forward dynamic is thanks to the early Autumn 2020's efforts. We weren't able to immediately convert those into actions but it still injected energy and vision into LinguaLibre which helped snowball the current dynamic. Also, many thanks to all those who got involved in this journey! [[User:Yug|Yug]] [[User talk:Yug|<small><font style="color:green;">(talk)</font></small>]] 07:20, 3 March 2021 (UTC)
| + | Bonjour! À cause d'une erreur lors d'écriture et parce que je l'ai fait pressé, j'ai enregistré par erreur deux termes: *"[[Q1387394|escaramón]]" et son pluriel *"[[Q1387395|escaramones]]". Serait-il possible de supprimer ces fichiers enregistrés ? J'ai déjà fait les enregistrements corrects de ces mots bien écrits et avec la prononciation correcte: "[[Q1387396|escamarón]]", "[[Q1387397|escamarones]]". Vous pouvez vérifier l’exactitude de ce terme [https://diccionariu.alladixital.org/index.php?cod=21008 ici]. Désolé pour le dérangement. --[[User:Limotecariu|Limotecariu]] ([[User talk:Limotecariu|talk]]) 20:31, 28 September 2024 (UTC) |
− | :Also, I just found out Commons grows at a speed of [https://stats.wikimedia.org/#/commons.wikimedia.org/content/pages-to-date/normal|line|2-year|page_type~content*non-content|monthly about 1 millions files per month]. So with 50,000 audios last month, Lili makes up to 5% of Commons' new files.
| |