LinguaLibre

2022-2023 projection

Revision as of 19:57, 20 March 2022 by 0x010C (talk | contribs)

This week, Wikimedia France is establishing its budget for the period of July 2022 to June 2023.

Please share here what you think we should get done this year on Lingua Libre. Feel free to add projects of yours that would require funding, as well as bugs and forseeable technical needs. Please remember to link phabricator tickets to the bugs and technical issues you raise. A maximum of 10 suggestions per person would be best.

Write your suggestions here
Username Project or problem Reason why this is important to accomplish or solve in 2022-2023 Approximate human time needed Estimated budget Would you like to be involved in this ?
Rdrg109 Program for extracting sentences from any audio stream for their inclusion in Lingua Libre. Each extracted audio would correspond to a sentence. Each sentence could be added to lexemes as a "usage example". Having usage examples with pronunciation audios makes Wikidata lexicographical data more useful. With SPARQL, we could then answer questions of the style: Usage examples with pronunciation audios that were retrieved from interviews where the participant is a native speaker of that language. More information about this idea in this page. 3 months Unknown (I have little experience with MediaWiki development so it will be more of a learning experience) Yes
Rdrg109 Interface in Lingua Libre that focuses on adding pronunciation audio that is missing (forms and usage examples that have zero pronunciation audios) There are lots of things in Wikidata lexicographical data that is missing pronunciation audio. As of 2022/03/18 22:24:21 UTC, there is only 1 usage example that has a pronunciation audio. English has 129942 forms, but only 340 have pronunciation audios (i.e. ~0.0026% of English forms has pronunciation audio), the same situation happens with other languages. More statistics on this at this page. 2 months Unknown (I have little experience with MediaWiki development so it will be more of a learning experience) Yes
marreromarco Improving the search function to make LinguaLibre useful for language learning The current user interface makes it impossible to use Lingua Libre for language learning as a competition to Forvo. Without language learners interested in the project, very few persons would be interested in contributing since audios would be stored in a database with no practical usage. LinguaLibre could be a FOSS alternative to Forvo that allows people to listen recordings easily and quickly. It is important to solve this problem in 2022-2023 to attract more contributors and expand the number of recordings. Otherwise, only few "Wikimedians" would collaborate, and the database would never grow. 6 months 30.000 Euros (cost of hiring a full time developer with experience) Yes (with feedback/ideas). I am not a programmer, but I would like to provide as much feedback as possible and report bugs.
marreromarco Public Relations (PR) Campaign LinguaLibre is essentially unknown among language learners. In its current state, the project has no way to attract learners because it lacks an efficient “Search Functionality”. If LinguaLibre could hire a developer to improve the search function, afterwards it would be necessary to promote the website to attract language learners (and new contributors). An efficient way to promote the website is to write posts on the blogs, YouTube Videos, social media, magazines, newspapers, etc. A PR Campaign is necessary in 2022-2023 to increase the number of active contributors and become a viable FOSS alternative to Forvo. 6 months 6.000 Euros (Cost of hiring an intern to work at WikimediaFrance Headquarters) Yes (with feedback/ideas)
marreromarco Anki Integration with LinguaLibre An Anki Add-on would be helpful for language learners 3 months 15.000 Euros (depends on the number of hours that a developer would have to invest) Yes (with feedback/ideas).
marreromarco Add function to "Request" a Pronunciation to Native Speakers It is very useful for language learners to request the specific word/phrase in which they have doubts about the Pronunciation. Forvo allows such function and users make very creative requests. It is also helpful specially for technical terms and proper names 3 months 15.000 Euros (depends on the number of hours that a developer would have to invest) Yes (with feedback/ideas).
marreromarco Establish a “Voice Month” on Wikipedia Propose to Wikimedia Headquarters the development of a "Voice Month" in which LinguaLibre would be promoted on Wikipedia Articles in the Section of "Languages" at the left side of the Main Page. The idea was discussed previously: https://lingualibre.org/wiki/LinguaLibre:Events/Winter_2021-2022_Public_Relations_Campaign 6 months 6000 Euros (Payment of an Intern in charge of the PR Campaign) Yes
Poslovitch Improve the Datasets page The Datasets index is unsightly and at best offputting for people wanting to re-use our recordings through the datasets. We could get some inspiration from CommonVoice's, especially regarding statistics for each dataset 1 Week < 400 € (both if we rely on a pro or volunteer dev) Yes (can actually do this)
0x010C New tools for our power-users Lingua Libre laks a couple of tools to help experienced users to do a bunch of maintenance tasks:
- patrolling
- batch-editing metadatas
- batch importing records (like the one we had on LinguaLibre v1)
- ...
Those tools could be directly integrated as new special pages into the RecordWizard MediaWiki-extension.
3 months 16000 € Yes
0x010C Allow users to easily explore our fantastic audio-database Since we launched the v2 of this website in july 2018, hardly everything has changed with a major exception: QueryViz, the extension used to display SPARQL queries inside wikipages. Now that Lingua Libre has almost 700,000 audio recordings in its database, it would be good to take the time to improve this extension to allow everyone to explore our dataset in an easy to use, responsive and powerful online interface. This will have the side effect of attracting more people to the website, thereby increasing public awareness of the tool and the number of contributors. 3 months 16000€ Yes
0x010C Global MediaWiki upgrade Time goes by and MediaWiki versions increase. If the schedule is respected, the future LTS version (1.39) will be released in November 2022. At this time we will have to think about migrating to stay up to date and keep our users safe. This will involve small but numerous adjustments in LinguaLibre-specific extensions.
Beyond that, there are still many possible improvements to be made to increase user experience on our MediaWiki: the main search bar, the lack of a Visual Editor, Special pages and wikicode-editing UI (Special:Search, Special:Recent changes,...), etc.
1.5 month 9000€ Yes
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule
Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule Texte de la cellule