LinguaLibre

Difference between revisions of "Citations"

Citations gathers all citations of LinguaLibre by external actors.

(removed red link)
(articles in "peripharic" do not cite Lingua Libre)
Line 5: Line 5:
 
== Wikimedia Newsrooms ==
 
== Wikimedia Newsrooms ==
 
== Academic ==
 
== Academic ==
=== Lingualibre ===
 
 
[[File:Hutin and Allasonniere-Tang, L'apport des données collaboratives à l'exploration linguistique.pdf|thumb|thumb|L'apport des données collaboratives à l'exploration linguistique]]
 
[[File:Hutin and Allasonniere-Tang, L'apport des données collaboratives à l'exploration linguistique.pdf|thumb|thumb|L'apport des données collaboratives à l'exploration linguistique]]
 
* https://www.researchgate.net/publication/361565674_Crowd-sourcing_for_Less-resourced_Languages_Lingua_Libre_for_Polish
 
* https://www.researchgate.net/publication/361565674_Crowd-sourcing_for_Less-resourced_Languages_Lingua_Libre_for_Polish
Line 11: Line 10:
 
* https://elex.link/elex2021/wp-content/uploads/2021/08/eLex_2021_38_pp588-597.pdf
 
* https://elex.link/elex2021/wp-content/uploads/2021/08/eLex_2021_38_pp588-597.pdf
 
** Xavier Marjou (2021), GIPFA: Generating IPA Pronunciation from Audio
 
** Xavier Marjou (2021), GIPFA: Generating IPA Pronunciation from Audio
 
=== Peripharic ===
 
Word lists by Google / Unilex researches
 
* https://research.google/pubs/pub47206/ for mining wordlists (Unilex-style) from 2,000+ languages
 
** Prasad, Manasa; Breiner, Theresa; Esch, Daan van (2018). "Mining Training Data for Language Modeling across the World's Languages" (PDF). Proceedings of the 6th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2018).
 
* https://research.google/pubs/pub46952/ cleaning them up;
 
** Chua, Mason; Esch, Daan van; Coccaro, Noah; Cho, Eunjoon; Bhandari, Sujeet; Jia, Libin (2018). "Text Normalization Infrastructure that Scales to Hundreds of Language Varieties". Proceedings of the 11th edition of the Language Resources and Evaluation Conference.
 
* https://arxiv.org/abs/2103.15845 open-sourced;
 
** Zupon, Andrew; Crew, Evan; Ritchie, Sandy (2021-03-29). "Text Normalization for Low-Resource Languages of Africa". arXiv:2103.15845 [cs].
 
* https://research.google/pubs/pub49814/ using these wordlists to find sentences using our web crawler
 
** Caswell, Isaac; Breiner, Theresa; Esch, Daan van; Bapna, Ankur (2020). "Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus".
 
* https://research.google/pubs/pub50211/ cleaning up web-crawled text
 
** Kreutzer, Julia; Caswell, Isaac; Wang, Lisa; Wahab, Ahsan; Esch, Daan van; Ulzii-Orshikh, Nasanbayar; Tapo, Allahsera Auguste; Subramani, Nishant; Sokolov, Artem; Sikasote, Claytone; Setyawan, Monang (2022). "Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets". TACL.
 
* https://arxiv.org/abs/2205.03983 building machine translation systems from them
 
** Bapna, Ankur; Caswell, Isaac; Kreutzer, Julia; Firat, Orhan; van Esch, Daan; Siddhant, Aditya; Niu, Mengmeng; Baljekar, Pallavi; Garcia, Xavier; Macherey, Wolfgang; Breiner, Theresa (2022-05-16). "Building Machine Translation Systems for the Next Thousand Languages". arXiv:2205.03983 [cs].
 
* https://ai.googleblog.com/2022/05/24-new-languages-google-translate.html blog post
 
** "Unlocking Zero-Resource Machine Translation to Support New Languages in Google Translate". Google AI Blog. Retrieved 2022-06-30.
 
  
 
== See also ==
 
== See also ==
 
* [[Lingualibre:Events]]
 
* [[Lingualibre:Events]]
 
* [[Lingualibre:Apps]]
 
* [[Lingualibre:Apps]]

Revision as of 07:22, 1 July 2022

Press

France

World

Wikimedia Newsrooms

Academic

L'apport des données collaboratives à l'exploration linguistique

See also