LinguaLibre talk
Difference between revisions of "Citations"
Line 1: | Line 1: | ||
== Massively hyper lingual projects == | == Massively hyper lingual projects == | ||
+ | Avec plaisir! Here are the papers: | ||
+ | * https://research.google/pubs/pub47206/ for mining wordlists (Unilex-style) from 2,000+ languages | ||
+ | * https://research.google/pubs/pub46952/ cleaning them up; open-sourced in https://arxiv.org/abs/2103.15845 | ||
+ | * https://research.google/pubs/pub49814/ using these wordlists to find sentences using our web crawler | ||
+ | * https://research.google/pubs/pub50211/ cleaning up web-crawled text | ||
+ | * https://arxiv.org/abs/2205.03983 building machine translation systems from them; blog post https://ai.googleblog.com/2022/05/24-new-languages-google-translate.html | ||
* https://arxiv.org/abs/2305.13516 https://huggingface.co/spaces/mms-meta/MMS | * https://arxiv.org/abs/2305.13516 https://huggingface.co/spaces/mms-meta/MMS | ||
+ | [[User:Yug|Yug]] ([[User talk:Yug|talk]]) 09:29, 11 August 2023 (UTC) |
Revision as of 09:29, 11 August 2023
Massively hyper lingual projects
Avec plaisir! Here are the papers:
- https://research.google/pubs/pub47206/ for mining wordlists (Unilex-style) from 2,000+ languages
- https://research.google/pubs/pub46952/ cleaning them up; open-sourced in https://arxiv.org/abs/2103.15845
- https://research.google/pubs/pub49814/ using these wordlists to find sentences using our web crawler
- https://research.google/pubs/pub50211/ cleaning up web-crawled text
- https://arxiv.org/abs/2205.03983 building machine translation systems from them; blog post https://ai.googleblog.com/2022/05/24-new-languages-google-translate.html
- https://arxiv.org/abs/2305.13516 https://huggingface.co/spaces/mms-meta/MMS