The 2022 Language Resources and Evaluation Conference is a major international and academico-professional conference on language resources. The dominant topic is on discourse evaluation and for anti-abusive discourse management, but the field is dedicated to expand its reach in term of language diversity. Orality is marginal, which therefore gives Lingualibre value and opportunities.
- Event: LREC 2022, Language Resources and Evaluation Conference 2022.
- Topic: Language Resources and Evaluation, there will also be enterprise looking for projects to work with.
- Place: Marseille, Pharo
- Date: 2022/06/20-25, 10-19h
- Lead contact: Adélaïde_Calais_WMFr, Yug.
- Objectives: 1) Advocate for "Lingualibre : rapid audio recording tool for lexicons and more. 2) Investigate possible technical sponsorship / collaboration."
- Etherpad: https://etherpad.wikimedia.org/p/LREC
- Report: See Etherpad or meta:User:Yug/Marseille.
- Outcome: ~40 languages communities contacted, informed. Contacts taken : see etherpad. Needs follow up email, see Lingualibre:Mailing.
- Lessons: Needs more flyers, namecards (?), poster (?). Duo was good idea. Discussion strategy to shorten so to investigate more participants.
- https://research.google/pubs/pub47206/ for mining wordlists (Unilex-style) from 2,000+ languages
- https://research.google/pubs/pub46952/ cleaning them up; open-sourced in https://arxiv.org/abs/2103.15845
- https://research.google/pubs/pub49814/ using these wordlists to find sentences using our web crawler
- https://research.google/pubs/pub50211/ cleaning up web-crawled text
- https://arxiv.org/abs/2205.03983 building machine translation systems from them; blog post https://ai.googleblog.com/2022/05/24-new-languages-google-translate.html