LinguaLibre

Jargon

LinguaLibre's jargo is the group of local and notable concepts which have emerged as relevant to our community's missions.

Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg
This page is a work in progress.

Awareness, Demo, Workshops

See also LinguaLibre:Workshops.

Real life outreach stay powerful tools. Common outreach grounds are Wikimedia events and language universities but any context could work as long as you public is interested by languages and conservation of cultural wealths. The first communication tools is just to discuss and create interest to our core mission : letting people know than a Wikimedia tool exist to make rapid audio recording and 800 recordings per hour possible. The second common communication tool at hand is the 4 minutes demo. An experienced Lili users shows on her or his device how he record few words from a list via a <5 minutes and 30~100 words demo. It requires a more visible interest and a calm room while smartphone or tablets have proved to work well for demoing. The recorded audios do not have to be uploaded to Commons, it depends on the situation. The highest strategy requires dedicated public ready for a longer exploration of LinguaLibre as a audio tool for rapid recording. In a workshop session, with one mentoree or a class of leaners under the assistance and leadership of an experienced contributor, the mentoree(s) use their Commons.Wikimedia.org account, go on Lingualibre.org, open the RecordWizard, follow the steps, open a modest list, to start recording. Discovery, ambiguity, mistakes and learning will ensue, therefore providing an aware and trained potential contributor by the end of the session. In all these three scenario, the newly contacted person can now report to other about LinguaLibre, its core goals, and its ease of use.

Core goals

LinguaLibre's core goal is to record as many words in as many languages as possible. This goal has language conservation, language e-learning, and language diversity implications.

Gentle ramp approach

Gentle ramp's in numbers
Start End User type
00001 00200 Discovery user
00201 01000 Returning, motivated
01001 02000 Returning, motivated
02001 05000 Commited
05001 10000 Commited
10001 15000 Bersek
15001 20000 Bersek
20001 25000 Bersek
25001 30000 Dragon
30001 35000 Dragon
35001 40000 Dragon
40001 45000 Dragon
45001 50000 Dragon
50000 + The One

The gentle ramp approach suggest the creation of Lists creating a progressive practice and learning ground for our audio contributors. This approach is the fruit of real-life experience with Lingualibre's recording outreach, demo and workshops.
The first list has just 200 words. It allows better on-the-ground first-contact and short demo session with new users. 200 is gently ambitious, allows to pass the uncanny valley of the first 20 words, and move to the joyful Lingualibre flow of rapid recording. Perfect for demo and on-boarding.
Following lists are for motivated users who by their own chose to return. To consolidate skills, list n⁰2 has 800 words while list n⁰3 has 1000 items. At this stage a decent 2,000 audios have been recorded by the speaker. These words likely make up for 90~95% of daily conversations.
Few users decide to become committed users. List n⁰4 has 3000 items, while all following ones have 5,000 words each. These lists are not expected to be done in one strike but over several session of one hour or less, during a dedicated day or along a week or so. These lists requires some middle-term planing and understanding of your ability to pause while a list is not complete, with the possibility to return later on.

Languages codes

See LinguaLibre:Language codes systems used across LinguaLibre.

Long tail, Zipf curve

See Long tail, Zipf law.

A frequent distributions of data, which with few category representing most of the occurrence, while a wide range of category represent each a low number of occurrence. This later group may, however, represent together a fair share of the whole. For example, about 50 English words make around 50% of English texts. Similarly, about 6 major languages are spoken by 4 billions people. On the opposite end few thousands minority languages are spoken only by few thousands speakers each. This as important strategic implications :

  • At a language level, it makes more sense to first records frequent words. But it then quickly becomes important to attack the long tail to document the true diversity of this language.
  • Considering all languages, it makes more sense to first records the most common language to serve more people. But it then quickly becomes important to attack the long tail to document the true diversity of human languages.

Lili

Aka LinguaLibre, Lingua Libre, LL, …

Elegant nickname and shortcut for LinguaLibre. While we started to use LL, Indian contributors Titodutta pointed out Lili (pronunced "Leelee") was a good way to orally nickname the project. It was adopted by other users.

See also