Help

Difference between revisions of "Homographs"

Homographs (same writing) but not homophone (not same pronunciation), aka require a suffix to differentiates these audios. The suffix should not be pronounced when recording, but it will appear in the filename. For more convenience, this suffix should be added to the List:{ISO}/{list title} you plan to record, before recording it.

 
(29 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''Homographs''' (same writing) but not homophone (not same pronunciation), aka [[:en:Heteronym_(linguistics)#Examples|Heteronym]] require a suffix to differentiates these audios. The suffix should not be pronounced nor recorded.
+
[[File:Homograph homophone venn diagram.svg|thumb|300px|[[w:Euler diagram|Euler diagram]] showing the relationships between heteronyms and related linguistic concepts.]]
 +
{{#Subtitle:'''Homographs''' (same writing) but not homophone (not same pronunciation), aka [[:en:Heteronym_(linguistics)#Examples|Heteronym]] require a suffix to differentiates these audios. The suffix should not be pronounced when recording, but it will appear in the filename. For more convenience, this suffix should be added to the <code>List:{ISO}/{list title}</code> you plan to record, before recording it.}}
  
== Rule ==
+
== Rules ==
# If one pronunciation is clearly the norm, no suffix is needed
+
# If one pronunciation is clearly the norm, no suffix is needed.
# For equal rank or rare pronunciations, add to that word a suffix within brackets, like so:<br><code># word (suffix)</code>.
+
# For equal rank or rare pronunciations, add to that word a suffix within brackets, example:<br><code># word (suffix)</code>.
# This suffix should hint at the difference between both items.
+
# This suffix should hint at the difference between two homographs or more.
 +
# The suffix must be consistent and stable, ex: if you start with <code>(noun)</code>, <code>(verb)</code>, keep that exact convention <u>for all</u> your recordings. If you start with a transcription, keep on that transcription. Etc.
 +
# The suffix is in the same language as the word, ex : <code>red (noun)</code>, <code>အနီရောင် (နာမ်)</code>.
 +
# Abbreviated suffixes should be avoided. Prefer full suffix <code>adjective</code>, <code>verb</code>, <code>noun</code>, <code>casual</code>, <code>formal</code>, …
  
== Example ==
+
== Homographs homophones ==
In French, the following are homographs non homophones, the part between brackets is not read aloud in LinguaLibre.
+
Given one language and one speaker, one recording for them all. Even if meaning or role (part of speech) diverge.
  
Distinction via the part of speech :
+
== Homographs non-homophones ==
* <code># excellent (v)</code>, pronounced and recorded `excel`
+
The following are homographs non-homophones, the part between brackets is not read aloud in LinguaLibre but is used to distinguish those recordings.
* <code># excellent (adj)</code>, pronounced and recorded `excellant`
+
 
Distinction via semantic synonyms :
+
Distinction via semantic synonyms. In English :
 
* <code># crooked (injured)</code>, pronounced and recorded `crookaid` /ˈkrʊkɪd/
 
* <code># crooked (injured)</code>, pronounced and recorded `crookaid` /ˈkrʊkɪd/
 
* <code># crooked (corrupt)</code>, pronounced and recorded `crookt` /ˈkrʊkt/
 
* <code># crooked (corrupt)</code>, pronounced and recorded `crookt` /ˈkrʊkt/
Distinction via pronunciation in a transcription of your choice, here with [[:en:IPA|IPA]]:
+
 
 +
Distinction via pronunciation. In Mandarin Chinese, using toned [[:en:Hanyu pinyin|Hanyu pinyin]]:
 +
* <code># 雨 (yǚ)</code>, noun, pronounced and recorded `/yː3/`
 +
* <code># 雨 (yù)</code>, verb, pronounced and recorded `/y:4/`
 +
 
 +
Distinction via the part of speech. In French :
 +
* <code># excellent (verb)</code>, pronounced and recorded `excel` /ɛk.sɛl/
 +
* <code># excellent (adjective)</code>, pronounced and recorded `excellant` /ɛk.sɛ.lɑ̃/
 +
 
 +
Distinction via pronunciation. In English, using [[:en:IPA|IPA]]:
 
* <code># crooked (/ˈkrʊkɪd/)</code>, pronounced and recorded `crookaid` /ˈkrʊkɪd/
 
* <code># crooked (/ˈkrʊkɪd/)</code>, pronounced and recorded `crookaid` /ˈkrʊkɪd/
 
* <code># crooked (/ˈkrʊkt/)</code>, pronounced and recorded `crookt` /ˈkrʊkt/
 
* <code># crooked (/ˈkrʊkt/)</code>, pronounced and recorded `crookt` /ˈkrʊkt/
In some language, word can be pronounced and recorded differently if read by a man or woman :
+
 
* <code># vert (masculin)</code>, pronounced and recorded `ver`
+
Distinction via cultural dimension, depending on the public (hierarchy, age, seniority). In Japanese :
* <code># vert (féminin)</code>, pronounced and recorded `verte`
+
* <code># 昨日</code> (default), pronounced and recorded `きのう`
 +
* <code># 昨日 (polite)</code>, pronounced and recorded `さくじつ`
 +
* <code># 明日</code> (default), pronounced and recorded `あした `
 +
* <code># 明日 (polite)</code>, pronounced and recorded `あす`,`みょうにち`
 +
* <code># 私</code> (default), pronounced and recorded `わたし watashi`
 +
* <code># (polite)</code>, pronounced and recorded `わたくし watakushi`
  
 
== In practice ==
 
== In practice ==
Line 27: Line 45:
 
# ကစေံ1
 
# ကစေံ1
 
# ကစေံ2
 
# ကစေံ2
 +
# ကစေံ3
 +
# ကစေံ4
 
</pre>
 
</pre>
 
into
 
into
 
<pre>
 
<pre>
#ကစေံ (read)
+
# ကစေံ (read)
#ကစေံ (speak)
+
# ကစေံ (speak)
 +
# ကစေံ (Tang)
 +
# ကစေံ (Te)
 
</pre>
 
</pre>
 
You can now record your words, without reading the suffix.
 
You can now record your words, without reading the suffix.
  
{{draft}}
+
== Technical details ==
[[Category:Lingua Libre:Help]]
+
The suffix is not part of the word and is stored with the property {{P|18}} in the Wikibase. See {{Q|1686}} and {{Q|1685}} for example. It is then possible to query recordings without mixing words and suffixes.
 +
 
 +
== See also ==
 +
* [[Help:Lists]]
 +
* [[Help:List translation]]
 +
 
 +
{{Helps}}

Latest revision as of 16:44, 4 September 2023

Euler diagram showing the relationships between heteronyms and related linguistic concepts.


Rules

  1. If one pronunciation is clearly the norm, no suffix is needed.
  2. For equal rank or rare pronunciations, add to that word a suffix within brackets, example:
    # word (suffix).
  3. This suffix should hint at the difference between two homographs or more.
  4. The suffix must be consistent and stable, ex: if you start with (noun), (verb), keep that exact convention for all your recordings. If you start with a transcription, keep on that transcription. Etc.
  5. The suffix is in the same language as the word, ex : red (noun), အနီရောင် (နာမ်).
  6. Abbreviated suffixes should be avoided. Prefer full suffix adjective, verb, noun, casual, formal, …

Homographs homophones

Given one language and one speaker, one recording for them all. Even if meaning or role (part of speech) diverge.

Homographs non-homophones

The following are homographs non-homophones, the part between brackets is not read aloud in LinguaLibre but is used to distinguish those recordings.

Distinction via semantic synonyms. In English :

  • # crooked (injured), pronounced and recorded `crookaid` /ˈkrʊkɪd/
  • # crooked (corrupt), pronounced and recorded `crookt` /ˈkrʊkt/

Distinction via pronunciation. In Mandarin Chinese, using toned Hanyu pinyin:

  • # 雨 (yǚ), noun, pronounced and recorded `/yː3/`
  • # 雨 (yù), verb, pronounced and recorded `/y:4/`

Distinction via the part of speech. In French :

  • # excellent (verb), pronounced and recorded `excel` /ɛk.sɛl/
  • # excellent (adjective), pronounced and recorded `excellant` /ɛk.sɛ.lɑ̃/

Distinction via pronunciation. In English, using IPA:

  • # crooked (/ˈkrʊkɪd/), pronounced and recorded `crookaid` /ˈkrʊkɪd/
  • # crooked (/ˈkrʊkt/), pronounced and recorded `crookt` /ˈkrʊkt/

Distinction via cultural dimension, depending on the public (hierarchy, age, seniority). In Japanese :

  • # 昨日 (default), pronounced and recorded `きのう`
  • # 昨日 (polite), pronounced and recorded `さくじつ`
  • # 明日 (default), pronounced and recorded `あした `
  • # 明日 (polite), pronounced and recorded `あす`,`みょうにち`
  • # 私 (default), pronounced and recorded `わたし watashi`
  • # 私 (polite), pronounced and recorded `わたくし watakushi`

In practice

Within your list such as List:mnw/Commons, transform :

# ကစေံ1
# ကစေံ2
# ကစေံ3
# ကစေံ4

into

# ကစေံ (read)
# ကစေံ (speak)
# ကစေံ (Tang)
# ကစေံ (Te)

You can now record your words, without reading the suffix.

Technical details

The suffix is not part of the word and is stored with the property qualifier (P18) in the Wikibase. See fils (enfant) (Q1686) and fils (pluriel de fil) (Q1685) for example. It is then possible to query recordings without mixing words and suffixes.

See also

Lingua Libre Help pages
General help pages Help:InterfaceHelp:Your first recordHelp:Choosing a microphoneHelp:Configure your microphoneHelp:TranslateHelp:LangtagsLinguaLibre:Language codes systems used across LinguaLibreLinguaLibre:List of languages
Linguistic help pages Help:Add a new languageHelp:HomographsHelp:List translationHelp:Ethics
Lists help pages Help:Create your own listsHelp:How to create a frequency list?Help:Why wordlists matter?Help:Swadesh listsHelp:ListsHelp:Create a new generator
Events, Outreach Lingualibre:EventsLingualibre:RolesLingualibre:WorkshopsLingualibre:HackathonLingualibre:Interested communitiesLingualibre:Events/2022 Public Relations CampaignLingualibre:MailingLingualibre:JargonLingualibre:AppsLingualibre:CitationsService civique 2022-2023
Strategy Lingualibre 2022 Review (including outreach)2022-2023 Lingualibre wishlist • {{Wikimedia Language Diversity/Projects}} • Speakers map • Voices gender • StatsLingua Libre SignIt/2022 report • {{Grants}}