User

Difference between revisions of "Olafbot"

(Created page with "Bot, created by {{u|Olaf}}, updates various lists of missing audio recordings every night. Much more active in Polish Wiktionary.")
 
Line 1: Line 1:
Bot, created by {{u|Olaf}}, updates various lists of missing audio recordings every night. Much more active in [[:pl:wikt:Specjalna:Wkład/Olafbot|Polish Wiktionary]].
+
[[File:Wiktionary_Bots.png|thumb]]
 +
Bot, created by {{u|Olaf}}, updates various lists of missing audio recordings every night. Much more active in [[:pl:wikt:Specjalna:Wkład/Olafbot|Polish Wiktionary]].
 +
 
 +
Lists named "Lemmas-without-audio-sorted-by-number-of-wiktionaries" are created in the following way:
 +
* For a given language, the bot traverses categories on all wiktionaries and a few open dictionaries and collects statistics - for each lemma it counts dictionaries that describe this word in this language. This is something the bot has been doing for 11 years, generating different [[:pl:wikt:Kategoria:Rankingi_brakujących_słów_według_wystąpień_w_innych_wikisłownikach|lists for Polish Wiktionary]].
 +
* Titles written in wrong alphabets are removed.
 +
* Lemmas with audio recording in Commons are also removed for this set.
 +
* For a few languages minor corrections are done, in order to extract the set of dictionary lemmas, if possible without inflected forms.
 +
* The resulting set is sorted descending by the number of wiktionaries and limited to 5000 entries.

Revision as of 01:22, 27 February 2021

Wiktionary Bots.png

Bot, created by Olaf, updates various lists of missing audio recordings every night. Much more active in Polish Wiktionary.

Lists named "Lemmas-without-audio-sorted-by-number-of-wiktionaries" are created in the following way:

  • For a given language, the bot traverses categories on all wiktionaries and a few open dictionaries and collects statistics - for each lemma it counts dictionaries that describe this word in this language. This is something the bot has been doing for 11 years, generating different lists for Polish Wiktionary.
  • Titles written in wrong alphabets are removed.
  • Lemmas with audio recording in Commons are also removed for this set.
  • For a few languages minor corrections are done, in order to extract the set of dictionary lemmas, if possible without inflected forms.
  • The resulting set is sorted descending by the number of wiktionaries and limited to 5000 entries.