List talk

Fra/Lemmas-without-audio-sorted-by-number-of-wiktionaries

Hi @Olaf thank you very much for this list, I find it very useful!

I have a few questions (that also applies to this list in other languages):

  1. How often exactly does Olafbot check if some words of the list have been recorded?
  2. If I change the name of this list (to translate it in French for example), do you need to change the bot's code?
  3. I believe you once said that the bot's code is in Java, do you have a repository containing the code? Is your bot running on a personal server or on ToolForge?

Thank you very much, all the best! — WikiLucas (🖋️) 15:27, 8 May 2021 (UTC)

Hi, @WikiLucas00
Ad 1. Every three hours all around the clock. However, it's based on the sorted list of the number of wiktionaries that contain a French language section for each word. This list is refreshed much less frequently - about once per two weeks because this is how long it takes for 72 languages.
Ad 2. Yes. But you may just create a redirection to this list from the French translation.
Ad 3. The bot is running from Polish Wikimedia's Tool Server. The code does a lot of mundane tasks in Polish Wiktionary, the lists for Lili are just a side job based on the same data which has been used to generate these lists: wikt:pl:Kategoria:Rankingi_brakujących_słów_według_wystąpień_w_innych_wikisłownikach since 2011. I'm sorry, I know it's against the Wikimedia custom, but I haven't published the source code. I started this bot 12 years ago, when my programming skills were much lower, and the code is very clumsy and old-fashioned. Now as a senior developer I can't afford to publish it because it could hinder my future job perspective if somebody found this code looking up my name on the Internet. :-) In fact, I should refactor the whole thing, but it's 53000 lines of a hopeless Java code, so I'm probably forced to maintain it forever. Perhaps I could extract and publish the fragment used for LiLi lists alone, but as it shares the source data with the lists for Polish Wiktionary, it doesn't make much sense. Olaf (talk) 15:59, 8 May 2021 (UTC)
@Olaf Thank you very much for your answer 🙂 I created a redirection to this list with a French name and it works fine. We might have to do the same (using a user-friendly name) for the 71 other languages in the future, in order to increase the visibility of your lists.
I totally understand your situation regarding your java code. As part of my Residence as a Lingua Librist at 2IF (see here), we want to organize a little (French-speaking) hackathon in Lyon (France) this summer (to improve lists in French -- and eventually any language), and I might come back to you then, asking for advice 🙂
Thank you for the time you spend on Lingua Libre, it is highly valuable.
All the best — WikiLucas (🖋️) 15:06, 10 May 2021 (UTC)

Unwanted words

Hello @Olaf is there a way to remove a word from this list without having the bot adding it again at its next cycle? Here, the word "abandonar" is a mistake, it does not exist in French.

All the best — WikiLucas (🖋️) 17:44, 18 August 2021 (UTC)

@WikiLucas00 Not at the moment, but I'm working on it. Olaf (talk) 11:58, 26 August 2021 (UTC)
@WikiLucas00 Done. There is an exclusion list implemented for each language. For French use this link: fra. In particular "abandonar" is no longer present on the "Lemmas-without-audio" list. Olaf (talk) 09:32, 13 September 2021 (UTC)