User

Titodutta/Bengali Lexeme workflow

< User:Titodutta

This page is a simple documentation of Bengali Lexeme workflow that is used on LinguaLibre. This document will explain:

  • the teamwork and the process
  • pros and cons
  • Upcoming plans

On LinguaLibre as of 1 November we have uploaded more than 35,000 words. A large number of these uploads (around 16,000 of these uploads) are to support Wikidata Lexicographical data in Bengali language. One word often have several form such as:

  • Go → went, gone, going etc. (verb)
  • Good → Better, Best etc. (adjective)

Every language has different forms, both in type in number. An English verb mostly has 5 forms. A Bengali verb may have around 98 forms.

Now, a few Bengali community members are working on Wikidata to improve lexicographical data. So, if you see bunch of words of same root are being uploaded, it is actually to sync with the Bengali project on Wikidata.

Team work: Procedure

There is a chat group on Facebook on discussion, Telegram platform, or sometimes phone calls are also done to co-ordinate.

So, all these teamwork may not be visible when you are seeing the words only, but it is good to note the work on the other side

Calendar

  • September–October were a bit slow, as the group was focusing on other areas of work. I was uploading non-lexeme pronunciation mostly.
  • In November 2020 you'll see an increase of file uploads, related to lexeme, as the group is working actively this month to create Bengali lexemes, hence we would require Bengali pronunciation accordingly.
  • February 2021: Work will be slower till 24 February, because of focus on other work
  • February last week — March end 2021: Work-speed should increase because of the planned dedicated activity around Bengali lexeme

Future plans

  • As you can see, as of 1 November around 40% of my total uploads were for Bengali lexeme, Update: and as of 1 February 2021, ~62% of the uploads were used on Lexeme projects, others were related to mostly Wikipedia article titles, or words from dictionary. I use all the options on LinguaLibre to generate words, including "nearby" options. I have also tried PagePile, and PetScan tool to generate word list (and used as local list). In future, I can write separate stories on those works. However I am mostly interested to work on the Lexeme project on LinguaLibre.
  • Once(When?) I can do some satisfactory work with Bengali on LinguaLibre, if God allows, I actually wish to move to "Indian English" or English (In) and record another series of words for Indian English. As of 1 November 2020, English on Lingua Libre is "English", I have not seen any other difference/dialect.

See also