LinguaLibre

Difference between revisions of "Wikidata"

Wikidata.org is a Wikimedia project storing structured data for Wikimedia projects and the world. These data are under CC0 license, freely accessible via numerous tool: queries, dumps and APIs. Wikidata and LinguaLibre are both based on the same Wikibase software, a data-storage and collaborative data-editing technology. Wikidata has a lexeme side which could have more collaborations with Lingualibre. This likely requires some knowledge in SPARQL queries, Wikidata properties and items as well as Lingualibre properties and items, then bots taming and associated development skills to scale things up. Introductions to these various aspects are below.

 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{#subtitle:'''Wikidata.org''' is a Wikimedia project storing structured data for Wikimedia projects and the world. These data are under CC0 license, freely accessible via numerous tool: [[Help:SPARQL|SPARQL]] queries, dumps and APIs. Wikidata and LinguaLibre are both based on the same Wikibase software, a data-storage and collaborative data-editing technology. Wikidata has a lexeme side which could have more collaborations with Lingualibre, but requires bots taming and associated development skills to do so.}}
+
{{#subtitle:'''Wikidata.org''' is a Wikimedia project storing structured data for Wikimedia projects and the world. These data are under CC0 license, freely accessible via numerous tool: [[Help:SPARQL|SPARQL]] queries, dumps and APIs. Wikidata and LinguaLibre are both based on the same Wikibase software, a data-storage and collaborative data-editing technology. Wikidata has a lexeme side which could have more collaborations with Lingualibre. This likely requires some knowledge in SPARQL queries, Wikidata properties and items as well as Lingualibre properties and items, then bots taming and associated development skills to scale things up. Introductions to these various aspects are below.}}
  
 
== Wikidata items ==
 
== Wikidata items ==
Line 6: Line 6:
  
 
== Wikidata's properties ==
 
== Wikidata's properties ==
:''See also [[Help:SPARQL 2#Notable elements]].''
+
:''See also [[Help:SPARQL (intermediate)#Notable elements]].''
 
Wikidata's properties most relevant to LinguaLibre are :  
 
Wikidata's properties most relevant to LinguaLibre are :  
 
* ''list to complete''
 
* ''list to complete''
  
 
== LinguaLibre's properties ==
 
== LinguaLibre's properties ==
:See also [[Special:ListProperties]]
+
:''See also [[Special:ListProperties]].''
 
LinguaLibre is using a database system called [[:en:Wikibase|Wikibase]], the same as used by [[:en:Wikidata|Wikidata]], to crowd source the creation of a large database relevant to multilingual audio recordings. Each audio recording is associated with few properties, mostly relevant to :
 
LinguaLibre is using a database system called [[:en:Wikibase|Wikibase]], the same as used by [[:en:Wikidata|Wikidata]], to crowd source the creation of a large database relevant to multilingual audio recordings. Each audio recording is associated with few properties, mostly relevant to :
 
# the speaker
 
# the speaker
Line 37: Line 37:
 
* [[Help:SPARQL]]
 
* [[Help:SPARQL]]
  
{{Lingua Libre scripts}}
+
{{Technicals}}
 
[[Category:Lingua Libre:Help{{#translation:}}]]
 
[[Category:Lingua Libre:Help{{#translation:}}]]

Latest revision as of 20:48, 28 December 2023


Wikidata items

See also Help:SPARQL 2#Notable elements.
Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg

Please explain shortly what Wikidata items are, and list some which could be interesting to Lingualibre.

Wikidata's properties

See also Help:SPARQL (intermediate)#Notable elements.

Wikidata's properties most relevant to LinguaLibre are :

  • list to complete

LinguaLibre's properties

See also Special:ListProperties.

LinguaLibre is using a database system called Wikibase, the same as used by Wikidata, to crowd source the creation of a large database relevant to multilingual audio recordings. Each audio recording is associated with few properties, mostly relevant to :

  1. the speaker
  2. the language used
  3. some system information : word, url on Wikimedia Commons, etc.

Wikidata Lexeme

Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg

This page is a work in progress.

Lexemes is the place (the namespace) where the lexicographical data are stored in Wikidata. Lexemes are lexical units, words or expressions, that contains senses and forms.

These forms can store recording like the one from LinguaLibre. As of February 2021, 44363 forms use a LinguaLibre file ([1]).

See also

Lingua Libre technical helps
Template {{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files How to create a frequency list?Convert files formatsDenoise files with SoXRename and mass rename
Bots Help:BotsLinguaLibre:BotHelp:Log in to Lingua Libre with PywikibotLingua Libre Bot (gh) • OlafbotPamputtBotDragons Bot (gh)
MediaWiki MediaWiki: Help:Documentation opérationelle MediawikiHelp:Database structureHelp:CSSHelp:RenameHelp:OAuthLinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}}JS scripts: MediaWiki:Common.jsLastAudios.jsSoundLibrary.jsItemsSugar.jsLexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.jsGadget-Demo.jsGadget-RecentNonAudio.jsLiLiZip.js
Queries Help:APIsHelp:SPARQLSPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenanceLingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses Help:Download datasetsHelp:Embed audio in HTML
Unstable & tests Help:SPARQL/test
Categories Category:Technical reports