Help:SPARQL gathers a list of basic SPARQL queries in the context of Lingua Libre, demoed and ready to test, together with beginners-friendly knowledges, inline-comments, introductions to concepts, code snippets and few tools. This page allows users not familiar with SPARQL to rapidly learn the basics of SPARQL, query the LinguaLibre database, and to download or directly feed that data to an application. To fit the with most frequent usages, the case of a web developper with basic Javascripts skill is taken.
[[File:Wikidata_Query_-_Query_Helper_-_Build_query_from_scratch.webm|thumb|450px|On Wikidata, the WDQS allows to practice SPARQL queries creation in an intuitive way.]]
[[File:Wikidata_Query_-_Query_Helper_-_Build_query_from_scratch.webm|thumb|450px|On Wikidata, the WDQS allows to practice SPARQL queries creation in an intuitive way.]]
−
* [{{SERVER}}/bigdata/#query <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [{{SERVER}}/bigdata/#query LinguaLibre Query Service (LLQS)] – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features.
+
* [{{SERVER}}/bigdata/#query <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [{{SERVER}}/bigdata/#query LinguaLibre Query Service (LLQS)] – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv.
−
* [https://query.wikidata.org <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [https://query.wikidata.org Wikidata Query Service (WDQS)] – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv.
+
* [https://query.wikidata.org <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [https://query.wikidata.org Wikidata Query Service (WDQS)] – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.
* [https://sinaahmadi.github.io/posts/sparql-query-generator-for-lexicographical-data.html Wikidata Lexeme Queries generators] ([https://jsfiddle.net/hugolpz/rygo9s5b/ hack me]) by @sina_ahm – helps to create queries for Wikidata's Lexeme.
* [https://sinaahmadi.github.io/posts/sparql-query-generator-for-lexicographical-data.html Wikidata Lexeme Queries generators] ([https://jsfiddle.net/hugolpz/rygo9s5b/ hack me]) by @sina_ahm – helps to create queries for Wikidata's Lexeme.
* [[Special:ApiSandbox]] – API queries generator for Lingualibre wikipage and wikibase contents.
* [[Special:ApiSandbox]] – API queries generator for Lingualibre wikipage and wikibase contents.
Revision as of 19:55, 10 December 2021
Draft
2021/12/10 : Work in progress. Please do not translate yet as sections are still under active changes. You may help by: reading/fixing the page, testing queries here, replacing Q21 (French) and Q42 (User:0x010C) by a smaller non-western languages and users, harmonising in-line comments, adding the right category to this page. --Yug
EndpointWikidata Query Service (WDQS) – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.
LinguaLibre data can be fetched using various coding languages such as Python, Javascript, R and others, returning JSON or other formats.
For code snippet in your language : open query.wikidata.org (WikiData Query Service, aka WDQS), run your SPARQL query, click "Code" : a pop up window appears with various implementations.
For downloading data, click "Download".
Javascript:
At least 3 methods exists (code snippet), example:
Advanced SPARQL queries with COUNT() and others are often slow (>3secs, sometime >100secs). You are encouraged to do multiple smaller SPARQL queries to then merge their responded data. By example, the complementary Javascript snippet below would help web developers to do so.
// Data from 3 sparql queries.// Important: One key must be similar in all datasets, here: 'qid'constlangs=[{qid:'Q209',label:'Breton',iso:'bre'},{qid:'Q21',label:'French',iso:'fra'}],speakersFemales=[{qid:'Q209',genderF:3,recordsF:60},{qid:'Q21',genderF:21,recordsF:15046}],speakersMales=[{qid:'Q209',genderM:7,recordsM:112},{qid:'Q21',genderM:85,recordsM:82964}];// Toolbox for merging data by same idvarmerge2ArraysBySameId=function(arr1,arr2,id1){returnarr1.map(item1=>{varidentical=arr2.find(obj=>obj[id1]===item1[id1]);returnObject.assign(identical,item1)});}// Mergingsvarstep1=merge2ArraysBySameId(langs,speakersFemales,'qid');varstep2=merge2ArraysBySameId(step1,speakersMales,'qid');alert(JSON.stringify(step2))
Lingualibre's ground
✅ Is Language (language/dialect(Q4)) → List existing languages with: LL Qid, ISO 639-3, Name
SELECT?item?itemLabelWHERE{?itemprop:P2entity:Q5# Condition 1, P2 'instance of' is Q5 'language level'.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
✅ Is Sex or Gender (sex or gender(Q7)) → List existing sexes or genders
SELECT?item?itemLabelWHERE{?itemprop:P2entity:Q7# Condition 1, P2 'instance of' is Q7 'sex or gender'.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
Speaker
✅ Speaker name(s) → Speaker Qid(s)
SELECT?speakerName?speakerIdWHERE{VALUES?speakerName{"Yug""VIGNERON"}# One or multiple valuesBIND(STRLANG(?speakerName,"en")AS?speakerLabel)# P2: instance of; Q3: speaker.?speakerIdprop:P2entity:Q3;rdfs:label?speakerLabel.}
# Get Q42 (User:0x010C)'s dataSELECT?predicate?object?objectLabelWHERE{entity:Q42?predicate?object.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
✅🇶 Speaker Qid (0x010C(Q42)) → Speaker languages (P4)
SELECT?audio?audioLabelWHERE{?audioprop:P5entity:Q42.# Filter: P5 Speaker is Q42 User:0x010C?audioprop:P4entity:Q21.# Filter: P4 language is Q21 French# LabelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
SELECT?language(COUNT(?audio)AS?audio)WHERE{VALUES?language{entity:Q21}?audioprop:P2entity:Q2.# P2 'instance of' is Q2 'record'?audioprop:P4?language.# P4 'language' is Q21 'French'}GROUP BY?language
... Loading ...
?✅ Language LL Qid (Q21) → Count unique words
SELECT?language(COUNT(?audio)as?audios)# Count and assign value to ?Audio(COUNT(DISTINCT(?itemLabel))AS?words)(ROUND(10000*?words/?audios)/100AS?percent)WHERE{VALUES?language{entity:Q21}?audioprop:P2entity:Q2.# Filter: P2 'instance of' is Q2 'record'?audioprop:P4?language.# Filter: P4 'language' is Q21 'French'?audiordfs:label?itemLabel.# Assign value: label to ?itemLabel}GROUP BY?language
... Loading ...
✅ Language LL Qid (Q21) → Count speakers
SELECT?language(COUNT(?audio)AS?audio)WHERE{VALUES?language{entity:Q21}?audioprop:P2entity:Q3.# P2 'instance of' is Q3 'speaker'?audioprop:P4?language.# P4 'language' is Q21 'French'}GROUP BY?language
... Loading ...
✅ Language LL Qid (Q209) → List speakers
SELECT?language?speaker?speakerLabelWHERE{VALUES?language{entity:Q209}?speakerprop:P2entity:Q3.# P2 'instance of' is Q3 'speaker'?speakerprop:P4?language.# P4 'language' is Q21 'French'# LabelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
✅ Language LL Qid (Breton(Q209)) → Language data, all
'Case: Get for language Q209 'Breton' all its data.
SELECT*WHERE{# Given Q209 'Breton language', get all properties and valuesentity:Q209?predicate?object.}
... Loading ...
✅ Language LL Qid (Breton(Q209)) → Language data, core
SELECT*WHERE{# Given Q209 'Breton language', get all properties and valuesentity:Q209?predicate?object.?predicaterdf:typeowl:DatatypeProperty.}
... Loading ...
✅ Language LL Qid (Breton(Q209)) → Property P13 (ISO 639-3)
SELECT*WHERE{entity:Q209prop:P13?iso.# Assign value : Q209 'Breton', P13 'ISO 639-3', value into ?iso}
... Loading ...
✅ Languages → List existing languages' iso-639-3
SELECT*WHERE{?langprop:P13?code.}
... Loading ...
✅ Language WD Qid → Language data, core
SELECT*WHERE{?langprop:P12"Q12107".# P12 'Wikidata id' is Wikidata's "Q12107"?lang?predicate?object.# ?predicaterdf:typeowl:DatatypeProperty.}
Case: Search in Breton language, with speaker 'ThonyVezbe',
SELECT?audioWHERE{?audioprop:P4entity:Q209.# P4 'language' is Q209 'Breton'?audioprop:P5entity:Q584098.# P5 'speaker' is Q584098 'ThonyVezbe'?audiordfs:label?word.#wordFILTER(STR(?word)="ni")# word = 'ni'}
SELECT?word?audio?urlPointer(replace(replace(replace(substr(STR(?urlPointer),52),"%20","_"),"%28","("),"%29",")")AS?filename)WHERE{?audioprop:P4entity:Q21.# Filter: P4 'language' is Q21 'French'?audioprop:P5entity:Q137047.# Filter: P5 'speaker' is Q137047 'Justforoc'?audiordfs:label?word.# Assign value: label to ?word#Filter: ?word with 'pomme' in French, non case-sensitiveFILTERREGEX(?word,"pomme"@fr,"i").?audioprop:P3?urlPointer}
... Loading ...
Heavy queries
Queries below are too large to run on LinguaLibre's wikipages, or even on Lingualibre Query Service).
To do: do smaller sub-queries, with one COUNT() function.
❌ Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders
Query
Result
SELECT?languageQidLabel?wdQid?languageQid?isoCode(COUNT(DISTINCT(?record))AS?recordCount)(COUNT(DISTINCT(?speakerLangM))AS?speakerM)(COUNT(DISTINCT(?speakerLangF))AS?speakerF)wWHERE{?recordprop:P2entity:Q2.# Filter: items where P2 'instance of' is Q2 'record'?recordprop:P4?languageQid.# Assign value: P4 'language' into variable ?language?languageQidprop:P12?wdQid.# Assign value: P12 'wikidata id' into variable ?WD?languageQidprop:P13?isoCode.# Assign value: P13 'iso639-3' into ?isoCode#?record prop:P5 ?speakerQidM . # Assign value: P5 'speaker' into variable ?speakerQidM#?speakerQidM prop:P8 entity:Q16 . # Filter: P8 'sex or gender' is Q16 'male#?speakerQidM prop:P4 ?speakerLangM . # Assign value: P4 'language' into variable ?spakerLangM?recordprop:P5?speakerQidF.# Assign value: P5 'speaker' into variable ?speakerQidF?speakerQidFprop:P8entity:Q17.# Filter: P8 'sex or gender' is Q17 'female?speakerQidFprop:P4?speakerLangF.# Assign value: P4 'language' into variable ?spakerLangFSERVICEwikibase:label{bd:serviceParamwikibase:language"en".}}GROUP BY?languageQidLabel?languageQid?wdQid?isoCodeORDER BYDESC(?recordCount)
languageQidLabel wdQid languageQid isoCode recordCount speakerM speakerF
French Q150 Q21 fra 16761 0 18
Marathi Q1571 Q34 mar 13153 0 5
Polish Q809 Q298 pol 11686 0 1
…
❌ Is Language (speaker(Q3)) → list all languages with number of unique words and speakers