Help:SPARQL gathers a list of basic SPARQL queries in the context of Lingua Libre, demoed and ready to test, together with beginners-friendly knowledges, inline-comments, introductions to concepts, code snippets and few tools. This page allows users not familiar with SPARQL to rapidly learn the basics of SPARQL, query the LinguaLibre database, and to download or directly feed that data to an application. To fit with most frequent usages, the page lightly lean toward web developpement and begginer's Javascripts skill.
=== ✅ → List of accounts and associated speakers ===
+
=== ✅ Is Speaker ([[Q3]]) → List of accounts and associated speakers ===
{| style="width:100%"
{| style="width:100%"
|- style="vertical-align:top;"
|- style="vertical-align:top;"
Revision as of 15:48, 16 December 2021
Draft
December 2021 rewriting : work in progress, please do not translate yet.
Done Gather SPARQL queries related to: core, speakers, languages, audios.
NOW/Opened: General review, volunteers wanted. You may help by: a) reading and copy-editing the page's English, b) testing queries on LLQS, edit in or discuss improvements, 3) harmonising in-line comments.
NOW/Opened: De-Westernization, replacing Q21 (French) and Q42 (User:0x010C) by a smaller non-western languages and users.
EndpointWikidata Query Service (WDQS) – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.
LinguaLibre data can be fetched using various coding languages such as Python, Javascript, R and others, returning JSON or other formats.
For code snippet in your language : open query.wikidata.org (WikiData Query Service, aka WDQS), run your SPARQL query, click "Code" : a pop up window appears with various implementations.
For downloading data, click "Download".
Javascript:
At least 3 methods exists (code snippet), example:
Advanced SPARQL queries with COUNT() and others are often slow (>3secs, sometime >100secs). You are encouraged to do multiple smaller SPARQL queries to then merge their responded data. By example, the complementary Javascript snippet below would help web developers to do so.
// Data from 3 sparql queries.// Important: One key must be similar in all datasets, here: 'qid'constlangs=[{qid:'Q209',label:'Breton',iso:'bre'},{qid:'Q21',label:'French',iso:'fra'}],speakersFemales=[{qid:'Q209',genderF:3,recordsF:60},{qid:'Q21',genderF:21,recordsF:15046}],speakersMales=[{qid:'Q209',genderM:7,recordsM:112},{qid:'Q21',genderM:85,recordsM:82964}];// Toolbox for merging data by same idvarmerge2ArraysBySameId=function(arr1,arr2,id1){returnarr1.map(item1=>{varidentical=arr2.find(obj=>obj[id1]===item1[id1]);returnObject.assign(identical,item1)});}// Mergingsvarstep1=merge2ArraysBySameId(langs,speakersFemales,'qid');varstep2=merge2ArraysBySameId(step1,speakersMales,'qid');alert(JSON.stringify(step2))
Lingualibre's ground
✅ Is Language (language/dialect(Q4)) → List existing languages with: LL Qid, ISO 639-3, Name
SELECT?item?itemLabelWHERE{?itemprop:P2entity:Q5# Condition 1, P2 'instance of' is Q5 'language level'.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
✅ Is Sex or Gender (sex or gender(Q7)) → List existing sexes or genders
SELECT?item?itemLabelWHERE{?itemprop:P2entity:Q7# Condition 1, P2 'instance of' is Q7 'sex or gender'.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
Speaker
✅ Speaker name(s) → Speaker Qid(s)
SELECT?speakerName?speakerIdWHERE{VALUES?speakerName{"Yug""VIGNERON"}# One or multiple valuesBIND(STRLANG(?speakerName,"en")AS?speakerLabel)# P2: instance of; Q3: speaker.?speakerIdprop:P2entity:Q3;rdfs:label?speakerLabel.}
# Get Q42 (User:0x010C)'s dataSELECT?anyProperty?anyValue?anyValueLabelWHERE{entity:Q42?anyProperty?anyValue.# Filter: of Q42 '0x010C', get any property and any valuesSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
✅🇶 Speaker Qid (0x010C(Q42)) → Speaker languages (P4)
SELECT?audio?audioLabelWHERE{?audioprop:P5entity:Q42.# Filter: P5 Speaker is Q42 User:0x010C?audioprop:P4entity:Q21.# Filter: P4 language is Q21 French# LabelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
SELECT?language(COUNT(?audio)AS?audios)WHERE{VALUES?language{entity:Q21}?audioprop:P2entity:Q2.# P2 'instance of' is Q2 'record'?audioprop:P4?language.# P4 'language' is Q21 'French'}GROUP BY?language
... Loading ...
?✅ Language LL Qid (Q21) → Count unique words
SELECT?language(COUNT(?audio)as?audios)# Count and assign value to ?Audio(COUNT(DISTINCT(?itemLabel))AS?words)(ROUND(10000*?words/?audios)/100AS?percent)WHERE{VALUES?language{entity:Q21}?audioprop:P2entity:Q2.# Filter: P2 'instance of' is Q2 'record'?audioprop:P4?language.# Filter: P4 'language' is Q21 'French'?audiordfs:label?itemLabel.# Assign value: label to ?itemLabel}GROUP BY?language
... Loading ...
✅ Language LL Qid (Q21) → Count speakers
SELECT?language(COUNT(?audio)AS?audio)WHERE{VALUES?language{entity:Q21}?audioprop:P2entity:Q3.# P2 'instance of' is Q3 'speaker'?audioprop:P4?language.# P4 'language' is Q21 'French'}GROUP BY?language
... Loading ...
✅ Language LL Qid (Q209) → List speakers
SELECT?language?speaker?speakerLabelWHERE{VALUES?language{entity:Q209}?speakerprop:P2entity:Q3.# P2 'instance of' is Q3 'speaker'?speakerprop:P4?language.# P4 'language' is Q21 'French'# LabelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
✅ Language LL Qid (Breton(Q209)) → Language data, all
'Case: Get for language Q209 'Breton' all its data.
SELECT*WHERE{# Given Q209 'Breton language', get all properties and valuesentity:Q209?predicate?object.}
... Loading ...
✅ Language LL Qid (Breton(Q209)) → Language data, core
SELECT*WHERE{# Given Q209 'Breton language', get all properties and valuesentity:Q209?predicate?object.?predicaterdf:typeowl:DatatypeProperty.}
... Loading ...
✅ Language LL Qid (Breton(Q209)) → Property P13 (ISO 639-3)
SELECT*WHERE{entity:Q209prop:P13?iso.# Assign value : Q209 'Breton', P13 'ISO 639-3', value into ?iso}
... Loading ...
✅ Languages → List existing languages' iso-639-3
SELECT*WHERE{?langprop:P13?code.}
... Loading ...
✅ Language WD Qid → Language data, core
SELECT*WHERE{?langprop:P12"Q12107".# P12 'Wikidata id' is Wikidata's "Q12107"?lang?predicate?object.# ?predicaterdf:typeowl:DatatypeProperty.}
Case: Search in Breton language, with speaker 'ThonyVezbe',
SELECT?audioWHERE{?audioprop:P4entity:Q209.# P4 'language' is Q209 'Breton'?audioprop:P5entity:Q584098.# P5 'speaker' is Q584098 'ThonyVezbe'?audiordfs:label?word.#wordFILTER(STR(?word)="ni")# word = 'ni'}
SELECT?word?audio?urlPointer(replace(replace(replace(substr(STR(?urlPointer),52),"%20","_"),"%28","("),"%29",")")AS?filename)WHERE{?audioprop:P4entity:Q21.# Filter: P4 'language' is Q21 'French'?audioprop:P5entity:Q137047.# Filter: P5 'speaker' is Q137047 'Justforoc'?audiordfs:label?word.# Assign value: label to ?word#Filter: ?word with 'pomme' in French, non case-sensitiveFILTERREGEX(?word,"pomme"@fr,"i").?audioprop:P3?urlPointer}
... Loading ...
Heavy queries
Queries below are too large to run on LinguaLibre's wikipages, or even on Lingualibre Query Service).
To do: do smaller sub-queries, with one COUNT() function.
❌ Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders
Query
Result
SELECT?languageQidLabel?wdQid?languageQid?isoCode(COUNT(DISTINCT(?record))AS?recordCount)(COUNT(DISTINCT(?speakerLangM))AS?speakerM)(COUNT(DISTINCT(?speakerLangF))AS?speakerF)wWHERE{?recordprop:P2entity:Q2.# Filter: items where P2 'instance of' is Q2 'record'?recordprop:P4?languageQid.# Assign value: P4 'language' into variable ?language?languageQidprop:P12?wdQid.# Assign value: P12 'wikidata id' into variable ?WD?languageQidprop:P13?isoCode.# Assign value: P13 'iso639-3' into ?isoCode#?record prop:P5 ?speakerQidM . # Assign value: P5 'speaker' into variable ?speakerQidM#?speakerQidM prop:P8 entity:Q16 . # Filter: P8 'sex or gender' is Q16 'male#?speakerQidM prop:P4 ?speakerLangM . # Assign value: P4 'language' into variable ?spakerLangM?recordprop:P5?speakerQidF.# Assign value: P5 'speaker' into variable ?speakerQidF?speakerQidFprop:P8entity:Q17.# Filter: P8 'sex or gender' is Q17 'female?speakerQidFprop:P4?speakerLangF.# Assign value: P4 'language' into variable ?spakerLangFSERVICEwikibase:label{bd:serviceParamwikibase:language"en".}}GROUP BY?languageQidLabel?languageQid?wdQid?isoCodeORDER BYDESC(?recordCount)
languageQidLabel wdQid languageQid isoCode recordCount speakerM speakerF
French Q150 Q21 fra 16761 0 18
Marathi Q1571 Q34 mar 13153 0 5
Polish Q809 Q298 pol 11686 0 1
…
❌ Is Language (speaker(Q3)) → list all languages with number of unique words and speakers