Help:SPARQL gathers a list of SPARQL queries in the context of Lingua Libre, ready to use, alongside with beginner-friendly inline-comments, introductions to concepts, code snippets and a few tools. To dive into it, first check the content's structure visible in the outline. This page allows users not familiar with SPARQL to easily query the LinguaLibre knowledge graph, and to download or directly feed that data into an application. To fit with most frequent uses, the page focuses towards web development.
December 2021 rewriting : work in progress, please do not translate yet.
NOW/Opened: General content review. You may help by: a) reading and copy-editing the page's English, b) testing queries on LLQS, edit in or discuss improvements, 3) increase comments' concistency.
Legend: 🇶 minor aspects to improve, see hidden comment ; ❌ query too heavy to run in this page.
Endpoint WikidataWikidata Query Service (WDQS) – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.
LinguaLibre data can be fetched using various coding languages such as Python, Javascript, R and others, returning JSON or other formats.
For code snippet in your language : open query.wikidata.org (WikiData Query Service, aka WDQS), run your SPARQL query, click "Code" : a pop up window appears with various implementations.
For downloading data, click "Download".
Javascript:
At least 3 methods exists (code snippet), example:
Advanced SPARQL queries with COUNT() and others are often slow (>3secs, sometime >100secs). You are encouraged to do multiple smaller SPARQL queries to then merge their responded data. By example, the complementary Javascript snippet below would help web developers to do so.
// Data from 3 sparql queries.// Important: One key must be similar in all datasets, here: 'qid'constlangs=[{qid:'Q209',label:'Breton',iso:'bre'},{qid:'Q34',label:'Marathi',iso:'mar'}],speakersFemales=[{qid:'Q209',genderF:3,recordsF:60},{qid:'Q34',genderF:21,recordsF:5046}],speakersMales=[{qid:'Q209',genderM:7,recordsM:218},{qid:'Q34',genderM:85,recordsM:32964}];// Toolbox for merging data by same idvarmerge2ArraysBySameId=function(arr1,arr2,id1){returnarr1.map(item1=>{varidentical=arr2.find(obj=>obj[id1]===item1[id1]);returnObject.assign(identical,item1)});}// Mergingsvarstep1=merge2ArraysBySameId(langs,speakersFemales,'qid');varstep2=merge2ArraysBySameId(step1,speakersMales,'qid');alert(JSON.stringify(step2))
Lingualibre's ground
Is Language (language/dialect(Q4)) → List existing languages with: LL Qid, ISO 639-3, Name
This query has been UPDATED to work on Commons SPARQL Endpoint.
SELECT?lang?iso?langLabelWHERE{# First get the relevant languages by first looking up all records{SELECTDISTINCT?langWHERE{_:recordwdt:P31wd:Q108167708;# Filter to get wd:Q108167708 (pronunciation file)wdt:P407?lang.# For each pronunciation file, fetch the language}}# From this point on, ?lang is bound. Get the label and ISO code on WikidataSERVICE<https://query.wikidata.org/sparql>{?langwdt:P31wd:Q34770.# Filter: P31 'instance of' is Q1193409 'language or dialect'.?langwdt:P220?iso.# Assign value: P220 'ISO-639-3' into ?iso.}SERVICEwikibase:label{# Add label to each variable used.# ?lang now has twin variable ?langLabelbd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
This query is going to be DEPRECATED as the queried data will no longer be available.
SELECT?speaker?speakerLabelWHERE{?speakerprop:P2entity:Q3.# Filter: P2 'instance of' is Q3 'speaker'.# Add labels to each variable used.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
This query is going to be DEPRECATED as the queried data will no longer be available.
SELECT?item?itemLabelWHERE{?itemprop:P2entity:Q5# Filter: P2 'instance of' is Q5 'language level'.# Add labels to each variable used.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
Is Sex or Gender (sex or gender(Q7)) → List existing sexes or genders
This query is going to be DEPRECATED as the queried data will no longer be available.
SELECT?item?itemLabelWHERE{?itemprop:P2entity:Q7# Filter: P2 'instance of' is Q7 'sex or gender'.# Add labels to each variable used.SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
Speaker
🇶 Speaker name(s) → Speaker Qid(s)
SELECT?speakerName?speakerIdWHERE{VALUES?speakerName{"Yug""VIGNERON"}# Assign value: one or multiple values# note: need to comment BINDBIND(STRLANG(?speakerName,"en")AS?speakerLabel)# Grammatical note: ';' allows to chain actions ?speakerIdprop:P2entity:Q3;# Filter: P2 'instance of' is Q3 'speaker'.rdfs:label?speakerLabel.# Filter by value: label equal ?speakerLabel's value}
# Get Q445757 (User:SangeetaRH)'s dataSELECT?anyProperty?anyValue?anyValueLabelWHERE{entity:Q445757?anyProperty?anyValue.# Filter: of Q445757 'SangeetaRH', get any property and any values# Add labelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
SELECT?audio?audioLabelWHERE{?audioprop:P5entity:Q445757.# Filter: P5 Speaker is Q445757 User:SangeetaRH?audioprop:P4entity:Q34.# Filter: P4 language is Q34 Marathi# Add labelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
SELECT?language?speakerLabel(COUNT(?audio)AS?audio)WHERE{VALUES?language{entity:Q34}# Assign value: Q34 'Marathi' into ?language VALUES?speaker{entity:Q445757}# Assign value: Q445757 'SangeetaRH' into ?speaker ?audioprop:P5?speaker.# Filter: P5 'speaker' is Q445757 'SangeetaRH'?audioprop:P4?language.# Filter: P4 'language' is Q34 'Marathi'?audioprop:P2entity:Q2.# Filter: P2 'instance of' is Q2 'record'# Add labelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en"}}GROUP BY?language?speakerLabel# Sorting first groups per language and speaker
... Loading ...
Is Speaker (speaker(Q3)) → List of accounts and associated speakers
SELECT?linkedUser?speakerLabel(SUBSTR(STR(?speaker),32)AS?speakerQid)WHERE{?speakerprop:P2entity:Q3.# Filter: P2 'instance of' is Q3 'speaker'.?speakerprop:P11?linkedUser.# Assign value: P11 'linked users' into ?linkedUser.# Add labelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}ORDER BYDESC(?speakerLabel)
... Loading ...
Languages
🇶 Language name(s) in English → Language LL Qid(s)
This query has been UPDATED to work on Commons SPARQL Endpoint.
SELECT?languageId?languageNameWHERE{SERVICE<https://query.wikidata.org/sparql>{VALUES?languageName{"Marathi""Atikamekw""Central Bikol"}# Target valuesBIND(STRLANG(?languageName,"en")AS?languageLabel)# Bind filter by English?languageIdrdfs:label?languageLabel;# Assign value label into ?languageLabelwdt:P31wd:Q34770.# Filter: P2 'instance of' is Q4 'language' AND}}
... Loading ...
Language ISO-639-3 → Language LL Qid(s), Wikidata Qid, Label
This query has been UPDATED to work on Commons SPARQL Endpoint.
SELECT?langIso?langId?langIdLabelWHERE{SERVICE<https://query.wikidata.org/sparql>{VALUES?langIso{"mar""bre""bcl""atj""ban"}# Target ISO values?langIdwdt:P220?langIso;# Assign value: P13 'Iso-639-3' to ?langIso ANDwdt:P31wd:Q34770;# Filter: P2 'instance of' is Q4 'language' }# LabelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en"}}
... Loading ...
Language LL All Languages → Count records
This query has been UPDATED to work on Commons SPARQL Endpoint.
SELECT?language?languageLabel?audiosWHERE{{SELECT?language(COUNT(?audio)AS?audios)WHERE{# Comment out the below statement to filter to only certain languages (e.g. Q34 or others)# VALUES ?language { entity:Q34 }?audiowdt:P31wd:Q108167708;# Filter: P2 'instance of' is Q2 'record'wdt:P407?language.}GROUP BY?language}SERVICE<https://query.wikidata.org/sparql>{SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".?languagerdfs:label?languageLabel.}}}
... Loading ...
🇶 Language LL All Languages → Count unique words, audios, ratio
This query has been UPDATED to work on Commons SPARQL Endpoint.
SELECT?language?languageLabel?words?audios?percentWHERE{{SELECT?language(COUNT(DISTINCT(?itemLabel))AS?words)# Count and assign value to ?Audio(COUNT(?audio)as?audios)(ROUND(10000*?words/?audios)/100AS?percent)WHERE{# Uncomment the below line to filter languages (e.g Q34 or others)# VALUES ?language { wd:Q34 }?audiowdt:P407?language;# Bind property P407 'language' to ?languagewdt:P31wd:Q108167708;# Filter: P31 'instance of' Q108167708 (pronunciation file)wdt:P9533?itemLabel.# Bind P9533 'transcription' to ?itemLabel.}GROUP BY?language?languageLabel}SERVICE<https://query.wikidata.org/sparql>{SERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".?languagerdfs:label?languageLabel.}}}
... Loading ...
Language LL Qid (Q34) → Count speakers
SELECT?language(COUNT(?audio)AS?audio)WHERE{VALUES?language{entity:Q34}?audioprop:P2entity:Q3.# P2 'instance of' is Q3 'speaker'?audioprop:P4?language.# P4 'language' is Q34 'Marathi'}GROUP BY?language
... Loading ...
Language LL Qid (Q209) → List speakers
SELECT?language?speaker?speakerLabelWHERE{VALUES?language{entity:Q209}?speakerprop:P2entity:Q3.# P2 'instance of' is Q3 'speaker'?speakerprop:P4?language.# P4 'language' is Q34 'Marathi'# LabelsSERVICEwikibase:label{bd:serviceParamwikibase:language"[AUTO_LANGUAGE],en".}}
... Loading ...
Language LL Qid (Breton(Q209)) → Language data, all
'Case: Get for language Q209 'Breton' all its data.
SELECT*WHERE{# Given Q209 'Breton language', get all properties and valuesentity:Q209?predicate?object.}
... Loading ...
Language LL Qid (Breton(Q209)) → Language data, core
SELECT*WHERE{# Given Q209 'Breton language', get all properties and valuesentity:Q209?predicate?object.?predicaterdf:typeowl:DatatypeProperty.}
... Loading ...
Language LL Qid (Breton(Q209)) → Property P13 (ISO 639-3)
SELECT*WHERE{entity:Q209prop:P13?iso.# Assign value : Q209 'Breton', P13 'ISO 639-3', value into ?iso}
... Loading ...
Languages → List existing languages' iso-639-3
SELECT*WHERE{?langprop:P13?code.}
... Loading ...
🇶 Language WD Qid → Language data, core
SELECT*WHERE{?langprop:P12"Q12107".# Filter: P12 'Wikidata id' is Wikidata's "Q12107"?lang?predicate?object.# ?predicaterdf:typeowl:DatatypeProperty.}
Queries below are too large to run on LinguaLibre's wikipages, or even on Lingualibre Query Service).
To do: do smaller sub-queries, with one COUNT() function.
❌ Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders
Query
Result
SELECT?languageQidLabel?wdQid?languageQid?isoCode(COUNT(DISTINCT(?record))AS?recordCount)(COUNT(DISTINCT(?speakerLangM))AS?speakerM)(COUNT(DISTINCT(?speakerLangF))AS?speakerF)wWHERE{?recordprop:P2entity:Q2.# Filter: items where P2 'instance of' is Q2 'record'?recordprop:P4?languageQid.# Assign value: P4 'language' into variable ?language?languageQidprop:P12?wdQid.# Assign value: P12 'wikidata id' into variable ?WD?languageQidprop:P13?isoCode.# Assign value: P13 'iso639-3' into ?isoCode#?record prop:P5 ?speakerQidM . # Assign value: P5 'speaker' into variable ?speakerQidM#?speakerQidM prop:P8 entity:Q16 . # Filter: P8 'sex or gender' is Q16 'male#?speakerQidM prop:P4 ?speakerLangM . # Assign value: P4 'language' into variable ?spakerLangM?recordprop:P5?speakerQidF.# Assign value: P5 'speaker' into variable ?speakerQidF?speakerQidFprop:P8entity:Q17.# Filter: P8 'sex or gender' is Q17 'female?speakerQidFprop:P4?speakerLangF.# Assign value: P4 'language' into variable ?spakerLangFSERVICEwikibase:label{bd:serviceParamwikibase:language"en".}}GROUP BY?languageQidLabel?languageQid?wdQid?isoCodeORDER BYDESC(?recordCount)
languageQidLabel wdQid languageQid isoCode recordCount speakerM speakerF
French Q150 Q21 fra 16761 0 18
Marathi Q1571 Q34 mar 13153 0 5
Polish Q809 Q298 pol 11686 0 1
…
❌ Is Language (speaker(Q3)) → list all languages with number of unique words and speakers