Help
Difference between revisions of "SPARQL"
Help:SPARQL gathers a list of basic SPARQL queries in the context of Lingua Libre, demoed and ready to test, together with beginners-friendly knowledges, inline-comments, introductions to concepts, code snippets and few tools. This page allows users not familiar with SPARQL to rapidly learn the basics of SPARQL, query the LinguaLibre database, and to download or directly feed that data to an application. To fit with most frequent usages, the page lightly lean toward web developpement and begginer's Javascripts skill.
(en fait non, car ça concerne le service de requête de Commons) Tag: Undo |
Poslovitch (talk | contribs) (removed useless checkmark ; it clutters the TOM) |
||
Line 6: | Line 6: | ||
# {{Done}} '''NOW/Opened:''' <s>De-Westernization, replacing Q21 (French) by Q34 (Marathi) and Q42 (User:0x010C) by Q445757 (User:SangeetaRH).</s> | # {{Done}} '''NOW/Opened:''' <s>De-Westernization, replacing Q21 (French) by Q34 (Marathi) and Q42 (User:0x010C) by Q445757 (User:SangeetaRH).</s> | ||
# '''NOW/Opened:''' General content review. You may help by: a) reading and copy-editing the page's English, b) testing queries on [https://lingualibre.org/bigdata/#query LLQS], edit in or [[Help talk:SPARQL|discuss improvements]], <s>3) increase comments' concistency</s>. | # '''NOW/Opened:''' General content review. You may help by: a) reading and copy-editing the page's English, b) testing queries on [https://lingualibre.org/bigdata/#query LLQS], edit in or [[Help talk:SPARQL|discuss improvements]], <s>3) increase comments' concistency</s>. | ||
− | # Legend: | + | # Legend: 🇶 minor aspects to improve, see hidden comment ; ❌ query too heavy to run in this page. |
# '''Later/not yet:''' translations. | # '''Later/not yet:''' translations. | ||
<!-- # '''Later:''' Improve Base section with core SPARQL concepts ? --> | <!-- # '''Later:''' Improve Base section with core SPARQL concepts ? --> | ||
Line 113: | Line 113: | ||
== Lingualibre's ground == | == Lingualibre's ground == | ||
− | === | + | === Is Language ([[Q4]]) → List existing languages with: LL Qid, ISO 639-3, Name === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 144: | Line 144: | ||
|} | |} | ||
− | === | + | === Is Speaker ([[Q3]]) → List existing speakers === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 171: | Line 171: | ||
|} | |} | ||
− | === | + | === Is Language level ([[Q5]]) → List existing levels === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 198: | Line 198: | ||
|} | |} | ||
− | === | + | === Is Sex or Gender ([[Q7]]) → List existing sexes or genders === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 312: | Line 312: | ||
|} | |} | ||
− | === | + | === Speaker Qid ([[Q445757]]) + Language LL Qid ([[Q34]]) → List records === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 341: | Line 341: | ||
|} | |} | ||
− | === | + | === Speaker Qid ([[Q445757]]) + Language LL Qid ([[Q34]]) → Count records === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 374: | Line 374: | ||
|} | |} | ||
− | === | + | === Is Speaker ([[Q3]]) → List of accounts and associated speakers === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 432: | Line 432: | ||
|} | |} | ||
− | === | + | === Language ISO-639-3 → Language LL Qid(s), Wikidata Qid, Label === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 463: | Line 463: | ||
|} | |} | ||
− | === | + | === Language LL Qid (Q34) → Count items === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 484: | Line 484: | ||
|} | |} | ||
− | === | + | === Language LL Qid (Q34) → Count records === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 542: | Line 542: | ||
|} | |} | ||
− | === | + | === Language LL Qid (Q34) → Count speakers === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 565: | Line 565: | ||
|} | |} | ||
− | === | + | === Language LL Qid (Q209) → List speakers === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 594: | Line 594: | ||
− | === | + | === Language LL Qid ([[Q209]]) → Language data, all === |
'''Case:'' Get for language Q209 'Breton' all its data. | '''Case:'' Get for language Q209 'Breton' all its data. | ||
{| style="width:100%" | {| style="width:100%" | ||
Line 613: | Line 613: | ||
|} | |} | ||
− | === | + | === Language LL Qid ([[Q209]]) → Language data, core === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 633: | Line 633: | ||
|} | |} | ||
− | === | + | === Language LL Qid ([[Q209]]) → Property P13 (ISO 639-3) === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 650: | Line 650: | ||
|} | |} | ||
− | === | + | === Languages → List existing languages' iso-639-3 === |
{| style="width:100%" | {| style="width:100%" | ||
Line 692: | Line 692: | ||
== Records == | == Records == | ||
− | === | + | === Record LL Qid ([[Q500]]) → Record data, all === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 710: | Line 710: | ||
|} | |} | ||
− | === | + | === Record LL Qid ([[Q500]]) → Record data, core === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 730: | Line 730: | ||
− | === | + | === Language ([[Q22]]) + String → Record LL Qid(s) === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 755: | Line 755: | ||
|} | |} | ||
− | === | + | === Language ([[Q209]]) + Speaker ([[Q584098]]) + String (ni) → Record LL Qid === |
'''Case:''' Search in Breton language, with speaker 'ThonyVezbe', | '''Case:''' Search in Breton language, with speaker 'ThonyVezbe', | ||
{| style="width:100%" | {| style="width:100%" | ||
Line 783: | Line 783: | ||
|} | |} | ||
− | === | + | === Language ([[Q21]]) + Speaker ([[Q137047]]) + String → URL pointer, filename === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
Line 814: | Line 814: | ||
|} | |} | ||
− | === | + | === Files on Commons about records in Punjabi with transcription and LinguaLibre identifier === |
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
Line 882: | Line 882: | ||
== Others == | == Others == | ||
''(These old queries are not assessed yet.)'' | ''(These old queries are not assessed yet.)'' | ||
− | === | + | === Language ([[Q209]]) → Record, speaker's language level === |
{| style="width:100%" | {| style="width:100%" | ||
Line 924: | Line 924: | ||
|} | |} | ||
− | === | + | === Language ([[Q34]]) → Records of Wikidata concepts with WD Qid ([[Property:P12|P12]]) === |
:''Those items were proposed to Lingualibre's recorder at step 3 via a SPARQL query upon Wikidata, so those words have WD's Qids.'' | :''Those items were proposed to Lingualibre's recorder at step 3 via a SPARQL query upon Wikidata, so those words have WD's Qids.'' | ||
{| style="width:100%" | {| style="width:100%" | ||
Line 958: | Line 958: | ||
|} | |} | ||
− | === | + | === Records → Filter by date: late 2018 === |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" |
Revision as of 09:35, 30 June 2022
December 2021 rewriting : work in progress, please do not translate yet.
- Done Gather SPARQL queries related to: core, speakers, languages, audios.
- Done NOW/Opened:
De-Westernization, replacing Q21 (French) by Q34 (Marathi) and Q42 (User:0x010C) by Q445757 (User:SangeetaRH). - NOW/Opened: General content review. You may help by: a) reading and copy-editing the page's English, b) testing queries on LLQS, edit in or discuss improvements,
3) increase comments' concistency. - Legend: 🇶 minor aspects to improve, see hidden comment ; ❌ query too heavy to run in this page.
- Later/not yet: translations.
Help welcome.
Base
Useful elements
|
|
Tools
- LinguaLibre Query Service (LLQS) – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv.
- Wikidata Query Service (WDQS) – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.
- Wikimedia Commons Query Service (WCQS) run SPARQL Queries upon Wikimedia Commons wikibase (need to log in).
- Wikidata Lexeme Queries generators (hack me) by @sina_ahm – helps to create queries for Wikidata's Lexeme.
- Special:ApiSandbox – API queries generator for Lingualibre wikipage and wikibase contents. An alternative to SPARQL queries.
References
Code snippets
Fetch data using SPARQL
LinguaLibre data can be fetched using various coding languages such as Python, Javascript, R and others, returning JSON or other formats.
- For code snippet in your language : open query.wikidata.org (WikiData Query Service, aka WDQS), run your SPARQL query, click "Code" : a pop up window appears with various implementations.
- For downloading data, click "Download".
Javascript:
At least 3 methods exists (code snippet), example:
Query | Result's basic unit |
---|---|
SPARQL:SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10
|
{ … },
{
"item": {
"type": "uri",
"value": "https://lingualibre.org/entity/Q12"
},
"itemLabel": {
"xml:lang": "en",
"type": "literal",
"value": "beginner"
}
},
{ … }
|
Javascript:
var endpoint = 'https://lingualibre.org/sparql';
var sparql = 'SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10';
$.getJSON(endpoint,
{ query: sparql, format: 'json' },
function(data){ console.log('JQuery: ',data)}
);
|
Merging data
Advanced SPARQL queries with COUNT()
and others are often slow (>3secs, sometime >100secs). You are encouraged to do multiple smaller SPARQL queries to then merge their responded data. By example, the complementary Javascript snippet below would help web developers to do so.
// Data from 3 sparql queries.
// Important: One key must be similar in all datasets, here: 'qid'
const langs = [{ qid: 'Q209', label: 'Breton', iso:'bre' }, { qid: 'Q34', label: 'Marathi', iso: 'mar' }],
speakersFemales = [{ qid: 'Q209', genderF: 3, recordsF: 60 }, { qid: 'Q34', genderF: 21, recordsF:5046 }],
speakersMales = [{ qid: 'Q209', genderM: 7, recordsM: 218 }, { qid: 'Q34', genderM: 85, recordsM:32964 }];
// Toolbox for merging data by same id
var merge2ArraysBySameId = function(arr1,arr2,id1){
return arr1.map( item1 => {
var identical = arr2.find(obj => obj[id1] === item1[id1]);
return Object.assign(identical, item1)
} );
}
// Mergings
var step1 = merge2ArraysBySameId(langs,speakersFemales,'qid');
var step2 = merge2ArraysBySameId(step1,speakersMales,'qid');
alert(JSON.stringify(step2))
Lingualibre's ground
Is Language (language/dialect (Q4)) → List existing languages with: LL Qid, ISO 639-3, Name
SELECT ?lang ?iso ?langLabel
WHERE {
?lang prop:P2 entity:Q4 . # Filter: P2 'instance of' is Q4 'language or dialect'.
?lang prop:P13 ?iso . # Assign value: P13 'ISO-639-3' into ?iso.
# Add label to each variable used.
# ?lang now has twin variable ?langLabel
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Is Speaker (speaker (Q3)) → List existing speakers
SELECT ?speaker ?speakerLabel
WHERE {
?speaker prop:P2 entity:Q3 . # Filter: P2 'instance of' is Q3 'speaker'.
# Add labels to each variable used.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Is Language level (language level (Q5)) → List existing levels
SELECT ?item ?itemLabel
WHERE {
?item prop:P2 entity:Q5 # Filter: P2 'instance of' is Q5 'language level'.
# Add labels to each variable used.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Is Sex or Gender (sex or gender (Q7)) → List existing sexes or genders
SELECT ?item ?itemLabel
WHERE {
?item prop:P2 entity:Q7 # Filter: P2 'instance of' is Q7 'sex or gender'.
# Add labels to each variable used.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Speaker
✅🇶 Speaker name(s) → Speaker Qid(s)
SELECT ?speakerName ?speakerId
WHERE {
VALUES ?speakerName { "Yug" "VIGNERON" } # Assign value: one or multiple values
# note: need to comment BIND
BIND ( STRLANG(?speakerName, "en") AS ?speakerLabel )
# Grammatical note: ';' allows to chain actions
?speakerId prop:P2 entity:Q3 ; # Filter: P2 'instance of' is Q3 'speaker'.
rdfs:label ?speakerLabel . # Filter by value: label equal ?speakerLabel's value
}
|
|
✅🇶 Speaker Qid (SangeetaRH (Q445757)) → Speaker data, all
# Get Q445757 (User:SangeetaRH)'s data
SELECT ?anyProperty ?anyValue ?anyValueLabel
WHERE {
entity:Q445757 ?anyProperty ?anyValue . # Filter: of Q445757 'SangeetaRH', get any property and any values
# Add labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅🇶 Speaker Qid (SangeetaRH (Q445757)) → Speaker languages (P4)
SELECT ?languages ?languagesLabel
WHERE {
entity:Q445757 prop:P4 ?languages . # Assign value: for Q445757 'SangeetaRH', P4 'language' into ?languages
# Add labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Speaker Qid (SangeetaRH (Q445757)) + Language LL Qid (Marathi (Q34)) → List records
SELECT ?audio ?audioLabel
WHERE {
?audio prop:P5 entity:Q445757 . # Filter: P5 Speaker is Q445757 User:SangeetaRH
?audio prop:P4 entity:Q34 . # Filter: P4 language is Q34 Marathi
# Add labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Speaker Qid (SangeetaRH (Q445757)) + Language LL Qid (Marathi (Q34)) → Count records
SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
WHERE {
VALUES ?language { entity:Q34 } # Assign value: Q34 'Marathi' into ?language
VALUES ?speaker { entity:Q445757 } # Assign value: Q445757 'SangeetaRH' into ?speaker
?audio prop:P5 ?speaker . # Filter: P5 'speaker' is Q445757 'SangeetaRH'
?audio prop:P4 ?language . # Filter: P4 'language' is Q34 'Marathi'
?audio prop:P2 entity:Q2 . # Filter: P2 'instance of' is Q2 'record'
# Add labels
SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"}
}
GROUP BY ?language ?speakerLabel # Sorting first groups per language and speaker
|
|
Is Speaker (speaker (Q3)) → List of accounts and associated speakers
SELECT ?linkedUser ?speakerLabel (SUBSTR(STR(?speaker),32) AS ?speakerQid)
WHERE {
?speaker prop:P2 entity:Q3 . # Filter: P2 'instance of' is Q3 'speaker'.
?speaker prop:P11 ?linkedUser . # Assign value: P11 'linked users' into ?linkedUser.
# Add labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
} ORDER BY DESC (?speakerLabel)
|
|
Languages
✅🇶 Language name(s) in English → Language LL Qid(s)
SELECT ?languageId ?languageName
WHERE {
VALUES ?languageName { "Marathi" "Atikamekw" "Central Bikol" } # Target values
?languageId
prop:P2 entity:Q4 ; # Filter: P2 'instance of' is Q4 'language' AND
rdfs:label ?languageLabel . # Assign value label into ?languageLabel
BIND ( STRLANG(?languageName, "en") AS ?languageLabel ) # Bind filter by English
}
|
|
Language ISO-639-3 → Language LL Qid(s), Wikidata Qid, Label
SELECT ?langIso ?langId ?langWDQid ?langIdLabel
WHERE {
VALUES ?langIso { "mar" "bre" "bcl" "atj" "ban" } # Target ISO values
?langId
prop:P2 entity:Q4 ; # Filter: P2 'instance of' is Q4 'language' AND
prop:P13 ?langIso ; # Assign value: P13 'Iso-639-3' to ?langIso AND
prop:P12 ?langWDQid . # Assign value: P12 'Iso-639-3' to ?langWDQid
# Labels
SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"}
}
|
|
Language LL Qid (Q34) → Count items
SELECT ?language (COUNT(?item) AS ?items) WHERE {
VALUES ?language { entity:Q34 }
?item prop:P4 ?language .
}
GROUP BY ?language
|
|
Language LL Qid (Q34) → Count records
SELECT ?language (COUNT(?audio) AS ?audios) WHERE {
VALUES ?language { entity:Q34 }
?audio prop:P2 entity:Q2 . # Filter: P2 'instance of' is Q2 'record'
?audio prop:P4 ?language . # Filter: P4 'language' is Q34 'Marathi'
}
GROUP BY ?language
|
|
✅🇶 Language LL Qid (Q34) → Count unique words, audios, ratio
SELECT ?language
(COUNT(DISTINCT(?itemLabel)) AS ?words) # Count and assign value to ?Audio
(COUNT(?audio) as ?audios)
(ROUND(10000*?words/?audios)/100 AS ?percent)
WHERE {
VALUES ?language { entity:Q34 }
?audio prop:P4 ?language . # Filter: P4 'language' is Q34 'Marathi'
?audio prop:P2 entity:Q2 . # Filter: P2 'instance of' is Q2 'record'
?audio rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
}
GROUP BY ?language
|
|
Language LL Qid (Q34) → Count speakers
SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
VALUES ?language { entity:Q34 }
?audio prop:P2 entity:Q3 . # P2 'instance of' is Q3 'speaker'
?audio prop:P4 ?language . # P4 'language' is Q34 'Marathi'
}
GROUP BY ?language
|
|
Language LL Qid (Q209) → List speakers
SELECT ?language ?speaker ?speakerLabel WHERE {
VALUES ?language { entity:Q209 }
?speaker prop:P2 entity:Q3 . # P2 'instance of' is Q3 'speaker'
?speaker prop:P4 ?language . # P4 'language' is Q34 'Marathi'
# Labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Language LL Qid (Breton (Q209)) → Language data, all
'Case: Get for language Q209 'Breton' all its data.
SELECT * WHERE {
# Given Q209 'Breton language', get all properties and values
entity:Q209 ?predicate ?object .
}
|
|
Language LL Qid (Breton (Q209)) → Language data, core
SELECT * WHERE {
# Given Q209 'Breton language', get all properties and values
entity:Q209 ?predicate ?object .
?predicate rdf:type owl:DatatypeProperty .
}
|
|
Language LL Qid (Breton (Q209)) → Property P13 (ISO 639-3)
SELECT * WHERE {
entity:Q209 prop:P13 ?iso . # Assign value : Q209 'Breton', P13 'ISO 639-3', value into ?iso
}
|
|
Languages → List existing languages' iso-639-3
SELECT * WHERE {
?lang prop:P13 ?code .
}
|
|
✅🇶 Language WD Qid → Language data, core
SELECT * WHERE {
?lang prop:P12 "Q12107" . # Filter: P12 'Wikidata id' is Wikidata's "Q12107"
?lang ?predicate ?object . #
?predicate rdf:type owl:DatatypeProperty .
}
|
|
Records
Record LL Qid (Cometa (Q500)) → Record data, all
SELECT * WHERE {
entity:Q500 ?predicate ?object .
# ?predicate rdf:type owl:DatatypeProperty .
}
|
|
Record LL Qid (Cometa (Q500)) → Record data, core
SELECT * WHERE {
entity:Q500 ?predicate ?object .
?predicate rdf:type owl:DatatypeProperty .
}
|
|
Language (English (Q22)) + String → Record LL Qid(s)
SELECT ?itemLabel ?item
WHERE {
?item prop:P2 entity:Q2 . # Filter: P2 'instance of' Q3 'record'
?item prop:P4 entity:Q22 . # Filter: P4 'language' is Q22 'English'
?item rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
FILTER(CONTAINS(?itemLabel, "apple"@en)).
} limit 10
|
|
Language (Breton (Q209)) + Speaker (ThonyVezbe (Q584098)) + String (ni) → Record LL Qid
Case: Search in Breton language, with speaker 'ThonyVezbe',
SELECT ?audio ?urlPointer
WHERE {
?audio prop:P4 entity:Q209 . # P4 'language' is Q209 'Breton'
?audio prop:P5 entity:Q584098 . # P5 'speaker' is Q584098 'ThonyVezbe'
?audio rdfs:label ?word . #word
FILTER ( STR(?word) = "ni" ) # word = 'ni'
?audio prop:P3 ?urlPointer.
}
|
|
Language (French (Q21)) + Speaker (Justforoc (Q137047)) + String → URL pointer, filename
SELECT ?word ?audio ?urlPointer
(replace(replace(replace(substr(STR(?urlPointer),52),"%20","_"),"%28","("),"%29",")") AS ?filename)
WHERE {
?audio prop:P4 entity:Q21 . # Filter: P4 'language' is Q21 'French'
?audio prop:P5 entity:Q137047 . # Filter: P5 'speaker' is Q137047 'Justforoc'
?audio rdfs:label ?word . # Assign value: label to ?word
#Filter: ?word with 'pomme' in French, non case-sensitive
FILTER REGEX(?word, "pomme"@fr, "i" ) .
?audio prop:P3 ?urlPointer
}
|
|
Files on Commons about records in Punjabi with transcription and LinguaLibre identifier
SELECT * WHERE {
?m wdt:P31 wd:Q108167708 ; #record
wdt:P407 wd:Q58635 ; #in Punjabi
wdt:P9533 ?transcription ; #with transcription
wdt:P10369 ?idLili . #with LinguaLibre identifier
}
LIMIT 100
Heavy queries
Queries below are too large to run on LinguaLibre's wikipages, or even on Lingualibre Query Service).
To do: do smaller sub-queries, with one COUNT()
function.
❌ Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders
Query | Result |
---|---|
SELECT ?languageQidLabel ?wdQid ?languageQid ?isoCode
(COUNT(DISTINCT(?record)) AS ?recordCount)
(COUNT(DISTINCT(?speakerLangM)) AS ?speakerM)
(COUNT(DISTINCT(?speakerLangF)) AS ?speakerF)
wWHERE{
?record prop:P2 entity:Q2 . # Filter: items where P2 'instance of' is Q2 'record'
?record prop:P4 ?languageQid . # Assign value: P4 'language' into variable ?language
?languageQid prop:P12 ?wdQid . # Assign value: P12 'wikidata id' into variable ?WD
?languageQid prop:P13 ?isoCode. # Assign value: P13 'iso639-3' into ?isoCode
#?record prop:P5 ?speakerQidM . # Assign value: P5 'speaker' into variable ?speakerQidM
#?speakerQidM prop:P8 entity:Q16 . # Filter: P8 'sex or gender' is Q16 'male
#?speakerQidM prop:P4 ?speakerLangM . # Assign value: P4 'language' into variable ?spakerLangM
?record prop:P5 ?speakerQidF . # Assign value: P5 'speaker' into variable ?speakerQidF
?speakerQidF prop:P8 entity:Q17 . # Filter: P8 'sex or gender' is Q17 'female
?speakerQidF prop:P4 ?speakerLangF . # Assign value: P4 'language' into variable ?spakerLangF
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
GROUP BY ?languageQidLabel ?languageQid ?wdQid ?isoCode
ORDER BY DESC(?recordCount)
|
languageQidLabel wdQid languageQid isoCode recordCount speakerM speakerF French Q150 Q21 fra 16761 0 18 Marathi Q1571 Q34 mar 13153 0 5 Polish Q809 Q298 pol 11686 0 1 … |
❌ Is Language (speaker (Q3)) → list all languages with number of unique words and speakers
SELECT ?language (COUNT(?audio) AS ?nbAudio) (COUNT(?speaker) AS ?nbSpeaker) WHERE {
?language prop:P2 entity:Q4 .
?audio prop:P4 ?language .
?speaker prop:P4 ?language .
}
GROUP BY ?language
Others
(These old queries are not assessed yet.)
Language (Breton (Q209)) → Record, speaker's language level
select ?record ?recordLabel ?speakerLabel ?languageLabel ?languageLevelLabel
where {
?record prop:P2 entity:Q2 # Filter: P2 'instance of' is Q2 'record' AND P4
; prop:P4 entity:Q209 . # AND P4 'language' is Q209 'Breton'
?record prop:P5 ?speaker . # Assign value: record's P5 'speaker' into ?speaker
?record prop:P4 ?language . # Assign value: record's P4 'language' into ?language
?speaker llp:P4 ?languageStatement . # P4 'language'
?languageStatement llv:P4 ?language . # P4 'language'
?languageStatement llq:P16 ?languageLevel . # P16 'language level'
# Adds labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
} ORDER BY ?languageLabel ?languageLevelLabel
|
|
Language (Marathi (Q34)) → Records of Wikidata concepts with WD Qid (P12)
- Those items were proposed to Lingualibre's recorder at step 3 via a SPARQL query upon Wikidata, so those words have WD's Qids.
SELECT ?languageLabel ?recordLabel ?record ?wid
WHERE {
?record prop:P2 entity:Q2 . # Filter: P2 'instance of' is Q2 'record'
?record prop:P4 entity:Q34 . # Filter: P4 'language' is Q34 'Marathi'
?record prop:P4 ?language . # Assign value: record's P4 'language' to variable ?language
?record prop:P12 ?wid . # Assign value: record's P12 'wikidata id' to variable ?wid
# Add labels capability
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
Records → Filter by date: late 2018
SELECT
(COUNT(DISTINCT ?speaker) AS ?speakers)
(COUNT(DISTINCT ?record) AS ?records)
WHERE {
?record prop:P2 entity:Q2 .
?record prop:P6 ?date .
?record prop:P5 ?speaker .
# Filters:
FILTER(?date >= "2018-07-01T00:00:00Z"^^xsd:dateTime)
FILTER(?date < "2019-01-01T00:00:00Z"^^xsd:dateTime)
}
|
|
See also
- Help:SPARQL 2 — next tutorial with focus on federate queries and Wikidata Lexemes.
- Lexicographical data/Ideas of queries
- LinguaLibre:Wikidata — stub, help write it !
- Help:Querying Lingua Libre — general review, redirecting users to rightful place.
- Help:APIs — API queries over Wikimedia Commons or other Wikimedia wikis.
- mw:Manual:Developing extensions — PHP-based modules enhancing wikis, can pull data via SPARQL queries.