Help
Difference between revisions of "SPARQL"
Line 534: | Line 534: | ||
== Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders == | == Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders == | ||
− | + | {| style="width:100%" | |
− | < | + | |- |
− | SELECT ?languageQidLabel ?wdQid ?languageQid ?isoCode (COUNT(DISTINCT(?record)) AS ?recordCount) (COUNT(DISTINCT(?speakerLangM)) AS ?speakerM) (COUNT(DISTINCT(?speakerLangF)) AS ?speakerF) | + | ! Query || Result |
+ | |- style="vertical-align:top;" | ||
+ | |style="padding: 0 3em;width:60%"| | ||
+ | <syntaxhighlight lang="sparql"> | ||
+ | SELECT ?languageQidLabel ?wdQid ?languageQid ?isoCode | ||
+ | (COUNT(DISTINCT(?record)) AS ?recordCount) | ||
+ | (COUNT(DISTINCT(?speakerLangM)) AS ?speakerM) | ||
+ | (COUNT(DISTINCT(?speakerLangF)) AS ?speakerF) | ||
wWHERE{ | wWHERE{ | ||
?record prop:P2 entity:Q2 . # Filter: items where P2 'instance of' is Q2 'record' | ?record prop:P2 entity:Q2 . # Filter: items where P2 'instance of' is Q2 'record' | ||
Line 555: | Line 562: | ||
GROUP BY ?languageQidLabel ?languageQid ?wdQid ?isoCode | GROUP BY ?languageQidLabel ?languageQid ?wdQid ?isoCode | ||
ORDER BY DESC(?recordCount) | ORDER BY DESC(?recordCount) | ||
− | </ | + | </syntaxhighlight> |
+ | || | ||
+ | <pre> | ||
+ | languageQidLabel wdQid languageQid isoCode recordCount speakerM speakerF | ||
+ | French Q150 Q21 fra 16761 0 18 | ||
+ | Marathi Q1571 Q34 mar 13153 0 5 | ||
+ | Polish Q809 Q298 pol 11686 0 1 | ||
+ | … | ||
+ | </pre> | ||
+ | |} | ||
== Tools == | == Tools == | ||
* [[Special:ApiSandbox]] – API queries generator for Lingualibre wikipage and wikibase contents. | * [[Special:ApiSandbox]] – API queries generator for Lingualibre wikipage and wikibase contents. |
Revision as of 15:53, 8 December 2021
Base
Fetch SPARQL data
Data can be fetched using various coding languages such as Python, Javascript, R and others. On the Wikidata Query Service page, after running your SPARQL query, click "Code" : a pop up window appears with various implementations.
Javascript:
At least 3 methods exists (code snippet), example:
Query | Result's basic unit |
---|---|
SPARQL:SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10
|
{ … },
{
"item": {
"type": "uri",
"value": "https://lingualibre.org/entity/Q12"
},
"itemLabel": {
"xml:lang": "en",
"type": "literal",
"value": "beginner"
}
},
{ … }
|
Javascript:
var endpoint = 'https://lingualibre.org/sparql';
var sparql = 'SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10';
$.getJSON(endpoint,
{ query: sparql, format: 'json' },
function(data){ console.log('JQuery: ',data)}
);
|
✅ Is Language level (language level (Q5)) → list possible values
SELECT ?item ?itemLabel
WHERE {
?item prop:P2 entity:Q5 # Condition 1, P2 'instance of' is Q5 'language level'.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅ Is Sex or Gender(sex or gender (Q7)) → list possible values
SELECT ?item ?itemLabel
WHERE {
?item prop:P2 entity:Q7 # Condition 1, P2 'instance of' is Q7 'sex or gender'.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅🇶 Is Speaker (speaker (Q3)) → list all speakers
SELECT ?speaker ?speakerLabel
WHERE {
?speaker prop:P2 entity:Q3 . # Condition 1, P2 'instance of' is Q3 'speaker'.
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅ Item name → Qid(s)
SELECT ?item ?itemLabel
WHERE {
?item rdfs:label ?itemLabel.
FILTER(CONTAINS(LCASE(?itemLabel), "Yug"@en)).
} limit 10
|
|
✅ Speaker name(s) → Speaker Qid(s)
SELECT ?speakerName ?speakerId
WHERE {
VALUES ?speakerName { "Yug" "VIGNERON" } # One or multiple values
BIND ( STRLANG(?speakerName, "en") AS ?speakerLabel )
# P2: instance of; Q3: speaker.
?speakerId prop:P2 entity:Q3 ; rdfs:label ?speakerLabel .
}
|
|
✅🇶 Speaker Qid (0x010C (Q42)) → Speaker data
# Get Q42 (User:0x010C)'s data
SELECT ?predicate ?object ?objectLabel
WHERE {
entity:Q42 ?predicate ?object .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅🇶 Speaker Qid (0x010C (Q42)) → Speaker data → Speaker languages (P4)
SELECT ?languages ?languagesLabel
WHERE {
entity:Q42 prop:P4 ?languages .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅ Speaker Qid + language → list of all associated audios
SELECT ?audio ?audioLabel
WHERE {
?audio prop:P5 entity:Q42 . # Condition 1, P5 Speaker is Q42 User:0x010C
?audio prop:P4 entity:Q21 . # Condition 2, P4 language is Q21 French
# Labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
❌ Is Language (speaker (Q3)) → list all languages with number of unique words and speakers
Too large to run (not even on Lingualibre Query).
SELECT ?language (COUNT(?audio) AS ?nbAudio) (COUNT(?speaker) AS ?nbSpeaker) WHERE {
?language prop:P2 entity:Q4 .
?audio prop:P4 ?language .
?speaker prop:P4 ?language .
}
GROUP BY ?language
To do: do smaller sub-queries. For now, works only for one counter and one language at a time:
Sub-queries
✅ Language LL Qid (Q21) → All items
SELECT ?language (COUNT(?audio) AS ?nbAudio) WHERE {
VALUES ?language { entity:Q21 }
?audio prop:P4 ?language .
}
GROUP BY ?language
|
|
✅ Language LL Qid (Q21) → Number of records
SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
VALUES ?language { entity:Q21 }
?audio prop:P2 entity:Q2 . # P2 'instance of' is Q2 'record'
?audio prop:P4 ?language . # P4 'language' is Q21 'French'
}
GROUP BY ?language
|
|
Language LL Qid (Q21) → Number of unique words
✅ Language LL Qid (Q21) → Number of speakers
SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
VALUES ?language { entity:Q21 }
?audio prop:P2 entity:Q3 . # P2 'instance of' is Q3 'speaker'
?audio prop:P4 ?language . # P4 'language' is Q21 'French'
}
GROUP BY ?language
|
|
✅ Language LL Qid (Q209) → List of speakers
SELECT ?language ?speaker ?speakerLabel WHERE {
VALUES ?language { entity:Q209 }
?speaker prop:P2 entity:Q3 . # P2 'instance of' is Q3 'speaker'
?speaker prop:P4 ?language . # P4 'language' is Q21 'French'
# Labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅ Language LL Qid (French (Q21)) + Speaker (0x010C (Q42)) → Number of records
SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
WHERE {
VALUES ?language { entity:Q21 }
VALUES ?speaker { entity:Q42 }
?audio prop:P4 ?language . # P4 'language' is Q21 'French'
?audio prop:P2 entity:Q2 . # P2 'instance of' is Q2 'record'
?audio prop:P5 ?speaker . # P5 'speaker' is Q42 '0x010C'
# Labels
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
GROUP BY ?language ?speakerLabel
|
|
Isolang → Language LL Qid
SELECT * WHERE {
?lang prop:P13 ?code .
}
|
|
✅ Isolang → Language WD Qid
SELECT ?langIso ?langId
WHERE {
VALUES ?langIso { "ban" "bre" } # One or multiple values
# P2 'instance of'; Q4 'language'; P13 'ISO 639-3 code'
?langId prop:P2 entity:Q4 ; prop:P13 ?langIso .
}
|
|
✅ Language WD Qid → Language data
SELECT * WHERE {
?lang prop:P12 "Q12107" . # P12 'Wikidata id' is Wikidata's "Q12107"
?lang ?predicate ?object . #
}
|
|
✅ Language LL Qid (Breton (Q209)) → Language data
'Case: Get for language Q209 'Breton' all its data.
SELECT * WHERE {
# Given Q209 'Breton language', get all properties and values
entity:Q209 ?predicate ?object .
}
|
|
✅ Language LL Qid (Breton (Q209)) → core Language data
'Case: Get for language Q209 'Breton' all its CORE data.
SELECT * WHERE {
# Given Q209 'Breton language', get all properties and values
entity:Q209 ?predicate ?object .
?predicate rdf:type owl:DatatypeProperty .
}
|
|
✅ Language (Breton (Q209)) + speaker (ThonyVezbe (Q584098)) + word (ni) → Audio's Qid
Case: Search in Breton language, with speaker 'ThonyVezbe',
SELECT ?audio
WHERE {
?audio prop:P4 entity:Q209 . # P4 'language' is Q209 'Breton'
?audio prop:P5 entity:Q584098 . # P5 'speaker' is Q584098 'ThonyVezbe'
?audio rdfs:label ?word . #word
FILTER ( STR(?word) = "ni" ) # word = 'ni'
}
|
|
Audio Qid → Audio data
✅ Langue + speaker + word → Audio's Commons url
Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders
Query | Result |
---|---|
SELECT ?languageQidLabel ?wdQid ?languageQid ?isoCode
(COUNT(DISTINCT(?record)) AS ?recordCount)
(COUNT(DISTINCT(?speakerLangM)) AS ?speakerM)
(COUNT(DISTINCT(?speakerLangF)) AS ?speakerF)
wWHERE{
?record prop:P2 entity:Q2 . # Filter: items where P2 'instance of' is Q2 'record'
?record prop:P4 ?languageQid . # Assign value: P4 'language' into variable ?language
?languageQid prop:P12 ?wdQid . # Assign value: P12 'wikidata id' into variable ?WD
?languageQid prop:P13 ?isoCode. # Assign value: P13 'iso639-3' into ?isoCode
#?record prop:P5 ?speakerQidM . # Assign value: P5 'speaker' into variable ?speakerQidM
#?speakerQidM prop:P8 entity:Q16 . # Filter: P8 'sex or gender' is Q16 'male
#?speakerQidM prop:P4 ?speakerLangM . # Assign value: P4 'language' into variable ?spakerLangM
?record prop:P5 ?speakerQidF . # Assign value: P5 'speaker' into variable ?speakerQidF
?speakerQidF prop:P8 entity:Q17 . # Filter: P8 'sex or gender' is Q17 'female
?speakerQidF prop:P4 ?speakerLangF . # Assign value: P4 'language' into variable ?spakerLangF
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
GROUP BY ?languageQidLabel ?languageQid ?wdQid ?isoCode
ORDER BY DESC(?recordCount)
|
languageQidLabel wdQid languageQid isoCode recordCount speakerM speakerF French Q150 Q21 fra 16761 0 18 Marathi Q1571 Q34 mar 13153 0 5 Polish Q809 Q298 pol 11686 0 1 … |
Tools
- Special:ApiSandbox – API queries generator for Lingualibre wikipage and wikibase contents.