Help
Difference between revisions of "SPARQL (intermediate)"
Help:SPARQL 2 will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints. It allows to augent your data thank to Wikidata-provided languages population, status, countries, as well as speakers' geocoordinates, country of origin, etc.
(55 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | {{#Subtitle:'''Help:SPARQL 2''' will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints, | + | {{#Subtitle:'''Help:SPARQL 2''' will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints. It allows to augent your data thank to Wikidata-provided languages population, status, countries, as well as speakers' geocoordinates, country of origin, etc.}} |
{{draft}} | {{draft}} | ||
== Tools == | == Tools == | ||
− | |||
− | |||
− | |||
=== SPARQL to persitent data === | === SPARQL to persitent data === | ||
''Some SPARQL queries are meaningful but heavy and overly slow. This administrator tool stores or updates the response data on LinguaLibre, within a wikipage. Stored data can then be loaded in <0.1 second. Multiple data can also be merged via a common property if any. | ''Some SPARQL queries are meaningful but heavy and overly slow. This administrator tool stores or updates the response data on LinguaLibre, within a wikipage. Stored data can then be loaded in <0.1 second. Multiple data can also be merged via a common property if any. | ||
Line 13: | Line 10: | ||
* To query Lingualibre from Wikidata, use <code><nowiki>SERVICE <https://lingualibre.org/sparql></nowiki></code>. | * To query Lingualibre from Wikidata, use <code><nowiki>SERVICE <https://lingualibre.org/sparql></nowiki></code>. | ||
* To query Wikidata from LinguaLibre, use <code><nowiki>SERVICE <https://query.wikidata.org/sparql></nowiki></code>. | * To query Wikidata from LinguaLibre, use <code><nowiki>SERVICE <https://query.wikidata.org/sparql></nowiki></code>. | ||
+ | * To query Commons from Lingualibre, use <code><nowiki>SERVICE <https://commons-query.wikimedia.org/sparql></nowiki></code>. | ||
=== Retrieve data of LinguaLibre from Wikidata === | === Retrieve data of LinguaLibre from Wikidata === | ||
To run on WDQS.<ref name="WDQS">[https://query.wikidata.org <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [https://query.wikidata.org Wikidata Query Service (WDQS)] – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.</ref> It lists the existing levels in LinguaLibre. | To run on WDQS.<ref name="WDQS">[https://query.wikidata.org <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [https://query.wikidata.org Wikidata Query Service (WDQS)] – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.</ref> It lists the existing levels in LinguaLibre. | ||
+ | {| style="width:100%" | ||
+ | |- style="vertical-align:top;" | ||
+ | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Wikidata | ||
PREFIX prop: <https://lingualibre.org/prop/direct/> | PREFIX prop: <https://lingualibre.org/prop/direct/> | ||
PREFIX entity: <https://lingualibre.org/entity/> | PREFIX entity: <https://lingualibre.org/entity/> | ||
Line 33: | Line 35: | ||
} | } | ||
</syntaxhighlight> | </syntaxhighlight> | ||
+ | || | ||
+ | <query _pagination="10"> | ||
− | === | + | #defaultEndpoint:Wikidata |
− | To run on LLQS.<ref name="LLQS">[{{SERVER}}/bigdata/#query <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [{{SERVER}}/bigdata/#query LinguaLibre Query Service (LLQS)] – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv.</ref> | + | PREFIX prop: <https://lingualibre.org/prop/direct/> |
+ | PREFIX entity: <https://lingualibre.org/entity/> | ||
+ | |||
+ | SELECT * { | ||
+ | SERVICE <https://lingualibre.org/sparql> { | ||
+ | SELECT | ||
+ | ?item | ||
+ | ?itemLabel | ||
+ | { | ||
+ | ?item prop:P2 entity:Q5. | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </query> | ||
+ | |} | ||
+ | |||
+ | === From LinguaLibre Query service, retrieve Wikidata's data === | ||
+ | :''To run on LLQS.<ref name="LLQS">[{{SERVER}}/bigdata/#query <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [{{SERVER}}/bigdata/#query LinguaLibre Query Service (LLQS)] – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv.</ref>'' | ||
+ | :''Federate SPARQL query example to create.'' | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
Line 41: | Line 64: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | + | == Notable elements == | |
− | {|class="wikitable" | + | {| class="wikitable" |
− | ! | + | |- style="background:#CCC" |
− | ! | + | ! LinguaLibre endpoint |
− | |- | + | !colspan=2| Wikidata endpoint |
+ | |- style="vertical-align:top" | ||
| | | | ||
* `instance of` [[Property:P2|P2]] : | * `instance of` [[Property:P2|P2]] : | ||
Line 56: | Line 80: | ||
* `wikidata` [[Property:P12|P12]] | * `wikidata` [[Property:P12|P12]] | ||
* `iso` [[Property:P13|P13]] | * `iso` [[Property:P13|P13]] | ||
+ | * `media type` [[Property:P24|P24]] | ||
+ | ** media type [[Q88888]] | ||
+ | ** is `audio` [[Q88889]] | ||
+ | ** is `video` [[Q88890]] | ||
+ | ** is `written` [[Q1087276]] | ||
| | | | ||
− | + | For languages: | |
* `instance of` [[:d:P:P31|P31]]/[[:d:P:P279|P279]]* | * `instance of` [[:d:P:P31|P31]]/[[:d:P:P279|P279]]* | ||
** is `language` [[:d:Q34770]] (ethnic based) , [[:d:Q315]] (capacity) | ** is `language` [[:d:Q34770]] (ethnic based) , [[:d:Q315]] (capacity) | ||
Line 65: | Line 94: | ||
** is `dead language` [[d:Q45762]] (no community) | ** is `dead language` [[d:Q45762]] (no community) | ||
** is `instinct language` [[:d:Q38058796]] (no speaker) | ** is `instinct language` [[:d:Q38058796]] (no speaker) | ||
+ | * `ISO 639-1 code` [[:d:P:P218|P:P218]] | ||
+ | * `ISO 639-2 code` [[:d:P:P219|P:P219]] | ||
+ | * `ISO 639-3 code` [[:d:P:P220|P:P220]] | ||
+ | * `IETF language tag` [[:d:P:P305|P:P305]] | ||
+ | * `geographic coordinate` [[:d:P:P625|P625]] | ||
+ | * `number of speakers` [[:d:P:P1098|P1098]] | ||
+ | * `wikimedia code` [[:d:P:P424|P:P424]] | ||
+ | * `native name` [[:d:P:P1705|P:P1705]] | ||
+ | * `lingualibre ID` [[:d:P:P10369|P:P10369]] | ||
+ | || | ||
+ | For countries: | ||
* `country` [[:d:P:P17|P17]] | * `country` [[:d:P:P17|P17]] | ||
** is `country` [[:d:Q6256]] | ** is `country` [[:d:Q6256]] | ||
** is `sovereign state` [[:d:Q3624078]] | ** is `sovereign state` [[:d:Q3624078]] | ||
+ | * `ISO 3166-1 alpha-3 code` [[:d:P:P298|P:P298]] | ||
* `continent` [[:d:P:P30|P30]] | * `continent` [[:d:P:P30|P30]] | ||
− | * `official language` [[:d:P:P37|P37]] | + | * `official language` [[:d:P:P37|P37]] |
− | * ` | + | For places: |
− | * ` | + | * `located in entity of level n` [[:d:P:P131|P:P131]] |
− | * ` | + | ** `administrative territorial entity of a specific level` [[:d:Q1799794]] |
+ | *** (0th level = country) | ||
+ | *** `1st-level administrative country subdivision [[:d:Q10864048]] | ||
+ | *** `2nd-level administrative country subdivision` [[:d:Q13220204]] | ||
+ | *** `3rd-level administrative country subdivision` [[:d:Q13221722]] | ||
+ | *** `4th-level administrative country subdivision` [[:d:Q14757767]] | ||
+ | *** `5th-level administrative country subdivision` [[:d:Q15640612]] | ||
+ | *** `6th-level administrative country subdivision [[:d:Q22927291]] | ||
+ | |||
|} | |} | ||
== Languages == | == Languages == | ||
− | === ✅ Language () → List of LL languages with wd speaker population === | + | === ✅ Language ([[Q930]] Gascon) → List of records in this language === |
+ | To run on WDQS.<ref name="WDQS" /> | ||
+ | {| style="width:100%" | ||
+ | |- style="vertical-align:top;" | ||
+ | |style="padding: 0 3em;width:60%"| | ||
+ | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Wikidata | ||
+ | PREFIX prop: <https://lingualibre.org/prop/direct/> | ||
+ | PREFIX entity: <https://lingualibre.org/entity/> | ||
+ | SELECT ?writing WHERE { | ||
+ | SERVICE <https://lingualibre.org/sparql> { | ||
+ | SELECT ?writing WHERE { | ||
+ | ?record prop:P2 entity:Q2; | ||
+ | prop:P4 entity:Q930, ?language; | ||
+ | prop:P3 ?url; | ||
+ | prop:P7 ?writing. | ||
+ | # FILTER(CONTAINS(STR(?audio), "LL-Q35735")) # occitan | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | || | ||
+ | <query _pagination="10"> | ||
+ | #defaultEndpoint:Wikidata | ||
+ | PREFIX prop: <https://lingualibre.org/prop/direct/> | ||
+ | PREFIX entity: <https://lingualibre.org/entity/> | ||
+ | SELECT ?writing WHERE { | ||
+ | SERVICE <https://lingualibre.org/sparql> { | ||
+ | SELECT ?writing WHERE { | ||
+ | ?record prop:P2 entity:Q2; | ||
+ | prop:P4 entity:Q930, ?language; | ||
+ | prop:P3 ?url; | ||
+ | prop:P7 ?writing. | ||
+ | # FILTER(CONTAINS(STR(?audio), "LL-Q35735")) # occitan | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </query> | ||
+ | |} | ||
+ | |||
+ | === ✅ Language () → List of WD languages with speaker population >80M === | ||
+ | To run on WDQS.<ref name="WDQS" /> | ||
+ | {| style="width:100%" | ||
+ | |- style="vertical-align:top;" | ||
+ | |style="padding: 0 3em;width:60%"| | ||
+ | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Wikidata | ||
+ | SELECT DISTINCT ?item ?ISO ?itemLabel WHERE { | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". } | ||
+ | ?item p:P1098 ?statement0. | ||
+ | ?item p:P219 ?isolang. | ||
+ | ?statement0 (psv:P1098/wikibase:quantityAmount) ?numericQuantity. | ||
+ | FILTER(?numericQuantity > "80000000"^^xsd:decimal) | ||
+ | MINUS { | ||
+ | ?item p:P31 ?statement1. | ||
+ | ?statement1 (ps:P31/(wdt:P279*)) wd:Q25295. | ||
+ | } | ||
+ | OPTIONAL { ?item wdt:P218 ?ISO. } | ||
+ | } | ||
+ | ORDER BY ASC (?ISO) | ||
+ | LIMIT 100 | ||
+ | </syntaxhighlight> | ||
+ | || | ||
+ | <query _pagination="10"> | ||
+ | #defaultEndpoint:Wikidata | ||
+ | #defaultEndpoint:Wikidata | ||
+ | SELECT DISTINCT ?item ?ISO ?itemLabel WHERE { | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". } | ||
+ | ?item p:P1098 ?statement0. | ||
+ | ?item p:P219 ?isolang. | ||
+ | ?statement0 (psv:P1098/wikibase:quantityAmount) ?numericQuantity. | ||
+ | FILTER(?numericQuantity > "80000000"^^xsd:decimal) | ||
+ | MINUS { | ||
+ | ?item p:P31 ?statement1. | ||
+ | ?statement1 (ps:P31/(wdt:P279*)) wd:Q25295. | ||
+ | } | ||
+ | OPTIONAL { ?item wdt:P218 ?ISO. } | ||
+ | } | ||
+ | ORDER BY ASC (?ISO) | ||
+ | LIMIT 100 | ||
+ | </query> | ||
+ | |} | ||
+ | |||
+ | === Language () → List of LL languages with wd speaker population === | ||
+ | :''Query to create.'' | ||
===✅ Languages → List of Sign languages on wikidata === | ===✅ Languages → List of Sign languages on wikidata === | ||
− | To run on WDQS.<ref name=" | + | To run on WDQS.<ref name="WDQS" /> |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Wikidata | ||
SELECT DISTINCT ?item ?itemLabel | SELECT DISTINCT ?item ?itemLabel | ||
WHERE { | WHERE { | ||
Line 93: | Line 229: | ||
|| | || | ||
<query _pagination="10"> | <query _pagination="10"> | ||
+ | #defaultEndpoint:Wikidata | ||
SELECT DISTINCT ?item ?itemLabel | SELECT DISTINCT ?item ?itemLabel | ||
WHERE { | WHERE { | ||
Line 98: | Line 235: | ||
?statement0 (ps:P31/(wdt:P279*)) wd:Q34228. | ?statement0 (ps:P31/(wdt:P279*)) wd:Q34228. | ||
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". } | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". } | ||
+ | } | ||
+ | </query> | ||
+ | |} | ||
+ | |||
+ | ===✅ Languages → List of whistled languages on wikidata === | ||
+ | To run on WDQS.<ref name="WDQS" /> | ||
+ | {| style="width:100%" | ||
+ | |- style="vertical-align:top;" | ||
+ | |style="padding: 0 3em;width:60%"| | ||
+ | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Wikidata | ||
+ | SELECT ?whistled_language ?whistled_languageLabel | ||
+ | WHERE { | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } | ||
+ | ?whistled_language wdt:P31 wd:Q2737212. # Is whistled language (Q2737212) | ||
+ | } | ||
+ | </syntaxhighlight> | ||
+ | || | ||
+ | <query _pagination="10"> | ||
+ | #defaultEndpoint:Wikidata | ||
+ | SELECT ?whistled_language ?whistled_languageLabel | ||
+ | WHERE { | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } | ||
+ | ?whistled_language wdt:P31 wd:Q2737212. # Is whistled language (Q2737212) | ||
} | } | ||
</query> | </query> | ||
Line 108: | Line 269: | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT | SELECT | ||
?deadLanguageLinguaLibre | ?deadLanguageLinguaLibre | ||
Line 155: | Line 317: | ||
|| | || | ||
<query _pagination="10"> | <query _pagination="10"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT | SELECT | ||
?deadLanguageLinguaLibre | ?deadLanguageLinguaLibre | ||
Line 208: | Line 371: | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT | SELECT | ||
?endangeredLanguageLinguaLibre | ?endangeredLanguageLinguaLibre | ||
Line 255: | Line 419: | ||
|| | || | ||
<query _pagination="10"> | <query _pagination="10"> | ||
− | + | #defaultEndpoint:Lingualibre | |
SELECT | SELECT | ||
?endangeredLanguageLinguaLibre | ?endangeredLanguageLinguaLibre | ||
Line 309: | Line 473: | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT | SELECT | ||
?signLanguageLinguaLibre | ?signLanguageLinguaLibre | ||
Line 354: | Line 519: | ||
|| | || | ||
<query _pagination="10"> | <query _pagination="10"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT | SELECT | ||
?signLanguageLinguaLibre | ?signLanguageLinguaLibre | ||
Line 399: | Line 565: | ||
|} | |} | ||
− | === | + | === Language ([[Q34]]) → Item with Wikidata Qid(s), optional Geo-coordinates === |
− | + | To run on LLQS.<ref name="LLQS" /> | |
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Lingualibre | ||
PREFIX wd: <http://www.wikidata.org/entity/> | PREFIX wd: <http://www.wikidata.org/entity/> | ||
PREFIX wdt: <http://www.wikidata.org/prop/direct/> | PREFIX wdt: <http://www.wikidata.org/prop/direct/> | ||
Line 411: | Line 578: | ||
PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/> | PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/> | ||
− | select distinct ?record ?transcription ?languageLabel ?wdQid ? | + | select distinct ?record ?transcription ?languageLabel ?wdQid (?wdLabel as ?wdLabelEN) ?coord |
where { | where { | ||
?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record' | ?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record' | ||
Line 417: | Line 584: | ||
?record llt:P4 ?language . # Assign value: record's P4 'language' to variable ?language | ?record llt:P4 ?language . # Assign value: record's P4 'language' to variable ?language | ||
?record llt:P7 ?transcription . # Assign value: record's P7 'transcription' to variable ?transcription | ?record llt:P7 ?transcription . # Assign value: record's P7 'transcription' to variable ?transcription | ||
− | ?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ? | + | ?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ?wdQid |
SERVICE <https://query.wikidata.org/sparql> { | SERVICE <https://query.wikidata.org/sparql> { | ||
Line 426: | Line 593: | ||
} | } | ||
} | } | ||
− | + | SERVICE wikibase:label { | |
− | SERVICE wikibase:label { | ||
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . | bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . | ||
} | } | ||
Line 433: | Line 599: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
|- | |- | ||
− | | | + | | |
− | < | + | <query _pagination="10"> |
− | record transcription languageLabel wdQid | + | #defaultEndpoint:Lingualibre |
− | + | PREFIX wd: <http://www.wikidata.org/entity/> | |
− | + | PREFIX wdt: <http://www.wikidata.org/prop/direct/> | |
− | + | PREFIX ll: <https://lingualibre.org/entity/> | |
− | + | PREFIX llt: <https://lingualibre.org/prop/direct/> | |
− | </ | + | PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/> |
+ | |||
+ | select distinct ?record ?transcription ?languageLabel ?wdQid (?wdLabel as ?wdLabelEN) ?coord | ||
+ | where { | ||
+ | ?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record' | ||
+ | ?record llt:P4 ll:Q34 . # Filter: record's P4 'language' is Q34 'Marathi' | ||
+ | ?record llt:P4 ?language . # Assign value: record's P4 'language' to variable ?language | ||
+ | ?record llt:P7 ?transcription . # Assign value: record's P7 'transcription' to variable ?transcription | ||
+ | ?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ?wdQid | ||
+ | |||
+ | SERVICE <https://query.wikidata.org/sparql> { | ||
+ | OPTIONAL { ?wdQid wdt:P625 ?coord . } # Assign value: wikidata item's wd:P625 'coordinates' to variable ?coord | ||
+ | OPTIONAL { | ||
+ | ?wdQid rdfs:label ?wdLabel . # Assign value: wikidata item's label to variable ?wikidataLabel | ||
+ | FILTER (LANG(?wdLabel) = "en") . # Filter: default language, else English | ||
+ | } | ||
+ | } | ||
+ | SERVICE wikibase:label { | ||
+ | bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . | ||
+ | } | ||
+ | } | ||
+ | </query> | ||
|} | |} | ||
Line 449: | Line 636: | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT ?languageLabel ?wikidata ?iso | SELECT ?languageLabel ?wikidata ?iso | ||
?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords | ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords | ||
Line 495: | Line 683: | ||
|| | || | ||
<query _pagination="10"> | <query _pagination="10"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT ?languageLabel ?wikidata ?iso ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords | SELECT ?languageLabel ?wikidata ?iso ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords | ||
(ROUND(1000*?femalesRecords/(?femalesRecords+?malesRecords))/10 AS ?percent) | (ROUND(1000*?femalesRecords/(?femalesRecords+?malesRecords))/10 AS ?percent) | ||
Line 540: | Line 729: | ||
=== [HEAVY] Languages → Languages with gender and recordings counts (2) === | === [HEAVY] Languages → Languages with gender and recordings counts (2) === | ||
+ | :''Section needs review.'' | ||
To run on LLQS.<ref name="LLQS" /> | To run on LLQS.<ref name="LLQS" /> | ||
{| style="width:100%" | {| style="width:100%" | ||
Line 545: | Line 735: | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
− | SELECT | + | #defaultEndpoint:Lingualibre |
+ | SELECT ?iso | ||
(?genderLabel as ?Gender) | (?genderLabel as ?Gender) | ||
(COUNT(DISTINCT ?speakerQid) as ?Speakers) | (COUNT(DISTINCT ?speakerQid) as ?Speakers) | ||
Line 558: | Line 749: | ||
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" } | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" } | ||
} | } | ||
− | GROUP BY | + | GROUP BY ?iso ?genderLabel |
− | ORDER BY ASC(? | + | ORDER BY ASC(?iso ) |
</syntaxhighlight> | </syntaxhighlight> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
− | + | <query _pagination="10"> | |
− | + | #defaultEndpoint:Lingualibre | |
− | + | SELECT ?iso | |
− | + | ?genderLabel | |
− | + | (COUNT(DISTINCT ?speakerQid) as ?Speakers) | |
− | + | (COUNT(DISTINCT ?record) as ?Records) | |
− | |||
− | |||
− | |||
− | |||
WHERE { | WHERE { | ||
− | ? | + | ?record prop:P2 entity:Q2 . # Filter: items where P2 'instance of' is Q2 'record' |
− | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" | + | ?record prop:P4 ?language . # Filter: items where P4 'language' is Q34 'Marathi' |
+ | # OPTIONAL { ?language prop:P12 ?wikidata } # Assign value: P12 'wikidata id' into variable ?WD | ||
+ | OPTIONAL { ?language prop:P13 ?iso } # Assign value: P13 'iso639-3' into ?isoCode | ||
+ | ?record prop:P5 ?speakerQid . # Assign value: P5 'speaker' into variable ?speakerQid | ||
+ | ?speakerQid prop:P8 ?gender . # Assign value: P8 'sex or gender' | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" } | ||
} | } | ||
− | + | GROUP BY ?iso ?genderLabel | |
− | ORDER BY ASC(? | + | ORDER BY ASC(?iso) |
− | + | </query> | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|} | |} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Speakers == | == Speakers == | ||
=== ✅ Speakers → Largest number of languages recorded and known === | === ✅ Speakers → Largest number of languages recorded and known === | ||
+ | To run on LLQS.<ref name="LLQS" /> | ||
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Lingualibre | ||
#Title: Speakers with recordings largest number of languages and known languages | #Title: Speakers with recordings largest number of languages and known languages | ||
SELECT ?speaker ?speakerLabel ?count ?languages | SELECT ?speaker ?speakerLabel ?count ?languages | ||
Line 734: | Line 833: | ||
|| | || | ||
<query _pagination="5"> | <query _pagination="5"> | ||
+ | #defaultEndpoint:Lingualibre | ||
#Title: Speakers with recordings largest number of languages and known languages | #Title: Speakers with recordings largest number of languages and known languages | ||
SELECT ?speaker ?speakerLabel ?count ?languages | SELECT ?speaker ?speakerLabel ?count ?languages | ||
Line 785: | Line 885: | ||
</query> | </query> | ||
|} | |} | ||
− | |||
=== ✅ Speakers → Countries with most speakers === | === ✅ Speakers → Countries with most speakers === | ||
+ | To run on LLQS.<ref name="LLQS" /><br>Note: this queries collects ''declared'' speakers, and do not check for actual recordings. | ||
{| style="width:100%" | {| style="width:100%" | ||
|- style="vertical-align:top;" | |- style="vertical-align:top;" | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count) | SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count) | ||
WITH { | WITH { | ||
Line 832: | Line 933: | ||
|| | || | ||
<query _pagination="20"> | <query _pagination="20"> | ||
+ | #defaultEndpoint:Lingualibre | ||
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count) | SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count) | ||
WITH { | WITH { | ||
Line 878: | Line 980: | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Wikidata | ||
#defaultView:Map | #defaultView:Map | ||
PREFIX ll: <https://lingualibre.org/entity/> | PREFIX ll: <https://lingualibre.org/entity/> | ||
Line 907: | Line 1,010: | ||
|| | || | ||
<query _pagination="20"> | <query _pagination="20"> | ||
+ | #defaultEndpoint:Wikidata | ||
+ | #defaultView:Map | ||
+ | PREFIX ll: <https://lingualibre.org/entity/> | ||
+ | PREFIX llt: <https://lingualibre.org/prop/direct/> | ||
+ | |||
+ | SELECT DISTINCT ?lLabel ?coord WITH { | ||
+ | SELECT ?lLabel ?loc WHERE { | ||
+ | SERVICE <https://lingualibre.org/sparql> { | ||
+ | select DISTINCT ?lLabel ?loc { | ||
+ | SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } | ||
+ | ?l llt:P2 ll:Q3 ; | ||
+ | llt:P14 ?loc . | ||
+ | ?record llt:P5 ?l. | ||
+ | FILTER (regex(?loc, '^Q')) | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | } AS %i | ||
+ | WHERE { | ||
+ | INCLUDE %i | ||
+ | BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?loc)) AS ?locURL) | ||
+ | SERVICE <https://query.wikidata.org/sparql> { | ||
+ | select * { | ||
+ | ?locURL wdt:P625 ?coord . | ||
+ | } | ||
+ | } | ||
+ | } | ||
</query> | </query> | ||
|} | |} | ||
− | |||
=== <!-- ✅--> Speakers → Map of speakers with recordings === | === <!-- ✅--> Speakers → Map of speakers with recordings === | ||
Line 917: | Line 1,046: | ||
|style="padding: 0 3em;width:60%"| | |style="padding: 0 3em;width:60%"| | ||
<syntaxhighlight lang="sparql"> | <syntaxhighlight lang="sparql"> | ||
+ | #defaultEndpoint:Wikidata | ||
#defaultView:Map | #defaultView:Map | ||
# Same can apply to languages : | # Same can apply to languages : | ||
Line 966: | Line 1,096: | ||
ORDER BY DESC (?records) | ORDER BY DESC (?records) | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | |||
| | | | ||
− | < | + | <query _pagination="10"> |
− | + | #defaultEndpoint:Wikidata | |
− | + | #defaultView:Map | |
− | + | # Same can apply to languages : | |
+ | # Q3→Q4 : items speaker and language | ||
+ | # P14→P12 : items WD location and WD language | ||
+ | # P5→P4 : properties speaker and languages | ||
+ | # P625→P1098 : properties location and population | ||
+ | |||
+ | PREFIX ll: <https://lingualibre.org/entity/> | ||
+ | PREFIX llt: <https://lingualibre.org/prop/direct/> | ||
− | </ | + | SELECT DISTINCT ?itemLabel ?wikidata (?info AS ?coordinates) ?records (SAMPLE(?tag) AS ?layer) |
+ | # On Lingualibre, get item's info | ||
+ | WITH { | ||
+ | SELECT * | ||
+ | WHERE { | ||
+ | SERVICE <https://lingualibre.org/sparql> { # Commented on LLQS only | ||
+ | SELECT DISTINCT ?itemLabel ?wikidata (COUNT(?record) AS ?records) | ||
+ | { | ||
+ | ?item llt:P2 ll:Q3 ; | ||
+ | rdfs:label ?itemLabel; | ||
+ | llt:P14 ?wikidata . | ||
+ | ?record llt:P5 ?item . | ||
+ | FILTER (LANG(?itemLabel)='en') | ||
+ | FILTER (regex(?wikidata, '^Q')) | ||
+ | } GROUP BY ?itemLabel ?wikidata | ||
+ | } # Commented on LLQS only | ||
+ | } | ||
+ | } AS %infoWikidataId | ||
+ | # On Wikidata, get the target info | ||
+ | WHERE { | ||
+ | INCLUDE %infoWikidataId | ||
+ | BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?wikidata)) AS ?infoURL) | ||
+ | SERVICE <https://query.wikidata.org/sparql> { | ||
+ | SELECT * { | ||
+ | ?infoURL wdt:P625 ?info . | ||
+ | } | ||
+ | } | ||
+ | BIND( | ||
+ | IF(?records < 10, "<10", | ||
+ | IF(?records < 1000, "10-1,000", | ||
+ | IF(?records < 5000, "1k-5k", | ||
+ | IF(?records < 25000, "5k-25k", | ||
+ | IF(?records < 50000, "25k-50k", | ||
+ | ">50k"))))) | ||
+ | AS ?tag) . | ||
+ | } | ||
+ | GROUP BY ?itemLabel ?wikidata ?info ?records | ||
+ | ORDER BY DESC (?records) | ||
+ | </query> | ||
|} | |} | ||
Line 981: | Line 1,155: | ||
== See also == | == See also == | ||
* [[Help:SPARQL]] | * [[Help:SPARQL]] | ||
− | {{ | + | * [https://cradle.toolforge.org/#/subject/affiant Craddle.toolforge.org] |
+ | {{Technicals}} | ||
[[Category:Lingua Libre:Help{{#translation:}}]] | [[Category:Lingua Libre:Help{{#translation:}}]] |
Latest revision as of 16:28, 23 April 2024
Tools
SPARQL to persitent data
Some SPARQL queries are meaningful but heavy and overly slow. This administrator tool stores or updates the response data on LinguaLibre, within a wikipage. Stored data can then be loaded in <0.1 second. Multiple data can also be merged via a common property if any.
Federated queries
- To query Lingualibre from Wikidata, use
SERVICE <https://lingualibre.org/sparql>
. - To query Wikidata from LinguaLibre, use
SERVICE <https://query.wikidata.org/sparql>
. - To query Commons from Lingualibre, use
SERVICE <https://commons-query.wikimedia.org/sparql>
.
Retrieve data of LinguaLibre from Wikidata
To run on WDQS.[1] It lists the existing levels in LinguaLibre.
#defaultEndpoint:Wikidata
PREFIX prop: <https://lingualibre.org/prop/direct/>
PREFIX entity: <https://lingualibre.org/entity/>
SELECT * {
SERVICE <https://lingualibre.org/sparql> {
SELECT
?item
?itemLabel
{
?item prop:P2 entity:Q5.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
}
}
|
|
From LinguaLibre Query service, retrieve Wikidata's data
- To run on LLQS.[2]
- Federate SPARQL query example to create.
# [Example to complete]
Notable elements
LinguaLibre endpoint | Wikidata endpoint | |
---|---|---|
|
For languages:
|
For countries:
For places:
|
Languages
✅ Language (Gascon (Q930) Gascon) → List of records in this language
To run on WDQS.[1]
#defaultEndpoint:Wikidata
PREFIX prop: <https://lingualibre.org/prop/direct/>
PREFIX entity: <https://lingualibre.org/entity/>
SELECT ?writing WHERE {
SERVICE <https://lingualibre.org/sparql> {
SELECT ?writing WHERE {
?record prop:P2 entity:Q2;
prop:P4 entity:Q930, ?language;
prop:P3 ?url;
prop:P7 ?writing.
# FILTER(CONTAINS(STR(?audio), "LL-Q35735")) # occitan
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
}
}
|
|
✅ Language () → List of WD languages with speaker population >80M
To run on WDQS.[1]
#defaultEndpoint:Wikidata
SELECT DISTINCT ?item ?ISO ?itemLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
?item p:P1098 ?statement0.
?item p:P219 ?isolang.
?statement0 (psv:P1098/wikibase:quantityAmount) ?numericQuantity.
FILTER(?numericQuantity > "80000000"^^xsd:decimal)
MINUS {
?item p:P31 ?statement1.
?statement1 (ps:P31/(wdt:P279*)) wd:Q25295.
}
OPTIONAL { ?item wdt:P218 ?ISO. }
}
ORDER BY ASC (?ISO)
LIMIT 100
|
|
Language () → List of LL languages with wd speaker population
- Query to create.
✅ Languages → List of Sign languages on wikidata
To run on WDQS.[1]
#defaultEndpoint:Wikidata
SELECT DISTINCT ?item ?itemLabel
WHERE {
?item p:P31 ?statement0.
?statement0 (ps:P31/(wdt:P279*)) wd:Q34228.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
|
|
✅ Languages → List of whistled languages on wikidata
To run on WDQS.[1]
#defaultEndpoint:Wikidata
SELECT ?whistled_language ?whistled_languageLabel
WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?whistled_language wdt:P31 wd:Q2737212. # Is whistled language (Q2737212)
}
|
|
✅ Languages → List of LL languages with wikidata dead or extinct status
To run on LLQS.[2]
#defaultEndpoint:Lingualibre
SELECT
?deadLanguageLinguaLibre
?deadLanguageLinguaLibreLabel
?audios
# List Wikidata dead/extinct languages
WITH {
SELECT DISTINCT ?deadLanguage {
SERVICE <https://query.wikidata.org/sparql> {
{ ?deadLanguage wdt:P31/wdt:P279* wd:Q45762. }
UNION
{ ?deadLanguage wdt:P31/wdt:P279* wd:Q38058796. }
}
}
} AS %deadLanguage
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
WITH {
SELECT ?deadLanguageLinguaLibre {
INCLUDE %deadLanguage.
BIND(REPLACE(STR(?deadLanguage), '.*/', '') AS ?deadLanguageQid)
?deadLanguageLinguaLibre
prop:P2 entity:Q4;
prop:P12 ?deadLanguageQid.
}
} AS %deadLanguageLinguaLibre
# For those Lingualibre languages, count audios into ?audios
WITH {
SELECT
?deadLanguageLinguaLibre
(COUNT(?audio) AS ?audios)
{
INCLUDE %deadLanguageLinguaLibre.
?audio
prop:P2 entity:Q2;
prop:P4 ?deadLanguageLinguaLibre.
}
GROUP BY ?deadLanguageLinguaLibre
} AS %audios
# What is that ? Back to main scope ? :
{
INCLUDE %deadLanguageLinguaLibre.
OPTIONAL{ INCLUDE %audios. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?audios)
|
|
✅ Languages → List of LL languages with wikidata Endangered language status
To run on LLQS.[2]
#defaultEndpoint:Lingualibre
SELECT
?endangeredLanguageLinguaLibre
?endangeredLanguageLinguaLibreLabel
?audios
# List Wikidata endangered languages
WITH {
SELECT DISTINCT ?endangeredLanguage {
SERVICE <https://query.wikidata.org/sparql> {
{ ?endangeredLanguage wdt:P31/wdt:P279* wd:Q335214. }
UNION
{ ?endangeredLanguage wdt:P31/wdt:P279* wd:Q83365366. }
}
}
} AS %endangeredLanguage
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
WITH {
SELECT ?endangeredLanguageLinguaLibre {
INCLUDE %endangeredLanguage.
BIND(REPLACE(STR(?endangeredLanguage), '.*/', '') AS ?endangeredLanguageQid)
?endangeredLanguageLinguaLibre
prop:P2 entity:Q4;
prop:P12 ?endangeredLanguageQid.
}
} AS %endangeredLanguageLinguaLibre
# For those Lingualibre languages, count audios into ?audios
WITH {
SELECT
?endangeredLanguageLinguaLibre
(COUNT(?audio) AS ?audios)
{
INCLUDE %endangeredLanguageLinguaLibre.
?audio
prop:P2 entity:Q2;
prop:P4 ?endangeredLanguageLinguaLibre.
}
GROUP BY ?endangeredLanguageLinguaLibre
} AS %audios
# What is that ? Back to main scope ? :
{
INCLUDE %endangeredLanguageLinguaLibre.
OPTIONAL{ INCLUDE %audios. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?audios)
|
|
✅ Languages → List of LL languages with wikidata Sign language status
To run on LLQS.[2]
#defaultEndpoint:Lingualibre
SELECT
?signLanguageLinguaLibre
?signLanguageLinguaLibreLabel
?audios
# List Wikidata sign languages
WITH {
SELECT DISTINCT ?signLanguage {
SERVICE <https://query.wikidata.org/sparql> {
{ ?signLanguage wdt:P31/wdt:P279* wd:Q34228. }
}
}
} AS %signLanguage
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
WITH {
SELECT ?signLanguageLinguaLibre {
INCLUDE %signLanguage.
BIND(REPLACE(STR(?signLanguage), '.*/', '') AS ?signLanguageQid)
?signLanguageLinguaLibre
prop:P2 entity:Q4;
prop:P12 ?signLanguageQid.
}
} AS %signLanguageLinguaLibre
# For those Lingualibre languages, count audios into ?audios
WITH {
SELECT
?signLanguageLinguaLibre
(COUNT(?audio) AS ?audios)
{
INCLUDE %signLanguageLinguaLibre.
?audio
prop:P2 entity:Q2;
prop:P4 ?signLanguageLinguaLibre.
}
GROUP BY ?signLanguageLinguaLibre
} AS %audios
# What is that ? Back to main scope ? :
{
INCLUDE %signLanguageLinguaLibre.
OPTIONAL{ INCLUDE %audios. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?audios)
|
|
Language (Marathi (Q34)) → Item with Wikidata Qid(s), optional Geo-coordinates
To run on LLQS.[2]
#defaultEndpoint:Lingualibre
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX ll: <https://lingualibre.org/entity/>
PREFIX llt: <https://lingualibre.org/prop/direct/>
PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/>
select distinct ?record ?transcription ?languageLabel ?wdQid (?wdLabel as ?wdLabelEN) ?coord
where {
?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record'
?record llt:P4 ll:Q34 . # Filter: record's P4 'language' is Q34 'Marathi'
?record llt:P4 ?language . # Assign value: record's P4 'language' to variable ?language
?record llt:P7 ?transcription . # Assign value: record's P7 'transcription' to variable ?transcription
?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ?wdQid
SERVICE <https://query.wikidata.org/sparql> {
OPTIONAL { ?wdQid wdt:P625 ?coord . } # Assign value: wikidata item's wd:P625 'coordinates' to variable ?coord
OPTIONAL {
?wdQid rdfs:label ?wdLabel . # Assign value: wikidata item's label to variable ?wikidataLabel
FILTER (LANG(?wdLabel) = "en") . # Filter: default language, else English
}
}
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
}
}
|
|
✅ Languages → Languages with gender and recordings counts
To run on LLQS.[2]
#defaultEndpoint:Lingualibre
SELECT ?languageLabel ?wikidata ?iso
?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
(ROUND(1000*?femalesRecords/(?femalesRecords+?malesRecords))/10 AS ?percent)
WITH {
SELECT ?language ?languageLabel ?wikidata ?iso {
?record prop:P2 entity:Q2 . # Filter: P2 'instance of' is Q2 'record'
?record prop:P4 ?language . # Assign value: P4 'language' into ?language
?language prop:P12 ?wikidata . # Assign value: P12 'wikidata id' into ?wikidata
OPTIONAL { ?language prop:P13 ?iso . } # Assign value: P13 'iso639-3' into ?iso
}
GROUP BY ?language ?languageLabel ?wikidata ?iso
} AS %base
WITH {
SELECT ?language ?languageLabel ?iso ?genderLabel
(COUNT(DISTINCT ?females) AS ?femalesSpeakers)
(COUNT(DISTINCT ?record) AS ?femalesRecords) {
INCLUDE %base
?record prop:P4 ?language ; # Filter
prop:P5 ?females . # Assign value: P5 'speaker' into ?females
?females prop:P8 entity:Q17 ; # Filter
prop:P8 ?gender . # Assign value: P8 'gender' into ?gender
} GROUP BY ?language ?languageLabel ?iso ?genderLabel
} AS %females
WITH {
SELECT ?language ?languageLabel ?iso ?genderLabel
(COUNT(DISTINCT ?males) AS ?malesSpeakers)
(COUNT(DISTINCT ?record) AS ?malesRecords) {
INCLUDE %base
?record prop:P4 ?language ;
prop:P5 ?males . # Assign value: P5 'speaker' into variable ?speakerQid
?males prop:P8 entity:Q16 ;
prop:P8 ?gender .
} GROUP BY ?language ?languageLabel ?iso ?genderLabel
} AS %males
{
INCLUDE %base
INCLUDE %females
INCLUDE %males
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?languageLabel ?wikidata ?iso ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
ORDER BY ASC(?languageLabel )
|
|
[HEAVY] Languages → Languages with gender and recordings counts (2)
- Section needs review.
To run on LLQS.[2]
#defaultEndpoint:Lingualibre
SELECT ?iso
(?genderLabel as ?Gender)
(COUNT(DISTINCT ?speakerQid) as ?Speakers)
(COUNT(DISTINCT ?record) as ?Records)
WHERE {
?record prop:P2 entity:Q2 . # Filter: items where P2 'instance of' is Q2 'record'
?record prop:P4 ?language . # Filter: items where P4 'language' is Q34 'Marathi'
# OPTIONAL { ?language prop:P12 ?wikidata } # Assign value: P12 'wikidata id' into variable ?WD
OPTIONAL { ?language prop:P13 ?iso } # Assign value: P13 'iso639-3' into ?isoCode
?record prop:P5 ?speakerQid . # Assign value: P5 'speaker' into variable ?speakerQid
?speakerQid prop:P8 ?gender . # Assign value: P8 'sex or gender'
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
GROUP BY ?iso ?genderLabel
ORDER BY ASC(?iso )
|
|
Speakers
✅ Speakers → Largest number of languages recorded and known
To run on LLQS.[2]
#defaultEndpoint:Lingualibre
#Title: Speakers with recordings largest number of languages and known languages
SELECT ?speaker ?speakerLabel ?count ?languages
# Get audios, language, speaker triplet
WITH {
SELECT DISTINCT ?speaker ?language {
?audio prop:P4 ?language;
prop:P5 ?speaker.
}
} AS %speakers
# Get the count of languages per each speaker
WITH {
SELECT ?speaker (COUNT(?speaker) AS ?count) {
INCLUDE %speakers.
}
GROUP BY ?speaker
ORDER BY DESC(?count)
} AS %countOfLanguagesRecordedPerSpeaker
# Get the maximum number of languages per each speaker
WITH {
SELECT (MAX(?count) AS ?maxNumberOfLanguagesRecorded) {
INCLUDE %countOfLanguagesRecordedPerSpeaker.
}
} AS %maxNumberOfLanguagesRecorded
# Get those speakers whose count equals the maximum number of languages
WITH {
SELECT ?speaker ?count {
INCLUDE %countOfLanguagesRecordedPerSpeaker.
INCLUDE %maxNumberOfLanguagesRecorded.
FILTER(?count = ?maxNumberOfLanguagesRecorded).
}
} AS %speakersWithMostNumberOfLanguagesRecorded
# Get the languages of those speakers that have recorded audios in the
# most number of languages
WITH {
SELECT ?speaker (GROUP_CONCAT(?languageLabel; SEPARATOR = ", ") AS ?languages) {
INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
?speaker prop:P4 [
rdfs:label ?languageLabel
]
FILTER(LANG(?languageLabel) = "en").
}
GROUP BY ?speaker
} AS %languagesOfSpeakersWithMostNumberOfLanguagesRecorded
{
INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
INCLUDE %languagesOfSpeakersWithMostNumberOfLanguagesRecorded.
?speaker rdfs:label ?speakerLabel.
FILTER(LANG(?speakerLabel) = "en")
}
|
|
✅ Speakers → Countries with most speakers
To run on LLQS.[2]
Note: this queries collects declared speakers, and do not check for actual recordings.
#defaultEndpoint:Lingualibre
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
WITH {
SELECT DISTINCT ?speaker {
?speaker prop:P2 entity:Q3;
}
} AS %speakers
WITH {
SELECT DISTINCT
?speaker
?country
?countryLabel
?ISO3
?continentLabel
{
INCLUDE %speakers.
?speaker prop:P14 ?residence.
# Avoids weird errors.
FILTER(REGEX(?residence, "^Q[0-9]+$"))
BIND(IRI(CONCAT('http://www.wikidata.org/entity/', ?residence)) AS ?residenceId)
# Get country from wikidata
SERVICE <https://query.wikidata.org/sparql> {
?residenceId wdt:P17 ?country.
?country rdfs:label ?countryLabel;
wdt:P298 ?ISO3;
wdt:P30 ?continent.
?continent rdfs:label ?continentLabel.
FILTER(LANG(?countryLabel) = "en").
FILTER(LANG(?continentLabel) = "en").
}
}
} AS %speakersWithCountries
{
INCLUDE %speakersWithCountries.
}
GROUP BY ?country ?continentLabel ?ISO3 ?countryLabel
ORDER BY DESC(?count)
|
|
Speakers → Map of speakers by place
To run on WDQS.[1]
#defaultEndpoint:Wikidata
#defaultView:Map
PREFIX ll: <https://lingualibre.org/entity/>
PREFIX llt: <https://lingualibre.org/prop/direct/>
SELECT DISTINCT ?lLabel ?coord WITH {
SELECT ?lLabel ?loc WHERE {
SERVICE <https://lingualibre.org/sparql> {
select DISTINCT ?lLabel ?loc {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?l llt:P2 ll:Q3 ;
llt:P14 ?loc .
?record llt:P5 ?l.
FILTER (regex(?loc, '^Q'))
}
}
}
} AS %i
WHERE {
INCLUDE %i
BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?loc)) AS ?locURL)
SERVICE <https://query.wikidata.org/sparql> {
select * {
?locURL wdt:P625 ?coord .
}
}
}
|
|
Speakers → Map of speakers with recordings
To run on WDQS.[1]
#defaultEndpoint:Wikidata
#defaultView:Map
# Same can apply to languages :
# Q3→Q4 : items speaker and language
# P14→P12 : items WD location and WD language
# P5→P4 : properties speaker and languages
# P625→P1098 : properties location and population
PREFIX ll: <https://lingualibre.org/entity/>
PREFIX llt: <https://lingualibre.org/prop/direct/>
SELECT DISTINCT ?itemLabel ?wikidata (?info AS ?coordinates) ?records (SAMPLE(?tag) AS ?layer)
# On Lingualibre, get item's info
WITH {
SELECT *
WHERE {
SERVICE <https://lingualibre.org/sparql> { # Commented on LLQS only
SELECT DISTINCT ?itemLabel ?wikidata (COUNT(?record) AS ?records)
{
?item llt:P2 ll:Q3 ;
rdfs:label ?itemLabel;
llt:P14 ?wikidata .
?record llt:P5 ?item .
FILTER (LANG(?itemLabel)='en')
FILTER (regex(?wikidata, '^Q'))
} GROUP BY ?itemLabel ?wikidata
} # Commented on LLQS only
}
} AS %infoWikidataId
# On Wikidata, get the target info
WHERE {
INCLUDE %infoWikidataId
BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?wikidata)) AS ?infoURL)
SERVICE <https://query.wikidata.org/sparql> {
SELECT * {
?infoURL wdt:P625 ?info .
}
}
BIND(
IF(?records < 10, "<10",
IF(?records < 1000, "10-1,000",
IF(?records < 5000, "1k-5k",
IF(?records < 25000, "5k-25k",
IF(?records < 50000, "25k-50k",
">50k")))))
AS ?tag) .
}
GROUP BY ?itemLabel ?wikidata ?info ?records
ORDER BY DESC (?records)
|
|
References
- ↑ 1.0 1.1 1.2 1.3 1.4 1.5 1.6 Wikidata Query Service (WDQS) – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.
- ↑ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 LinguaLibre Query Service (LLQS) – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv.