Help

Difference between revisions of "SPARQL (intermediate)"

Help:SPARQL 2 will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints. It allows to augent your data thank to Wikidata-provided languages population, status, countries, as well as speakers' geocoordinates, country of origin, etc.

 
(102 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{#Subtitle:'''Help:SPARQL 2''' will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints, then Wikidata Lexemes, an emerging source of lexicographic data. The duo can be a solid combo to provide lexicographic and multimedia (audio recordings and images) for either Wikimedia modules or web developers.}}
+
{{#Subtitle:'''Help:SPARQL 2''' will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints. It allows to augent your data thank to Wikidata-provided languages population, status, countries, as well as speakers' geocoordinates, country of origin, etc.}}
  
 
{{draft}}
 
{{draft}}
 
== Tools ==
 
== Tools ==
=== Lexemes Queries Generator ===
 
{{LexemeQueriesGenerator}}
 
 
 
=== SPARQL to persitent data ===
 
=== SPARQL to persitent data ===
 
''Some SPARQL queries are meaningful but heavy and overly slow. This administrator tool stores or updates the response data on LinguaLibre, within a wikipage. Stored data can then be loaded in <0.1 second. Multiple data can also be merged via a common property if any.
 
''Some SPARQL queries are meaningful but heavy and overly slow. This administrator tool stores or updates the response data on LinguaLibre, within a wikipage. Stored data can then be loaded in <0.1 second. Multiple data can also be merged via a common property if any.
 
{{Sparql2data}}
 
{{Sparql2data}}
  
=== Federated queries ===
+
== Federated queries==
 
* To query Lingualibre from Wikidata, use <code><nowiki>SERVICE <https://lingualibre.org/sparql></nowiki></code>.
 
* To query Lingualibre from Wikidata, use <code><nowiki>SERVICE <https://lingualibre.org/sparql></nowiki></code>.
 
* To query Wikidata from LinguaLibre, use <code><nowiki>SERVICE <https://query.wikidata.org/sparql></nowiki></code>.
 
* To query Wikidata from LinguaLibre, use <code><nowiki>SERVICE <https://query.wikidata.org/sparql></nowiki></code>.
 +
* To query Commons from Lingualibre, use <code><nowiki>SERVICE <https://commons-query.wikimedia.org/sparql></nowiki></code>.
  
==== Retrieve data of LinguaLibre from Wikidata ====
+
=== Retrieve data of LinguaLibre from Wikidata ===
 +
To run on WDQS.<ref name="WDQS">[https://query.wikidata.org <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [https://query.wikidata.org Wikidata Query Service (WDQS)] – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.</ref> It lists the existing levels in LinguaLibre.
 +
 
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Wikidata
 +
PREFIX prop: <https://lingualibre.org/prop/direct/>
 +
PREFIX entity: <https://lingualibre.org/entity/>
  
The following query shows a simple example of retrieving data of LinguaLibre from [https://query.wikidata.org/ Wikidata Query Service]. It lists the existing levels in LinguaLibre.
+
SELECT * {
 +
  SERVICE <https://lingualibre.org/sparql> {
 +
    SELECT
 +
      ?item
 +
      ?itemLabel
 +
    {
 +
      ?item prop:P2 entity:Q5.
 +
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
 +
    }
 +
  }
 +
}
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
  
<syntaxhighlight lang="sparql">
+
#defaultEndpoint:Wikidata
 
PREFIX prop: <https://lingualibre.org/prop/direct/>
 
PREFIX prop: <https://lingualibre.org/prop/direct/>
 
PREFIX entity: <https://lingualibre.org/entity/>
 
PREFIX entity: <https://lingualibre.org/entity/>
Line 33: Line 53:
 
   }
 
   }
 
}
 
}
 +
</query>
 +
|}
 +
 +
=== From LinguaLibre Query service, retrieve Wikidata's data ===
 +
:''To run on LLQS.<ref name="LLQS">[{{SERVER}}/bigdata/#query <span class="mw-ui-button mw-ui-progressive" role="button" aria-disabled="false">Endpoint</span>] [{{SERVER}}/bigdata/#query LinguaLibre Query Service (LLQS)] – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv.</ref>''
 +
:''Federate SPARQL query example to create.''
 +
 +
<syntaxhighlight lang="sparql">
 +
# [Example to complete]
 
</syntaxhighlight>
 
</syntaxhighlight>
  
=== Notable elements ===
+
== Notable elements ==
{|class="wikitable"
+
{| class="wikitable"
!width=50%| LinguaLibre endpoint  
+
|- style="background:#CCC"
!width=50%| Wikidata endpoint
+
! LinguaLibre endpoint  
|-
+
!colspan=2| Wikidata endpoint
 +
|- style="vertical-align:top"
 
|
 
|
 
* `instance of` [[Property:P2|P2]] :
 
* `instance of` [[Property:P2|P2]] :
** `record` [[Q2]]
+
** is `record` [[Q2]]
** `speaker` [[Q3]]
+
** is `speaker` [[Q3]]
** `language` [[Q4]]
+
** is `language` [[Q4]]
 
* `language` [[Property:P4|P4]]
 
* `language` [[Property:P4|P4]]
 
* `speaker` [[Property:P5|P5]]
 
* `speaker` [[Property:P5|P5]]
Line 50: Line 80:
 
* `wikidata` [[Property:P12|P12]]
 
* `wikidata` [[Property:P12|P12]]
 
* `iso` [[Property:P13|P13]]
 
* `iso` [[Property:P13|P13]]
 +
* `media type` [[Property:P24|P24]]
 +
** media type [[Q88888]]
 +
** is `audio` [[Q88889]]
 +
** is `video` [[Q88890]]
 +
** is `written` [[Q1087276]]
 
|
 
|
 
+
For languages:
 
* `instance of` [[:d:P:P31|P31]]/[[:d:P:P279|P279]]*
 
* `instance of` [[:d:P:P31|P31]]/[[:d:P:P279|P279]]*
** `language` [[:d:Q34770]] (ethnic based) , [[:d:Q315]] (capacity)
+
** is `language` [[:d:Q34770]] (ethnic based) , [[:d:Q315]] (capacity)
** `dead language` [[d:Q45762]] (no community)
+
** is `sign language` [[:d:Q34228]]
** `instinct language` [[:d:Q38058796]] (no speaker)
+
** is `endangered language` [[:d:Q335214]]
 +
** is `severely endangered language` [[:d:Q83365366]]
 +
** is `dead language` [[d:Q45762]] (no community)
 +
** is `instinct language` [[:d:Q38058796]] (no speaker)
 +
* `ISO 639-1 code` [[:d:P:P218|P:P218]]
 +
* `ISO 639-2 code` [[:d:P:P219|P:P219]]
 +
* `ISO 639-3 code` [[:d:P:P220|P:P220]]
 +
* `IETF language tag` [[:d:P:P305|P:P305]]
 +
* `geographic coordinate` [[:d:P:P625|P625]]
 +
* `number of speakers` [[:d:P:P1098|P1098]]
 +
* `wikimedia code` [[:d:P:P424|P:P424]]
 +
* `native name` [[:d:P:P1705|P:P1705]]
 +
* `lingualibre ID` [[:d:P:P10369|P:P10369]]
 +
||
 +
For countries:
 
* `country` [[:d:P:P17|P17]]
 
* `country` [[:d:P:P17|P17]]
 +
** is `country` [[:d:Q6256]]
 +
** is `sovereign state` [[:d:Q3624078]]
 +
* `ISO 3166-1 alpha-3 code` [[:d:P:P298|P:P298]]
 
* `continent` [[:d:P:P30|P30]]
 
* `continent` [[:d:P:P30|P30]]
* `iso` [[:d:P:P298|P298]]
+
* `official language` [[:d:P:P37|P37]]
* `geographic coordinate` [[:d:P:P625|P625]]
+
For places:
* `number of speakers` [[:d:P:P1098|P1098]]
+
* `located in entity of level n` [[:d:P:P131|P:P131]]
 +
** `administrative territorial entity of a specific level` [[:d:Q1799794]]
 +
*** (0th level = country)
 +
*** `1st-level administrative country subdivision [[:d:Q10864048]]
 +
*** `2nd-level administrative country subdivision` [[:d:Q13220204]]
 +
*** `3rd-level administrative country subdivision` [[:d:Q13221722]]
 +
*** `4th-level administrative country subdivision` [[:d:Q14757767]]
 +
*** `5th-level administrative country subdivision` [[:d:Q15640612]]
 +
*** `6th-level administrative country subdivision [[:d:Q22927291]]
 +
 
 
|}
 
|}
  
 
== Languages ==
 
== Languages ==
=== ✅ Language () → List of LL languages with wd speaker population ===
+
=== ✅ Language ([[Q930]] Gascon) → List of records in this language ===
 +
To run on WDQS.<ref name="WDQS" />
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Wikidata
 +
PREFIX prop: <https://lingualibre.org/prop/direct/>
 +
PREFIX entity: <https://lingualibre.org/entity/>
 +
SELECT ?writing WHERE {
 +
  SERVICE <https://lingualibre.org/sparql> {
 +
    SELECT ?writing WHERE {
 +
      ?record prop:P2 entity:Q2;
 +
        prop:P4 entity:Q930, ?language;
 +
        prop:P3 ?url;
 +
        prop:P7 ?writing.
 +
    # FILTER(CONTAINS(STR(?audio), "LL-Q35735")) # occitan
 +
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
    }
 +
  }
 +
}
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
 +
#defaultEndpoint:Wikidata
 +
PREFIX prop: <https://lingualibre.org/prop/direct/>
 +
PREFIX entity: <https://lingualibre.org/entity/>
 +
SELECT ?writing WHERE {
 +
  SERVICE <https://lingualibre.org/sparql> {
 +
    SELECT ?writing WHERE {
 +
      ?record prop:P2 entity:Q2;
 +
        prop:P4 entity:Q930, ?language;
 +
        prop:P3 ?url;
 +
        prop:P7 ?writing.
 +
    # FILTER(CONTAINS(STR(?audio), "LL-Q35735")) # occitan
 +
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
    }
 +
  }
 +
}
 +
</query>
 +
|}
  
=== ✅ Languages () → List of LL languages with wikidata dead or extinct status ===
+
=== ✅ Language () → List of WD languages with speaker population >80M ===
It obtains the list of dead and extinct languages from Wikidata. This query is expected to be run in Lingua Libre SPARQL endpoint. It shouldn't be run in the SPARQL endpoint of Wikidata or Wikidata Query Service.
+
To run on WDQS.<ref name="WDQS" />
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Wikidata
 +
SELECT DISTINCT ?item ?ISO ?itemLabel WHERE {
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
 +
  ?item p:P1098 ?statement0.
 +
  ?item p:P219 ?isolang.
 +
  ?statement0 (psv:P1098/wikibase:quantityAmount) ?numericQuantity.
 +
  FILTER(?numericQuantity > "80000000"^^xsd:decimal)
 +
  MINUS {
 +
    ?item p:P31 ?statement1.
 +
    ?statement1 (ps:P31/(wdt:P279*)) wd:Q25295.
 +
  }
 +
  OPTIONAL { ?item wdt:P218 ?ISO. }
 +
}
 +
ORDER BY ASC (?ISO)
 +
LIMIT 100
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
 +
#defaultEndpoint:Wikidata
 +
#defaultEndpoint:Wikidata
 +
SELECT DISTINCT ?item ?ISO ?itemLabel WHERE {
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
 +
  ?item p:P1098 ?statement0.
 +
  ?item p:P219 ?isolang.
 +
  ?statement0 (psv:P1098/wikibase:quantityAmount) ?numericQuantity.
 +
  FILTER(?numericQuantity > "80000000"^^xsd:decimal)
 +
  MINUS {
 +
    ?item p:P31 ?statement1.
 +
    ?statement1 (ps:P31/(wdt:P279*)) wd:Q25295.
 +
  }
 +
  OPTIONAL { ?item wdt:P218 ?ISO. }
 +
}
 +
ORDER BY ASC (?ISO)
 +
LIMIT 100
 +
</query>
 +
|}
  
 +
=== Language () → List of LL languages with wd speaker population ===
 +
:''Query to create.''
 +
 +
===✅ Languages → List of Sign languages on wikidata ===
 +
To run on WDQS.<ref name="WDQS" />
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Wikidata
 +
SELECT DISTINCT ?item ?itemLabel
 +
WHERE {
 +
  ?item p:P31 ?statement0.
 +
  ?statement0 (ps:P31/(wdt:P279*)) wd:Q34228.
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
 +
}
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
 +
#defaultEndpoint:Wikidata
 +
SELECT DISTINCT ?item ?itemLabel
 +
WHERE {
 +
  ?item p:P31 ?statement0.
 +
  ?statement0 (ps:P31/(wdt:P279*)) wd:Q34228.
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
 +
}
 +
</query>
 +
|}
 +
 +
===✅ Languages → List of whistled languages on wikidata ===
 +
To run on WDQS.<ref name="WDQS" />
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Wikidata
 +
SELECT ?whistled_language ?whistled_languageLabel
 +
WHERE {
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 +
  ?whistled_language wdt:P31 wd:Q2737212. # Is whistled language (Q2737212)
 +
}
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
 +
#defaultEndpoint:Wikidata
 +
SELECT ?whistled_language ?whistled_languageLabel
 +
WHERE {
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 +
  ?whistled_language wdt:P31 wd:Q2737212. # Is whistled language (Q2737212)
 +
}
 +
</query>
 +
|}
 +
 +
=== ✅ Languages → List of LL languages with wikidata dead or extinct status ===
 +
To run on LLQS.<ref name="LLQS" />
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Lingualibre
 
SELECT
 
SELECT
 
   ?deadLanguageLinguaLibre
 
   ?deadLanguageLinguaLibre
 
   ?deadLanguageLinguaLibreLabel
 
   ?deadLanguageLinguaLibreLabel
   ?count
+
   ?audios
 +
# List Wikidata dead/extinct languages
 
WITH {
 
WITH {
 
   SELECT DISTINCT ?deadLanguage {
 
   SELECT DISTINCT ?deadLanguage {
Line 86: Line 284:
 
   }
 
   }
 
} AS %deadLanguage
 
} AS %deadLanguage
 +
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
 
WITH {
 
WITH {
 
   SELECT ?deadLanguageLinguaLibre {
 
   SELECT ?deadLanguageLinguaLibre {
 
     INCLUDE %deadLanguage.
 
     INCLUDE %deadLanguage.
 
 
     BIND(REPLACE(STR(?deadLanguage), '.*/', '') AS ?deadLanguageQid)
 
     BIND(REPLACE(STR(?deadLanguage), '.*/', '') AS ?deadLanguageQid)
 
 
     ?deadLanguageLinguaLibre
 
     ?deadLanguageLinguaLibre
 
       prop:P2 entity:Q4;
 
       prop:P2 entity:Q4;
Line 97: Line 294:
 
   }
 
   }
 
} AS %deadLanguageLinguaLibre
 
} AS %deadLanguageLinguaLibre
 +
# For those Lingualibre languages, count audios into ?audios
 
WITH {
 
WITH {
 
   SELECT
 
   SELECT
 
     ?deadLanguageLinguaLibre
 
     ?deadLanguageLinguaLibre
     (COUNT(*) AS ?count)
+
     (COUNT(?audio) AS ?audios)
 
   {
 
   {
 
     INCLUDE %deadLanguageLinguaLibre.
 
     INCLUDE %deadLanguageLinguaLibre.
 
 
     ?audio
 
     ?audio
 
       prop:P2 entity:Q2;
 
       prop:P2 entity:Q2;
Line 109: Line 306:
 
   }
 
   }
 
   GROUP BY ?deadLanguageLinguaLibre
 
   GROUP BY ?deadLanguageLinguaLibre
} AS %count
+
} AS %audios
 +
# What is that ? Back to main scope ? :
 
{
 
{
 
   INCLUDE %deadLanguageLinguaLibre.
 
   INCLUDE %deadLanguageLinguaLibre.
   OPTIONAL{INCLUDE %count.}
+
   OPTIONAL{ INCLUDE %audios. }
 
   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 
   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 
}
 
}
ORDER BY DESC(?count)
+
ORDER BY DESC(?audios)
 
</syntaxhighlight>
 
</syntaxhighlight>
 
||
 
||
 
<query _pagination="10">
 
<query _pagination="10">
 +
#defaultEndpoint:Lingualibre
 
SELECT
 
SELECT
 
   ?deadLanguageLinguaLibre
 
   ?deadLanguageLinguaLibre
 
   ?deadLanguageLinguaLibreLabel
 
   ?deadLanguageLinguaLibreLabel
   ?count
+
   ?audios
 +
# List Wikidata dead/extinct languages
 
WITH {
 
WITH {
 
   SELECT DISTINCT ?deadLanguage {
 
   SELECT DISTINCT ?deadLanguage {
Line 132: Line 332:
 
   }
 
   }
 
} AS %deadLanguage
 
} AS %deadLanguage
 +
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
 
WITH {
 
WITH {
 
   SELECT ?deadLanguageLinguaLibre {
 
   SELECT ?deadLanguageLinguaLibre {
 
     INCLUDE %deadLanguage.
 
     INCLUDE %deadLanguage.
 
 
     BIND(REPLACE(STR(?deadLanguage), '.*/', '') AS ?deadLanguageQid)
 
     BIND(REPLACE(STR(?deadLanguage), '.*/', '') AS ?deadLanguageQid)
 
 
     ?deadLanguageLinguaLibre
 
     ?deadLanguageLinguaLibre
 
       prop:P2 entity:Q4;
 
       prop:P2 entity:Q4;
Line 143: Line 342:
 
   }
 
   }
 
} AS %deadLanguageLinguaLibre
 
} AS %deadLanguageLinguaLibre
 +
# For those Lingualibre languages, count audios into ?audios
 
WITH {
 
WITH {
 
   SELECT
 
   SELECT
 
     ?deadLanguageLinguaLibre
 
     ?deadLanguageLinguaLibre
     (COUNT(*) AS ?count)
+
     (COUNT(?audio) AS ?audios)
 
   {
 
   {
 
     INCLUDE %deadLanguageLinguaLibre.
 
     INCLUDE %deadLanguageLinguaLibre.
 
 
     ?audio
 
     ?audio
 
       prop:P2 entity:Q2;
 
       prop:P2 entity:Q2;
Line 155: Line 354:
 
   }
 
   }
 
   GROUP BY ?deadLanguageLinguaLibre
 
   GROUP BY ?deadLanguageLinguaLibre
} AS %count
+
} AS %audios
 +
# What is that ? Back to main scope ? :
 
{
 
{
 
   INCLUDE %deadLanguageLinguaLibre.
 
   INCLUDE %deadLanguageLinguaLibre.
   OPTIONAL{INCLUDE %count.}
+
   OPTIONAL{ INCLUDE %audios. }
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
}
 +
ORDER BY DESC(?audios)
 +
</query>
 +
|}
 +
 
 +
=== ✅ Languages → List of LL languages with wikidata Endangered language status ===
 +
To run on LLQS.<ref name="LLQS" />
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Lingualibre
 +
SELECT
 +
  ?endangeredLanguageLinguaLibre
 +
  ?endangeredLanguageLinguaLibreLabel
 +
  ?audios
 +
# List Wikidata endangered languages
 +
WITH {
 +
  SELECT DISTINCT ?endangeredLanguage {
 +
    SERVICE <https://query.wikidata.org/sparql> {
 +
      { ?endangeredLanguage wdt:P31/wdt:P279* wd:Q335214. }
 +
      UNION
 +
      { ?endangeredLanguage wdt:P31/wdt:P279* wd:Q83365366. }
 +
    }
 +
  }
 +
} AS %endangeredLanguage
 +
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
 +
WITH {
 +
  SELECT ?endangeredLanguageLinguaLibre {
 +
    INCLUDE %endangeredLanguage.
 +
    BIND(REPLACE(STR(?endangeredLanguage), '.*/', '') AS ?endangeredLanguageQid)
 +
    ?endangeredLanguageLinguaLibre
 +
      prop:P2 entity:Q4;
 +
      prop:P12 ?endangeredLanguageQid.
 +
  }
 +
} AS %endangeredLanguageLinguaLibre
 +
# For those Lingualibre languages, count audios into ?audios
 +
WITH {
 +
  SELECT
 +
    ?endangeredLanguageLinguaLibre
 +
    (COUNT(?audio) AS ?audios)
 +
  {
 +
    INCLUDE %endangeredLanguageLinguaLibre.
 +
    ?audio
 +
      prop:P2 entity:Q2;
 +
      prop:P4 ?endangeredLanguageLinguaLibre.
 +
  }
 +
  GROUP BY ?endangeredLanguageLinguaLibre
 +
} AS %audios
 +
# What is that ? Back to main scope ? :
 +
{
 +
  INCLUDE %endangeredLanguageLinguaLibre.
 +
  OPTIONAL{ INCLUDE %audios. }
 
   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 
   SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 
}
 
}
ORDER BY DESC(?count)
+
ORDER BY DESC(?audios)
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
 +
#defaultEndpoint:Lingualibre
 +
SELECT
 +
  ?endangeredLanguageLinguaLibre
 +
  ?endangeredLanguageLinguaLibreLabel
 +
  ?audios
 +
# List Wikidata endangered languages
 +
WITH {
 +
  SELECT DISTINCT ?endangeredLanguage {
 +
    SERVICE <https://query.wikidata.org/sparql> {
 +
      { ?endangeredLanguage wdt:P31/wdt:P279* wd:Q335214. }
 +
      UNION
 +
      { ?endangeredLanguage wdt:P31/wdt:P279* wd:Q83365366. }
 +
    }
 +
  }
 +
} AS %endangeredLanguage
 +
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
 +
WITH {
 +
  SELECT ?endangeredLanguageLinguaLibre {
 +
    INCLUDE %endangeredLanguage.
 +
    BIND(REPLACE(STR(?endangeredLanguage), '.*/', '') AS ?endangeredLanguageQid)
 +
    ?endangeredLanguageLinguaLibre
 +
      prop:P2 entity:Q4;
 +
      prop:P12 ?endangeredLanguageQid.
 +
  }
 +
} AS %endangeredLanguageLinguaLibre
 +
# For those Lingualibre languages, count audios into ?audios
 +
WITH {
 +
  SELECT
 +
    ?endangeredLanguageLinguaLibre
 +
    (COUNT(?audio) AS ?audios)
 +
  {
 +
    INCLUDE %endangeredLanguageLinguaLibre.
 +
    ?audio
 +
      prop:P2 entity:Q2;
 +
      prop:P4 ?endangeredLanguageLinguaLibre.
 +
  }
 +
  GROUP BY ?endangeredLanguageLinguaLibre
 +
} AS %audios
 +
# What is that ? Back to main scope ? :
 +
{
 +
  INCLUDE %endangeredLanguageLinguaLibre.
 +
  OPTIONAL{ INCLUDE %audios. }
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
}
 +
ORDER BY DESC(?audios)
 
</query>
 
</query>
 
|}
 
|}
  
=== ❌ Language ([[Q34]]) Wikidata Qid(s) → Geo-coordinates ===
+
=== ✅ Languages → List of LL languages with wikidata Sign language status ===
 +
To run on LLQS.<ref name="LLQS" />
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Lingualibre
 +
SELECT
 +
  ?signLanguageLinguaLibre
 +
  ?signLanguageLinguaLibreLabel
 +
  ?audios
 +
# List Wikidata sign languages
 +
WITH {
 +
  SELECT DISTINCT ?signLanguage {
 +
    SERVICE <https://query.wikidata.org/sparql> {
 +
      { ?signLanguage wdt:P31/wdt:P279* wd:Q34228. }
 +
    }
 +
  }
 +
} AS %signLanguage
 +
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
 +
WITH {
 +
  SELECT ?signLanguageLinguaLibre {
 +
    INCLUDE %signLanguage.
 +
    BIND(REPLACE(STR(?signLanguage), '.*/', '') AS ?signLanguageQid)
 +
    ?signLanguageLinguaLibre
 +
      prop:P2 entity:Q4;
 +
      prop:P12 ?signLanguageQid.
 +
  }
 +
} AS %signLanguageLinguaLibre
 +
# For those Lingualibre languages, count audios into ?audios
 +
WITH {
 +
  SELECT
 +
    ?signLanguageLinguaLibre
 +
    (COUNT(?audio) AS ?audios)
 +
  {
 +
    INCLUDE %signLanguageLinguaLibre.
 +
    ?audio
 +
      prop:P2 entity:Q2;
 +
      prop:P4 ?signLanguageLinguaLibre.
 +
  }
 +
  GROUP BY ?signLanguageLinguaLibre
 +
} AS %audios
 +
# What is that ? Back to main scope ? :
 +
{
 +
  INCLUDE %signLanguageLinguaLibre.
 +
  OPTIONAL{ INCLUDE %audios. }
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
}
 +
ORDER BY DESC(?audios)
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
 +
#defaultEndpoint:Lingualibre
 +
SELECT
 +
  ?signLanguageLinguaLibre
 +
  ?signLanguageLinguaLibreLabel
 +
  ?audios
 +
# List Wikidata sign languages
 +
WITH {
 +
  SELECT DISTINCT ?signLanguage {
 +
    SERVICE <https://query.wikidata.org/sparql> {
 +
      { ?signLanguage wdt:P31/wdt:P279* wd:Q34228. }
 +
    }
 +
  }
 +
} AS %signLanguage
 +
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
 +
WITH {
 +
  SELECT ?signLanguageLinguaLibre {
 +
    INCLUDE %signLanguage.
 +
    BIND(REPLACE(STR(?signLanguage), '.*/', '') AS ?signLanguageQid)
 +
    ?signLanguageLinguaLibre
 +
      prop:P2 entity:Q4;
 +
      prop:P12 ?signLanguageQid.
 +
  }
 +
} AS %signLanguageLinguaLibre
 +
# For those Lingualibre languages, count audios into ?audios
 +
WITH {
 +
  SELECT
 +
    ?signLanguageLinguaLibre
 +
    (COUNT(?audio) AS ?audios)
 +
  {
 +
    INCLUDE %signLanguageLinguaLibre.
 +
    ?audio
 +
      prop:P2 entity:Q2;
 +
      prop:P4 ?signLanguageLinguaLibre.
 +
  }
 +
  GROUP BY ?signLanguageLinguaLibre
 +
} AS %audios
 +
# What is that ? Back to main scope ? :
 +
{
 +
  INCLUDE %signLanguageLinguaLibre.
 +
  OPTIONAL{ INCLUDE %audios. }
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
}
 +
ORDER BY DESC(?audios)
 +
</query>
 +
|}
  
 +
=== Language ([[Q34]]) → Item with Wikidata Qid(s), optional Geo-coordinates ===
 +
To run on LLQS.<ref name="LLQS" />
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Lingualibre
 
PREFIX wd: <http://www.wikidata.org/entity/>
 
PREFIX wd: <http://www.wikidata.org/entity/>
 
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
 
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
Line 177: Line 578:
 
PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/>
 
PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/>
  
select distinct ?record ?transcription ?languageLabel ?wdQid ?wdQidLabel ?wdLabel ?coord
+
select distinct ?record ?transcription ?languageLabel ?wdQid (?wdLabel as ?wdLabelEN) ?coord
 
where {
 
where {
 
   ?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record'
 
   ?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record'
Line 183: Line 584:
 
   ?record llt:P4 ?language .      # Assign value: record's P4 'language' to variable ?language
 
   ?record llt:P4 ?language .      # Assign value: record's P4 'language' to variable ?language
 
   ?record llt:P7 ?transcription .  # Assign value: record's P7 'transcription' to variable ?transcription
 
   ?record llt:P7 ?transcription .  # Assign value: record's P7 'transcription' to variable ?transcription
   ?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ?wikidataItem
+
   ?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ?wdQid
 
    
 
    
 
   SERVICE <https://query.wikidata.org/sparql> {
 
   SERVICE <https://query.wikidata.org/sparql> {
Line 192: Line 593:
 
     }
 
     }
 
   }
 
   }
 
+
   SERVICE wikibase:label {  
   SERVICE wikibase:label {
 
 
     bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
 
     bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
 
   }  
 
   }  
Line 199: Line 599:
 
</syntaxhighlight>
 
</syntaxhighlight>
 
|-
 
|-
| Result type:<br>
+
|
<pre>
+
<query _pagination="10">
record transcription languageLabel wdQid wdQidLabel wdLabel coord
+
#defaultEndpoint:Lingualibre
Q196212 Tathavade Marathi Q2719024 Q2719024 Tathavade Point(73.74 18.62)
+
PREFIX wd: <http://www.wikidata.org/entity/>
Q428904 Jambavade Marathi Q24894740 Q24894740 Jambavade Point(73.85 18.51)
+
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
Q428900 Dhangavhan Marathi Q24885008 Q24885008 Dhangavhan Point(73.85 18.52)
+
PREFIX ll: <https://lingualibre.org/entity/>  
+
PREFIX llt: <https://lingualibre.org/prop/direct/>
</pre>
+
PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/>
 +
 
 +
select distinct ?record ?transcription ?languageLabel ?wdQid (?wdLabel as ?wdLabelEN) ?coord
 +
where {
 +
  ?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record'
 +
  ?record llt:P4 ll:Q34 .          # Filter: record's P4 'language' is Q34 'Marathi'
 +
  ?record llt:P4 ?language .      # Assign value: record's P4 'language' to variable ?language
 +
  ?record llt:P7 ?transcription .  # Assign value: record's P7 'transcription' to variable ?transcription
 +
  ?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ?wdQid
 +
 
 +
  SERVICE <https://query.wikidata.org/sparql> {
 +
    OPTIONAL { ?wdQid wdt:P625 ?coord . } # Assign value: wikidata item's wd:P625 'coordinates' to variable ?coord
 +
    OPTIONAL {
 +
      ?wdQid rdfs:label ?wdLabel . # Assign value: wikidata item's label to variable ?wikidataLabel
 +
    FILTER (LANG(?wdLabel) = "en") . # Filter: default language, else English
 +
    }
 +
  }
 +
  SERVICE wikibase:label {
 +
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
 +
  }
 +
}
 +
</query>
 
|}
 
|}
  
== Lexemes ==
+
=== ✅ Languages → Languages with gender and recordings counts ===
=== ✅ Language ([[:d:Q12107]]) → List of wd lexemes ===
+
To run on LLQS.<ref name="LLQS" />
Example : Q12107 breton.
+
{| style="width:100%"
=== ✅ Language () → List of wd lexemes with LL audio ===
+
|- style="vertical-align:top;"
=== ✅ Language () → List of wd lexemes with LL audio and wd translation ([[:d:Q150]]) ===
+
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Lingualibre
 +
SELECT ?languageLabel ?wikidata ?iso
 +
  ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
 +
  (ROUND(1000*?femalesRecords/(?femalesRecords+?malesRecords))/10 AS ?percent)
 +
WITH {
 +
  SELECT ?language ?languageLabel ?wikidata ?iso {
 +
      ?record prop:P2 entity:Q2 .    # Filter: P2 'instance of' is Q2 'record'
 +
      ?record prop:P4 ?language .    # Assign value: P4 'language' into ?language
 +
      ?language prop:P12 ?wikidata .  # Assign value: P12 'wikidata id' into ?wikidata
 +
      OPTIONAL { ?language prop:P13 ?iso . } # Assign value: P13 'iso639-3' into ?iso
 +
}
 +
GROUP BY ?language ?languageLabel ?wikidata ?iso
 +
} AS %base
 +
WITH {
 +
  SELECT ?language ?languageLabel ?iso ?genderLabel
 +
    (COUNT(DISTINCT ?females) AS ?femalesSpeakers)  
 +
    (COUNT(DISTINCT ?record) AS ?femalesRecords) {
 +
  INCLUDE %base
 +
  ?record prop:P4 ?language ; # Filter
 +
          prop:P5 ?females . # Assign value: P5 'speaker' into ?females
 +
  ?females prop:P8 entity:Q17 ;  # Filter
 +
          prop:P8 ?gender . # Assign value: P8 'gender' into ?gender
 +
    } GROUP BY ?language ?languageLabel ?iso ?genderLabel
 +
} AS %females
  
=== ✅ Language () → List of wd lexemes ([[:d:Q150]]) ===
+
WITH {
:''Strange query from [[User:VIGNERON/common.js]]''
+
  SELECT ?language ?languageLabel ?iso ?genderLabel
 +
    (COUNT(DISTINCT ?males) AS ?malesSpeakers)
 +
    (COUNT(DISTINCT ?record) AS ?malesRecords) {
 +
  INCLUDE %base
 +
  ?record prop:P4 ?language ;
 +
          prop:P5 ?males . # Assign value: P5 'speaker' into variable ?speakerQid
 +
  ?males prop:P8 entity:Q16 ;
 +
          prop:P8 ?gender .
 +
    } GROUP BY ?language ?languageLabel ?iso ?genderLabel
 +
} AS %males
 +
{
 +
  INCLUDE %base
 +
  INCLUDE %females
 +
  INCLUDE %males
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
}
 +
GROUP BY ?languageLabel ?wikidata ?iso ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
 +
ORDER BY ASC(?languageLabel )
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="10">
 +
#defaultEndpoint:Lingualibre
 +
SELECT ?languageLabel ?wikidata ?iso ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
 +
    (ROUND(1000*?femalesRecords/(?femalesRecords+?malesRecords))/10 AS ?percent)
 +
WITH {
 +
  SELECT ?language ?languageLabel ?wikidata ?iso {
 +
      ?record prop:P2 entity:Q2 .    # Filter: P2 'instance of' is Q2 'record'
 +
      ?record prop:P4 ?language .    # Assign value: P4 'language' into ?language
 +
      ?language prop:P12 ?wikidata .  # Assign value: P12 'wikidata id' into ?wikidata
 +
      OPTIONAL { ?language prop:P13 ?iso . } # Assign value: P13 'iso639-3' into ?iso
 +
}
 +
GROUP BY ?language ?languageLabel ?wikidata ?iso
 +
} AS %base
 +
WITH {
 +
  SELECT ?language ?languageLabel ?iso ?genderLabel
 +
    (COUNT(DISTINCT ?females) AS ?femalesSpeakers)
 +
    (COUNT(DISTINCT ?record) AS ?femalesRecords) {
 +
  INCLUDE %base
 +
  ?record prop:P4 ?language ; # Filter
 +
          prop:P5 ?females . # Assign value: P5 'speaker' into ?females
 +
  ?females prop:P8 entity:Q17 ;  # Filter
 +
          prop:P8 ?gender . # Assign value: P8 'gender' into ?gender
 +
    } GROUP BY ?language ?languageLabel ?iso ?genderLabel
 +
} AS %females
 +
WITH {
 +
  SELECT ?language ?languageLabel ?iso ?genderLabel
 +
    (COUNT(DISTINCT ?males) AS ?malesSpeakers)
 +
    (COUNT(DISTINCT ?record) AS ?malesRecords) {
 +
  INCLUDE %base
 +
  ?record prop:P4 ?language ;
 +
          prop:P5 ?males . # Assign value: P5 'speaker' into variable ?speakerQid
 +
  ?males prop:P8 entity:Q16 ;
 +
          prop:P8 ?gender .
 +
    } GROUP BY ?language ?languageLabel ?iso ?genderLabel
 +
} AS %males
 +
{
 +
  INCLUDE %base
 +
  INCLUDE %females
 +
  INCLUDE %males
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
 +
}
 +
GROUP BY ?languageLabel ?wikidata ?iso ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
 +
ORDER BY ASC(?languageLabel )
 +
</query>
 +
|}
 +
 
 +
=== [HEAVY] Languages → Languages with gender and recordings counts (2) ===
 +
:''Section needs review.''
 +
To run on LLQS.<ref name="LLQS" />
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
SELECT DISTINCT ?lexemeLabel ?lexeme
+
#defaultEndpoint:Lingualibre
WITH {
+
SELECT ?iso
   SELECT ?lexeme ?lexemeLabel ?lexical_category WHERE {
+
   (?genderLabel as ?Gender)
    ?lexeme a ontolex:LexicalEntry ;
+
  (COUNT(DISTINCT ?speakerQid) as ?Speakers)
            dct:language wd:Q12107 ;
+
  (COUNT(DISTINCT ?record) as ?Records)
            wikibase:lemma ?lexemeLabel .
 
    OPTIONAL {
 
      ?lexeme wikibase:lexicalCategory ?lexical_category .
 
    }
 
  }
 
} AS %results
 
 
WHERE {
 
WHERE {
   INCLUDE %results
+
   ?record prop:P2 entity:Q2 .    # Filter: items where P2 'instance of' is Q2 'record'
   OPTIONAL {      
+
  ?record prop:P4 ?language .    # Filter: items where P4 'language' is Q34 'Marathi'
    ?lexical_category rdfs:label ?lexical_categoryLabel .
+
   # OPTIONAL { ?language prop:P12 ?wikidata }  # Assign value: P12 'wikidata id' into variable ?WD
    FILTER (LANG(?lexical_categoryLabel) = "en")
+
  OPTIONAL { ?language prop:P13 ?iso } # Assign value: P13 'iso639-3' into ?isoCode
  }
+
  ?record prop:P5 ?speakerQid . # Assign value: P5 'speaker' into variable ?speakerQid
 +
  ?speakerQid prop:P8 ?gender . #  Assign value: P8 'sex or gender'
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
 
}
 
}
 +
GROUP BY ?iso ?genderLabel
 +
ORDER BY ASC(?iso )
 
</syntaxhighlight>
 
</syntaxhighlight>
 
|
 
|
 +
<query _pagination="10">
 +
#defaultEndpoint:Lingualibre
 +
SELECT ?iso
 +
  ?genderLabel
 +
  (COUNT(DISTINCT ?speakerQid) as ?Speakers)
 +
  (COUNT(DISTINCT ?record) as ?Records)
 +
WHERE {
 +
  ?record prop:P2 entity:Q2 .    # Filter: items where P2 'instance of' is Q2 'record'
 +
  ?record prop:P4 ?language .    # Filter: items where P4 'language' is Q34 'Marathi'
 +
  # OPTIONAL { ?language prop:P12 ?wikidata }  # Assign value: P12 'wikidata id' into variable ?WD
 +
  OPTIONAL { ?language prop:P13 ?iso } # Assign value: P13 'iso639-3' into ?isoCode
 +
  ?record prop:P5 ?speakerQid . # Assign value: P5 'speaker' into variable ?speakerQid
 +
  ?speakerQid prop:P8 ?gender . #  Assign value: P8 'sex or gender'
 +
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
 +
}
 +
GROUP BY ?iso ?genderLabel
 +
ORDER BY ASC(?iso)
 +
</query>
 
|}
 
|}
  
 
== Speakers ==
 
== Speakers ==
 
=== ✅ Speakers → Largest number of languages recorded and known ===
 
=== ✅ Speakers → Largest number of languages recorded and known ===
 +
To run on LLQS.<ref name="LLQS" />
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Lingualibre
 
#Title: Speakers with recordings largest number of languages and known languages
 
#Title: Speakers with recordings largest number of languages and known languages
 
SELECT ?speaker ?speakerLabel ?count ?languages
 
SELECT ?speaker ?speakerLabel ?count ?languages
Line 301: Line 833:
 
||
 
||
 
<query _pagination="5">
 
<query _pagination="5">
 +
#defaultEndpoint:Lingualibre
 
#Title: Speakers with recordings largest number of languages and known languages
 
#Title: Speakers with recordings largest number of languages and known languages
 
SELECT ?speaker ?speakerLabel ?count ?languages
 
SELECT ?speaker ?speakerLabel ?count ?languages
Line 352: Line 885:
 
</query>
 
</query>
 
|}
 
|}
 
  
 
=== ✅ Speakers → Countries with most speakers ===
 
=== ✅ Speakers → Countries with most speakers ===
 +
To run on LLQS.<ref name="LLQS" /><br>Note: this queries collects ''declared'' speakers, and do not check for actual recordings.
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Lingualibre
 
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
 
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
 
WITH {
 
WITH {
Line 399: Line 933:
 
||
 
||
 
<query _pagination="20">
 
<query _pagination="20">
 +
#defaultEndpoint:Lingualibre
 
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
 
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
 
WITH {
 
WITH {
Line 440: Line 975:
  
 
=== <!-- ✅--> Speakers → Map of speakers by place ===
 
=== <!-- ✅--> Speakers → Map of speakers by place ===
 +
To run on WDQS.<ref name=WDQS />
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Wikidata
 
#defaultView:Map
 
#defaultView:Map
 
PREFIX ll: <https://lingualibre.org/entity/>
 
PREFIX ll: <https://lingualibre.org/entity/>
Line 470: Line 1,007:
 
   }
 
   }
 
}
 
}
 
 
</syntaxhighlight>
 
</syntaxhighlight>
 
||
 
||
 
<query _pagination="20">
 
<query _pagination="20">
 +
#defaultEndpoint:Wikidata
 +
#defaultView:Map
 +
PREFIX ll: <https://lingualibre.org/entity/>
 +
PREFIX llt: <https://lingualibre.org/prop/direct/>
 +
 +
SELECT DISTINCT ?lLabel ?coord WITH {
 +
  SELECT ?lLabel ?loc WHERE {
 +
    SERVICE <https://lingualibre.org/sparql> {
 +
      select DISTINCT ?lLabel ?loc {
 +
        SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 +
        ?l llt:P2 ll:Q3 ;
 +
          llt:P14 ?loc .
 +
        ?record llt:P5 ?l. 
 +
        FILTER (regex(?loc, '^Q'))
 +
      }
 +
    }
 +
  }
 +
} AS %i
 +
WHERE {
 +
  INCLUDE %i
 +
  BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?loc)) AS ?locURL)
 +
  SERVICE <https://query.wikidata.org/sparql> {
 +
    select * {
 +
      ?locURL wdt:P625 ?coord .
 +
    }
 +
  }
 +
}
 
</query>
 
</query>
 
|}
 
|}
 +
 +
=== <!-- ✅--> Speakers → Map of speakers with recordings ===
 +
To run on WDQS.<ref name=WDQS />
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
#defaultEndpoint:Wikidata
 +
#defaultView:Map
 +
# Same can apply to languages :
 +
# Q3→Q4  : items speaker and language
 +
# P14→P12 : items WD location and WD language
 +
# P5→P4 : properties speaker and languages
 +
# P625→P1098 : properties location and population
 +
 +
PREFIX ll: <https://lingualibre.org/entity/>
 +
PREFIX llt: <https://lingualibre.org/prop/direct/>
 +
 +
SELECT DISTINCT ?itemLabel ?wikidata (?info AS ?coordinates) ?records (SAMPLE(?tag) AS ?layer)
 +
# On Lingualibre, get item's info
 +
WITH {
 +
  SELECT *
 +
  WHERE {
 +
    SERVICE <https://lingualibre.org/sparql> {    # Commented on LLQS only
 +
      SELECT DISTINCT ?itemLabel ?wikidata (COUNT(?record) AS ?records)
 +
      {
 +
        ?item llt:P2 ll:Q3 ;
 +
                rdfs:label ?itemLabel;
 +
                llt:P14 ?wikidata .
 +
        ?record llt:P5 ?item .
 +
        FILTER (LANG(?itemLabel)='en')
 +
        FILTER (regex(?wikidata, '^Q'))
 +
      } GROUP BY ?itemLabel ?wikidata
 +
    }          # Commented on LLQS only
 +
  }
 +
} AS %infoWikidataId
 +
# On Wikidata, get the target info
 +
WHERE {
 +
  INCLUDE %infoWikidataId
 +
  BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?wikidata)) AS ?infoURL)
 +
  SERVICE <https://query.wikidata.org/sparql> {
 +
    SELECT * {
 +
      ?infoURL wdt:P625 ?info .
 +
    }
 +
  }
 +
  BIND(
 +
    IF(?records < 10, "<10",
 +
    IF(?records < 1000, "10-1,000",
 +
    IF(?records < 5000, "1k-5k",
 +
    IF(?records < 25000, "5k-25k",
 +
    IF(?records < 50000, "25k-50k",
 +
    ">50k")))))
 +
    AS ?tag) .
 +
}
 +
GROUP BY ?itemLabel ?wikidata ?info ?records
 +
ORDER BY DESC (?records)
 +
</syntaxhighlight>
 +
|
 +
<query _pagination="10">
 +
#defaultEndpoint:Wikidata
 +
#defaultView:Map
 +
# Same can apply to languages :
 +
# Q3→Q4  : items speaker and language
 +
# P14→P12 : items WD location and WD language
 +
# P5→P4 : properties speaker and languages
 +
# P625→P1098 : properties location and population
 +
 +
PREFIX ll: <https://lingualibre.org/entity/>
 +
PREFIX llt: <https://lingualibre.org/prop/direct/>
 +
 +
SELECT DISTINCT ?itemLabel ?wikidata (?info AS ?coordinates) ?records (SAMPLE(?tag) AS ?layer)
 +
# On Lingualibre, get item's info
 +
WITH {
 +
  SELECT *
 +
  WHERE {
 +
    SERVICE <https://lingualibre.org/sparql> {    # Commented on LLQS only
 +
      SELECT DISTINCT ?itemLabel ?wikidata (COUNT(?record) AS ?records)
 +
      {
 +
        ?item llt:P2 ll:Q3 ;
 +
                rdfs:label ?itemLabel;
 +
                llt:P14 ?wikidata .
 +
        ?record llt:P5 ?item .
 +
        FILTER (LANG(?itemLabel)='en')
 +
        FILTER (regex(?wikidata, '^Q'))
 +
      } GROUP BY ?itemLabel ?wikidata
 +
    }          # Commented on LLQS only
 +
  }
 +
} AS %infoWikidataId
 +
# On Wikidata, get the target info
 +
WHERE {
 +
  INCLUDE %infoWikidataId
 +
  BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?wikidata)) AS ?infoURL)
 +
  SERVICE <https://query.wikidata.org/sparql> {
 +
    SELECT * {
 +
      ?infoURL wdt:P625 ?info .
 +
    }
 +
  }
 +
  BIND(
 +
    IF(?records < 10, "<10",
 +
    IF(?records < 1000, "10-1,000",
 +
    IF(?records < 5000, "1k-5k",
 +
    IF(?records < 25000, "5k-25k",
 +
    IF(?records < 50000, "25k-50k",
 +
    ">50k")))))
 +
    AS ?tag) .
 +
}
 +
GROUP BY ?itemLabel ?wikidata ?info ?records
 +
ORDER BY DESC (?records)
 +
</query>
 +
|}
 +
 +
== References ==
 +
<references />
  
 
== See also ==
 
== See also ==
 
* [[Help:SPARQL]]
 
* [[Help:SPARQL]]
{{Lingua Libre scripts}}
+
* [https://cradle.toolforge.org/#/subject/affiant Craddle.toolforge.org]
 +
{{Technicals}}
 
[[Category:Lingua Libre:Help{{#translation:}}]]
 
[[Category:Lingua Libre:Help{{#translation:}}]]

Latest revision as of 16:28, 23 April 2024


Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg

This page is a work in progress.

Tools

SPARQL to persitent data

Some SPARQL queries are meaningful but heavy and overly slow. This administrator tool stores or updates the response data on LinguaLibre, within a wikipage. Stored data can then be loaded in <0.1 second. Multiple data can also be merged via a common property if any.


Federated queries

  • To query Lingualibre from Wikidata, use SERVICE <https://lingualibre.org/sparql>.
  • To query Wikidata from LinguaLibre, use SERVICE <https://query.wikidata.org/sparql>.
  • To query Commons from Lingualibre, use SERVICE <https://commons-query.wikimedia.org/sparql>.

Retrieve data of LinguaLibre from Wikidata

To run on WDQS.[1] It lists the existing levels in LinguaLibre.

#defaultEndpoint:Wikidata
PREFIX prop: <https://lingualibre.org/prop/direct/>
PREFIX entity: <https://lingualibre.org/entity/>

SELECT * {
  SERVICE <https://lingualibre.org/sparql> {
    SELECT
      ?item
      ?itemLabel
    {
      ?item prop:P2 entity:Q5.
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
  }
}
... Loading ...

From LinguaLibre Query service, retrieve Wikidata's data

To run on LLQS.[2]
Federate SPARQL query example to create.
# [Example to complete]

Notable elements

LinguaLibre endpoint Wikidata endpoint

For languages:

For countries:

For places:

  • `located in entity of level n` P:P131
    • `administrative territorial entity of a specific level` d:Q1799794
      • (0th level = country)
      • `1st-level administrative country subdivision d:Q10864048
      • `2nd-level administrative country subdivision` d:Q13220204
      • `3rd-level administrative country subdivision` d:Q13221722
      • `4th-level administrative country subdivision` d:Q14757767
      • `5th-level administrative country subdivision` d:Q15640612
      • `6th-level administrative country subdivision d:Q22927291

Languages

✅ Language (Gascon (Q930) Gascon) → List of records in this language

To run on WDQS.[1]

#defaultEndpoint:Wikidata
PREFIX prop: <https://lingualibre.org/prop/direct/>
PREFIX entity: <https://lingualibre.org/entity/>
SELECT ?writing WHERE {
  SERVICE <https://lingualibre.org/sparql> {
    SELECT ?writing WHERE {
      ?record prop:P2 entity:Q2;
        prop:P4 entity:Q930, ?language;
        prop:P3 ?url;
        prop:P7 ?writing.
    # FILTER(CONTAINS(STR(?audio), "LL-Q35735")) # occitan
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
  }
}
... Loading ...

✅ Language () → List of WD languages with speaker population >80M

To run on WDQS.[1]

#defaultEndpoint:Wikidata
SELECT DISTINCT ?item ?ISO ?itemLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
  ?item p:P1098 ?statement0.
  ?item p:P219 ?isolang.
  ?statement0 (psv:P1098/wikibase:quantityAmount) ?numericQuantity.
  FILTER(?numericQuantity > "80000000"^^xsd:decimal)
  MINUS {
    ?item p:P31 ?statement1.
    ?statement1 (ps:P31/(wdt:P279*)) wd:Q25295.
  }
  OPTIONAL { ?item wdt:P218 ?ISO. }
}
ORDER BY ASC (?ISO)
LIMIT 100
... Loading ...

Language () → List of LL languages with wd speaker population

Query to create.

✅ Languages → List of Sign languages on wikidata

To run on WDQS.[1]

#defaultEndpoint:Wikidata
SELECT DISTINCT ?item ?itemLabel
WHERE {
  ?item p:P31 ?statement0.
  ?statement0 (ps:P31/(wdt:P279*)) wd:Q34228.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
... Loading ...

✅ Languages → List of whistled languages on wikidata

To run on WDQS.[1]

#defaultEndpoint:Wikidata
SELECT ?whistled_language ?whistled_languageLabel
WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?whistled_language wdt:P31 wd:Q2737212. # Is whistled language (Q2737212)
}
... Loading ...

✅ Languages → List of LL languages with wikidata dead or extinct status

To run on LLQS.[2]

#defaultEndpoint:Lingualibre
SELECT
  ?deadLanguageLinguaLibre
  ?deadLanguageLinguaLibreLabel
  ?audios
# List Wikidata dead/extinct languages
WITH {
  SELECT DISTINCT ?deadLanguage {
    SERVICE <https://query.wikidata.org/sparql> {
      { ?deadLanguage wdt:P31/wdt:P279* wd:Q45762. }
      UNION
      { ?deadLanguage wdt:P31/wdt:P279* wd:Q38058796. }
    }
  }
} AS %deadLanguage
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
WITH {
  SELECT ?deadLanguageLinguaLibre {
    INCLUDE %deadLanguage.
    BIND(REPLACE(STR(?deadLanguage), '.*/', '') AS ?deadLanguageQid)
    ?deadLanguageLinguaLibre
      prop:P2 entity:Q4;
      prop:P12 ?deadLanguageQid.
  }
} AS %deadLanguageLinguaLibre
# For those Lingualibre languages, count audios into ?audios
WITH {
  SELECT
    ?deadLanguageLinguaLibre
    (COUNT(?audio) AS ?audios)
  {
    INCLUDE %deadLanguageLinguaLibre.
    ?audio
      prop:P2 entity:Q2;
      prop:P4 ?deadLanguageLinguaLibre.
  }
  GROUP BY ?deadLanguageLinguaLibre
} AS %audios
# What is that ? Back to main scope ? :
{
  INCLUDE %deadLanguageLinguaLibre.
  OPTIONAL{ INCLUDE %audios. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?audios)
... Loading ...

✅ Languages → List of LL languages with wikidata Endangered language status

To run on LLQS.[2]

#defaultEndpoint:Lingualibre
SELECT
  ?endangeredLanguageLinguaLibre
  ?endangeredLanguageLinguaLibreLabel
  ?audios
# List Wikidata endangered languages
WITH {
  SELECT DISTINCT ?endangeredLanguage {
    SERVICE <https://query.wikidata.org/sparql> {
      { ?endangeredLanguage wdt:P31/wdt:P279* wd:Q335214. }
      UNION
      { ?endangeredLanguage wdt:P31/wdt:P279* wd:Q83365366. }
    }
  }
} AS %endangeredLanguage
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
WITH {
  SELECT ?endangeredLanguageLinguaLibre {
    INCLUDE %endangeredLanguage.
    BIND(REPLACE(STR(?endangeredLanguage), '.*/', '') AS ?endangeredLanguageQid)
    ?endangeredLanguageLinguaLibre
      prop:P2 entity:Q4;
      prop:P12 ?endangeredLanguageQid.
  }
} AS %endangeredLanguageLinguaLibre
# For those Lingualibre languages, count audios into ?audios
WITH {
  SELECT
    ?endangeredLanguageLinguaLibre
    (COUNT(?audio) AS ?audios)
  {
    INCLUDE %endangeredLanguageLinguaLibre.
    ?audio
      prop:P2 entity:Q2;
      prop:P4 ?endangeredLanguageLinguaLibre.
  }
  GROUP BY ?endangeredLanguageLinguaLibre
} AS %audios
# What is that ? Back to main scope ? :
{
  INCLUDE %endangeredLanguageLinguaLibre.
  OPTIONAL{ INCLUDE %audios. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?audios)
... Loading ...

✅ Languages → List of LL languages with wikidata Sign language status

To run on LLQS.[2]

#defaultEndpoint:Lingualibre
SELECT
  ?signLanguageLinguaLibre
  ?signLanguageLinguaLibreLabel
  ?audios
# List Wikidata sign languages
WITH {
  SELECT DISTINCT ?signLanguage {
    SERVICE <https://query.wikidata.org/sparql> {
      { ?signLanguage wdt:P31/wdt:P279* wd:Q34228. }
    }
  }
} AS %signLanguage
# Compare with LinguaLibre languages, keep when P12 `wikidata` matches
WITH {
  SELECT ?signLanguageLinguaLibre {
    INCLUDE %signLanguage.
    BIND(REPLACE(STR(?signLanguage), '.*/', '') AS ?signLanguageQid)
    ?signLanguageLinguaLibre
      prop:P2 entity:Q4;
      prop:P12 ?signLanguageQid.
  }
} AS %signLanguageLinguaLibre
# For those Lingualibre languages, count audios into ?audios
WITH {
  SELECT
    ?signLanguageLinguaLibre
    (COUNT(?audio) AS ?audios)
  {
    INCLUDE %signLanguageLinguaLibre.
    ?audio
      prop:P2 entity:Q2;
      prop:P4 ?signLanguageLinguaLibre.
  }
  GROUP BY ?signLanguageLinguaLibre
} AS %audios
# What is that ? Back to main scope ? :
{
  INCLUDE %signLanguageLinguaLibre.
  OPTIONAL{ INCLUDE %audios. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?audios)
... Loading ...

Language (Marathi (Q34)) → Item with Wikidata Qid(s), optional Geo-coordinates

To run on LLQS.[2]

#defaultEndpoint:Lingualibre
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX ll: <https://lingualibre.org/entity/> 
PREFIX llt: <https://lingualibre.org/prop/direct/>
PREFIX lltn: <https://lingualibre.org/prop/direct-normalized/>

select distinct ?record ?transcription ?languageLabel ?wdQid (?wdLabel as ?wdLabelEN) ?coord
where {
  ?record llt:P2 ll:Q2 . # Filter: P2 'instance of' is Q2 'record'
  ?record llt:P4 ll:Q34 .          # Filter: record's P4 'language' is Q34 'Marathi'
  ?record llt:P4 ?language .       # Assign value: record's P4 'language' to variable ?language
  ?record llt:P7 ?transcription .  # Assign value: record's P7 'transcription' to variable ?transcription
  ?record lltn:P12 ?wdQid . # Assign value: record's P12 'wikidata id' to variable ?wdQid
  
  SERVICE <https://query.wikidata.org/sparql> {
    OPTIONAL { ?wdQid wdt:P625 ?coord . } # Assign value: wikidata item's wd:P625 'coordinates' to variable ?coord
    OPTIONAL {
      ?wdQid rdfs:label ?wdLabel . # Assign value: wikidata item's label to variable ?wikidataLabel
     FILTER (LANG(?wdLabel) = "en") . # Filter: default language, else English
    }
  }
  SERVICE wikibase:label { 
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅ Languages → Languages with gender and recordings counts

To run on LLQS.[2]

#defaultEndpoint:Lingualibre
SELECT ?languageLabel ?wikidata ?iso
  ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
  (ROUND(1000*?femalesRecords/(?femalesRecords+?malesRecords))/10 AS ?percent)
WITH {
  SELECT ?language ?languageLabel ?wikidata ?iso {
      ?record prop:P2 entity:Q2 .     # Filter: P2 'instance of' is Q2 'record'
      ?record prop:P4 ?language .     # Assign value: P4 'language' into ?language
      ?language prop:P12 ?wikidata .   # Assign value: P12 'wikidata id' into ?wikidata
      OPTIONAL { ?language prop:P13 ?iso . } # Assign value: P13 'iso639-3' into ?iso
	}
	GROUP BY ?language ?languageLabel ?wikidata ?iso
} AS %base
WITH {
  SELECT ?language ?languageLabel ?iso ?genderLabel 
    (COUNT(DISTINCT ?females) AS ?femalesSpeakers) 
    (COUNT(DISTINCT ?record) AS ?femalesRecords) {
  INCLUDE %base
  ?record prop:P4 ?language ; # Filter
          prop:P5 ?females . # Assign value: P5 'speaker' into ?females
  ?females prop:P8 entity:Q17 ;  # Filter
          prop:P8 ?gender . # Assign value: P8 'gender' into ?gender
    } GROUP BY ?language ?languageLabel ?iso ?genderLabel
} AS %females

WITH {
  SELECT ?language ?languageLabel ?iso ?genderLabel 
    (COUNT(DISTINCT ?males) AS ?malesSpeakers)
    (COUNT(DISTINCT ?record) AS ?malesRecords) {
  INCLUDE %base
  ?record prop:P4 ?language ;
          prop:P5 ?males . # Assign value: P5 'speaker' into variable ?speakerQid
  ?males prop:P8 entity:Q16 ; 
          prop:P8 ?gender .
    } GROUP BY ?language ?languageLabel ?iso ?genderLabel
} AS %males
{
  INCLUDE %base 
  INCLUDE %females
  INCLUDE %males
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?languageLabel ?wikidata ?iso ?malesSpeakers ?malesRecords ?femalesSpeakers ?femalesRecords
ORDER BY ASC(?languageLabel )
... Loading ...

[HEAVY] Languages → Languages with gender and recordings counts (2)

Section needs review.

To run on LLQS.[2]

#defaultEndpoint:Lingualibre
SELECT ?iso 
  (?genderLabel as ?Gender)
  (COUNT(DISTINCT ?speakerQid) as ?Speakers) 
  (COUNT(DISTINCT ?record) as ?Records)
WHERE {
  ?record prop:P2 entity:Q2 .     # Filter: items where P2 'instance of' is Q2 'record'
  ?record prop:P4 ?language .    # Filter: items where P4 'language' is Q34 'Marathi'
  # OPTIONAL { ?language prop:P12 ?wikidata }  # Assign value: P12 'wikidata id' into variable ?WD
  OPTIONAL { ?language prop:P13 ?iso } # Assign value: P13 'iso639-3' into ?isoCode
  ?record prop:P5 ?speakerQid . # Assign value: P5 'speaker' into variable ?speakerQid
  ?speakerQid prop:P8 ?gender . #  Assign value: P8 'sex or gender'
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
GROUP BY ?iso ?genderLabel
ORDER BY ASC(?iso )
... Loading ...

Speakers

✅ Speakers → Largest number of languages recorded and known

To run on LLQS.[2]

#defaultEndpoint:Lingualibre
#Title: Speakers with recordings largest number of languages and known languages
SELECT ?speaker ?speakerLabel ?count ?languages
# Get audios, language, speaker triplet
WITH {
  SELECT DISTINCT ?speaker ?language {
    ?audio prop:P4 ?language;
           prop:P5 ?speaker.
  }
} AS %speakers
# Get the count of languages per each speaker
WITH {
  SELECT ?speaker (COUNT(?speaker) AS ?count) {
    INCLUDE %speakers.
  }
  GROUP BY ?speaker
  ORDER BY DESC(?count)
} AS %countOfLanguagesRecordedPerSpeaker
# Get the maximum number of languages per each speaker
WITH {
  SELECT (MAX(?count) AS ?maxNumberOfLanguagesRecorded) {
    INCLUDE %countOfLanguagesRecordedPerSpeaker.
  }
} AS %maxNumberOfLanguagesRecorded
# Get those speakers whose count equals the maximum number of languages
WITH {
  SELECT ?speaker ?count {
    INCLUDE %countOfLanguagesRecordedPerSpeaker.
    INCLUDE %maxNumberOfLanguagesRecorded.
    FILTER(?count = ?maxNumberOfLanguagesRecorded).
  }
} AS %speakersWithMostNumberOfLanguagesRecorded
# Get the languages of those speakers that have recorded audios in the
# most number of languages
WITH {
  SELECT ?speaker (GROUP_CONCAT(?languageLabel; SEPARATOR = ", ") AS ?languages) {
    INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
    ?speaker prop:P4 [
        rdfs:label ?languageLabel
      ]
    FILTER(LANG(?languageLabel) = "en").
  }
  GROUP BY ?speaker
} AS %languagesOfSpeakersWithMostNumberOfLanguagesRecorded
{
  INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
  INCLUDE %languagesOfSpeakersWithMostNumberOfLanguagesRecorded.
  ?speaker rdfs:label ?speakerLabel.
  FILTER(LANG(?speakerLabel) = "en")
}
... Loading ...

✅ Speakers → Countries with most speakers

To run on LLQS.[2]
Note: this queries collects declared speakers, and do not check for actual recordings.

#defaultEndpoint:Lingualibre
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
WITH {
  SELECT DISTINCT ?speaker {
    ?speaker prop:P2 entity:Q3;
  }
} AS %speakers
WITH {
  SELECT DISTINCT
    ?speaker
    ?country
    ?countryLabel
    ?ISO3
    ?continentLabel
  {
    INCLUDE %speakers.
    ?speaker prop:P14 ?residence.
    # Avoids weird errors.
    FILTER(REGEX(?residence, "^Q[0-9]+$"))
    BIND(IRI(CONCAT('http://www.wikidata.org/entity/', ?residence)) AS ?residenceId)
    
    # Get country from wikidata
    SERVICE <https://query.wikidata.org/sparql> {
      ?residenceId wdt:P17 ?country.
      ?country rdfs:label ?countryLabel;
               wdt:P298 ?ISO3;
               wdt:P30 ?continent.
      ?continent rdfs:label ?continentLabel.
      FILTER(LANG(?countryLabel) = "en").
      FILTER(LANG(?continentLabel) = "en").
    }
  }
} AS %speakersWithCountries
{
  INCLUDE %speakersWithCountries.
}
GROUP BY ?country ?continentLabel ?ISO3 ?countryLabel
ORDER BY DESC(?count)
... Loading ...

Speakers → Map of speakers by place

To run on WDQS.[1]

#defaultEndpoint:Wikidata
#defaultView:Map
PREFIX ll: <https://lingualibre.org/entity/>
PREFIX llt: <https://lingualibre.org/prop/direct/>

SELECT DISTINCT ?lLabel ?coord WITH {
  SELECT ?lLabel ?loc WHERE {
    SERVICE <https://lingualibre.org/sparql> { 
      select DISTINCT ?lLabel ?loc { 
        SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
        ?l llt:P2 ll:Q3 ;
           llt:P14 ?loc . 
        ?record llt:P5 ?l.   
        FILTER (regex(?loc, '^Q')) 
      } 
    }
  }
} AS %i
WHERE {
  INCLUDE %i
  BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?loc)) AS ?locURL)
  SERVICE <https://query.wikidata.org/sparql> { 
    select * { 
      ?locURL wdt:P625 ?coord . 
    } 
  }
}
... Loading ...

Speakers → Map of speakers with recordings

To run on WDQS.[1]

#defaultEndpoint:Wikidata
#defaultView:Map
# Same can apply to languages :
# Q3→Q4  : items speaker and language
# P14→P12 : items WD location and WD language
# P5→P4 : properties speaker and languages
# P625→P1098 : properties location and population

PREFIX ll: <https://lingualibre.org/entity/>
PREFIX llt: <https://lingualibre.org/prop/direct/>

SELECT DISTINCT ?itemLabel ?wikidata (?info AS ?coordinates) ?records (SAMPLE(?tag) AS ?layer)
# On Lingualibre, get item's info
WITH {
  SELECT *
  WHERE {
    SERVICE <https://lingualibre.org/sparql> {    # Commented on LLQS only
      SELECT DISTINCT ?itemLabel ?wikidata (COUNT(?record) AS ?records) 
      {
        ?item llt:P2 ll:Q3 ;
                 rdfs:label ?itemLabel;
                 llt:P14 ?wikidata .
        ?record llt:P5 ?item .
        FILTER (LANG(?itemLabel)='en')
        FILTER (regex(?wikidata, '^Q'))
      } GROUP BY ?itemLabel ?wikidata
    }           # Commented on LLQS only
  }
} AS %infoWikidataId
# On Wikidata, get the target info
WHERE {
  INCLUDE %infoWikidataId
  BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?wikidata)) AS ?infoURL)
  SERVICE <https://query.wikidata.org/sparql> { 
    SELECT * { 
      ?infoURL wdt:P625 ?info .
    } 
  }
   BIND(
    IF(?records < 10, "<10",
    IF(?records < 1000, "10-1,000",
    IF(?records < 5000, "1k-5k",
    IF(?records < 25000, "5k-25k",
    IF(?records < 50000, "25k-50k",
    ">50k")))))
    AS ?tag) .
} 
GROUP BY ?itemLabel ?wikidata ?info ?records
ORDER BY DESC (?records)
... Loading ...

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 Endpoint Wikidata Query Service (WDQS) – run SPARQL Queries upon Wikidata. Run, test, download the data as json, csv or tsv. Has advanced user-friendly features such as : word hovering too see a term's meaning, code optimization, etc.
  2. 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 Endpoint LinguaLibre Query Service (LLQS) – run SPARQL Queries upon LinguaLibre. Run, test, download the data as json, csv or tsv.

See also

Lingua Libre technical helps
Template {{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files How to create a frequency list?Convert files formatsDenoise files with SoXRename and mass rename
Bots Help:BotsLinguaLibre:BotHelp:Log in to Lingua Libre with PywikibotLingua Libre Bot (gh) • OlafbotPamputtBotDragons Bot (gh)
MediaWiki MediaWiki: Help:Documentation opérationelle MediawikiHelp:Database structureHelp:CSSHelp:RenameHelp:OAuthLinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}}JS scripts: MediaWiki:Common.jsLastAudios.jsSoundLibrary.jsItemsSugar.jsLexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.jsGadget-Demo.jsGadget-RecentNonAudio.jsLiLiZip.js
Queries Help:APIsHelp:SPARQLSPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenanceLingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses Help:Download datasetsHelp:Embed audio in HTML
Unstable & tests Help:SPARQL/test
Categories Category:Technical reports