Help

Difference between revisions of "SPARQL"

Help:SPARQL gathers a list of basic SPARQL queries in the context of Lingua Libre, demoed and ready to test, together with beginners-friendly knowledges, inline-comments, introductions to concepts, code snippets and few tools. This page allows users not familiar with SPARQL to rapidly learn the basics of SPARQL, query the LinguaLibre database, and to download or directly feed that data to an application. To fit with most frequent usages, the page lightly lean toward web developpement and begginer's Javascripts skill.

m (De-westernize : switch from Q21 / French / fra / 0x010C to Q34 / Marathi / mar / SangeetaRH‎)
Line 5: Line 5:
 
# {{Done}} Gather SPARQL queries related to: core, speakers, languages, audios.
 
# {{Done}} Gather SPARQL queries related to: core, speakers, languages, audios.
 
# '''NOW/Opened:''' General content review. You may help by: a) reading and copy-editing the page's English, b) testing queries on [https://lingualibre.org/bigdata/#query LLQS], edit in or [[Help talk:SPARQL|discuss improvements]], <s>3) increase comments' concistency</s>.  
 
# '''NOW/Opened:''' General content review. You may help by: a) reading and copy-editing the page's English, b) testing queries on [https://lingualibre.org/bigdata/#query LLQS], edit in or [[Help talk:SPARQL|discuss improvements]], <s>3) increase comments' concistency</s>.  
# '''NOW/Opened:''' De-Westernization, replacing Q21 (French) and Q42 (User:0x010C) by more interesting languages and users.
+
# '''NOW/Opened:''' De-Westernization, replacing Q21 (French) by Q34 (Marathi) and Q42 (User:0x010C) by Q445757 (User:SangeetaRH‎).
 
# '''Later:''' translations.
 
# '''Later:''' translations.
 
<!-- # '''Later:''' Improve Base section with core SPARQL concepts ? -->
 
<!-- # '''Later:''' Improve Base section with core SPARQL concepts ? -->
Line 94: Line 94:
 
// Data from 3 sparql queries.
 
// Data from 3 sparql queries.
 
// Important: One key must be similar in all datasets, here: 'qid'
 
// Important: One key must be similar in all datasets, here: 'qid'
const langs = [{ qid: 'Q209', label: 'Breton', iso:'bre' }, { qid: 'Q21', label: 'French', iso: 'fra' }],
+
const langs = [{ qid: 'Q209', label: 'Breton', iso:'bre' }, { qid: 'Q34', label: 'Marathi', iso: 'mar' }],
     speakersFemales = [{ qid: 'Q209', genderF: 3, recordsF: 60 }, { qid: 'Q21', genderF: 21, recordsF:15046 }],
+
     speakersFemales = [{ qid: 'Q209', genderF: 3, recordsF: 60 }, { qid: 'Q34', genderF: 21, recordsF:5046 }],
     speakersMales = [{ qid: 'Q209', genderM: 7, recordsM: 112 }, { qid: 'Q21', genderM: 85, recordsM:82964 }];
+
     speakersMales = [{ qid: 'Q209', genderM: 7, recordsM: 218 }, { qid: 'Q34', genderM: 85, recordsM:32964 }];
 
// Toolbox for merging data by same id
 
// Toolbox for merging data by same id
 
var merge2ArraysBySameId = function(arr1,arr2,id1){
 
var merge2ArraysBySameId = function(arr1,arr2,id1){
Line 253: Line 253:
 
|}
 
|}
  
=== ✅🇶 Speaker Qid ([[Q42]]) → Speaker data, all ===
+
=== ✅🇶 Speaker Qid ([[Q445757]]) → Speaker data, all ===
 
<!-- Q: alternative words for "predicate" and "object". "property" and "value" ?-->
 
<!-- Q: alternative words for "predicate" and "object". "property" and "value" ?-->
 
{| style="width:100%"  
 
{| style="width:100%"  
Line 259: Line 259:
 
|style="padding: 0 3em;width:60%"|
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
# Get Q42 (User:0x010C)'s data
+
# Get Q445757 (User:SangeetaRH‎)'s data
 
SELECT ?anyProperty ?anyValue ?anyValueLabel
 
SELECT ?anyProperty ?anyValue ?anyValueLabel
 
WHERE {
 
WHERE {
   entity:Q42 ?anyProperty ?anyValue .  # Filter: of Q42 '0x010C', get any property and any values
+
   entity:Q445757 ?anyProperty ?anyValue .  # Filter: of Q445757 'SangeetaRH‎', get any property and any values
 
   # Add labels
 
   # Add labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {
Line 273: Line 273:
 
SELECT ?anyProperty ?anyValue ?anyValueLabel
 
SELECT ?anyProperty ?anyValue ?anyValueLabel
 
WHERE {
 
WHERE {
   entity:Q42 ?anyProperty ?anyValue .  # Filter: of Q42 '0x010C', get any property and any values
+
   entity:Q445757 ?anyProperty ?anyValue .  # Filter: of Q445757 'SangeetaRH‎', get any property and any values
 
   # Add labels
 
   # Add labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {
Line 282: Line 282:
 
|}
 
|}
  
=== ✅🇶 Speaker Qid ([[Q42]]) → Speaker languages ([[Property:P4|P4]]) ===
+
=== ✅🇶 Speaker Qid ([[Q445757]]) → Speaker languages ([[Property:P4|P4]]) ===
 
<!-- Q: Add languages iso P:13 -->
 
<!-- Q: Add languages iso P:13 -->
 
{| style="width:100%"  
 
{| style="width:100%"  
Line 290: Line 290:
 
SELECT ?languages ?languagesLabel
 
SELECT ?languages ?languagesLabel
 
WHERE {
 
WHERE {
   entity:Q42 prop:P4 ?languages . # Assign value: for Q42 '0x010C', P4 'language' into ?languages
+
   entity:Q445757 prop:P4 ?languages . # Assign value: for Q445757 'SangeetaRH‎', P4 'language' into ?languages
 
   # Add labels
 
   # Add labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {
Line 301: Line 301:
 
SELECT ?languages ?languagesLabel
 
SELECT ?languages ?languagesLabel
 
WHERE {
 
WHERE {
   entity:Q42 prop:P4 ?languages . # Assign value: for Q42 '0x010C', P4 'language' into ?languages
+
   entity:Q445757 prop:P4 ?languages . # Assign value: for Q445757 'SangeetaRH‎', P4 'language' into ?languages
 
   # Add labels
 
   # Add labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {
Line 310: Line 310:
 
|}
 
|}
  
=== ✅ Speaker Qid ([[Q42]]) + Language LL Qid ([[Q21]]) → List records ===
+
=== ✅ Speaker Qid ([[Q445757]]) + Language LL Qid ([[Q34]]) → List records ===
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
Line 317: Line 317:
 
SELECT ?audio ?audioLabel
 
SELECT ?audio ?audioLabel
 
WHERE {
 
WHERE {
   ?audio prop:P5 entity:Q42 .  # Filter: P5 Speaker is Q42 User:0x010C
+
   ?audio prop:P5 entity:Q445757 .  # Filter: P5 Speaker is Q445757 User:SangeetaRH‎
   ?audio prop:P4 entity:Q21 .  # Filter: P4 language is Q21 French
+
   ?audio prop:P4 entity:Q34 .  # Filter: P4 language is Q34 Marathi
 
   # Add labels
 
   # Add labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {
Line 329: Line 329:
 
SELECT ?audio ?audioLabel
 
SELECT ?audio ?audioLabel
 
WHERE {
 
WHERE {
   ?audio prop:P5 entity:Q42 .  # Filter: P5 Speaker is Q42 User:0x010C
+
   ?audio prop:P5 entity:Q445757 .  # Filter: P5 Speaker is Q445757 User:SangeetaRH‎
   ?audio prop:P4 entity:Q21 .  # Filter: P4 language is Q21 French
+
   ?audio prop:P4 entity:Q34 .  # Filter: P4 language is Q34 Marathi
 
   # Add labels
 
   # Add labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {
Line 339: Line 339:
 
|}
 
|}
  
=== ✅ Speaker Qid ([[Q42]]) + Language LL Qid ([[Q21]]) → Count records ===
+
=== ✅ Speaker Qid ([[Q445757]]) + Language LL Qid ([[Q34]]) → Count records ===
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
Line 346: Line 346:
 
SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
 
SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
 
WHERE {
 
WHERE {
   VALUES ?language { entity:Q21 }  # Assign value: Q21 'French' into ?language   
+
   VALUES ?language { entity:Q34 }  # Assign value: Q34 'Marathi' into ?language   
   VALUES ?speaker { entity:Q42 }  # Assign value: Q42 '0x010C' into ?speaker  
+
   VALUES ?speaker { entity:Q445757 }  # Assign value: Q445757 'SangeetaRH‎' into ?speaker  
   ?audio prop:P5 ?speaker .  # Filter: P5 'speaker' is Q42 '0x010C'
+
   ?audio prop:P5 ?speaker .  # Filter: P5 'speaker' is Q445757 'SangeetaRH‎'
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   # Add labels
 
   # Add labels
Line 360: Line 360:
 
SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
 
SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
 
WHERE {
 
WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
   VALUES ?speaker { entity:Q42 }
+
   VALUES ?speaker { entity:Q445757 }
   ?audio prop:P5 ?speaker .  # Filter: P5 'speaker' is Q42 '0x010C'
+
   ?audio prop:P5 ?speaker .  # Filter: P5 'speaker' is Q445757 'SangeetaRH‎'
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   # Add labels
 
   # Add labels
Line 461: Line 461:
 
|}
 
|}
  
=== ✅ Language LL Qid (Q21) → Count items ===
+
=== ✅ Language LL Qid (Q34) → Count items ===
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
Line 467: Line 467:
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 
SELECT ?language (COUNT(?item) AS ?items) WHERE {
 
SELECT ?language (COUNT(?item) AS ?items) WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
 
   ?item prop:P4 ?language .
 
   ?item prop:P4 ?language .
 
}
 
}
Line 475: Line 475:
 
<query _pagination="6" item="Property" itemLabel="Values">
 
<query _pagination="6" item="Property" itemLabel="Values">
 
SELECT ?language (COUNT(?item) AS ?items) WHERE {
 
SELECT ?language (COUNT(?item) AS ?items) WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
 
   ?item prop:P4 ?language .
 
   ?item prop:P4 ?language .
 
}
 
}
Line 482: Line 482:
 
|}
 
|}
  
=== ✅ Language LL Qid (Q21) → Count records ===
+
=== ✅ Language LL Qid (Q34) → Count records ===
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
Line 488: Line 488:
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 
SELECT ?language (COUNT(?audio) AS ?audios) WHERE {
 
SELECT ?language (COUNT(?audio) AS ?audios) WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
 
}
 
}
 
GROUP BY ?language
 
GROUP BY ?language
Line 497: Line 497:
 
<query _pagination="6" item="Property" itemLabel="Values">
 
<query _pagination="6" item="Property" itemLabel="Values">
 
SELECT ?language (COUNT(?audio) AS ?audios) WHERE {
 
SELECT ?language (COUNT(?audio) AS ?audios) WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
 
}
 
}
 
GROUP BY ?language
 
GROUP BY ?language
Line 505: Line 505:
 
|}
 
|}
  
=== ✅🇶 Language LL Qid (Q21) → Count unique words, audios, ratio ===
+
=== ✅🇶 Language LL Qid (Q34) → Count unique words, audios, ratio ===
 
<!-- Use smaller language for higher speed -->
 
<!-- Use smaller language for higher speed -->
 
{| style="width:100%"  
 
{| style="width:100%"  
Line 516: Line 516:
 
   (ROUND(10000*?words/?audios)/100 AS ?percent)
 
   (ROUND(10000*?words/?audios)/100 AS ?percent)
 
WHERE {
 
WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
 
   ?audio rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
Line 531: Line 531:
 
   (ROUND(10000*?words/?audios)/100 AS ?percent)
 
   (ROUND(10000*?words/?audios)/100 AS ?percent)
 
WHERE {
 
WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
 
   ?audio rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
 
   ?audio rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
Line 540: Line 540:
 
|}
 
|}
  
=== ✅ Language LL Qid (Q21) → Count speakers ===
+
=== ✅ Language LL Qid (Q34) → Count speakers ===
 
{| style="width:100%"  
 
{| style="width:100%"  
 
|- style="vertical-align:top;"
 
|- style="vertical-align:top;"
Line 546: Line 546:
 
<syntaxhighlight lang="sparql">
 
<syntaxhighlight lang="sparql">
 
SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
 
SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
 
   ?audio prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
 
   ?audio prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
   ?audio prop:P4 ?language .  # P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # P4 'language' is Q34 'Marathi'
 
}
 
}
 
GROUP BY ?language
 
GROUP BY ?language
Line 555: Line 555:
 
<query _pagination="6" item="Property" itemLabel="Values">
 
<query _pagination="6" item="Property" itemLabel="Values">
 
SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
 
SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
   VALUES ?language { entity:Q21 }
+
   VALUES ?language { entity:Q34 }
 
   ?audio prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
 
   ?audio prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
   ?audio prop:P4 ?language .  # P4 'language' is Q21 'French'
+
   ?audio prop:P4 ?language .  # P4 'language' is Q34 'Marathi'
 
}
 
}
 
GROUP BY ?language
 
GROUP BY ?language
Line 571: Line 571:
 
   VALUES ?language { entity:Q209 }
 
   VALUES ?language { entity:Q209 }
 
   ?speaker prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
 
   ?speaker prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
   ?speaker prop:P4 ?language .  # P4 'language' is Q21 'French'
+
   ?speaker prop:P4 ?language .  # P4 'language' is Q34 'Marathi'
 
   # Labels
 
   # Labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {
Line 583: Line 583:
 
   VALUES ?language { entity:Q209 }
 
   VALUES ?language { entity:Q209 }
 
   ?speaker prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
 
   ?speaker prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
   ?speaker prop:P4 ?language .  # P4 'language' is Q21 'French'
+
   ?speaker prop:P4 ?language .  # P4 'language' is Q34 'Marathi'
 
   # Labels
 
   # Labels
 
   SERVICE wikibase:label {
 
   SERVICE wikibase:label {

Revision as of 09:59, 7 February 2022


Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg

December 2021 rewriting : work in progress, please do not translate yet.
  1. Check-green.svg Done Gather SPARQL queries related to: core, speakers, languages, audios.
  2. NOW/Opened: General content review. You may help by: a) reading and copy-editing the page's English, b) testing queries on LLQS, edit in or discuss improvements, 3) increase comments' concistency.
  3. NOW/Opened: De-Westernization, replacing Q21 (French) by Q34 (Marathi) and Q42 (User:0x010C) by Q445757 (User:SangeetaRH‎).
  4. Later: translations.

Help welcome.

Base

Useful elements

... Loading ...

Tools

On Wikidata, the WDQS allows to practice SPARQL queries creation in an intuitive way.

References

Code snippets

Fetch data using SPARQL

LinguaLibre data can be fetched using various coding languages such as Python, Javascript, R and others, returning JSON or other formats.

  • For code snippet in your language : open query.wikidata.org (WikiData Query Service, aka WDQS), run your SPARQL query, click "Code" : a pop up window appears with various implementations.
  • For downloading data, click "Download".

Javascript:
At least 3 methods exists (code snippet), example:

Query Result's basic unit
SPARQL:
SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10
{  },
{
  "item": {
    "type": "uri",
    "value": "https://lingualibre.org/entity/Q12"
  },
  "itemLabel": {
    "xml:lang": "en",
    "type": "literal",
    "value": "beginner"
  }
},
{  }
Javascript:
var endpoint = 'https://lingualibre.org/sparql';
var sparql = 'SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10';
$.getJSON(endpoint,
	{ query: sparql, format: 'json' },
	function(data){ console.log('JQuery: ',data)}
);

Merging data

Advanced SPARQL queries with COUNT() and others are often slow (>3secs, sometime >100secs). You are encouraged to do multiple smaller SPARQL queries to then merge their responded data. By example, the complementary Javascript snippet below would help web developers to do so.

// Data from 3 sparql queries.
// Important: One key must be similar in all datasets, here: 'qid'
const langs = [{ qid: 'Q209', label: 'Breton', iso:'bre' }, { qid: 'Q34', label: 'Marathi', iso: 'mar' }],
    speakersFemales = [{ qid: 'Q209', genderF: 3, recordsF: 60 }, { qid: 'Q34', genderF: 21, recordsF:5046 }],
    speakersMales = [{ qid: 'Q209', genderM: 7, recordsM: 218 }, { qid: 'Q34', genderM: 85, recordsM:32964 }];
// Toolbox for merging data by same id
var merge2ArraysBySameId = function(arr1,arr2,id1){
	return arr1.map( item1 => { 
  	var identical = arr2.find(obj => obj[id1] === item1[id1]); 
  	return Object.assign(identical, item1) 
  } );
}
// Mergings
var step1 = merge2ArraysBySameId(langs,speakersFemales,'qid');
var step2 = merge2ArraysBySameId(step1,speakersMales,'qid');
alert(JSON.stringify(step2))

Lingualibre's ground

✅ Is Language (language/dialect (Q4)) → List existing languages with: LL Qid, ISO 639-3, Name

SELECT ?lang ?iso ?langLabel
WHERE {
  ?lang prop:P2 entity:Q4 . # Filter: P2 'instance of' is Q4 'language or dialect'.
  ?lang prop:P13 ?iso .     # Assign value: P13 'ISO-639-3' into ?iso.
  # Add label to each variable used.
  # ?lang now has twin variable ?langLabel
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅ Is Speaker (speaker (Q3)) → List existing speakers

SELECT ?speaker ?speakerLabel
WHERE {
  ?speaker prop:P2 entity:Q3 .  # Filter: P2 'instance of' is Q3 'speaker'.
  # Add labels to each variable used.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅ Is Language level (language level (Q5)) → List existing levels

SELECT ?item ?itemLabel
WHERE {
  ?item prop:P2 entity:Q5    # Filter: P2 'instance of' is Q5 'language level'.
  # Add labels to each variable used.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅ Is Sex or Gender (sex or gender (Q7)) → List existing sexes or genders

SELECT ?item ?itemLabel
WHERE {
  ?item prop:P2 entity:Q7    # Filter: P2 'instance of' is Q7 'sex or gender'.
  # Add labels to each variable used.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

Speaker

✅🇶 Speaker name(s) → Speaker Qid(s)

SELECT ?speakerName ?speakerId
WHERE {
  VALUES ?speakerName { "Yug" "VIGNERON" } # Assign value: one or multiple values
  # note: need to comment BIND
  BIND ( STRLANG(?speakerName, "en") AS ?speakerLabel )
  # Grammatical note: ';' allows to chain actions 
  ?speakerId prop:P2 entity:Q3 ;        # Filter: P2 'instance of' is Q3 'speaker'.
             rdfs:label ?speakerLabel . # Filter by value: label equal ?speakerLabel's value
}
... Loading ...

✅🇶 Speaker Qid (SangeetaRH (Q445757)) → Speaker data, all

# Get Q445757 (User:SangeetaRH‎)'s data
SELECT ?anyProperty ?anyValue ?anyValueLabel
WHERE {
  entity:Q445757 ?anyProperty ?anyValue .  # Filter: of Q445757 'SangeetaRH‎', get any property and any values
  # Add labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅🇶 Speaker Qid (SangeetaRH (Q445757)) → Speaker languages (P4)

SELECT ?languages ?languagesLabel
WHERE {
  entity:Q445757 prop:P4 ?languages . # Assign value: for Q445757 'SangeetaRH‎', P4 'language' into ?languages
  # Add labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅ Speaker Qid (SangeetaRH (Q445757)) + Language LL Qid (Marathi (Q34)) → List records

SELECT ?audio ?audioLabel
WHERE {
  ?audio prop:P5 entity:Q445757 .   # Filter: P5 Speaker is Q445757 User:SangeetaRH‎
  ?audio prop:P4 entity:Q34 .   # Filter: P4 language is Q34 Marathi
  # Add labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅ Speaker Qid (SangeetaRH (Q445757)) + Language LL Qid (Marathi (Q34)) → Count records

SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
WHERE {
  VALUES ?language { entity:Q34 }  # Assign value: Q34 'Marathi' into ?language  
  VALUES ?speaker { entity:Q445757 }   # Assign value: Q445757 'SangeetaRH‎' into ?speaker 
  ?audio prop:P5 ?speaker .   # Filter: P5 'speaker' is Q445757 'SangeetaRH‎'
  ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
  ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
  # Add labels
  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"} 
}
GROUP BY ?language ?speakerLabel  # Sorting first groups per language and speaker
... Loading ...

✅ Is Speaker (speaker (Q3)) → List of accounts and associated speakers

SELECT ?linkedUser ?speakerLabel (SUBSTR(STR(?speaker),32) AS ?speakerQid)
WHERE {
  ?speaker prop:P2 entity:Q3 .  # Filter: P2 'instance of' is Q3 'speaker'.
  ?speaker prop:P11 ?linkedUser . # Assign value: P11 'linked users' into ?linkedUser.
  # Add labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
} ORDER BY DESC (?speakerLabel)
... Loading ...

Languages

✅🇶 Language name(s) in English → Language LL Qid(s)

SELECT ?languageId ?languageName
WHERE {
  VALUES ?languageName { "Marathi" "Atikamekw" "Central Bikol" } # Target values
  ?languageId 
    prop:P2 entity:Q4 ;          # Filter: P2 'instance of' is Q4 'language' AND
    rdfs:label ?languageLabel .  # Assign value label into ?languageLabel
  BIND ( STRLANG(?languageName, "en") AS ?languageLabel ) # Bind filter by English
}
... Loading ...

✅ Language ISO-639-3 → Language LL Qid(s), Wikidata Qid, Label

SELECT ?langIso ?langId ?langWDQid ?langIdLabel
WHERE {
  VALUES ?langIso { "mar" "bre" "bcl" "atj" "ban" } # Target ISO values
  ?langId 
    prop:P2 entity:Q4 ;    # Filter: P2 'instance of' is Q4 'language' AND
    prop:P13 ?langIso ;    # Assign value: P13 'Iso-639-3' to ?langIso AND
    prop:P12 ?langWDQid .  # Assign value: P12 'Iso-639-3' to ?langWDQid
  # Labels
  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"}
}
... Loading ...

✅ Language LL Qid (Q34) → Count items

SELECT ?language (COUNT(?item) AS ?items) WHERE {
  VALUES ?language { entity:Q34 }
  ?item prop:P4 ?language .
}
GROUP BY ?language
... Loading ...

✅ Language LL Qid (Q34) → Count records

SELECT ?language (COUNT(?audio) AS ?audios) WHERE {
  VALUES ?language { entity:Q34 }
  ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
  ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
}
GROUP BY ?language
... Loading ...

✅🇶 Language LL Qid (Q34) → Count unique words, audios, ratio

SELECT ?language 
  (COUNT(DISTINCT(?itemLabel)) AS ?words)  # Count and assign value to ?Audio
  (COUNT(?audio) as ?audios) 
  (ROUND(10000*?words/?audios)/100 AS ?percent)
WHERE {
  VALUES ?language { entity:Q34 }
  ?audio prop:P4 ?language .  # Filter: P4 'language' is Q34 'Marathi'
  ?audio prop:P2 entity:Q2 .  # Filter: P2 'instance of' is Q2 'record'
  ?audio rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
}
GROUP BY ?language
... Loading ...

✅ Language LL Qid (Q34) → Count speakers

SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
  VALUES ?language { entity:Q34 }
  ?audio prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
  ?audio prop:P4 ?language .  # P4 'language' is Q34 'Marathi'
}
GROUP BY ?language
... Loading ...

✅ Language LL Qid (Q209) → List speakers

SELECT ?language ?speaker ?speakerLabel WHERE {
  VALUES ?language { entity:Q209 }
  ?speaker prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
  ?speaker prop:P4 ?language .  # P4 'language' is Q34 'Marathi'
  # Labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...


✅ Language LL Qid (Breton (Q209)) → Language data, all

'Case: Get for language Q209 'Breton' all its data.

SELECT * WHERE {
  # Given Q209 'Breton language', get all properties and values
  entity:Q209 ?predicate ?object .
}
... Loading ...

✅ Language LL Qid (Breton (Q209)) → Language data, core

SELECT * WHERE {
  # Given Q209 'Breton language', get all properties and values
  entity:Q209 ?predicate ?object .
  ?predicate rdf:type owl:DatatypeProperty .
}
... Loading ...

✅ Language LL Qid (Breton (Q209)) → Property P13 (ISO 639-3)

SELECT * WHERE {
  entity:Q209 prop:P13 ?iso . # Assign value : Q209 'Breton', P13 'ISO 639-3', value into ?iso
}
... Loading ...

✅ Languages → List existing languages' iso-639-3

SELECT * WHERE {
  ?lang prop:P13 ?code .
}
... Loading ...

✅🇶 Language WD Qid → Language data, core

SELECT * WHERE {
  ?lang prop:P12 "Q12107" .  # Filter: P12 'Wikidata id' is Wikidata's "Q12107"
  ?lang ?predicate ?object . # 
  ?predicate rdf:type owl:DatatypeProperty .
}
... Loading ...

Records

✅ Record LL Qid (Cometa (Q500)) → Record data, all

SELECT * WHERE {
  entity:Q500 ?predicate ?object .
  # ?predicate rdf:type owl:DatatypeProperty .
}
... Loading ...

✅ Record LL Qid (Cometa (Q500)) → Record data, core

SELECT * WHERE {
  entity:Q500 ?predicate ?object .
  ?predicate rdf:type owl:DatatypeProperty .
}
... Loading ...


✅ Language (English (Q22)) + String → Record LL Qid(s)

SELECT ?itemLabel ?item
WHERE { 
  ?item prop:P2 entity:Q2 .    # Filter: P2 'instance of' Q3 'record'
  ?item prop:P4 entity:Q22 .    # Filter: P4 'language' is Q22 'English'
  ?item rdfs:label ?itemLabel. # Assign value: label to ?itemLabel
  FILTER(CONTAINS(?itemLabel, "apple"@en)). 
} limit 10
... Loading ...

✅ Language (Breton (Q209)) + Speaker (ThonyVezbe (Q584098)) + String (ni) → Record LL Qid

Case: Search in Breton language, with speaker 'ThonyVezbe',

SELECT ?audio ?urlPointer
WHERE {
  ?audio prop:P4 entity:Q209 .    # P4 'language' is Q209 'Breton'
  ?audio prop:P5 entity:Q584098 . # P5 'speaker' is Q584098 'ThonyVezbe'
  ?audio rdfs:label ?word . #word
  FILTER ( STR(?word) = "ni" )    # word = 'ni'
  ?audio prop:P3 ?urlPointer.
}
... Loading ...

✅ Language (French (Q21)) + Speaker (Justforoc (Q137047)) + String → URL pointer, filename

SELECT ?word ?audio ?urlPointer
  (replace(replace(replace(substr(STR(?urlPointer),52),"%20","_"),"%28","("),"%29",")") AS ?filename)
WHERE {
  ?audio prop:P4 entity:Q21 .      # Filter: P4 'language' is Q21 'French'
  ?audio prop:P5 entity:Q137047 .  # Filter: P5 'speaker' is Q137047 'Justforoc'
  ?audio rdfs:label ?word .        # Assign value: label to ?word
  #Filter: ?word with 'pomme' in French, non case-sensitive
  FILTER REGEX(?word, "pomme"@fr, "i" ) .
  ?audio prop:P3 ?urlPointer
}
... Loading ...

Heavy queries

Queries below are too large to run on LinguaLibre's wikipages, or even on Lingualibre Query Service).
To do: do smaller sub-queries, with one COUNT() function.

❌ Languages → Name, Wikidata Qid, LLQid, Iso-639-3, and genders

Query Result
SELECT ?languageQidLabel ?wdQid ?languageQid ?isoCode 
(COUNT(DISTINCT(?record)) AS ?recordCount)
(COUNT(DISTINCT(?speakerLangM)) AS ?speakerM) 
(COUNT(DISTINCT(?speakerLangF)) AS ?speakerF)
wWHERE{
  ?record prop:P2 entity:Q2 .     # Filter: items where P2 'instance of' is Q2 'record'
  ?record prop:P4 ?languageQid .  # Assign value: P4 'language' into variable ?language
  ?languageQid prop:P12 ?wdQid .  # Assign value: P12 'wikidata id' into variable ?WD
  ?languageQid prop:P13 ?isoCode. # Assign value: P13 'iso639-3' into ?isoCode
  
  #?record prop:P5 ?speakerQidM .   # Assign value: P5 'speaker' into variable ?speakerQidM
  #?speakerQidM prop:P8 entity:Q16 .   # Filter: P8 'sex or gender' is Q16 'male
  #?speakerQidM prop:P4 ?speakerLangM .  # Assign value: P4 'language' into variable ?spakerLangM
  
  ?record prop:P5 ?speakerQidF .   # Assign value: P5 'speaker' into variable ?speakerQidF
  ?speakerQidF prop:P8 entity:Q17 .   # Filter: P8 'sex or gender' is Q17 'female
  ?speakerQidF prop:P4 ?speakerLangF .  # Assign value: P4 'language' into variable ?spakerLangF
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . } 
}
GROUP BY ?languageQidLabel ?languageQid ?wdQid ?isoCode
ORDER BY DESC(?recordCount)
languageQidLabel	wdQid	languageQid	isoCode	recordCount	speakerM	speakerF
French	Q150	Q21	fra	16761	0	18
Marathi	Q1571	Q34	mar	13153	0	5
Polish	Q809	Q298	pol	11686	0	1
…

❌ Is Language (speaker (Q3)) → list all languages with number of unique words and speakers

SELECT ?language (COUNT(?audio) AS ?nbAudio) (COUNT(?speaker) AS ?nbSpeaker) WHERE {
  ?language prop:P2 entity:Q4 .
  ?audio prop:P4 ?language .
  ?speaker prop:P4 ?language .
}
GROUP BY ?language

Others

(These old queries are not assessed yet.)

✅ Language (Breton (Q209)) → Record, speaker's language level

select ?record ?recordLabel ?speakerLabel ?languageLabel ?languageLevelLabel
where {
  ?record prop:P2 entity:Q2   # Filter: P2 'instance of' is Q2 'record' AND P4
	; prop:P4 entity:Q209 .   # AND P4 'language' is Q209 'Breton'
  ?record prop:P5 ?speaker .  # Assign value: record's P5 'speaker' into ?speaker
  ?record prop:P4 ?language . # Assign value: record's P4 'language' into ?language
  
  ?speaker llp:P4 ?languageStatement .    # P4 'language'
  ?languageStatement llv:P4 ?language .   # P4 'language'
  ?languageStatement llq:P16 ?languageLevel . # P16 'language level'
  # Adds labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
} ORDER BY ?languageLabel ?languageLevelLabel
... Loading ...

✅ Language (Marathi (Q34)) → Records of Wikidata concepts with WD Qid (P12)

Those items were proposed to Lingualibre's recorder at step 3 via a SPARQL query upon Wikidata, so those words have WD's Qids.
SELECT ?languageLabel ?recordLabel ?record ?wid
WHERE {
  ?record prop:P2 entity:Q2 .    # Filter: P2 'instance of' is Q2 'record'
  ?record prop:P4 entity:Q34 .  # Filter: P4 'language' is Q34 'Marathi'
  ?record prop:P4 ?language .    # Assign value: record's P4 'language' to variable ?language
  ?record prop:P12 ?wid .        # Assign value: record's P12 'wikidata id' to variable ?wid
  # Add labels capability
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

✅ Records → Filter by date: late 2018

SELECT
(COUNT(DISTINCT ?speaker) AS ?speakers)
(COUNT(DISTINCT ?record) AS ?records)
WHERE {
  ?record prop:P2 entity:Q2 .
  ?record prop:P6 ?date .
  ?record prop:P5 ?speaker .
  # Filters:
  FILTER(?date >= "2018-07-01T00:00:00Z"^^xsd:dateTime)
  FILTER(?date < "2019-01-01T00:00:00Z"^^xsd:dateTime)
}
... Loading ...

See also

Lingua Libre technical helps
Template {{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files How to create a frequency list?Convert files formatsDenoise files with SoXRename and mass rename
Bots Help:BotsLinguaLibre:BotHelp:Log in to Lingua Libre with PywikibotLingua Libre Bot (gh) • OlafbotPamputtBotDragons Bot (gh)
MediaWiki MediaWiki: Help:Documentation opérationelle MediawikiHelp:Database structureHelp:CSSHelp:RenameHelp:OAuthLinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}}JS scripts: MediaWiki:Common.jsLastAudios.jsSoundLibrary.jsItemsSugar.jsLexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.jsGadget-Demo.jsGadget-RecentNonAudio.js
Queries Help:APIsHelp:SPARQLSPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenanceLingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses Help:Download datasetsHelp:Embed audio in HTML
Unstable & tests Help:SPARQL/test
Categories Category:Technical reports