Help

Difference between revisions of "SPARQL (intermediate)"

Help:SPARQL 2 will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints, then Wikidata Lexemes, an emerging source of lexicographic data. The duo can be a solid combo to provide lexicographic and multimedia (audio recordings and images) for either Wikimedia modules or web developers.

m (Reverted edits by Yug (talk) to last revision by Rdrg109)
Tag: Rollback
Line 5: Line 5:
 
=== Lexemes Queries Generator ===
 
=== Lexemes Queries Generator ===
 
{{LexemeQueriesGenerator}}
 
{{LexemeQueriesGenerator}}
 
=== SPARQL to persitent data ===
 
''Some SPARQL queries are meaningful but heavy and overly slow. His administrator tool stores or updates the response data on LinguaLibre, within a wikipage. Stored data can then be loaded in <1 second via a variation of <code>mw.loader.load('/index.php?title=MediaWiki:Mydata.js&action=raw&ctype=text/javascript');</code>.
 
{{Sparql2data}}
 
  
 
=== Federate queries ===
 
=== Federate queries ===
Line 24: Line 20:
 
     SELECT ?item ?itemLabel {
 
     SELECT ?item ?itemLabel {
 
       ?item prop:P2 entity:Q5.
 
       ?item prop:P2 entity:Q5.
       SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
+
       SERVICE wikibase:label {bd:serviceParam wikibase:language "en".}
 
     }
 
     }
 
   }
 
   }
Line 41: Line 37:
 
=== ✅ Language () → List of wd lexemes ([[:d:Q150]]) ===
 
=== ✅ Language () → List of wd lexemes ([[:d:Q150]]) ===
 
:''Strange query from [[User:VIGNERON/common.js]]''
 
:''Strange query from [[User:VIGNERON/common.js]]''
{| style="width:100%"
+
<pre>
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
 
SELECT DISTINCT ?lexemeLabel ?lexeme
 
SELECT DISTINCT ?lexemeLabel ?lexeme
 
WITH {
 
WITH {
Line 63: Line 56:
 
   }
 
   }
 
}
 
}
</syntaxhighlight>
+
</pre>
|
 
|}
 
 
 
 
== Speakers ==
 
== Speakers ==
 
=== ✅ Speakers → Largest number of languages recorded and known ===
 
=== ✅ Speakers → Largest number of languages recorded and known ===
Line 260: Line 250:
 
GROUP BY ?country ?continentLabel ?ISO3 ?countryLabel
 
GROUP BY ?country ?continentLabel ?ISO3 ?countryLabel
 
ORDER BY DESC(?count)
 
ORDER BY DESC(?count)
</query>
 
|}
 
 
=== <!-- ✅--> Speakers → Map of speakers by place ===
 
{| style="width:100%"
 
|- style="vertical-align:top;"
 
|style="padding: 0 3em;width:60%"|
 
<syntaxhighlight lang="sparql">
 
PREFIX ll: <https://lingualibre.org/entity/>
 
PREFIX llt: <https://lingualibre.org/prop/direct/>
 
 
SELECT DISTINCT ?lLabel ?coord WITH {
 
  SELECT ?lLabel ?loc WHERE {
 
    SERVICE <https://lingualibre.org/sparql> {
 
      select DISTINCT ?lLabel ?loc {
 
        SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 
        ?l llt:P2 ll:Q3 ;
 
          llt:P14 ?loc .
 
        ?record llt:P5 ?l. 
 
        FILTER (regex(?loc, '^Q'))
 
      }
 
    }
 
  }
 
} AS %i
 
WHERE {
 
  INCLUDE %i
 
  BIND (URI(CONCAT("http://www.wikidata.org/entity/", ?loc)) AS ?locURL)
 
  SERVICE <https://query.wikidata.org/sparql> {
 
    select * {
 
      ?locURL wdt:P625 ?coord .
 
    }
 
  }
 
}
 
 
</syntaxhighlight>
 
||
 
<query _pagination="20">
 
 
</query>
 
</query>
 
|}
 
|}

Revision as of 08:36, 18 January 2022


Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg

This page is a work in progress.

Tools

Lexemes Queries Generator


Federate queries

  • To query Lingualibre from Wikidata, use SERVICE <https://lingualibre.org/sparql>.
  • To query Wikidata from LinguaLibre, use SERVICE <https://query.wikidata.org/sparql>.

The following query shows a simple example of retrieving data of LinguaLibre from Wikidata Query Service. It lists the existing levels in LinguaLibre.

PREFIX prop: <https://lingualibre.org/prop/direct/>
PREFIX entity: <https://lingualibre.org/entity/>

SELECT * {
  SERVICE <https://lingualibre.org/sparql> {
    SELECT ?item ?itemLabel {
      ?item prop:P2 entity:Q5.
      SERVICE wikibase:label {bd:serviceParam wikibase:language "en".}
    }
  }
}

Languages

✅ Language () → List of LL languages with wd speaker population

Lexemes

✅ Language (d:Q12107) → List of wd lexemes

Example : Q12107 breton.

✅ Language () → List of wd lexemes with LL audio

✅ Language () → List of wd lexemes with LL audio and wd translation (d:Q150)

✅ Language () → List of wd lexemes (d:Q150)

Strange query from User:VIGNERON/common.js
SELECT DISTINCT ?lexemeLabel ?lexeme
WITH {
  SELECT ?lexeme ?lexemeLabel ?lexical_category WHERE {
    ?lexeme a ontolex:LexicalEntry ;
            dct:language wd:Q12107 ; 
            wikibase:lemma ?lexemeLabel .
    OPTIONAL {
      ?lexeme wikibase:lexicalCategory ?lexical_category .
    }
  }
} AS %results
WHERE {
  INCLUDE %results
  OPTIONAL {        
    ?lexical_category rdfs:label ?lexical_categoryLabel .
    FILTER (LANG(?lexical_categoryLabel) = "en")
  }
}

Speakers

✅ Speakers → Largest number of languages recorded and known

#Title: Speakers with recordings largest number of languages and known languages
SELECT ?speaker ?speakerLabel ?count ?languages
# Get audios, language, speaker triplet
WITH {
  SELECT DISTINCT ?speaker ?language {
    ?audio prop:P4 ?language;
           prop:P5 ?speaker.
  }
} AS %speakers
# Get the count of languages per each speaker
WITH {
  SELECT ?speaker (COUNT(?speaker) AS ?count) {
    INCLUDE %speakers.
  }
  GROUP BY ?speaker
  ORDER BY DESC(?count)
} AS %countOfLanguagesRecordedPerSpeaker
# Get the maximum number of languages per each speaker
WITH {
  SELECT (MAX(?count) AS ?maxNumberOfLanguagesRecorded) {
    INCLUDE %countOfLanguagesRecordedPerSpeaker.
  }
} AS %maxNumberOfLanguagesRecorded
# Get those speakers whose count equals the maximum number of languages
WITH {
  SELECT ?speaker ?count {
    INCLUDE %countOfLanguagesRecordedPerSpeaker.
    INCLUDE %maxNumberOfLanguagesRecorded.
    FILTER(?count = ?maxNumberOfLanguagesRecorded).
  }
} AS %speakersWithMostNumberOfLanguagesRecorded
# Get the languages of those speakers that have recorded audios in the
# most number of languages
WITH {
  SELECT ?speaker (GROUP_CONCAT(?languageLabel; SEPARATOR = ", ") AS ?languages) {
    INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
    ?speaker prop:P4 [
        rdfs:label ?languageLabel
      ]
    FILTER(LANG(?languageLabel) = "en").
  }
  GROUP BY ?speaker
} AS %languagesOfSpeakersWithMostNumberOfLanguagesRecorded
{
  INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
  INCLUDE %languagesOfSpeakersWithMostNumberOfLanguagesRecorded.
  ?speaker rdfs:label ?speakerLabel.
  FILTER(LANG(?speakerLabel) = "en")
}
... Loading ...


✅ Speakers → Countries with most speakers

SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
WITH {
  SELECT DISTINCT ?speaker {
    ?speaker prop:P2 entity:Q3;
  }
} AS %speakers
WITH {
  SELECT DISTINCT
    ?speaker
    ?country
    ?countryLabel
    ?ISO3
    ?continentLabel
  {
    INCLUDE %speakers.
    ?speaker prop:P14 ?residence.
    # Avoids weird errors.
    FILTER(REGEX(?residence, "^Q[0-9]+$"))
    BIND(IRI(CONCAT('http://www.wikidata.org/entity/', ?residence)) AS ?residenceId)
    
    # Get country from wikidata
    SERVICE <https://query.wikidata.org/sparql> {
      ?residenceId wdt:P17 ?country.
      ?country rdfs:label ?countryLabel;
               wdt:P298 ?ISO3;
               wdt:P30 ?continent.
      ?continent rdfs:label ?continentLabel.
      FILTER(LANG(?countryLabel) = "en").
      FILTER(LANG(?continentLabel) = "en").
    }
  }
} AS %speakersWithCountries
{
  INCLUDE %speakersWithCountries.
}
GROUP BY ?country ?continentLabel ?ISO3 ?countryLabel
ORDER BY DESC(?count)
... Loading ...

See also

Lingua Libre technical helps
Template {{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files How to create a frequency list?Convert files formatsDenoise files with SoXRename and mass rename
Bots Help:BotsLinguaLibre:BotHelp:Log in to Lingua Libre with PywikibotLingua Libre Bot (gh) • OlafbotPamputtBotDragons Bot (gh)
MediaWiki MediaWiki: Help:Documentation opérationelle MediawikiHelp:Database structureHelp:CSSHelp:RenameHelp:OAuthLinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}}JS scripts: MediaWiki:Common.jsLastAudios.jsSoundLibrary.jsItemsSugar.jsLexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.jsGadget-Demo.jsGadget-RecentNonAudio.js
Queries Help:APIsHelp:SPARQLSPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenanceLingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses Help:Download datasetsHelp:Embed audio in HTML
Unstable & tests Help:SPARQL/test
Categories Category:Technical reports