Help

Difference between revisions of "SPARQL (intermediate)"

Help:SPARQL 2 will explore federated queries fetching data from both LinguaguaLibre and Wikidata's endpoints, then Wikidata Lexemes, an emerging source of lexicographic data. The duo can be a solid combo to provide lexicographic and multimedia (audio recordings and images) for either Wikimedia modules or web developers.

Line 144: Line 144:
 
   FILTER(LANG(?speakerLabel) = "en")
 
   FILTER(LANG(?speakerLabel) = "en")
 
}
 
}
 +
</query>
 +
|}
 +
 +
 +
=== ✅ Speakers → Countries with most speakers ===
 +
{| style="width:100%"
 +
|- style="vertical-align:top;"
 +
|style="padding: 0 3em;width:60%"|
 +
<syntaxhighlight lang="sparql">
 +
</syntaxhighlight>
 +
||
 +
<query _pagination="5">
 +
SELECT ?country ?continentLabel ?ISO3 ?countryLabel (COUNT(?country) AS ?count)
 +
WITH {
 +
  SELECT DISTINCT ?speaker {
 +
    ?speaker prop:P2 entity:Q3;
 +
  }
 +
} AS %speakers
 +
WITH {
 +
  SELECT DISTINCT
 +
    ?speaker
 +
    ?country
 +
    ?countryLabel
 +
    ?ISO3
 +
    ?continentLabel
 +
  {
 +
    INCLUDE %speakers.
 +
    ?speaker prop:P14 ?residence.
 +
    # Avoids weird errors.
 +
    FILTER(REGEX(?residence, "^Q[0-9]+$"))
 +
    BIND(IRI(CONCAT('http://www.wikidata.org/entity/', ?residence)) AS ?residenceId)
 +
   
 +
    # Get country from wikidata
 +
    SERVICE <https://query.wikidata.org/sparql> {
 +
      ?residenceId wdt:P17 ?country.
 +
      ?country rdfs:label ?countryLabel;
 +
              wdt:P298 ?ISO3;
 +
              wdt:P30 ?continent.
 +
      ?continent rdfs:label ?continentLabel.
 +
      FILTER(LANG(?countryLabel) = "en").
 +
      FILTER(LANG(?continentLabel) = "en").
 +
    }
 +
  }
 +
} AS %speakersWithCountries
 +
{
 +
  INCLUDE %speakersWithCountries.
 +
}
 +
GROUP BY ?country ?continentLabel ?ISO3 ?countryLabel
 +
ORDER BY DESC(?count)
 
</query>
 
</query>
 
|}
 
|}

Revision as of 21:12, 8 January 2022


Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg

This page is a work in progress.

Tools

  1. Lexemes Queries Generator


Languages

✅ Language () → List of LL languages with wd speaker population

Lexemes

✅ Language (d:Q12107) → List of wd lexemes

Example : Q12107 breton.

✅ Language () → List of wd lexemes with LL audio

✅ Language () → List of wd lexemes with LL audio and wd translation (d:Q150)

✅ Language () → List of wd lexemes (d:Q150)

Strange query from User:VIGNERON/common.js
SELECT DISTINCT ?lexemeLabel ?lexeme
WITH {
  SELECT ?lexeme ?lexemeLabel ?lexical_category WHERE {
    ?lexeme a ontolex:LexicalEntry ;
            dct:language wd:Q12107 ; 
            wikibase:lemma ?lexemeLabel .
    OPTIONAL {
      ?lexeme wikibase:lexicalCategory ?lexical_category .
    }
  }
} AS %results
WHERE {
  INCLUDE %results
  OPTIONAL {        
    ?lexical_category rdfs:label ?lexical_categoryLabel .
    FILTER (LANG(?lexical_categoryLabel) = "en")
  }
}

Speakers

✅ Speakers → Largest number of languages recorded and known

#Title: Speakers with recordings largest number of languages and known languages
SELECT ?speaker ?speakerLabel ?count ?languages
# Get audios, language, speaker triplet
WITH {
  SELECT DISTINCT ?speaker ?language {
    ?audio prop:P4 ?language;
           prop:P5 ?speaker.
  }
} AS %speakers
# Get the count of languages per each speaker
WITH {
  SELECT ?speaker (COUNT(?speaker) AS ?count) {
    INCLUDE %speakers.
  }
  GROUP BY ?speaker
  ORDER BY DESC(?count)
} AS %countOfLanguagesRecordedPerSpeaker
# Get the maximum number of languages per each speaker
WITH {
  SELECT (MAX(?count) AS ?maxNumberOfLanguagesRecorded) {
    INCLUDE %countOfLanguagesRecordedPerSpeaker.
  }
} AS %maxNumberOfLanguagesRecorded
# Get those speakers whose count equals the maximum number of languages
WITH {
  SELECT ?speaker ?count {
    INCLUDE %countOfLanguagesRecordedPerSpeaker.
    INCLUDE %maxNumberOfLanguagesRecorded.
    FILTER(?count = ?maxNumberOfLanguagesRecorded).
  }
} AS %speakersWithMostNumberOfLanguagesRecorded
# Get the languages of those speakers that have recorded audios in the
# most number of languages
WITH {
  SELECT ?speaker (GROUP_CONCAT(?languageLabel; SEPARATOR = ", ") AS ?languages) {
    INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
    ?speaker prop:P4 [
        rdfs:label ?languageLabel
      ]
    FILTER(LANG(?languageLabel) = "en").
  }
  GROUP BY ?speaker
} AS %languagesOfSpeakersWithMostNumberOfLanguagesRecorded
{
  INCLUDE %speakersWithMostNumberOfLanguagesRecorded.
  INCLUDE %languagesOfSpeakersWithMostNumberOfLanguagesRecorded.
  ?speaker rdfs:label ?speakerLabel.
  FILTER(LANG(?speakerLabel) = "en")
}
... Loading ...


✅ Speakers → Countries with most speakers

... Loading ...

See also

Lingua Libre technical helps
Template {{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files How to create a frequency list?Convert files formatsDenoise files with SoXRename and mass rename
Bots Help:BotsLinguaLibre:BotHelp:Log in to Lingua Libre with PywikibotLingua Libre Bot (gh) • OlafbotPamputtBotDragons Bot (gh)
MediaWiki MediaWiki: Help:Documentation opérationelle MediawikiHelp:Database structureHelp:CSSHelp:RenameHelp:OAuthLinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}}JS scripts: MediaWiki:Common.jsLastAudios.jsSoundLibrary.jsItemsSugar.jsLexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.jsGadget-Demo.jsGadget-RecentNonAudio.js
Queries Help:APIsHelp:SPARQLSPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenanceLingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses Help:Download datasetsHelp:Embed audio in HTML
Unstable & tests Help:SPARQL/test
Categories Category:Technical reports