Help

Difference between revisions of "SPARQL for lexemes"

 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Lexemes ==
+
== Tools ==
:''[[LinguaLibre:Technical board/Reports/2021/Wikidata Lexemes & Lingua Libre coordination assessment]]''.
+
=== Lexemes Queries Generator ===
=== Notable elements ===
+
{{LexemeQueriesGenerator}}
 +
 
 +
== Notable elements ==
 
:''Table of notable wikidata properties to create.''
 
:''Table of notable wikidata properties to create.''
{|class="wikitable"
+
{| class="wikitable"
!width=50%| 
+
! LinguaLibre endpoint
!width=50%|  
+
!colspan=2| Wikidata endpoint
|-
+
|- style="vertical-align:top"
 
|
 
|
 +
For recordings:
 +
* `instance of` [[Property:P2|P2]] :
 +
** is `record` [[Q2]]
 +
** is `speaker` [[Q3]]
 +
** is `language` [[Q4]]
 +
* `language` [[Property:P4|P4]]
 +
* `speaker` [[Property:P5|P5]]
 +
* `gender` [[Property:P8|P8]]
 +
* `wikidata` [[Property:P12|P12]]
 +
* `iso` [[Property:P13|P13]]
 +
* `media type` [[Property:P24|P24]]
 +
** media type [[Q88888]]
 +
** is `audio` [[Q88889]]
 +
** is `video` [[Q88890]]
 +
** is `written` [[Q1087276]]
 
|
 
|
 +
For languages:
 +
* `instance of` [[:d:P:P31|P31]]/[[:d:P:P279|P279]]*
 +
** is `language` [[:d:Q34770]] (ethnic based) , [[:d:Q315]] (capacity)
 +
** is `sign language` [[:d:Q34228]]
 +
** is `endangered language` [[:d:Q335214]]
 +
** is `severely endangered language` [[:d:Q83365366]]
 +
** is `dead language` [[d:Q45762]] (no community)
 +
** is `instinct language` [[:d:Q38058796]] (no speaker)
 +
* `ISO 639-1 code` [[:d:P:P218|P:P218]]
 +
* `ISO 639-2 code` [[:d:P:P219|P:P219]]
 +
* `ISO 639-3 code` [[:d:P:P220|P:P220]]
 +
* `IETF language tag` [[:d:P:P305|P:P305]]
 +
* `geographic coordinate` [[:d:P:P625|P625]]
 +
* `number of speakers` [[:d:P:P1098|P1098]]
 +
* `wikimedia code` [[:d:P:P424|P:P424]]
 +
* `native name` [[:d:P:P1705|P:P1705]]
 +
* `lingualibre ID` [[:d:P:P10369|P:P10369]]
 +
|
 +
For lexemes:
 +
* <span style="color:orange;">TO BE COMPLETED !</span>
 +
* Q82042 'part of speech'
 +
* P5137 'item for this sense'
 +
 
|}
 
|}
 +
 +
== Lexemes ==
 +
:''[[LinguaLibre:Technical board/Reports/2021/Wikidata Lexemes & Lingua Libre coordination assessment]]''.
  
 
=== Part of speech ===
 
=== Part of speech ===

Latest revision as of 15:42, 26 February 2024

Tools

Lexemes Queries Generator


Notable elements

Table of notable wikidata properties to create.
LinguaLibre endpoint Wikidata endpoint

For recordings:

For languages:

For lexemes:

  • TO BE COMPLETED !
  • Q82042 'part of speech'
  • P5137 'item for this sense'

Lexemes

LinguaLibre:Technical board/Reports/2021/Wikidata Lexemes & Lingua Libre coordination assessment.

Part of speech

To run on WDQS.[1]

#defaultEndpoint:Wikidata
SELECT ?item ?itemLabel
WHERE {
   ?item wdt:P31 wd:Q82042   # Q82042 'part of speech' 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
# GROUP BY ?item
ORDER BY ASC(?itemLabel)
... Loading ...

✅ Language (d:Q12107) → List of Wikidata lexemes for Breton

To run on WDQS.[1] Source: User:VIGNERON/common.js.

#defaultEndpoint:Wikidata
SELECT DISTINCT ?lexemeLabel ?lexeme
WITH {
  SELECT ?lexeme ?lexemeLabel ?lexical_category WHERE {
    ?lexeme a ontolex:LexicalEntry ;
            dct:language wd:Q12107 ; 
            wikibase:lemma ?lexemeLabel .
    OPTIONAL {
      ?lexeme wikibase:lexicalCategory ?lexical_category .
    }
  }
} AS %results
WHERE {
  INCLUDE %results
  OPTIONAL {        
    ?lexical_category rdfs:label ?lexical_categoryLabel .
    FILTER (LANG(?lexical_categoryLabel) = "en")
  }
}
... Loading ...
#defaultEndpoint:Wikidata
SELECT ?lexeme ?lemma
WHERE {
  ?lexeme dct:language wd:Q12107; 
          wikibase:lemma ?lemma.
}
... Loading ...

✅ Language () → List of wd lexemes with LL audio

To run on WDQS.[1]

#defaultEndpoint:Wikidata
SELECT * WHERE {
  ?l ontolex:lexicalForm ?x .
  ?x wdt:P443 ?value .
  FILTER regex (str(?value), "^http://commons.wikimedia.org/wiki/Special:FilePath/LL-").
}
... Loading ...

✅ Language () → List of wd lexemes with (LL audio and) wd translation (d:Q150)

✅ Concept (Q3142) → All lexeme whom sense is this concept

To run on WDQS.[1]

#defaultEndpoint:Wikidata
#title: Lexemes with senses linked to the item about the colour "red" (Q314
SELECT ?wikidataLexeme ?languageLabel ?lemma ?sense WHERE {
  ?wikidataLexeme dct:language ?language ; 
     wikibase:lemma ?lemma ; 
     ontolex:sense ?sense.
  ?language rdfs:label ?languageLabel .
  Filter(lang(?languageLabel)="en").
  ?sense wdt:P5137 wd:Q3142 . # Filter : P5137 'item for this sense' is Q3142 'red'
}
... Loading ...


✅ Given language (d:Q5146) → List existing parts of speech

To run on WDQS.[1]

#defaultEndpoint:Wikidata
SELECT ?pos ?posLabel (COUNT(?wikidataLexeme) AS ?quantity)
WHERE {
  ?wikidataLexeme dct:language wd:Q5146 ;     # Portugese
        wikibase:lexicalCategory ?pos .       # Parts of speech
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?pos ?posLabel
ORDER BY DESC(?quantity)
... Loading ...

✅ Given language (d:Q9192) → List wikidata lexemes

To run on WDQS.[1]

#defaultEndpoint:Wikidata
SELECT ?wikidataLexeme ?posLabel (GROUP_CONCAT(?lemma;separator=" / ") as ?lemmas )
WHERE {
  ?wikidataLexeme dct:language wd:Q9192 ;    # Chinese
     wikibase:lemma ?lemma ;                 # Words
     wikibase:lexicalCategory ?pos .         # Part of speech
  ?pos rdfs:label ?posLabel .
  Filter(lang(?posLabel)="en").
}
GROUP BY ?wikidataLexeme ?posLabel
... Loading ...

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 Cite error: Invalid <ref> tag; no text was provided for refs named WDQS

See also

Lingua Libre technical helps
Template {{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files How to create a frequency list?Convert files formatsDenoise files with SoXRename and mass rename
Bots Help:BotsLinguaLibre:BotHelp:Log in to Lingua Libre with PywikibotLingua Libre Bot (gh) • OlafbotPamputtBotDragons Bot (gh)
MediaWiki MediaWiki: Help:Documentation opérationelle MediawikiHelp:Database structureHelp:CSSHelp:RenameHelp:OAuthLinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}}JS scripts: MediaWiki:Common.jsLastAudios.jsSoundLibrary.jsItemsSugar.jsLexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.jsGadget-Demo.jsGadget-RecentNonAudio.jsLiLiZip.js
Queries Help:APIsHelp:SPARQLSPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenanceLingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses Help:Download datasetsHelp:Embed audio in HTML
Unstable & tests Help:SPARQL/test
Categories Category:Technical reports