Help

Difference between revisions of "Renaming"

(create page on renaming files. To expand.)
 
 
(30 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
== Renaming using the file name's fields ==
 +
Given files names such as <code>./LL-{codesLang}-{speaker}-{word}.wav</code> such as <code>./LL-Q150_(fra)-Tsaag_Valren-zèbre.wav
 +
</code> :
  
 +
<syntaxhighlight lang="bash" >mkdir -p ./new                                # create dir
 +
for file in ./LL-*.wav;
 +
do
 +
  isolang=$(basename "$file" | cut -d- -f2 | cut -d_ -f2 | tr -d "()");    # using "-" as split, select field 4 : "zèbre", trim parenthesis
 +
  key=$(basename "$file" | cut -d- -f4 | tr -d ".wav");            # using "-" as split, select field 4 : "zèbre", trim extension
 +
  cp "$file" ./new/"$isolang"-"$key".wav;                          # ./new/fra-zèbre.wav
 +
done
 +
</syntaxhighlight>
  
== Using the file name's fields ==
+
== Renaming using metadata ==
Given files names such as <code>./{codeLang}-{word}-{speaker}-{id}.wav</code> such as <code>./cmn-quan3-LL12087.wav</code> :
+
:IMPORTANT: This example process a SWAC Recorder's file. LinguaLibre's files metadata are in the cloud : on the audio's, speakers and languages pages.
<pre>mkdir -p ./new                                # create dir
+
==== Dependencies====
for file in ./cmn-*.wav;
+
:''<span style="color:#B10000">Warning : <code>avconv</code> have been removed from Ubuntu packages since 18.04. Use ffmpeg instead.</span>''
 +
<syntaxhighlight lang="bash">
 +
sudo apt-get install lame avconv                      # examine audio file's properties.
 +
</syntaxhighlight>
 +
 
 +
====avconv's output====
 +
<syntaxhighlight lang="bash">
 +
avconv -i ./cmn-jiāoliú.flac 2>&1                      # print out metadata of $file, for some formats only
 +
ffmpeg version 2.8.14-0ubuntu0.16.04.1 Copyright (c) 2000-2018 the FFmpeg developers
 +
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
 +
[...]
 +
Input #0, flac, from './cmn-jiāoliú.flac':
 +
  Metadata:
 +
    TITLE          : 交流
 +
    LICENSE        : Creative Commons BY-SA 3.0 U.S
 +
    COPYRIGHT      : (c) 2009 Yue Tan
 +
    ARTIST          : Tan
 +
    DATE            : 2009-07-08
 +
    GENRE          : Speech
 +
    SWAC_LANG      : cmn
 +
    SWAC_TEXT      : 交流
 +
    SWAC_ALPHAIDX  : jiāoliú
 +
    SWAC_SPEAK_NAME : Tan
 +
    SWAC_SPEAK_GENDER: F
 +
    SWAC_SPEAK_BIRTH_YEAR: 1978
 +
    SWAC_SPEAK_LANG : zho
 +
    SWAC_SPEAK_LANG_REGION: Liaoning
 +
    SWAC_SPEAK_LIV_COUNTRY: FR
 +
    SWAC_SPEAK_LIV_TOWN: Caen
 +
    SWAC_PRON_PHON  : jiāoliú
 +
    SWAC_COLL_SECTION: HSK niveau II
 +
    SWAC_COLL_LICENSE: Creative Commons BY-SA 3.0 U.S
 +
    SWAC_COLL_COPYRIGHT: (c) 2009 Yue Tan
 +
    SWAC_TECH_DATE  : 2009-07-08
 +
    SWAC_TECH_SOFT  : Shtooka Recorder/1.3
 +
  Duration: 00:00:01.40, start: 0.000000, bitrate: 447 kb/s
 +
    Stream #0:0: Audio: flac, 44100 Hz, mono, s16
 +
</syntaxhighlight>
 +
 
 +
====Renaming====
 +
<syntaxhighlight lang="bash">mkdir -p ./new                                # create dir
 +
for file in ./cmn-*.flac;
 
do  
 
do  
  key=$(basename &quot;$file&quot; | cut -d- -f3);     # using &quot;-&quot; as split, select field 2 : &quot;quan3&quot;
+
    key=$(avconv -i "$file" 2>&1 | sed -ne 's/.*SWAC_TEXT *: //p')    # print metadata, assign SWAC_TEXT's value to variable.
  cp &quot;$file&quot; ./new/cmn-&quot;$key&quot;.wav;          # ./new/cmn-quan3.wav
+
    cp "$file" ./cmn-$key.flac                                        # ./cmn-交流.flac
done</pre>
+
done</syntaxhighlight>
 +
 
 +
== See also ==
 +
{{Technicals}}
 +
 
 +
[[Category:Lingua Libre:Help]]

Latest revision as of 20:42, 28 December 2023

Renaming using the file name's fields

Given files names such as ./LL-{codesLang}-{speaker}-{word}.wav such as ./LL-Q150_(fra)-Tsaag_Valren-zèbre.wav  :

mkdir -p ./new                                # create dir
for file in ./LL-*.wav;
do 
   isolang=$(basename "$file" | cut -d- -f2 | cut -d_ -f2 | tr -d "()");     # using "-" as split, select field 4 : "zèbre", trim parenthesis
   key=$(basename "$file" | cut -d- -f4 | tr -d ".wav");            # using "-" as split, select field 4 : "zèbre", trim extension
   cp "$file" ./new/"$isolang"-"$key".wav;                          # ./new/fra-zèbre.wav
done

Renaming using metadata

IMPORTANT: This example process a SWAC Recorder's file. LinguaLibre's files metadata are in the cloud : on the audio's, speakers and languages pages.

Dependencies

Warning : avconv have been removed from Ubuntu packages since 18.04. Use ffmpeg instead.
sudo apt-get install lame avconv                       # examine audio file's properties.

avconv's output

avconv -i ./cmn-jiāoliú.flac 2>&1                       # print out metadata of $file, for some formats only
ffmpeg version 2.8.14-0ubuntu0.16.04.1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
[...]
Input #0, flac, from './cmn-jiāoliú.flac':
  Metadata:
    TITLE           : 交流
    LICENSE         : Creative Commons BY-SA 3.0 U.S
    COPYRIGHT       : (c) 2009 Yue Tan
    ARTIST          : Tan
    DATE            : 2009-07-08
    GENRE           : Speech
    SWAC_LANG       : cmn
    SWAC_TEXT       : 交流
    SWAC_ALPHAIDX   : jiāoliú
    SWAC_SPEAK_NAME : Tan
    SWAC_SPEAK_GENDER: F
    SWAC_SPEAK_BIRTH_YEAR: 1978
    SWAC_SPEAK_LANG : zho
    SWAC_SPEAK_LANG_REGION: Liaoning
    SWAC_SPEAK_LIV_COUNTRY: FR
    SWAC_SPEAK_LIV_TOWN: Caen
    SWAC_PRON_PHON  : jiāoliú
    SWAC_COLL_SECTION: HSK niveau II
    SWAC_COLL_LICENSE: Creative Commons BY-SA 3.0 U.S
    SWAC_COLL_COPYRIGHT: (c) 2009 Yue Tan
    SWAC_TECH_DATE  : 2009-07-08
    SWAC_TECH_SOFT  : Shtooka Recorder/1.3
  Duration: 00:00:01.40, start: 0.000000, bitrate: 447 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, mono, s16

Renaming

mkdir -p ./new                                # create dir
for file in ./cmn-*.flac;
do 
    key=$(avconv -i "$file" 2>&1 | sed -ne 's/.*SWAC_TEXT *: //p')     # print metadata, assign SWAC_TEXT's value to variable.
    cp "$file" ./cmn-$key.flac                                         # ./cmn-交流.flac
done

See also

Lingua Libre technical helps
Template {{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files How to create a frequency list?Convert files formatsDenoise files with SoXRename and mass rename
Bots Help:BotsLinguaLibre:BotHelp:Log in to Lingua Libre with PywikibotLingua Libre Bot (gh) • OlafbotPamputtBotDragons Bot (gh)
MediaWiki MediaWiki: Help:Documentation opérationelle MediawikiHelp:Database structureHelp:CSSHelp:RenameHelp:OAuthLinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}}JS scripts: MediaWiki:Common.jsLastAudios.jsSoundLibrary.jsItemsSugar.jsLexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.jsGadget-Demo.jsGadget-RecentNonAudio.jsLiLiZip.js
Queries Help:APIsHelp:SPARQLSPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenanceLingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses Help:Download datasetsHelp:Embed audio in HTML
Unstable & tests Help:SPARQL/test
Categories Category:Technical reports