Latest revision as of 17:14, 4 January 2024

Converting audio files is a frequently needed task when serving audio files to target services and users.

How to convert large batch of audios ?

Dependencies

Warning : avconv have been removed from Ubuntu packages since 18.04. Use ffmpeg instead.

sudo apt-get install lame ffmpeg
man lame                           # then search for parameters via "/{param}". ex: /-m

Technolect

cbr: constant bit rate.
abr: average bit rate.
vbr: variable bit rate.

For more, see man lame.

Helpers

$ mkdir -p ./new/                    # create folder, if not existing (-p)
$ file="./data/cmn-0a0a8a8b.flac"    # path to .flac file into varible &quot;$file&quot;
$ ffprobe -hide_banner "$file"       # print out metadata of $file, for some formats only

$ ffprobe -hide_banner ./data/cmn-0a0a8a8b.ogg
Input #0, ogg, from './data/cmn-0a0a8a8b.ogg':
  Duration: 00:00:01.25, start: 0.000000, bitrate: 99 kb/s
  Stream #0:0: Audio: vorbis, 44100 Hz, mono, fltp, 80 kb/s
    Metadata:
      TITLE           : 高低
      LICENSE         : Creative Commons BY-SA 3.0 U.S
      COPYRIGHT       : (c) 2009 Yue Tan
      ARTIST          : Tan
      DATE            : 2009-07-08
      GENRE           : Speech
      SWAC_LANG       : cmn
      SWAC_TEXT       : 高低
      SWAC_ALPHAIDX   : gāodī
      SWAC_SPEAK_NAME : Tan
      SWAC_SPEAK_GENDER: F
      SWAC_SPEAK_BIRTH_YEAR: 1978
      SWAC_SPEAK_LANG : zho
      SWAC_SPEAK_LANG_REGION: Liaoning
      SWAC_SPEAK_LIV_COUNTRY: FR
      SWAC_SPEAK_LIV_TOWN: Caen
      SWAC_PRON_PHON  : gāodī
      SWAC_COLL_SECTION: HSK niveau IV
      SWAC_COLL_LICENSE: Creative Commons BY-SA 3.0 U.S
      SWAC_COLL_COPYRIGHT: (c) 2009 Yue Tan
      SWAC_TECH_DATE  : 2009-07-08
      SWAC_TECH_SOFT  : Shtooka Recorder/1.3

Simple batch format conversion

for file in ./flac/*.flac
do
  key=$(basename "$file" .flac).mp3                                 # name of the file minus .flac, plus .mp3 
  lame --abr 24    -m m -h --resample 22.05 "$file" "./new/$key";
done

Metadata-based format conversion

This example works on SWAC recorder audio files available at https://packs.shtooka.net. Those files contain rich metadata, with a SWAC_TEXT metadata field. In this example, we assume a folder with file audio-0a6f36g.flac and metadata SWAC_TEXT : 很.

for file in ./flac/*.flac
do
  key=$(ffmpeg -i "$file" 2>&1 | sed -ne 's/.*SWAC_TEXT *: //p')                 # print metadata, assign SWAC_TEXT's value to variable.
  lame --abr 24    -m m -h --resample 22.05 "$file" "./new-24k/cmn-$key.mp3";    # ex: cmn-很.mp3 (24k abr)
  lame --cbr -b 96 -m m -h --resample 22.05 "$file" "./new-96k/cmn-$key.mp3";    # ex: cmn-很.mp3 (96k cbr)
done

**Lingua Libre technical helps**
Template	{{Speakers category}} • {{Recommended lists}} • {{To iso 639-2}} • {{To iso 639-3}} • {{Userbox-records}} • {{Bot steps}}
Audio files	How to create a frequency list? • Convert files formats • Denoise files with SoX • Rename and mass rename
Bots	Help:Bots • LinguaLibre:Bot • Help:Log in to Lingua Libre with Pywikibot • Lingua Libre Bot (gh) • Olafbot • PamputtBot • Dragons Bot (gh)
MediaWiki	MediaWiki: Help:Documentation opérationelle Mediawiki • Help:Database structure • Help:CSS • Help:Rename • Help:OAuth • LinguaLibre:User rights (rate limit) • Module:Lingua Libre record & {{Lingua Libre record}} • JS scripts: MediaWiki:Common.js • LastAudios.js • SoundLibrary.js • ItemsSugar.js • LexemeQueriesGenerator.js (pad) • Sparql2data.js (pad) • LanguagesGallery.js (pad) • Gadgets: Gadget-LinguaImporter.js • Gadget-Demo.js • Gadget-RecentNonAudio.js • LiLiZip.js
Queries	Help:APIs • Help:SPARQL • SPARQL (intermediate) (stub) • SPARQL for lexemes (stub) • SPARQL for maintenance • Lingualibre:Wikidata (stub) • Help:SPARQL (HAL)
Reuses	Help:Download datasets • Help:Embed audio in HTML
Unstable & tests	Help:SPARQL/test
Categories	Category:Technical reports

@@ Line 1: / Line 1: @@
+'''Converting audio files''' is a frequently needed task when serving audio files to target services and users.
 = How to convert large batch of audios ? =
 == Dependencies ==
-<source lang="bash">
+:''<span style="color:#B10000">Warning : <code>avconv</code> have been removed from Ubuntu packages since 18.04. Use ffmpeg instead.</span>''
-sudo apt-get install lame avconv
+<syntaxhighlight lang="bash">
+sudo apt-get install lame ffmpeg
 man lame                           # then search for parameters via "/{param}". ex: /-m
-</source>
+</syntaxhighlight>
-== Technolecte ==
+== Technolect ==
 * <code>cbr</code>: constant bit rate.
 * <code>abr</code>: average bit rate.
@@ Line 13: / Line 16: @@
 == Helpers ==
-<source lang="bash">
+<syntaxhighlight lang="bash">
-mkdir -p ./new/                     # create folder, if not existing (-p)
+$ mkdir -p ./new/                    # create folder, if not existing (-p)
-file="./dir/audio-0a6f36g.flac"     # path to .flac file into varible &quot;$file&quot;
+$ file="./data/cmn-0a0a8a8b.flac"    # path to .flac file into varible &quot;$file&quot;
-avconv -i "$file" 2>1               # print out metadata of $file, for some formats only
+$ ffprobe -hide_banner "$file"       # print out metadata of $file, for some formats only
-</source>
+$ ffprobe -hide_banner ./data/cmn-0a0a8a8b.ogg
+Input #0, ogg, from './data/cmn-0a0a8a8b.ogg':
+  Duration: 00:00:01.25, start: 0.000000, bitrate: 99 kb/s
+  Stream #0:0: Audio: vorbis, 44100 Hz, mono, fltp, 80 kb/s
+    Metadata:
+      TITLE           : 高低
+      LICENSE         : Creative Commons BY-SA 3.0 U.S
+      COPYRIGHT       : (c) 2009 Yue Tan
+      ARTIST          : Tan
+      DATE            : 2009-07-08
+      GENRE           : Speech
+      SWAC_LANG       : cmn
+      SWAC_TEXT       : 高低
+      SWAC_ALPHAIDX   : gāodī
+      SWAC_SPEAK_NAME : Tan
+      SWAC_SPEAK_GENDER: F
+      SWAC_SPEAK_BIRTH_YEAR: 1978
+      SWAC_SPEAK_LANG : zho
+      SWAC_SPEAK_LANG_REGION: Liaoning
+      SWAC_SPEAK_LIV_COUNTRY: FR
+      SWAC_SPEAK_LIV_TOWN: Caen
+      SWAC_PRON_PHON  : gāodī
+      SWAC_COLL_SECTION: HSK niveau IV
+      SWAC_COLL_LICENSE: Creative Commons BY-SA 3.0 U.S
+      SWAC_COLL_COPYRIGHT: (c) 2009 Yue Tan
+      SWAC_TECH_DATE  : 2009-07-08
+      SWAC_TECH_SOFT  : Shtooka Recorder/1.3
+</syntaxhighlight>
 == Simple batch format conversion ==
-<source lang="bash">
+<syntaxhighlight lang="bash">
 for file in ./flac/*.flac
 do
@@ Line 26: / Line 57: @@
    lame --abr 24    -m m -h --resample 22.05 "$file" "./new/$key";
 done
-</source>
+</syntaxhighlight>
 == Metadata-based format conversion ==
-This example works on SWAC recorder audio files having the <code>SWAC_TEXT</code> metadata field. In this exemple, we assume a folder with file <code>audio-0a6f36g.flac</code> and metadata <code>SWAC_TEXT    : 很</code>.
+This example works on SWAC recorder audio files available at https://packs.shtooka.net. Those files contain rich metadata, with a <code>SWAC_TEXT</code> metadata field. In this example, we assume a folder with file <code>audio-0a6f36g.flac</code> and metadata <code>SWAC_TEXT    : 很</code>.
-<source lang="bash">
+<syntaxhighlight lang="bash">
 for file in ./flac/*.flac
 do
-   key=$(avconv -i "$file" 2>&1 | sed -ne 's/.*SWAC_TEXT *: //p')                 # print metadata, assign SWAC_TEXT's value to variable.
+   key=$(ffmpeg -i "$file" 2>&1 | sed -ne 's/.*SWAC_TEXT *: //p')                 # print metadata, assign SWAC_TEXT's value to variable.
    lame --abr 24    -m m -h --resample 22.05 "$file" "./new-24k/cmn-$key.mp3";    # ex: cmn-很.mp3 (24k abr)
    lame --cbr -b 96 -m m -h --resample 22.05 "$file" "./new-96k/cmn-$key.mp3";    # ex: cmn-很.mp3 (96k cbr)
-done</source>
+done</syntaxhighlight>
+== See also ==
+* [[:c:Commons:Audio]]
+{{Technicals}}
-{{LinguaLibre scripts}}
+[[Category:Lingua Libre:Help{{#translation:}}]]

Help

Difference between revisions of "Converting audios"