Bringing Ogg/Vorbis encoding to Chrome’s Audio API

Firefox can pull your silky voice in over a microphone and generate Ogg/Vorbis. This format is small enough to allow anyone to upload 3 minutes of audio in as long as it takes to send a large photo by email.

Chrome however, only gives us PCM data. So what you gonna do? Spend an hour uploading WAV files to a server? No thanks.

I’m going to use Emscripten to transcode libOgg + libVorbis into Javascript. And then encode the PCM data arriving from the mic into a more bandwidth-friendly format. Also: it would scale better not to have the web-server do all of the audio encoding :)

Build step:

I have a tiny wrapper that performs vorbis encoding, feeds packets into ogg, and outputs encoded data into a buffer that can then be played via <audio> or pushed to a server via xmlhttp. The above build step compiles and exports the minimal amount of Javascript to support only the features of the libraries that I’m using.

The Emscripten overhead/bootstrap is 200KB
A full build exporting all ogg/vorbis functions: 4MB
Minimal build exporting just the critical parts (above) : 1.6MB

1.6MB isn’t awful. Gzip it and it’s fairly manageable. But we can do better!

Crack open the generated Javascript

Scanning through, the first thing I notice is gigantic blobs of binary data (I assume these are functions from the ogg/vorbis libraries). These account for a good 1MB of the generated file. A good target.

It looks like this:

Each byte is represented as an integer (0-255), written out as an ASCII string. This means a single byte like “119” is represented by 3 bytes on disk. My first thought: shrink it by representing it in pure binary form. But you can’t just write binary into a Javascript file, that will generate random characters and break code. So you need to encode it somehow. A fast and well supported encoding method, base64 is worth trying.

How do we do that? Need to find the code in Emscripten responsible for generating these blobs. Scan the emscripten folder for the “memory initializer” comment and see where that takes us. Turns out: tools/shared.py

This method generates the big blobs, tweak it to use base64 encoding:

Original output was done via:  ','.join(contents),
Raw binary (bad idea) can be done via:  ''.join(struct.pack('B',int(i)) for i in contents),
Base64 encoded binary (nice and safe) via:  base64.b64encode(''.join(struct.pack('B',int(i)) for i in contents)), 

Disclaimer: I don’t know python at all, so the above code may be hazardous to your health.

Next: modifying the allocate() function to decode base64 encoded binary.

We just changed the parameters fed into the allocate() method, so we need to modify it to be able to deal with base64 encoded data.

But I don’t want to modify the global allocate(), since that may have unintended consequences. Rather lets make an alternative one to use with our large binary blobs.

1) Search-and-replace  /* memory initializer */ allocate  with  /* memory initializer (tweaked) */ allocateBase64Encoded

2) Create this moddified version of allocate() called allocateBase64Encoded():

The above allocator is extremely stripped down. It works with a Base64 string to Uint8Array decoder I borrowed from Mozilla docs. And only for ALLOC_NONE “i8” (unsigned char) slabs. The above snippet can be injected into Emscripten’s src/preamble.js or manually into generated code on an individual basis.

Resulting Javascript blobs:

No more ASCII integer arrays, but rather a nice base64 string. And we save a good 600KB!
Minimal build using base64 encoding: 1.0MB

Shrinking the binary libraries

libvorbis.so takes up 440KB, libvorbis.so 220KB, libogg.so 30KB. So let’s tackle the fat library first.

Running llvm-ar t libvorbisenc.a  tells us that the entire archive is just libvorbisenc.o. Taking a look at libvorbisenc.cpp we can spot a lot of large headers responsible for it’s 440KB weight. Let’s gut some of these headers:

We will only support 44khz. Delete vorbisenc.o and rebuild:

The encoder object shrank by a good 270KB!

Minimal gutted base64 build: 730KB

Gzipped: 200KB! This is VERY manageable.

13 thoughts on “Bringing Ogg/Vorbis encoding to Chrome’s Audio API

  1. Awesome!!!! 730 KB with a encoder in js if pretty awesome.!!

  2. Hi, is a good work!

    Can you create one step-by-step howto and publish this creation (linux, I use ubuntu 12.04 64) eg. on github or hier?

    Or put compiled js files into blog and accept downloadable?

    Big question a encoder performance,have you perform data?

    I try SPEEX encoder built in a realtime HTML5 upstream application. Main perform. results is good. Main laptop is 4core, good work this fine. The speex chunks source a and use block parsing (4096 sample on each block), upload to server and restream to many clients.

    • can you share how you implemented the speex.js encoding? I am attempting this, and am having issues, any help much much appreciated!

      failing that, some more explanation as to how to get the emscripten method working would be so helpful!

  3. Allan Simon says:

    At the beginning of your article you say

    “Firefox can pull your silky voice in over a microphone and generate Ogg/Vorbis.”

    How ? I’m pretty interested by it, any link or piece of code would be really helpful, thanks

  4. Pingback: Browser Audio Encoding, Atom Open Source | 8tut.com

  5. Very intresting, but useless without detailed information.

  6. I’m very interested in this as well. Have you published this library at all?

  7. Great work! and nice post ;)
    Thanks!

  8. Hey, i’ve tried all of the above and I get this when I run the emcc -O2 command above:

    warning: unresolved symbol: vorbis_analysis_buffer
    warning: unresolved symbol: vorbis_encode_init_vbr
    warning: unresolved symbol: ogg_stream_packetin
    warning: unresolved symbol: vorbis_comment_clear
    warning: unresolved symbol: vorbis_info_clear
    warning: unresolved symbol: vorbis_block_clear
    warning: unresolved symbol: vorbis_comment_add_tag
    warning: unresolved symbol: ogg_stream_flush
    warning: unresolved symbol: ogg_stream_clear
    warning: unresolved symbol: vorbis_bitrate_flushpacket
    warning: unresolved symbol: vorbis_analysis_init
    warning: unresolved symbol: ogg_stream_pageout
    warning: unresolved symbol: vorbis_analysis_headerout
    warning: unresolved symbol: vorbis_analysis_wrote
    warning: unresolved symbol: vorbis_analysis
    warning: unresolved symbol: vorbis_info_init
    warning: unresolved symbol: vorbis_dsp_clear
    warning: unresolved symbol: vorbis_analysis_blockout
    warning: unresolved symbol: ogg_stream_init
    warning: unresolved symbol: vorbis_block_init
    warning: unresolved symbol: vorbis_bitrate_addblock
    warning: unresolved symbol: vorbis_comment_init

    I build libogg and libvorbis from sources using emconfigure and placed them in the library directory as above but it cannot find the symbols when linking…

    Oh no!

    D.

  9. Can’t you just provide working .js file or real demo on the web? Prove that you actually got it working :)

  10. Check out https://github.com/Garciat/libvorbis.js for a working distribution of the library.

  11. Amazing work, I’m trying to do this but converting to m4a files, only found ffmpeg but is too big and delayed :(

Leave a Reply

Your email address will not be published. Required fields are marked *