Implementing WebAudio in WebKit with GStreamer

WebAudio is a JavaScript API for processing and synthesizing audio in web applications. It has been implemented first in Google Chrome and later on in Safari. The intention behind this blog post is to explain how other WebKit ports can support the WebAudio APIs using GStreamer. The WebKit GTK+ port is already up to speed and the EFL and Qt ports are within reach as well.

The Chromium WebAudio demos don’t work yet in Epiphany because WebKitGTK+ doesn’t yet ship with WebAudio support enabled by default and the WebView used by Epiphany doesn’t yet enable the enable-web-audio WebKitWebSetting. However, for the brave people willing to build WebKitGTK+, here are the steps to follow:

$ Tools/Scripts/build-webkit --gtk --web-audio
$ Tools/Scripts/run-launcher --gtk --enable-web-audio=TRUE http://chromium.googlecode.com/svn/trunk/samples/audio/shiny-drum-machine.html

I especially like the drum machine demo, pretty impressing stuff! So, back to the topic. Fortunately the WebAudio AudioContext was designed to allow multiple platform backends in WebCore, I’m going to talk about the GStreamer backend. It was first prototyped by Zan Dobersek and he kindly passed me on the task which I continued and upstreamed in WebKit trunk.

The two biggest features of the backend are:

1. Decoding audio data from a media file or from a memory pointer and pulling it into the WebAudio machinery of WebKit where it’s going to be processed by the WebAudio pipeline.

2. Getting back audio data as WAV from the WebAudio pipeline and play it back :)

For the first step we use decodebin2 with data coming from either filesrc or giostreamsrc. The decoded data is split in mono channels using the deinterleave element, passed to appsinks and converted to AudioChannels which internally store the data in FloatArrays. Every AudioChannel is stored in an AudioBus instance which we pass onto WebCore for further internal processing. The implementation can be found in the AudioFileReaderGStreamer.cpp file.

Internally the WebAudio stack also relies on our GStreamer FFTFrame implementation which uses the gst-fft library to perform Fast Fourier Transforms.

Then let’s talk about playback! Once the WebAudio stack has finished processing data and if the web developer created an AudioDestinationNode as part of his WebAudio pipeline, the user needs to hear something playing :) As the mighty reader might guess the magic is done in an AudioDestinationGStreamer.cpp file! I decided to implement most of the logic about reading data from the AudioBus as a GStreamer source element, called WebKitWebAudioSourceGStreamer. This element takes every AudioChannel of the AudioBus, convert the FloatArray data to GStreamer buffers, interleave the channels and encode the whole as WAV data using the wavenc element. In the AudioDestination GStreamer pipeline we then use this shiny source element, parse the WAV data and pass it on the audio sink!

What’s next in our roadmap?

I’m not sure this source element was the correct approach in the end, I think that at some point I’ll just refactor it into the AudioDestination implementation.
Port to GStreamer 0.11 APIs! This work is on-going already for the MediaPlayer.
There are still some bugs to fix, before I really feel confident about enabling this feature by default. We should also add support for reading data from the HTML5 Media elements.
The WebKit webaudio layout tests are almost passing on the build bots, this work is on-going as well.
Enable WebAudio by default in WebKitGTK’s build and Epiphany, targetting GNOME 3.6.

I would like to thank Igalia for allowing me to dedicate work time on this project :)