Graphics overlays are everywhere nowadays in the live video broadcasting industry. In this post I introduce a new demo relying on GStreamer and WPEWebKit to deliver low-latency web-augmented video broadcasts.

Readers of this blog might remember a few posts about WPEWebKit and a GStreamer element we at Igalia worked on. In december 2018 I introduced GstWPE and a few months later blogged about a proof-of-concept application I wrote for it. So, learning from this first iteration, I wrote another demo!

The first demo was already quite cool, but had a few down-sides:

  1. It works only on desktop (running in a Wayland compositor). The Wayland compositor dependency can be a burden in some cases. Ideally we could imaginge GstWPE applications running “in the cloud”, on machines without GPU, bare metal.
  2. While it was cool to stream to Twitch, Youtube and the like, these platforms currently can ingest only RTMP streams. That means the latency introduced can be quite significant, depending on the network conditions of course, but even in ideal conditions the latency was between one and 2 seconds. This is not great, in the world we live in.

To address the first point, WPE founding engineer, Žan Doberšek enabled software rasterizing support in WPE and its FDO backend. This is great because it allows WPE to run on machines without GPU (like continuous integration builders, test bots) but also “in the cloud” where machines with GPU are less affordable than bare metal! Following up, I enabled this feature in GstWPE. The source element caps template now has video/x-raw, in addition to video/x-raw(memory:GLMemory). To force swrast, you need to set the LIBGL_ALWAYS_SOFTWARE=true environment variable. The downside of swrast is that you need a good CPU. Of course it depends on the video resolution and framerate you want to target.

On the latency front, I decided to switch from RTMP to WebRTC! This W3C spec isn’t only about video chat! With WebRTC, sub-second live one-to-many broadcasting can be achieved, without much efforts, given you have a good SFU. For this demo I chose Janus, because its APIs are well documented, and it’s a cool project! I’m not sure it would scale very well in large deployments, but for my modest use-case, it fits very well.

Janus has a plugin called video-room which allows multiple participants to chat. But then imagine a participant only publishing its video stream and multiple “clients” connecting to that room, without sharing any video or audio stream, one-to-many broadcasting. As it turns out, GStreamer applications can already connect to this video-room plugin using GstWebRTC! A demo was developed by tobiasfriden and saket424 in Python, it recently moved to the gst-examples repository. As I kind of prefer to use Rust nowadays (whenever I can anyway) I ported this demo to Rust, it was upstreamed in gst-examples as well. This specific demo streams the video test pattern to a Janus instance.

Adapting this Janus demo was then quite trivial. By relying on a similar video mixer approach I used for the first GstWPE demo, I had a GstWPE-powered WebView streaming to Janus.

The next step was the actual graphics overlays infrastructure. In the first GstWPE demo I had a basic GTK UI allowing to edit the overlays on-the-fly. This can’t be used for this new demo, because I wanted to use it headless. After doing some research I found a really nice NodeJS app on Github, it was developed by Luke Moscrop, who’s actually one of the main developers of the Brave BBC project. The Roses CasparCG Graphics was developed in the context of the Lancaster University Students’ Union TV Station, this app starts a web-server on port 3000 with two main entry points:

  • An admin web-UI (in /admin/ allowing to create and manage overlays, like sports score boards, info banners, and so on.
  • The target overlay page (in the root location of the server), which is a web-page without predetermined background, displaying the overlays with HTML, CSS and JS. This web-page is meant to be fed to CasparCG (or GstWPE :))

After making a few tweaks in this NodeJS app, I can now:

  1. Start the NodeJS app, load the admin UI in a browser and enable some overlays
  2. Start my native Rust GStreamer/WPE application, which:
    • connects to the overlay web-server
    • mixes a live video source (webcam for instances) with the WPE-powered overlay
    • encodes the video stream to H.264, VP8 or VP9
    • sends the encoded RTP stream using WebRTC to a Janus server
  3. Let “consumer” clients connect to Janus with their browser, in order to see the resulting live broadcast.

(If the video doesn’t display, here is the Youtube link.)

This is pretty cool and fun, as my colleague Brian Kardell mentions in the video. Working on this new version gave me more ideas for the next one. And very recently the audio rendering protocol was merged in WPEBackend-FDO! That means even more use-cases are now unlocked for GstWPE.

This demo’s source code is hosted on Github. Feel free to open issues there, I am always interested in getting feedback, good or bad!

GstWPE is maintained upstream in GStreamer and relies heavily on WPEWebKit and its FDO backend. Don’t hesitate to contact us if you have specific requirements or issues with these projects :)