Next steps: Video streaming and production

I’ve done a lot of blogging on radio and Rivendell in particular. I’m a huge proponent of open tools and technologies wherever possible because it provides tons of flexibility, is cheap, and in many cases is just as powerful or easy to use as the commercial stuff. Radio and audio is complex, but why stop there? At Insanity we’ve been evaluating video streaming as a way of adding to our existing broadcasts and coverage, as video can be far more engaging to consumers than audio, particularly in the YouTube era. But with Insanity, we have one major problem: We don’t have any money!

So, you might figure that’s a problem. You’d be, partly, right. Video involves a lot more data, loads more numbers to crunch as a result, more bandwidth, and so on. Not to mention the relevant methods of capture are immensely more expensive to implement than the equivalent-quality audio. I’d like, though, to highlight a few nice things for open source video and production.

For any broadcast worthy of the description, we want to be able to do this:

  • Work with multiple camera feeds both close to and further from the computer
  • Fade between cameras
  • Overlay static graphics
  • Overlay dynamic text
  • Add decent-quality audio
  • Stream to as wide an audience as possible

So let’s start with cameras. Cameras are expensive. Buying handicam type consumer-grade things won’t cut the mustard as far as I am aware. I don’t know enough to speak on the subject of HDMI capture (which things like the Blackmagic Studio can do, purportedly to Linux), but that’s the only output format those cameras present. Firewire/IEEE1394 remains a popular format of direct capture in Linux. In broadcast, SDI and HD-SDI are now the main format for video/audio transport, and affordable (I say affordable- low-end professional cameras, ~£3,500 on eBay) cameras support it as a standard and again, Blackmagic cards and other capture devices are available with Linux support. HD/SD-SDI are in my mind the ideal format to be working with because as a transport protocol it is capable of long cable runs- meaning your cameras can roam around or be placed where it most makes sense from a camerawork perspective. You pay extra, but compared to any other capture format it makes loads of sense. PC capture cards are not particularly expensive – a few hundred pounds gets you a card with SDI I/O and balanced audio I/O and analogue video I/O thrown in, too. Stick that in a PCI-E capable box and you’re good!

This brings me to topic two – CPU power. Specify the fastest box you can. Right now I’m testing an encoding-and-compositing-on-one-box setup, and with no cameras connected I’m at 80% CPU usage on a 2.8Ghz Core 2 Duo. This stuff needs brains. Intel i7 processors are the way to go right now, or high to mid end Xeons. RAM, too- at least 4 gigabytes, more wouldn’t hurt at all, especially if you use videos in your composites. Aim to keep all your data in RAM, or use an SSD. Multiple computers with gigabit networking will let you split loads across machines, as I’ll talk about in a bit.

Now so far we’ve spent a lot on cameras and computers. Fortunately, that’s all you need (audio equipment aside). Now we get everything in as a source in Linux, we can do the fun stuff- compositing our graphics and videos and text on top of the cameras, switching between them, and then broadcasting.

Composition we have a lot of choice in. We can use freej, which I am very interested in but have been unable to get working properly on Ubuntu so far. Or, we can use WebcamStudio. It’s a fairly flexible package that lets you do compositing of lots of source types, arranging things into layouts. So we might have an intro layout that has placeholder graphics describing the stream for while we’re setting up the cameras. Then we might have a couple of layouts that have our camera feeds on them. For a test setup I wrote a small Python script to get the last few tweets from our on-campus media, then told WCS to run that command and put the output on the screen as a rolling text item. Now we’ve got our video cameras with contributions from news teams overlaid. We’re starting to look professional!

Audio is a piece of cake. All we need to do is capture it and encode it with the video. The former is too complex to detail here but needless to say you end up bringing in your final mixdown of audio into a soundcard input which you can then bring into my next chunk of software for the post – Flumotion.

Flumotion is really neat. It lets you build GStreamer chains very easily to do broadcasting, and has a lot of added functionality on top of that like test chains, and crucially the ability to split tasks across computers fairly seamlessly. You can make your rig a network of 3 machines, put video compositing on one, capture on another, and combining and encoding of those streams can be delegated to another machine- for instance. There’s loads of ways to work with this. Another way I’ve thought about doing things would be to encode using Driac or another high-res HD codec the feed from each camera on a local Icecast server or HTTP server, then pull that compressed feed down into the compositor, then do the lossy for-broadcast encoding prior to distribution. This has the benefit that you can also write the Driac encoded data direct to disk, giving you clean-feeds of each camera input for later editing and production use. Perfect!

So, how about streaming? Well, turns out this is pretty simple. We use Flumotion to generate a Theora/Vorbis encoded Ogg container, throw it at Icecast2 using the shout2 module, and then use HTML5 video tags with a fallback on the Cortado Java player to embed it into pages. This works without any plugin requirements on most modern browsers, and will seamlessly fall back to Java if HTML5/Theora/Vorbis support is lacking. Perfect! Icecast2 gives us failover and relay options for expansion and redundancy for free, and doesn’t need much in the way of resources. The fact that we can push to this means that we can site this outside any networks we might be using, and we don’t have to worry about firewalls at the venue so much as long as we can send out to our remote host.

The main issue here is in making everything realtime. With video, that’s expensive. I reckon for a two-camera setup including buying computers and ancillary equipment we’re talking somewhere around £10,000 to £15,000 setup costs. But that’s actually pretty good – considering what it gets you. You’re now a real, bona-fide online TV station capable of real time coverage- something very few places can boast. For student media that’s a massive boon to both the people running the media and students involved with it, as they can now learn what amounts to the most stressful and complex aspects of TV broadcast production in a fairly safe setting, and for students consuming the media it means much better coverage, more engaging events and more fun in general.

This is a very exciting time for student media, where there is a lot of room for experimentation and improvement to try and provide the absolute best in a safe, controlled environment. Of course, student media means student budgets, but TV is rapidly becoming affordable, and is certainly within the grasp of any SU in the country who wish to go down this path.

One thought on “Next steps: Video streaming and production”

Comments are closed.