Interfacing SilentJack and Nagios

So, silence detection is a big deal when it comes to monitoring broadcast audio systems. You want to be sure your stuff is making noise. If your sustainer’s not putting anything out, it’s not a lot of good.

SilentJack is an awesome little utility from the king of ‘oh, that’s a handy little program for broadcast’, Nicholas Humfrey. This guy’s getting a beer if I ever meet him. But it’s not a simple drop-in tool for monitoring, sadly – we need to do a bit of work to make it so. Continue reading Interfacing SilentJack and Nagios

Monitoring radio, and the joys of realtime feedback

Monitoring and logging of broadcast systems is often overlooked in smaller setups like university radio stations. Which is a shame. If we’d had monitoring in place last year at Insanity, we’d have known exactly how much dead air went out. Now, we’ve got a much better setup. It’s cheap(ish), a little bit homebrew, but there’s nothing fundamentally wrong about that. It works, and I know we’ve had 53m 34s of dead air in the last year – and it’s all accounted for. We also know when we were running on backup audio – showing that we’d have been on dead air for a goodly 1 day, 17 hours, if it weren’t for our silence detector and backup playout unit.

So, what’s the best way to approach all this? With a new set of equipment to monitor, and a lot of old equipment still unmonitored (our desk’s GPIO and the AM transmission gear, for instance), hardware GPIO lines are order of the day. Mostly, this can be done with relays, pull-up resistors, optoisolators and a steady hand with a soldering iron (bolted onto an Arduino – see my last post). This is all well and good – but now you need to get that into a sensible format, record it, log it, and take action depending on the state (emails, indicators in the studio, etc). Which is where Nagios comes in.

Nagios is scarily flexible. It’s also extremely lightweight for small setups, and very easy to configure, with excellent documentation. And since it’s widely used in larger IT outfits, there’s an infinitude of addons, plugins, and so on to expand it. It’s at heart a monitoring program, with support for active checks (where Nagios polls the host or service), and passive checks (where the host/service reports in to Nagios itself). Once you’ve gotten things inside Nagios you can use a flexible set of rules for notifications, and use some of the many addons (we use check_mk’s livestatus) to get data out into other programs. We use CoffeeSaint on dashboards to provide single heads-up displays (with camera displays combined), and we’ll soon have Nagios integrated into the Insanity website’s admin backend so that staff can see a general overview of the system alongside all the other vital information like listening figures and cameras from wherever they are in the world.

At the moment because we’re poor, we don’t have ethernet shields for our Arduinos, so we have an Arduino Mega and an Arduino doing two things- driving a matrix LED display, and monitoring our silence detector. The latter is done by the Mega, which will later be expanded with ethernet and various boards to break out the DB25/DB15/DB9 connectors from the mixing desk into Arduino-compatible levels. For now though it’s just hooked up on USB. The Mega also drives a set of Bliptronics LED RGB modules- these are currently spread along the top of the mixing desk in the studio where presenters can see it.

The Mega is set up to simply watch the two inputs of the silence detector and their alarm states, and to display one of four messages describing that state to serial over USB, where it’s picked up by a small Ruby script, which then sends a NSCA (Nagios Status Check Acceptor) packet to the NSCA daemon on the monitoring box. This packet contains the information about the silence detector in a format that maps to the host and services described in Nagios. This means we get very rapid updates on the silence detector compared to active monitoring. Even more rapid is the LED strip, which is controlled wholly in C. If the station’s output gets switched to the backup source because the silence detector thinks the studio is too quiet, then the strip goes red (normally green) – and if a presenter is in there, he/she now knows that there’s a problem, they’re no longer going out on air, and they can either work out the problem by themselves (previously the staff wouldn’t know that this had even happened without going out and listening to the radio) or call for support.

It’s one thing to have standards, and another thing entirely to monitor your performance and hold yourself to them; and when there’s a problem with your metrics, then you can fix it. We’ve dramatically improved the actual quality of broadcast radio that Insanity puts out in the last year; partly it’s a human change, we’ve had some great presenters join us this year and a very determined board, but there’s an element of technical change and a lot of instant feedback for presenters to help them; and this has really helped people spot problems where previously they wouldn’t even have known they were there.