I’m sure this post’s been done a million times over by now, but enough people at work have asked me how my video/audio works, that at this point it’s quicker to have a blog post to point at!
I’m pretty sorted for audio and video for videoconferencing. Given my days are usually back-to-back video calls, that’s fairly important to me – I want people to be able to see me clearly and hear me clearly (and ideally vice versa, but that’s not always the case). Non-verbal communication is important for effective communication, especially with difficult topics and emotional issues.
In the past I’ve been almost entirely audio-only – playing games with friends, that’s all you need, and that’s most of what I did, chatting on Mumble etc. The de rigeur for this is a decent headset or a separate mic/headphone pairing – headphones for spatial awareness (thanks to HRTFs), but also to avoid feedback.
Audio, as it was (Feb 2020)
My setup at the time was a 2nd generation Focusrite Scarlett 2i2 – a well-loved and widely used 2 in, 2 out (stereo pair + headphone) USB audio interface. Into this I had plugged a custom breakout/switch box which went to a Beyerdynamic DT109. The DT109’s a weird headset – it’s the microphone-equipped version of the classic DT100 you’ll see in old (and some new) BBC music videos etc, primarily designed for camera operators and the like.
This had a few issues – the 2i2 occasionally fell over and just spat crackly audio at the PC, and the DT109 microphone was designed for clarity, not quality. Which is to say I was entirely intelligible but didn’t quite sound myself. However, it basically worked. I used the headphone jack for my headphones, and the main outputs fed my monitor speakers (a pair of Genelec 8320s and a 7350 subwoofer).
I did have an old webcam – a Microsoft thing which purported to do HD – but it wasn’t much to write home about. Good enough for occasional family Skype, but nothing special.
Limitations of space and budget
At no point did I want to spend a fortune on all this, but I should caveat the below by saying I’m doing all this in a pretty tiny cottage in which my “office” is also my partner’s office, a bedroom, and part-time workshop.
I’m not so constrained by budget – and I am very much of the opinion that if you go “cheap” you end up buying the expensive one down the line and paying 50% more (at least) in the end. However, I do strongly recommend the approach espoused by Adam Savage on tools – if you’ve not got one and have no experience, buy the cheapest possible thing and then decide if it’s worth it to you to have a good one. You’ll get a better understanding of what makes a good one, and be able to make better decisions. Plan accordingly!
Enter COVID-19 (March 2020)
My usual working pattern was to be in the office/lab every day. I’m very close to the office, so this isn’t so much of a drag, and a lot of what I do is to do with things, physical bits of plastic and metal and glass, so being able to get my hands on stuff is pretty key.
When the pandemic hit I realised I’d be videoconferencing and spending a lot of time at home, so I figured I had to fix a few things:
- Ergonomics – so I wouldn’t end up with chronic RSI/carpal tunnel or back issues
- Video – so people could see me and I could have proper conversations
- Audio – so I could be heard clearly and well, and without background noise
Ergonomics were fairly easy – I’ve had a Herman Miller Aeron for a while which works very well for me as a chair. I swapped my cheap-as-chips desk from an office surplus store for a Fully Jarvis electronic standing desk, and I try to stand up for half a day at least every other day. I got a “moving mat” to stand on, which is crucial both for comfort and to keep you moving around a bit as you stand.
Standing desks (of the moving type) also force you to think carefully about cable management and monitor positions, and I adjusted my monitor mounting setup a few times just to get everything movable. I have four monitors – one of which is also a drawing tablet (which is a godsend for engineering discussions). All but one are mounted on simple and cheap VESA arms of the solid-pole-and-clamp variety.
I started out by buying a cheap chroma key background (greenscreen) and stands. I figured I had a lot of crap in the background of my shot – rather unavoidable mess (though my partner will disagree) that I’d rather avoid. At the time, AI-based “cutouts” were pretty basic and I knew how to make chroma key work.
I also added three lights, which I still use today. I don’t get much natural light in this room, with a tiny window.
They’re cheap “Neewer” brand dual colour temperature LED panels. I mounted them on cheap ceiling spigots and in one case a clamp – they have the “Manfrotto” standard spigots for which there are a lot of cheap and cheerful mounting options and tools available. I also put them on cheap TP-Link smart plugs so I can switch them off and on with my phone. These are a lot cheaper than things like the Elgato Key Lights, and though the light quality isn’t as good as “proper” cine lights like things from Aputure they’re good enough (they measure as having a CRI of about 93 on my meter).
I set up the lights in a key/fill/backlight setup and I’ve not adjusted them since – I have them set to produce a rough 5000K output (with 3000K on the back light, for some contrast) at about half their maximum brightness, which is plenty.
I stuck to my Microsoft HD camera for the time, and it worked OK.
I decided to make quite minor adjustments to my audio setup. I wanted a better microphone, and a switch in my DT109 breakout box died, so I dug out some old DT250s and bought a Rode NT1-A condenser microphone kit. This worked pretty well but picked up a lot of background noise, being a nice sensitive omni condenser.
I still had a bunch of issues with the Focusrite, but at the time didn’t want to spend a bunch on the box I wanted to replace it with, so stuck with it for the time being.
To glue it all together I used the fabulous Open Broadcaster Software, OBS. This is open source and powers probably 95% of streamers and commercial internet broadcasts, esports, and so on. It’s incredibly powerful but also pretty simple.
My OBS setup was simple:
- Camera video came in and used the in-built “Chroma Key” filter to produce video that was just me, cropped to just capture my face/head
- I put a background image behind me (or fed in a “VLC Source” from an IP camera in the garden for a “live” background)
- Audio went through a basic noise reduction filter and a compressor
To actually get this as a usable feed in all my videoconferencing tools I used Virtual Audio Cable (VAC) on Windows to take the “monitor” output from OBS and present it as a new system audio input, and used the OBS Virtualcam plugin to present the output of OBS as a virtual webcam input. OBS has since added a native virtual camera, so this isn’t needed any more.
Basically, I boot up OBS and get a nice confidence monitor for video and audio levels. Then anything like Teams gets pointed at the virtual inputs and it’s none the wiser that I’m fiddling around with the inputs.
And this worked really well! I had lots of control over what people saw, and though the cropped HD feed was a bit rubbish in terms of resolution the lighting worked well.
Skip to the endgame – March 2021
Over the course of a year, I replaced almost every element I’ve described above. Some of this was for reasons other than videoconferencing – I’ve been getting into making “music” with some synthesisers, which drove a lot of the audio side.
The key upgrades that actually made a difference were the microphone, the audio interface/mixer, and a new webcam.
Software improvements over the last year have also made a big difference to things.
I spent maybe a month with a borrowed camera from work – a Nikon Z6 with a Zeiss 50mm/f1.2 Milvus F-mount lens. I hooked this up with a HDMI-to-USB box, and used OBS to sync the audio up (as this came with about 120ms of video delay). This produced fabulous video and I could use an Atomos Ninja HDMI monitor as a screen to look at right next to the camera. It also let me nicely defocus the background and lose the greenscreen – blurry mess with pretty bokeh from some fairy lights isn’t quite so objectionable!
However, it couldn’t move with my desk, and was a huge bulky thing to deal with. So rather than go buy something similar and make that setup permanent, I decided to just upgrade my webcam and go down the AI route as the cheaper option.
My camera is a Logitech Brio 4K. For a while these were unobtanium, being the best reasonably-priced webcam out there – but eventually I nabbed one and it was a huge upgrade from the Microsoft one. Colour science is still a bit odd but I could fix the white balance setting to match my lighting conditions which helped consistency a lot. The extra resolution let me crop and still get a HD stream for OBS.
I added XSplit into the mix, which is a software tool that separates you from background. I just use this to apply a modest blur in the background. It’s nowhere as nice or consistent as a good lens with shallow depth of focus, but it works pretty well for what it is and doesn’t need a huge lens, so my camera can still perch on top of my monitor and move with the desk.
Besides that, I’ve not done much. I tinkered with LUTs a bit to do colour correction but found that if I get the white balance locked off I don’t really need it.
Audio I did a lot on.
I’m an audio nerd at heart, and as I was getting into synths and wanting more in the way of flexibility, I opted to go fairly high end on this stuff.
In the end I replaced the NT1-A with a Shure SM7B on a desk-mounted arm – the classic “streamer/vlogger” mic for good reason, since it’s a dynamic mic with a good pickup pattern and nice and robust. It doesn’t pick up much background noise.
This needs a really good preamp with lots of gain to work, though, since it’s dynamic. The traditional approach is to use a phantom powered box like a Cloudlifter to add in another 20dB of gain, but this adds noise and another bit of kit, so I opted to just get some better preamps, since I also wanted to add some more audio inputs to my PC.
I did this by adding a Solid State Logic SiX mixer, which is not something I’d generally recommend to anyone not also doing music stuff. It’s a stupidly versatile desk with incredibly good preamps, tons of routing options, great built-in compressors, lots of monitor control options, and also costs several arms and legs. Much cheaper options are available, like SSL’s new USB interfaces. This did a great job of replacing some of the functions of the Focusrite and providing some superb preamps. It’ll last me for decades to come and also accommodates all my current synths as inputs with room to spare.
I’d originally run this into the Focusrite but finally tired of all the crashes and glitches and replaced that with a RME Fireface. Again, high end, but very solid and reliable which is what I wanted. I don’t run it in its highest-end mode – I use 96kHz/24-bit which lets all the aliasing filters have plenty of space without delving to madness like 192kHz audio. The preamps are actually good enough to directly use the SM7B, so I could skip the SiX for that, but for now I’ll stick to the SiX preamps.
The headphones I’ll probably upgrade – the 250s are pretty good all-rounders but aren’t great for all-day wear.
I haven’t changed a lot – I still use OBS and VAC to make everything apper as virtual inputs. However, I now use NVIDIA’s new noise reduction filtering in OBS, along with a Solid State Logic native VST plugin to de-ess and de-pop. This removes the worst silibances from my speech and worked out a lot cheaper than buying a hardware unit to do the same.
XSplit keeps improving, so that’s worked out pretty well.
What I’d recommend
There’s lots of good advice on the budget/low-end of things, so I’ll stick to what I’d recommend if you wanted a “really good” setup that was going to give you a superb output.
- Video: Logitech Brio or Canon’s EOS-M videoconferencing kit, and whatever lighting makes sense for your environment
- Audio: SSL 2/2+ with Shure SM7B, or a Shure/Rode lapel mic kit
- Software: OBS, NVIDIA’s noise-removal plugin, and the built-in plugins or VSTs of your choice for any audio compression etc
The Canon EOS-M based VC kits are pretty solid as a step up from a webcam and are actually light enough to mount on a monitor if you wanted to. They’ll give you much better video than a webcam could ever do. However, the Brio is pretty solid.
Lighting is the biggest thing to upgrade for most people. I’d avoid ringlights – get LED panels, or COB lights that can accept diffusers. There’s tons out there which will work great, even the cheap ones are a whole lot better than having nothing.
For noise, get a good interface like the SSL 2/2+ or RME Fireface/Babyface that can handle the SM7B. Alternatively, Shure now have a SM7B-alike which has USB built in, which makes matters easier. My partner uses a Rode USB mic which works very well, too. These USB mics will have built-in headphone amplifiers that’ll suit most headphones and in-ear options, and let you put a bit of sidetone in so you can hear yourself without latency.
Software – OBS is unbeatable, I think. It’s certainly so widely used that it’s the best supported thing out there.
For multiple cameras there are things like VMix, but honestly I’ve done mad things with OBS. I can use NDI over our VPN to pull in video from our lab’s camera system and switch my own video out for it – I’ve used this to remotely pan/tilt a camera over to a demonstration unit on the wall, switch my face for the lab camera, and talk engineers through problems they’re having, for instance. You can do an awful lot with free software (though do donate to the OBS team).