2015 Shortcut: When I wrote this article Audacity didn’t have an automatic center-panned vocal canceling effect… but now it does, so rather than do the stereo-separate / invert-one-track / play-both-as-mono trick (and that’s pretty much all there is to it), you should be able to find the Vocal Remover option in the Effects menu – but it’s more fun / interesting and can give better results if you do it yourself! =D
I found this trick the other day whilst stumbling the Interwebs and thought I’d do a quick-write up w/ pictures to make it as easy as possible… For this exercise we’re going to be using a piece of free audio software called Audacity, which you can get for Linux, Windows and Mac.
Update: If you’re trying this out on a Mac, please make sure you get Audacity 1.3 Beta or newer – the stable 1.2 version appears to have a missing equaliser decibal-range slider which you need towards the end of the process!
The track I’m using in this example is the first 50 seconds of Ben Folds – Zak and Sara, where the voice kicks in at the 11 second mark, and the original sounds like this:
Once you’ve got a copy of Audacity for your platform of choice, fire it up and follow these simple steps to get rid of the vocals from most songs:
1.) Import Some Audio
From the menu in Audacity, choose File | Import | Audio and then select an mp3 (or any audio format Audacity understands) to work with.
2.) Duplicate The Tracks
We’re going to come back later and use the bass from this to give it a nice, full sound – but for now just duplicate your imported audio by going to Edit | Duplicate:
Once you’ve duplicated the tracks, we’ll mute our copy for now by clicking on the Mute button to the left of the waveform as shown:
3.) Separate Our Original Tracks, Convert To Mono and Invert One Of Them
This is the key part of the process: because vocal tracks on songs are commonly recorded as mono and then mixed into stereo – by separating the tracks and making them act as separate mono tracks, we can then invert one of them to have them cancel each other out! And since usually only the vocal waveform is identical (i.e. mono mixed to stereo) it’s only the vocals that magically disappear from the sound! Ha!
So, to start off we need to click on the little down-arrow to the left of our original wave form and select Split Stereo Track:
Once the waveform’s been split (so we can mess with both channels individually) double click in the lower of the two waveforms (the right channel) to select it all, and then from the menu choose Effect | Invert as shown:
Now for the last really important step – simply set both left and right channels to output as mono by clicking on the little down-arrow to the left of each waveform and selecting Mono. Don’t forget to set both of them to Mono or the magic won’t happen!
With that done, give it a play and see what happens! With any luck, there won’t be any vocals in the track – so with my example, it now sounds like this:
You’ll notice at the end that the vocals come back (the backing singing etc.) – why? Because it wasn’t recorded as a mono source, and hence doesn’t get cancelled out by the inversion we did earlier – so this technique won’t work for all songs – only ones where the voice is recorded in mono and then mixed into stereo, which to be fair, I think it a pretty large swathe of ’em, and it’d be perfect for karaoke or something like this anyway because you’d want the backing vocals there!
If you wanted to know more about how this wave-form cancellation works, you can always look up Superposition of Waves, but I’ll leave that as an exercise for the curious =D
4.) Filter Our Original To Add Back The Bass
Update: BigFuz points out in the comments below that an easier way than using equalisation to filter our copy so that it only keeps the bass is to use a Low Pass filter and just enter a value of 200Hz or 250Hz (whichever works best for you). You won’t be able to add back both bass and treble with a single pass using this method, but you may not want or need to! To apply a low pass filter to the copy, you can just select Effects | Effects 1 to 9 | Low Pass Filter from the menu – too easy! Relatedly (and yeah, it’s a bit obvious, but I use this to keep track myself), a quick way to remember which way around low-pass/high-pass goes is to think that a low pass filter allows everything below the given frequency to pass through, so a high pass filter must allow any frequencies higher than what you provide to pass through.
The voice-cancelled audio above sounds pretty good, and the vocals are definitely gone, but in the process we’ve stripped out a lot of the lower frequency sounds (i.e. the bass). So remember when we duplicated our waveform and muted it right at the beginning? This is where it fits in…
Un-mute our duplicated (and still stereo) audio copy by clicking on the Mute button to the left of the waveform, double click on the waveform to select it all, and then from the menu choose Effect | Equalization as shown:
When the equalisation window pops up, we’re going to filter it so that all sounds above 200Hz are stripped out. To do this, just click somewhere on the main part of the window and a white dot will appear, click again and another will – then click on them to drag them around until you get a shape that looks kinda like this:
Notice that I’ve dragged the bottom-left slider all the way down to get access to the full 120Db and not just the 30Db on the scale by default.
You might have to have a bit of a play to get it right, but all we’re really doing is saying “Leave anything with a frequency of 200Hz or less alone, but drop the volume of anything over that frequency by around 120Db” (i.e. remove it entirely!).
If you mute our top two mono tracks and play it back, you should get the filtered version of the stereo track with only the bass remaining, which for my example sounds like this:
5.) Un-mute Our Original Voice Cancelled Tracks
With the vocal-free (but a bit tinny) audio playing at the same time as our bass-only version, we get a pretty neat sound with good bass and no vocals! Result! =D
You can then just go to File | Export to save the finished vocal-free version to an mp3 or such, if you wanted to keep it.
I’ve read that some people like to cut out the sections between 200Hz and 1000Hz or so (1KHz, although I’ve also seen people push it up to 6KHz) to keep the low-end and high-end sounds, but when I was playing with this I kept getting some voice creeping back into the mix. This could well have been because I was only dropping 30Db when I was messing around with it though – so go nuts and experiment if ya wanna!
The shape I used for that EQ setting was:
With that all said and done, I hope you found this guide useful – I didn’t come up with the technique or anything like that, I just saw a 10 line how-to and had to mess around for half an hour to get it to work, so thought I could knock up a quick guide that shows how it’s done really clearly, and I hope you have fun with the technique!