Anatomy of the human voice
The human voice is one of evolution’s finest efforts – let’s take a peek at how it works. The lungs force air through the larynx (or voicebox), causing the two vocal folds inside to vibrate – it’s an oscillator, essentially, and we control its pitch with our muscles. This rasping, buzzing tone resonates all the way through the vocal tract or cavity – the space inside your neck, mouth and nose. This acts as a filter, essentially EQing the larynx’s output and adding resonant peaks, called formants – Vowel sounds are characterised by a relationship of two such formants, and by changing the shape of our vocal tract, we shift the formants into distinctive ‘shapes’ to create vowels. Formant frequencies vary from person to person, and with age and gender, but there are some basic vowel ‘formulas’ that make them identifiable to us.
Consonants complete the puzzle. By forcing air through small gaps in the vocal tract (eg, teeth and tongue, tongue and palate), we produce fricative sounds. The most commonly referenced fricative in music production is sibilance – S sounds – as these can prove overbearing in a mix. We also use the lips to produce ‘plosive’ pops – Bs and Ps – and to temporarily close the mouth, producing nasal sounds like M and N.
Armed with this rudimentary (if a bit gross) knowledge, you can now hopefully appreciate how, for example, formant filters in synths can make oscillators seem to speak or sing (hint: they use multiple peaking filters), how a plugin can adjust vocal formants to change the gender of a vocal, or how speech can be synthesised using filtered oscillators and noise generators.