We hear many mentions of common audio terms like distortion and frequency response yet there is comparably little discussion of one of the most fundamental factors on how the human hearing system works and its profound impact on performance of digital systems. This is odd as the original research dates way back to 1933 at Bell Labs where experiments were performed to understand the capabilities of the human hearing system: http://en.wikipedia.org/wiki/Fletcher–Munson_curves
The work is named after its researchers – Fletcher & Munson – and is summed into a nice graph:
.
What the lines show is the ear sensitivity at fixed level of sound (pressure) level at different frequencies. Inverted, we realize that to maintain equal perceived loudness, the level needs to drastically rise at lower and higher frequencies. The ear is clearly most sensitive to frequencies in the middle bands around 2KHz to 3 KHz. This is probably due to our need to hear other humans well as the frequency of our vocal cords is around the same range. It may also have something to do with the early man being able to hear the hidden danger from animals and such.
Lossy audio compression systems (e.g. MP3) universally use this phenomenon to great effect to reduce the amount of storage needed for a piece of music. They convert the audio samples into frequency domain, and then assign less resolution to lower and higher frequencies, saving bits to allocate to the critical mid-band region. The effective resolution for example may be just 4 or 5 bits at high frequencies whereas the entire 16-bits of CD audio samples are preserved for mid frequencies. Reduced allocation of bits to low and high frequencies causes distortion at those frequencies but since the ear is less sensitive there, it will likely not be (very) audible.
Getting far more esoteric, let’s say we are trying to understand an artifact in digital audio reproduction called “jitter.” Jitter is a variation in timing. Instead of every audio sample from a CD source arriving precisely at 1/44100 of a second, some samples come slightly earlier or faster. Debates range across the Internet as to whether such variations cause audible distortion. Jitter is a complex topic and one that I will cover at depth in another article but for now, let’s accept that the factors that vary timing and hence cause jitter have a frequency of their own. Understanding that make up is super important in realizing if jitter can be audible or not. Why? The answer is in the above graphs! If jitter frequency lies in the 2 to 3Khz range, then it is far more likely to be audible than not. So be dubious of tests which claim jitter is not audible which were performed with frequencies outside of this range (or worse yet, are devoid of what frequency was used). They run afoul of the equal loudness curves.
One other interesting observation from the graphs. If you look at them, you realize that they are not parallel to each other. This means that depending on the level of the sound, the response of the ear changes! For example, we are not as sensitive to low frequencies at lower levels as higher levels. Old time audiophiles remember “Loudness” switches on amplifiers designed to counteract this effect by boosting the low frequencies.
A better solution than a loudness switch is to utilize the power of signal processing in today’s processors to adaptively change the system response to match that of the ear. In other words, the volume control would not just change the volume but also shape the frequency response to match the loudness curves. That way, the volume control truly does what it is supposed to: change the volume across all frequency bands equally. The way it is now, the frequency response changes perceptively as you change the dial which is not correct.
There is one room correction device which claims to have the above feature: TacT. While that is laudable, a more perfect solution would first test your hearing system, create custom equal loudness curves for your ear (and compensate for losses in your hearing!) and adapt to that, rather than what the research shows across many subjects.
In future installments, I will describe other “psychoacoustics” data we have about human hearing system. So, “do come back, you hear?”
The work is named after its researchers – Fletcher & Munson – and is summed into a nice graph:
What the lines show is the ear sensitivity at fixed level of sound (pressure) level at different frequencies. Inverted, we realize that to maintain equal perceived loudness, the level needs to drastically rise at lower and higher frequencies. The ear is clearly most sensitive to frequencies in the middle bands around 2KHz to 3 KHz. This is probably due to our need to hear other humans well as the frequency of our vocal cords is around the same range. It may also have something to do with the early man being able to hear the hidden danger from animals and such.
Lossy audio compression systems (e.g. MP3) universally use this phenomenon to great effect to reduce the amount of storage needed for a piece of music. They convert the audio samples into frequency domain, and then assign less resolution to lower and higher frequencies, saving bits to allocate to the critical mid-band region. The effective resolution for example may be just 4 or 5 bits at high frequencies whereas the entire 16-bits of CD audio samples are preserved for mid frequencies. Reduced allocation of bits to low and high frequencies causes distortion at those frequencies but since the ear is less sensitive there, it will likely not be (very) audible.
Getting far more esoteric, let’s say we are trying to understand an artifact in digital audio reproduction called “jitter.” Jitter is a variation in timing. Instead of every audio sample from a CD source arriving precisely at 1/44100 of a second, some samples come slightly earlier or faster. Debates range across the Internet as to whether such variations cause audible distortion. Jitter is a complex topic and one that I will cover at depth in another article but for now, let’s accept that the factors that vary timing and hence cause jitter have a frequency of their own. Understanding that make up is super important in realizing if jitter can be audible or not. Why? The answer is in the above graphs! If jitter frequency lies in the 2 to 3Khz range, then it is far more likely to be audible than not. So be dubious of tests which claim jitter is not audible which were performed with frequencies outside of this range (or worse yet, are devoid of what frequency was used). They run afoul of the equal loudness curves.
One other interesting observation from the graphs. If you look at them, you realize that they are not parallel to each other. This means that depending on the level of the sound, the response of the ear changes! For example, we are not as sensitive to low frequencies at lower levels as higher levels. Old time audiophiles remember “Loudness” switches on amplifiers designed to counteract this effect by boosting the low frequencies.
A better solution than a loudness switch is to utilize the power of signal processing in today’s processors to adaptively change the system response to match that of the ear. In other words, the volume control would not just change the volume but also shape the frequency response to match the loudness curves. That way, the volume control truly does what it is supposed to: change the volume across all frequency bands equally. The way it is now, the frequency response changes perceptively as you change the dial which is not correct.
There is one room correction device which claims to have the above feature: TacT. While that is laudable, a more perfect solution would first test your hearing system, create custom equal loudness curves for your ear (and compensate for losses in your hearing!) and adapt to that, rather than what the research shows across many subjects.
In future installments, I will describe other “psychoacoustics” data we have about human hearing system. So, “do come back, you hear?”