Ethan Winer's definition of Audio transparency

Status
Not open for further replies.
Well what exactly do you listen for?? How do you hear jitter?? What kind of audible artifacts does it introduce where you would could reliably say, Ah Jitter there it is.

Rob:)

This is a question for Amir, Rob. I believe when he was managing a team that developed codecs, he trained himself to hear digital artifacts, to the point where now he can't not hear them. Sounds like sonic self-mutilation to me. I admire his dedication, but I do not envy his skills and training. I'm the opposite extreme, which may explain much. you can clue me into a quite audible distortion, I can listen for it, get it, and then it's gone because I find myself listening to the music and losing the sound I was listening for. In quick A/B switching I can pick up audible differences. Give me more than a few minutes and unless it is some obvious distortion, I'm liable to lose it. I've tried listening to audio with music I really dislike. That helps me stay focused on the sound. Give me good music, even decent music, and I'm going to listen for the sound of the music, not the "pre-echo"

I'm blessed.

Tim
 
Hello Amir

Maybe this isn't the right thread but what causes the jitter to be higher in HDMI vs Coax or Optical??

Rob:)
I have some general educated guesses:

1. Learning curve. We have nearly three decades of experience with S/PDIF in the filed. Over that time engineers have figured out how to reject jitter there even in lower cost implementations. HDMI comparatively is very young. Its mass adoption has only been in the last 5 to 6 years.

2. HDMI is designed and usually implemented by video guys. HDMI came from DVI where it was a video only standard. Audio is embedded in the unused region of video signal and is slaved to video clock as a result. This is the right solution for video as nothing is more annoying than audio and video getting out of sync. Sadly we are forced to use video even if playing pure music. HDMI cannot function without a video signal.

3. In today's world, video jitter is immaterial. Our displays have fixed pixel locations so even if the incoming signal is dancing around as far as timing, we eliminate that completely by knowing the next pixel location and depositing the video pixel value there. This means within reason, video clock can be very "dirty" as far as timing and video still works. Unfortunately audio doesn't work this way and jitter makes it through its pipeline.

4. Since HDMI video is mandatory as soon you use that interface you have a lot more signal working inside your receiver. The video circuits are all humming even though they are getting black signal (or whatever the screen saver is in your player). This means that even if you got a clean clock out of HDMI, you wind up corrupting it with the activity inside your receiver. It does not take much to have a clock drift by picoseconds or nanoseconds.

5. There is no scrutiny. S/PDIF constantly gets measured and reviewed by audio magazines. So if you are building audio-only products, you get subjected to its performance here so engineers will likely measure and make sure their design is good. HDMI devices however are very rare in 2-channel audio world so that interface doesn't get measured by those magazine. And unfortunately the home theater type magazines lack equipment and knowledge to test for its performance. So with the rare exception of Paul Miller in UK, no one ever publicizes bad performance here. So if you are an equipment maker, as long as audio plays reliably, you are golden. I am confident that these 99% of the companies never measure jitter over HDMI.

6. The companies making HDMI silicon are not audio-focused companies. They are in the business of building high-speed interface chips at the lowest cost they can. Right now, their focus is in building "4K" chips to take advantage of the marketing hype around 4K displays (4X more resolution than current 1080p units). No one is going to pay them a cent extra for lower jitter. And due to very low volumes, high-end companies have no buying power to voice their opinion otherwise. All effort is focused around the company who buys a million chips, not a few hundred.
 
I missed the 120 dB SPL reference in the caption the first time around. That is ridiculously loud, unless it refers to peak level rather than average. But if it's peak, the crest factor is crucial. So I'll assume average, since that's how noise is usually assessed. To put this in context, 100 dB SPL is ear-splitting loud. That's the loudest I can stand to listen to normal music in my living room, and I like loud music! So once you scale everything down by 20 dB to make this graph practical, even 16 bits is enough to be transparent.

Just curious, where is this graph from? A book? A web article? I'd like to read more about the context.

--Ethan
I addressed all of these points in the 192K thread. Here it is again:

In other words, assuming we want 20 bits, we conclude that we need 20 bits. Hard to argue against that, aside from the fact that it's a bit circular.
No, I showed that you can do with 16 if you used the proper signal processing.

My original point was that as long as you listen at a level where the max amplitude is 96 dB SPL (which I consider loud enough), just considering ATH is enough to conclude that 16 bit is enough.
Now *that* is circular :D. You pick 96 based on what? That it happens to be the dynamic range of 16 bits/CD? You can't solve an equation based on the variables you choose. But rather, what the customer needs. The customer is a high-end one, who wants the absolute best fidelity and nothing lost in the capture of the source. For that, we can look to some research:

http://www.aes.org/e-lib/browse.cfm?elib=11981
Author: Fielder, Louis D.

"Dynamic Range Requirement for Subjective Noise Free Reproduction of Music

A dynamic range of 118 dB is determined necessary for subjective noise-free reproduction of music in a dithered digital audio recorder. Maximum peak sound levels in music are compared to the minimum discernible level of white noise in a quiet listening situation. Microphone noise limitations, monitoring loudspeaker capabilities, and performance environment noise levels are also considered.
....
The recent emergence of PCM recording techniques for music reproduction and the desire to standardize this format involves a re-examination of dynamic range requirements for natural music reproduction. Standardization of a 16 bit linear format would limit the dynamic range capability to 96 dB, and limit the quality of future PCM recorders if a wider range eventually became necessary.
....
The most accurate of previous examinations of dynamic range requirements was done by Fletcher [1] , who argued that 100 dB dynamic range was necessary.... Fletcher ignored the ear's ability to detect a noise source below that of the room noise by source localization.
...
For this particular microphone, the overload point is 130 dB and thus would allow the capturing of an equivalent dynamic range of 121 dB if peak levels of 130 dB exist in a performance. From the tabulation on peak sound levels close to musical instruments in Table 3, it is seen that musical instruments are capable of producing these high sound levels especially at distances less than 3 feet.
...
Four different microphones were measured which had overload levels between 120 to 140 decibels. They were all condenser microphones and as the graph shows the noise levels in the 3 - 7 kHz region were within 5 dB of each other. In summary, it is shown that close talking techniques and the proper selection of a microphone produces no limitation or reduction on the dynamic range requirement as determined by the playback experiments. Even a natural miking technique results in only a 9 dB white noise threshold.

In conclusion, several experiments were made to determine the dynamic range requirement for a recording system to produce no audible hiss when used to play back music at natural listening levels. These experiments resulted in a dynamic range requirement of 118 dB (non-amplified music), 124 dB (amplified music) for the professional, and 106 dB for the high quality consumer playback system."


http://www.aes.org/e-lib/browse.cfm?elib=7948
Author: Fielder, Louis D. (1995)

"Dynamic-Range Issues in the Modern Digital Audio Environment

The peak sound levels of music performances are combined with the audibility of noise in sound reproduction circumstances to yield a dynamic-range criterion for noise-free reproduction of music. This criterion is then examined in light of limitations due to microphones, analog-to-digital conversion, digital audio storage, low-bit-rate coders, digital-to-analog conversion, and loudspeakers. A dynamic range of over 120 dB is found to be necessary in the most demanding circumstances, requiring the reproduction of sound levels of up to 129 dB SPL. Present audio systems are shown to be challenged to yield these values.
....
A survey of the dynamic range capabilities of ADCs shows values of 90-110 dB, with the highest value for the best configurations of 20-bit word length converters,Analog Devices, Crystal Semiconductor, and Ultra Ana-log all make ADCs with dynamic ranges of 106-110 above 1 kHz. Unfortunately these values of dynamic-range performance are inadequate to meet the professional and most demanding of the consumer requirements, and techniques to increase the apparent dynamic-range characteristics are necessary."


Granted, not everyone requires such dynamic range. But if we are to establish what is an appropriate distribution specification, it better accommodate all that we can throw at it. One has to remember that there is no better customer of music than high-end buyer. They are the ones shelling out thousands and often tens of thousands of dollars in music. And are least apt to go and steal MP3s. So if you are going to set a standard, you better take good care of them.

Now, if you listen at a level above that, 16 bits is still enough, but then you have to involve simultaneous masking to show that it is. In other words, if you have music playing at 120 dB, it's going to severely degrade the sensitivity of your hearing. Not to mention -- as Ethan pointed out -- that finding a room where the background noise is below ATH is not exactly easy.
Not really. I am not sitting there listening to a tone at 120 db. A transient may just last a few milliseconds. Home theaters routinely hit 100+ db. THX spec for example requires 105 db. I am not seeing warning signs on such equipment saying you are going to go deaf.

Now in the old days of slow Internet and expensive hard disks, sure, we could argue these points and I used to do the same :). But technology and infrastructure has moved on. It is time that we don't short-change the customer knowingly in the interest of economizing for the sake of economizing. People are paying good money for music and like to feel, and be confident, that they are getting all the quality they could. Taking some of it out because we think they shouldn't have it doesn't make sense.
 
I addressed all of these points in the 192K thread. Here it is again:

I don't see addressed my point that 120 dB SPL is unreasonably loud. Further, what does this have to do with audio sounding "bleached" due to not using enough bits? How does it disprove the fact that all bit depth affects is the noise floor? Isn't this the definition of a straw man? Person 1 makes a statement, and person 2 addresses a different statement. Or is that a red herring? :D

--Ethan
 
I don't see addressed my point that 120 dB SPL is unreasonably loud.
I did. No one is playing a pure tone at 120 spl. But rather, letting peaks get up there for a second or two. And besides, whoever is bothered by that can turn down the volume whereas you if you limited the dynamic range in recording, there is no going back.

Further, what does this have to do with audio sounding "bleached" due to not using enough bits? How does it disprove the fact that all bit depth affects is the noise floor? Isn't this the definition of a straw man?
I think the plot is lost :). We are discussing your statement in front of professional audio industry that 100 db is the definition of transparency. That number is above 96 db that can be captured using 16 bit format. So I showed how Bob Stuart makes the same point there with that graph. I am still unclear if you are backtracking from that AES presentation or sticking to it. In that regard, I don't know if you are defending 100 db in this statement or 80 or whatever. Can you please clarify?
 
I don't see addressed my point that 120 dB SPL is unreasonably loud.


Hello Ethan

Depends on where it is in the frequency band. 4k sure run for cover but drop it down into the LFE bass region and it's quite tolerable for short bumps and bangs. I have hit 120db peaks in my HT in the LFE region and it shakes the daylights out of you but not much more.

Hello Amir

Thanks for your post.

Rob:)
 
Musical peaks aren't usually even "a second or two", but rather a fraction of a second. Even well-recorded rock and jazz will hit peaks of 120 dB when the average level is closer to 100 dB, and with classical the average volume ( when peaks hit 120 dB) is more like 90-95 dB. Loud, sure; unreasonably loud, no way.
 
Musical peaks aren't usually even "a second or two", but rather a fraction of a second. Even well-recorded rock and jazz will hit peaks of 120 dB when the average level is closer to 100 dB, and with classical the average volume ( when peaks hit 120 dB) is more like 90-95 dB. Loud, sure; unreasonably loud, no way.

Are you guys listening at 100dB averages? And you think you can hear audiophile subtleties? Hell it's a wonder you can hear anything.

Tim
 
Are you guys listening at 100dB averages? And you think you can hear audiophile subtleties? Hell it's a wonder you can hear anything.

Not even close my average is in the mid 80's for music. You drop on a movie especially some of the newer Blue Rays such as Super 8 and duck. I am not sure what level the LFE is at but it's loud as hell in certain parts for dramatic effect and if you don't know it's coming there is some real jump factor in there. Another example is the cannon broadside in Master and Commander. I usually listen to movies at around -6db or so from reference depending on the sound track and type of movie.

Rob:)
 
Are you guys listening at 100dB averages? And you think you can hear audiophile subtleties? Hell it's a wonder you can hear anything.

Tim

On a regular basis, of course not. On occasion? Probably; it's significantly quieter than a rock concert, where much of the audience is listening at closer to 110 dB average.

But the point is that it's an easily achievable and not unrealistic volume, and source sound quality should be up to it.
 
Threshold for pain is ~130dB unless ear-splitting has nothing to do with the pain implied something else is causing that pain with music material at 100dB.

I'm with Rob it's where that amplitude is in the spectrum. Hearing sensitivity by frequency, basic stuff.

I'm still amazed at why people are so focused on the dynamic range of higher bit length and the frequency extremes of sampling rate not the resolution.
 
...I'm still amazed at why people are so focused on the dynamic range of higher bit length and the frequency extremes of sampling rate not the resolution.

Because over in the 24/192 topic it is claimed that the only thing related to bit-depth is the noise floor, so the concept of increased resolution separate from that is a red herring. I'm not saying I agree with that, but there appears to be no engineering argument to counter that.
 
Because over in the 24/192 topic it is claimed that the only thing related to bit-depth is the noise floor, so the concept of increased resolution separate from that is a red herring. I'm not saying I agree with that, but there appears to be no engineering argument to counter that.

There is no engineering argument to counter that, that I'm aware of. It will be pretty big news if you've got one.

Tim
 
Depends on where it is in the frequency band. 4k sure run for cover but drop it down into the LFE bass region and it's quite tolerable for short bumps and bangs. I have hit 120db peaks in my HT in the LFE region and it shakes the daylights out of you but not much more.

Yes indeed, and this is not unlike my frequently repeated explanation that masking is equally a factor with artifact audibility. That's why it's impossible to pin this down to a single number. As I also said earlier, trying to identify a single number shows a lack of understanding of both how audio works and how we hear. I'm satisfied with keeping artifacts 80 dB below the music. If we can get them to -90 or even -100, all the better. But music can sound excellent, and highly satisfying, with only 80 dB separation between the signal and any noises. That people argue endlessly over the gnat of 0.01 percent distortion while ignoring the elephant of 30+ dB response errors in their room is truly mind boggling!

--Ethan
 
Please see my post directly above.

--Ethan
I am still confused Ethan. You seem to scuff at the idea of picking one number here yet this is what you said in your presentation:

"22:10 Aside from devices that intentionally add “color” by changing the frequency response or adding distortion, it’s generally accepted that audio gear should aim to be transparent. This is easily tested by measuring the above four parameters with various test signals. If the frequency response is flat to less than 1/10th dB from 20 Hz to 20 KHz, and the sum of all noise and distortion is at least 100 dB below the music, a device can be said to be audibly transparent. A device that’s transparent will sound the same as every other transparent device, whether a microphone preamp or DAW summing algorithm."

Seemed like you picked one number -- 100 db -- and positioned that to the minimum standard with the comment, "at least 100 db below music." My question simply is whether we should be going by your rule above or not.
 
There is no engineering argument to counter that, that I'm aware of. It will be pretty big news if you've got one.

Tim
I think there is a practical one there. The only way the math helps us there is by use of proper dither. If you don't apply dither, then you have distortion and distortion can be audible even through noise. So we need to see if as a practical matter people know to apply dither and what profile of it. I think the real knowledge of dither is not there in the music community. We went through a version of this in HD video. Movies are scanned at 10 bits but delivery formats for video to consumers are 8 bits. We immediately saw picture banding from content coming from major studios/production houses because people didn't understand the need for dither. I think the situation is better for music but not enough to rest easy :).
 
My question simply is whether we should be going by your rule above or not.

LOL, no, you should not go by one rule. I used 100 dB to be perfectly safe, to make a point I could be certain cannot be refuted. In practice, 80 dB is probably enough. Surely 90 dB is enough. Again, I've never heard the noise floor of 16-bit audio on normal music playing at normal levels. So I'm absolutely certain that -96 is a safe number too. Let's go with that one, okay?

I hope this satisfies you! :rolleyes:

The obsession expressed over this audio gnat is staggering. It's as if you guys are trying with all your might to find a chink in my armor, to find some infinitesimal thing you can find fault with. As if that would somehow discredit my efforts to define fidelity in practical terms. It's not working. Maybe you need to start yet another thread with my name in the title. :cool:

--Ethan
 
Status
Not open for further replies.

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu