Audible Jitter/amirm vs Ethan Winer

Amir, I have 2 questions for now:

1) I've read the description you've posted here and (in what seems like another lifetime;)) at AVS, and IIRC in all of the tests you've run you were listening through cans. Have you heard what you describe as jitter through loudspeakers?

2) Whether or not the tests you've run may be properly characterized as DBT or not, they would appear to be of the AB variety. Have you correctly identified what you describe as jitter by passing an ABX test?
 
Amir said,

"""Putting it all together, jitter is indeed a small level of distortion. It is not a problem for the general public or those with very limited funds to deal with it. For discerning audiophile though, it should be a consideration. Investing in good gear with good digital hygiene, can help eliminate it, bringing digital closer to its ideal characteristic of being transparent to the source.""""

I like the term good, "Digital hygiene, can help eliminate jitter..." I have been pealing away layers of distortion for 8 years, using the same speakers and amps. The solid sate amps I left behind were a big part of the problem. Later, preamp, cabling and source were all replaced with the test being hearing more within the recording.

I have a rather rare disc, "Lavin," where the first notes are plucked from the strings of a grand piano. What you hear is anything your system can do. When you get those raw string sounds, their reverberations, and decay correct you know it, if you know pianos intimately.
 
1) I've read the description you've posted here and (in what seems like another lifetime;)) at AVS, and IIRC in all of the tests you've run you were listening through cans. Have you heard what you describe as jitter through loudspeakers?
No I have not tested jitter using speakers. While we had a $300K+ listening room at Microsoft, I was always too busy at work to go and use it to run these tests. Instead, I would do the testing at home using my audio workbench which has a nice headphone amp with A/B inputs letting me do quick tests with minimal gear in the loop.

I was asked once whether I thought it would be as easy to hear these artifacts with speakers and I said no. Happy to elaborate once we cover some of the main points of the debate.
2) Whether or not the tests you've run may be properly characterized as DBT or not, they would appear to be of the AB variety. Have you correctly identified what you describe as jitter by passing an ABX test?
No. It is difficult and expensive enough to run these tests as AB. :) While I have run countless ABX tests for other types of distortion using computers, doing so with CE hardware becomes very challenging for someone trying to satisfy personal curiosity. As it is, I had to buy two of many things from content to hardware. Think how you would structure ABX for yourself to run at home and you see the issues :).

To address your implied point and that of Glen, I have always readily admitted that my tests results are not fit for publication nor absolute proof of anything. But the opposite is also true that they can’t be easily dismissed as sighted and clearly subject to bias. In addition, I usually find far more flaws in the tests of people who say they didn’t hear digital audio artifacts than they find in mine :). So until they raise the bar on what they do, I think I am fine stating my results as having value.
 
I don't think there's any question that your results have value, Emir. Blind listening, no matter how informal and inconclusive, is always more objective than sighted listening. I don't even understand how people deny that. With that said, even as a very frequent headphone listener, I'm not going to concern myself much with a distortion happening that many db below the music. I think I can rest assured that I'll never hear it over the voices in my head. :)

P
 
I have read this debate with a great level of fascination. Although Ethan is a friend of mine, and we are often on the same page about audio issues of this nature, Emir presented a lot of meaningful facts which have merit.

Some people MAY be able to hear certain distortions that others aren't trained to recognize. In a way, it's analogous to the existence of God. Some swear by the existence of God, others say they cannot perceive it therefore it does not exist.

When Ethan presented his test of 3 digital recording systems last month, I was actually quite shocked that I could hear a difference between all three files (once he fixed his mistake with the duplicate files). While sorting out the low-end DAC was easy because of the noise, I discovered something different about the IMPACT in the recordings between the two higher end DACs. One had definate impact and the other was more restrained. I think I was hearing a difference in linearity of the two, causing subtle, but definate dynamics differences.

Another more extreme example was my Ultimate Fireworks Video, which I recorded in 24-bits, on the launch site for a fireworks display. I played with this recording on 3 different systems. All were "high end" for their type of device. Two were internal computer cards. One of them sounded gritty and granulated to the point where it seemed like 100% THD on the preamble before the fireworks, which was peaking at -85dB in the recording. The National Anthem was playing on a car stereo, 600' away, along with other ambient sound from the middle of an airfield at Danbury Airport, where the launchers were situated. The other computer sound card did a much better job reproducing the ambient sound, though there was much noise and clock buzz, etc. Finally, the high end pro gear that recorded it was used to play it back, and once again it was like being there.

With regard to jitter affecting s/n, this may explain why going from 24-bits to 16-bits with NO dither, resulted in the introduction of hiss where none was audible in the original 24-bit recording. Somewhere during the software conversion, there may be the equivalent of jitter being introduced into the resulting file, which is the only sensible explanation for this noise being introduced by merely converting the file to 16-bit. I experimented with all of the dither types and spectral shaping curves and eventually found a combination that was quieter than "no dither" on the converted file, but the introduction of hiss by merely converting the file was still somewhat baffling to me.

DACs DO sound different, and often vastly so at very small signal levels. At normal music levels, they can definately affect the dynamics, as I discovered when listening to Ethan's 3 test files on my reference system.

BTW, last month, the DAC on the left channel of my Denon DCD-590 started sounding 'tizzy' in the left channel and I troubleshot and played the disc on my Oppo and it sounded good again. So the DAC or some component related to it has gone south in the Denon player. It was audible with my Bridgeport Symphony recording that I produced in Nov 2008. Had I had on a pop/rock CD, it would have gone unnoticed.

Anyway, great and stimulating debate guys!
 
I just think it is important to be accurate. If what you're doing is sort of single blind testing controlled solely by yourself, call it that and don't assert that you've done double blind testing.

And, I agree, the results of single blind testing are not proof of anything nor acceptable for publication.

However, just like any anecdotal testimony given here between friends, any and all results are interesting and suitable for discussion.
 
I just think it is important to be accurate. If what you're doing is sort of single blind testing controlled solely by yourself, call it that and don't assert that you've done double blind testing.
What would be the addition/difference that would make it double blind? Would appreciate a complete answer not just further assertions that I did it wrong :).

And, I agree, the results of single blind testing are not proof of anything nor acceptable for publication.
Well, results of double-blind tests are not proof of anything either. Look at the test at hand in the paper. I have shown that its conclusions were not merited despite using ABX methodology. I take a good single blind test over a bad double-blind test every day of the week and twice on Sunday! If I had 100 people who could tell jitter in a blind test out of a population of 100, it would make for significant news. I am pretty sure it would make Ethan switch sides if not anything else :).

However, just like any anecdotal testimony given here between friends, any and all results are interesting and suitable for discussion.
Damned by faint praise :D.
 
If I had 100 people who could tell jitter in a blind test out of a population of 100, it would make for significant news.

Would also beg the question why (sort of) single blind rather than the more credible double blind or ABX. I think Ethan has already stated that all you need is one person to pass a DBT or ABX to back your claim and he would change his opinion.
 
Damned by faint praise :D.

Not my intention. I am being sincere when I say I have enjoyed the interchange and have found the viewpoints expressed on both sides to be very interesting. Credit goes to both of you. Hopefully, you've displayed the template for more of this kind of debate.
 
What would be the addition/difference that would make it double blind? Would appreciate a complete answer not just further assertions that I did it wrong :).

Double blind simply means that neither the subject, nor the people conducting the test know which sample is being tested when. You may have accomplished the same thing through your informal methodology (I've used exactly the same methodology myself, by the way), but you were both conducting the test and the subject of the test. The X is the core of the test. Randomization is typically applied through software in this kind of testing where you'l be presented with A, presented with B, then presented with X, which could be A or B, and asked to identify it.

Your methodology may have been "blind enough," but it wasn't AB/X. By definition it can't be if you were doing this alone and going back and forth between two examples instead of randomly presenting A, then B, then X, identifying EX, and repeating. And you may have done enough trials to reach a statistically significant sample, but if you did, you have a lot more patience than I. I've done the "push the button until you forget where you are, A/B for awhile, then look" test you've described. I want to declare it audible or inaudible pretty quickly then get on to something more productive or entertaining.

P
 
Last edited:
Double blind simply means that neither the subject, nor the people conducting the test know which sample is being tested when. You may have accomplished the same thing through your informal methodology (I've used exactly the same methodology myself, by the way), but you were both conducting the test and the subject of the test. The X is the core of the test. Randomization is typically applied through software in this kind of testing where you'l be presented with A, presented with B, then presented with X, which could be A or B, and asked to identify it.
Exactly. I was trying to get rsbeck to answer this way because then he would realize that what he is asking for and what I ran are extremely close to each other. To expand, the distinction for double blind is to remove conductor bias. A conductor for example, could on purpose or not, change how the test is conducted. For example, he could give me one sequence of clips, and you a different set after hearing whether I am an audiophile or not. That could impact the outcome in a positive or negative way.

If I had my wife do the switching with my back turned, we would all agree then that this would fall in the classic category of double blind because of the presence of a person conducting the test without knowledge or ability to change how the test was run. But think about it. It would not add any more accuracy to me randomizing the sequence, and then switching them without knowing which one is which.

The distinction remaining for the readers is then whether I did the randomization correctly or not. If I did, no more value comes from having someone else conduct the test versus me.

Your methodology may have been "blind enough," but it wasn't AB/X. By definition it can't be if you were doing this alone and going back and forth between two examples instead of randomly presenting A, then B, then X, identifying EX, and repeating.
ABX is an example of double-blind methodology but not the the definition thereof. It is designed to remove the chances of cooked sequences and the test harness itself introducing bias in some manner. We also use it to find out which testers were just guessing and had no idea which one was which. This aspect however, does not apply to trained listeners. There are also some other distinctions I won't drill into here.

I personally put a lot of value in people running a blind AB test and I am confident Ethan does too as those are the type of tests he has put forth in his talks and on his web site. Sure, some more confidence comes from having ABX style testing especially if you don't have the knowledge of how a test was run and who ran it. But substantial amount of validity comes from the fact that the experiment is blind.

And you may have done enough trials to reach a statistically significant sample, but if you did, you have a lot more patience than I. I've done the "push the button until you forget where you are, A/B for awhile, then look" test you've described. I want to declare it audible or inaudible pretty quickly then get on to something more productive or entertaining.

P
I actually take it to one more level. I change which source was which, re-randomize and run the test again. At the end, I look to see which one was which. If the results were consistent, then I would believe them.

Above is important to do due to variations in hardware and content. For example, it might seem that if I have two identical CD players and two of the same titles, then switching sources should not be necessary. But jitter and similar audio artifacts can occur due to very small differences and we need to rule them out. Tolerances in say, power supply components could let out more AC ripple on to the DAC clock source in one player or the other. Or one CD could have been stamped differently than the other.

Fortunately I have never found switching to disagree with the first set of results but I don't sleep easy until I can verify the difference a second time.

Running a test twice also rules out other things that could make it wrong such as the starting sequence not being random.
 
After reading through the description, I'd say that Amir's methodology would most likely replicate the results and procedural requirements to function as a double-blind test. However, as stated, it could not be reported as an "official" double-blind test due to the variances in the conduct of the test.

Personally, I feel it's important as audiophiles that we learn how to assess equipment via methods like this. Since we often have only ourselves to perform the work, developing standardized one-operator protocols for "blind" listening seems like a prudent idea.

Lee
 
Would also beg the question why (sort of) single blind rather than the more credible double blind or ABX. I think Ethan has already stated that all you need is one person to pass a DBT or ABX to back your claim and he would change his opinion.
I don't think he has asked for ABX. As I noted, he uses AB tests to prove his own points so it would be odd to all of a sudden require a high level from the other person.

As for wanting that one person, he has that: me! Have him cross examine me all he wants to get comfort in that I knew what I was doing. Only then does he get to dismiss my results.

On my side, I have gotten him to say that he has heard the very distortions he says don't exist when going from 16 to 24 bits. His defense is that he had to turn up the volume to hear them. I am fine with that. As long as he agrees that the distortion is indeed there, and can be heard at elevated levels, then we are a hell of a lot away from "you can't hear jitter no matter what." If you want to turn this in a double blind test, I could force him to use the very clip he used in that test, turn up the volume as he did, and I would win! :) After all, we know that he could hear the difference.

Now if you want to tell me that he likes to cook the tests by using loud material so that I can't turn up the volume as he has done in his test, then we don't need to do that either. I have already conceded that there are a million ways jitter is not audible and that digital audio fidelity is not about loud signals. What he has to prove is that jitter is never audible. If he indeed heard something when he turned up the volume, then what better evidence do we need?

Oh yes, proving that what he heard was due to jitter. I don't actually know if it was since I can't measure his gear. But I would declare victory anyway since it doesn't matter why the fidelity was not there. His own test proves that accuracy does matter and that digital is not perfect in all of its forms. That is something he would argue against just as much as he jitter.

Indeed, in this discussion we are trying to figure out if jitter reduces effective resolution below 16 bits, does it matter? If he has found that there is a difference between 16 bits and 24, surely he has conquered hearing a much more difficult impairment than the one we are talking about here!

It is a subject for another test but I suspect the test he really ran above was 16 bits versus lower than 16 bits. Most "24-bit" converters are not linear above 16. Run them in 16-bit mode, and they wind up only being good to 14 bits. So I shouldn't be unfair to him and make the last point above :). BTW, he would only know this if he had seen measurements of his DAC linearity as I have for all the devices I have tested. I am always disappointed when the advocates of objective testing, do away with measurements because it is too hard, yet they demand others to go through a lot work to prove their point of view. Either you believe in objectivity or not. Start with measurements and then move on to listening tests which tend to be subjective at some level.
 
A partner can make life much, much simpler... When I evaluated several DAC's last year, my partner used a die to choose which DAC was to be connected for any given trial. He'd roll the die, connect the indicated DAC (or no change, if it was the same DAC as last trial) and leave the room. I'd walk in, listen for a minute or two and select which DAC I thought was in-use. My view of the gear was completely blocked i.e. no visible clues. Volumes were level-matched to within 0.1dB. The big drawback to this technique is acoustic memory - each observation was separated by approx. 2 minutes, so the differences need to be significant enough to be heard after such a long time lapse.
 
Now if you want to tell me that he likes to cook the tests by using loud material so that I can't turn up the volume as he has done in his test, then we don't need to do that either.

I would argue that riding the volume up during quiet passages, or any manner of playing the material at abnormally high volume for the purpose of pushing the distortions into audibility is cooking the tests. But with that said, Amir, your methodology, while not "double" and containing no "X" is sufficiently blind in my book. I've used very similar methods often, not to prove anything to anyone else, but to determine whether or not a new component is earning its keep in my systems. It works for me. FWIW, I use the high end of normal listening volume (too high for an extended listening session), on headphones. I call what I can't hear under those conditions "inaudible." But your hearing, your comfort with extreme volume, and your mileage may vary.

P
 
We may wish that we can be both the subject and conductor of the test and still call it double blind, but as long as you are both the subject and conductor, it isn't double blind. So, it is inaccurate to assert that you have conducted your tests double blind. Since your methodology is a sort of informal one where you claim to confuse yourself and then a/b, you still have all kinds of possibility for experimenter bias that cannot be ruled out. That's why double blind tests are conducted with another level of blindness. Scientists understand that even the most objective among them is subject to experimenter bias -- because they are human. IMO, even calling your test "single blind" would be problematic because you claim to be blinding yourself through intentional confusion and you cannot rule out experimenter bias. So, it isn't an insult to ask that experimenter bias be completely ruled out before calling your tests blind or double blind and it is hardly an insult to ask that correct terminology be used when reporting. Call it what it is -- it is a sort of single blind test. Calling for accuracy is not an attack nor IMO should anyone feel attacked.
 
Yep, all kinds of opportunity for bias and inaccuracy, which is why I use that methodology for testing equipment in my own system, not to show evidence of anything to anyone else. I would still vote for the "confuse myself single blind A/B" comparison over any variety of sighted listening tests, even panels. Audiophiles who think they are giving themselves and their gear a fair shot with sighted listening "testing" are fooling themselves. Literally and figuratively.

P
 
I would still vote for the "confuse myself single blind A/B" comparison over any variety of sighted listening tests, even panels.

That is too funny. I think you should "trademark" that term in audio at the very least :)
 
We may wish that we can be both the subject and conductor of the test and still call it double blind, but as long as you are both the subject and conductor, it isn't double blind. So, it is inaccurate to assert that you have conducted your tests double blind. Since your methodology is a sort of informal one where you claim to confuse yourself and then a/b, you still have all kinds of possibility for experimenter bias that cannot be ruled out. That's why double blind tests are conducted with another level of blindness.
You can rule out experimenter bias by asking me questions. This is impossible to do this normally when a paper is presented and we can't question the participant. Hence the reason the protocols tend to be more rigorous. Such is not the case here.

Keep in mind that even if a test is run double-blind, it can just as well be biased. Note how the paper being discussed started with less important type of jitter. No amount of test methodology saves you after you make that mistake. What is important then is to understand the true nature of what is being presented, rather than being impressed or disappointed with superficial aspects of it.

Scientists understand that even the most objective among them is subject to experimenter bias -- because they are human.
I hope your implication is not that I don't understand this lest you want me to catch you explaining to a Chinese chef how to make stir-fry :D.
So, it isn't an insult to ask that experimenter bias be completely ruled out before calling your tests blind or double blind and it is hardly an insult to ask that correct terminology be used when reporting. Call it what it is -- it is a sort of single blind test. Calling for accuracy is not an attack nor IMO should anyone feel attacked.
I don't need to tell you the difference between asking a question which is respectful of the person's experience and credentials, and the type of indirect questioning you do when you don't believe a person ;).

You don't see me grilling Ethan as to whether he knew how to run his audio workstation software to inject noise properly at the levels he says he did. That's because even though there could be opportunity for mistake, and operator bias, I consider his qualifications and ethics high enough to have it be immaterial. More importantly, if I did mention it, he would likely take it as an insult that he doesn't know how to use the very software he makes a living from.

90% of the information in my posts had nothing to do with my own testing anyway. What I wish is that you consider that data which is free of bias and comment on it, rather then spending the entire Saturday correcting me on whether I understand what single or double blind testing in audio circles means :). You need to trust me that I know the difference and that I was after objective data to steer software development for a $14B business. You don't set yourself up to fool yourself when you have something like that at stake....

So let's move on and talk about other points. I think we have beat this one to death.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu