Double blind testing and stress

As moderator let me sum up:

The odds against guessing 1/4 4 times are insignificant.
Cables are a sham and ABX/DBT will prove it.
If there are differences in cables there are insignificant.
Those who claim there are differences should be forced to prove it.
Anecdotal evidence is not sufficient.
They must be subjected to a proper ABX/DBT protocol.
They will not do it because they are afraid they will fail.

Now let me humbly suggest these arguments have ben made ad nauseum since ABX/DBT made its appearance. Except for you math ABX/DBT have involved two cables with ten trials as sufficient test against guessing. If you google it you will find there has been plenty of cable shootouts involving ABX/DBT with mixed results. Many have tried to induce highend reviewers and audiophiles to engage in ABXDBT. Again with mixed results.
You are of course free to discuss the issue. I would hope that those that do would try to bring something new to the argument
 
Last edited:
you mean it stuck in your craw that much when this statement "You could flip a coin and get four heads or four tails in a row but not likely" was challenged that you have spent the next day trying to justify it?? (esp when it turns out the odds are not as great as you implied)

BTW, I missed where he had to choose between four components, this tends to suggest he had to choose between two??? This argument was made a while ago in Serophile when JGH was asked to distinguish the Sp9 vs the SP14.

So the 1/4 four times escapes me slightly.

You made the claim that I would be a sucker in some city in the desert for thinking the odds against four heads in a row was not THAT great. You failed to consider that whatever happens in that desert city, it is all based around 'odd's, and as such each game is different in terms of risk and reward, and hence not that applicable.

(if you were referring to me) then I personally have never made the claim that cables are a sham or there are no differences under all conditions, however I DO make the claim that the differences are no where as extreme as made out, to whit 'a $40 000 pair of speakers can be made to sound shit with poor cables'.

THAT, notwithstanding your protestations above, should be a walk in the park under blind test conditions no??

WHY should anecdotal evidence be enough when certain specific unconditional claims are made?? If anecdotal claims are made, then yeah, anecdotal evidence is ok. If scientific claims are made, then scientific evidence is needed.

Your thoughts on the stress involved??

And, why the specific mention of you posting as MODERATOR? I don't get why you made that point?? Do you have two very different roles in this thread, normal interested party and another as moderator? How do you see those two different roles, what differentiates the two??

I too would like to see something new, a plausible explanation of why not knowing what the component is makes a difference to the actual sound field. Do you have new thoughts on that? If you do, are they as a result of your own experiences with undergoing a dbt??
 
I have little to add to this old debate, but here goes:

Statistically significant sample

Margin for error

In a properly conducted study, these two concepts answer every rational objection that is typically made against ABX testing. But almost none of the testing we're talking about is properly conducted. Still, even in the most informal, uncontrolled, personal listening test, I don't know how anyone can argue that listening while looking at your shiny new toy and sitting on your recently emptied wallet could possibly be as objective and revealing as evaluating the new acquisition in your reference system with eyes closed. No science required, just common sense. Yet audiophiles make that argument all the time.

With that said, should those who claim differences in cables, isolation plinths, shakti stones, and extension cords be required to prove their claims through blind testing? Of course not. They should be allowed to waste their money any way they see fit. And the rest of us should be allowed to be amused.

P
 
you mean it stuck in your craw that much when this statement "You could flip a coin and get four heads or four tails in a row but not likely" was challenged that you have spent the next day trying to justify it?? (esp when it turns out the odds are not as great as you implied)

BTW, I missed where he had to choose between four components, this tends to suggest he had to choose between two??? This argument was made a while ago in Serophile when JGH was asked to distinguish the Sp9 vs the SP14.

So the 1/4 four times escapes me slightly.

You made the claim that I would be a sucker in some city in the desert for thinking the odds against four heads in a row was not THAT great. You failed to consider that whatever happens in that desert city, it is all based around 'odd's, and as such each game is different in terms of risk and reward, and hence not that applicable.


Recently our role as moderators has been re-emphasized. As much as I would like to exchange barbs with you I decline.

You challeged the notion that guessing four in a row was not that difficult. My point is you are just mathematically incorrect. You remain so. The city in the desert comment was just a way of making a point. I did not call you a sucker. Name calling will not tolerated any more by the moderators or members if it ever was.
The JGH test was an example provided by me. The test under consideration involved identifying for types of digital data. Therefore the odds for each choice would be 1/4. I remain ready to accept your wager with those odds any day. Nothing stuck in my craw. You yourself stated you did not know and asked me had I calculated the odds. You are correct ,gaming theory is more complicated then just figuring odds.

PP I see nothing wrong with a properly conducted double blind protocol. In deed I referred to one in the phase alignment threads ABX is a mere type of DBT. I don't like it.

As to who should be the subjected to ABX/DBT. I submit everyone. In your choice of computer and active speaker did you subject yourself to rigorous double blind protocol against a well setup vinyl system? Correct me if I am wrong. I suspect not. I suspect that you indulged in your pre-expectation bias that digital was better and it was not necessary. Correct me if I am wrong. Anecdotal evidence seems to be sufficient in support of my own dogma. It's the other guy who has to prove his point.
Like I said nothing new here. You gentleman will have the last word. Unless I need to correct some factual error.
 
Well, here's my last word, FWIW. I don't think any music lover should be subjected to dbx. If he wants to employ some informal version, as I do when it is practical, good. But this is about enjoyment of art, and how each of us gets there is pretty personal. If you like vinyl and electrostats, and I like hard discs and active speakers, to each our own bliss, no need for verification. Who would be subject to dbx, in a more perfect world, would be all of those, in positions of self-proclaimed authority, making performance claims that are not otherwise supported outside of their self-proclaimed authority. Digital audio designers further reducing jitter below levels that are broadly considered to be inaudible and claiming more "inner detail" or a "deeper sound stage." Turntable manufacturers claiming a "more natural," "life-like" sonic signature, a "euphonic musicality." IE: Manufacturers selling expensive solutions to problems that may not exist or even more expensive improvements backed up only by a bit of fluffy marketing copy. In a better world, these guys, and especially the audiophile press that enables them, would all be subject to rigorous, independent, statistically sound double-blind testing of their claims, their ears and their honesty. But the world is not perfect and I'm not holding my breath.

P
 
Whether a probability of 1/16 is characterized as "astronomical" is beside the point. It's what's at risk that counts.

If a decision to buy a $40k cable which makes no real difference is based on a test outcome of 1/16 then the risk is $2.5k and would be characterized as a loss.

If all other outcomes result in a "not buy" decision then the risk is $37.5k saved.

BUT, I think the risk that is most feared is calculated with a different formula: Risk = probability X (Cost to my ego).

Really. How many of us can afford to pay such a price?
 
Forgot to answer this question:

Your thoughts on the stress involved??

It's axiomatic. The greater your stake in the outcome the greater your stress in the test. When I'm under a lot of stress my system sounds like crap.
 
Last edited:
Cables and blind testing...the age-old boreass debate and one, quite frankly, I wish could be put to bed for good. That of course will never happen. I have played around swapping speaker cables out in my system, one that is highly resolving by the way, and have heard differences. Could I duplicate that under a controlled DBT? Probably not since my sonic memory is about as long as my ____ (fill in the blank). I have a friend who recently bought a $6K pair of gold ICs. Would I do that? No. Do I think they make a difference? If he thinks so, then yes they did. Either way it's none of my business how he spends his money. If a guys wants to spend a ton of money on cables and it gives him satisfaction, fine. If a guy pays $200 for a bottle of wine, again, none of my business. Most of this hobby can't be justified economically anyhow so just accept that fact and let the "crazies" who can afford to spend $30K on speaker wires spend it.
 
Cables and blind testing...the age-old boreass debate and one, quite frankly, I wish could be put to bed for good. That of course will never happen. I have played around swapping speaker cables out in my system, one that is highly resolving by the way, and have heard differences. Could I duplicate that under a controlled DBT? Probably not since my sonic memory is about as long as my ____ (fill in the blank). I have a friend who recently bought a $6K pair of gold ICs. Would I do that? No. Do I think they make a difference? If he thinks so, then yes they did. Either way it's none of my business how he spends his money. If a guys wants to spend a ton of money on cables and it gives him satisfaction, fine. If a guy pays $200 for a bottle of wine, again, none of my business. Most of this hobby can't be justified economically anyhow so just accept that fact and let the "crazies" who can afford to spend $30K on speaker wires spend it.

Amen brother
 
So you're in your own room, you've got an ABX box in place to compare two different speaker cables. Doesn't matter which ones. It could be Nordost Valhalla versus Rat Shack. It could be Transparents $30K+ versus Blue Jeans.

No one is watching you. You are alone, sitting in the sweet spot, in your most comfortable clothes. You can perform this test using your own gear, your own carefully selected music, at your own leisure. There is no time limit. You can do this test in 1 day, 1 week, 1 month, 1 year. No one will even know you are engaged in this scientific undertaking. You decide when to listen. You decide when to take a break. You decide when to flip the switch on the ABX box. You decide when to stop the test and look at the data.

Where is the stress?

How is this not a far more rational way of getting to the truth of the matter than sighted listening tests?

The bottom line is that there is no way to reason someone out of a position when that person didn't use reason to acquire the position in the first place. In other words, there is no way to reason with a true believer.
 
Well, here's my last word, FWIW. I don't think any music lover should be subjected to dbx. If he wants to employ some informal version, as I do when it is practical, good. But this is about enjoyment of art, and how each of us gets there is pretty personal. If you like vinyl and electrostats, and I like hard discs and active speakers, to each our own bliss, no need for verification. Who would be subject to dbx, in a more perfect world, would be all of those, in positions of self-proclaimed authority, making performance claims that are not otherwise supported outside of their self-proclaimed authority.
Well said :).

Digital audio designers further reducing jitter below levels that are broadly considered to be inaudible and claiming more "inner detail" or a "deeper sound stage.
I have read every paper I have found from AES and elsewhere and yet to see anything that comes close to telling me what jitter is audible . The methodology used in the papers which do exists, are extremely easy to invalidate (the problem set is infinite so even if they wanted to prove this point, they likely couldn't). I could be mean and put you on spot and say "prove this declaration" but I won't :D.

In a better world, these guys, and especially the audiophile press that enables them, would all be subject to rigorous, independent, statistically sound double-blind testing of their claims, their ears and their honesty. But the world is not perfect and I'm not holding my breath.

P
That is not the only way to analyze such claims. We can first look to engineering principals and see if they can explain the design. Only when that fails us, do we need to resort to difficult, expensive and time consuming process of blind testing.

For example, I know MP3 encoders at 128kbps tend to roll off the high frequencies. We don't need double blind test to realize this. Likewise, mathematics shows us the level of jitter we need to get below to reproduce 16 bits of audio samples. Anything less means a degradation. I expect quality hardware to not degrade the audio samples below CD's resolution -- dare I say whether I can hear it or not. It is like buying a car that advertises it can do 150 miles an hour but in reality tests show that it can only do 100. Should I not care because I can't find a road here to test it?
 
I too would like to see something new, a plausible explanation of why not knowing what the component is makes a difference to the actual sound field. Do you have new thoughts on that? If you do, are they as a result of your own experiences with undergoing a dbt??
Given our current state of human understanding, there is no possible answer to your question. I should qualify that: there is no rational answer to your question.

Again, you can't reason someone out of a position who didn't use reason to get to that position. You can't reason with a true believer.
 
So you're in your own room, you've got an ABX box in place to compare two different speaker cables. Doesn't matter which ones. It could be Nordost Valhalla versus Rat Shack. It could be Transparents $30K+ versus Blue Jeans.

No one is watching you. You are alone, sitting in the sweet spot, in your most comfortable clothes. You can perform this test using your own gear, your own carefully selected music, at your own leisure. There is no time limit. You can do this test in 1 day, 1 week, 1 month, 1 year. No one will even know you are engaged in this scientific undertaking. You decide when to listen. You decide when to take a break. You decide when to flip the switch on the ABX box. You decide when to stop the test and look at the data.

Where is the stress?
I have done this many times. It is true that stress is gone. ABX of CD/LP though requires a lot more energy than ABX of music files on a computer. To do this you need:

1. Two identical sources. I use two DVD audio players. If you don't have two identical players, do the test with the players you do have. Once you develop a preference, do the test again with the sources reversed (and randomized). If preference moves, then it was due to the player and your results are invalid. If not, then you are golden. I assume most people don't have two identical expensive LP players :). So they should likely try to test in digital domain as that also allows fast remote control such as hitting play on two players at once.

2. Two identical pieces of content (I use DVD-A).

3. The content above needs to be revealing of differences.

4. A very transparent switching and amplification device. I use my Stax headphone amp(s) for IC tests. There, I worry whether it is representative of my normal amplifier. But it is a decent compromise to test whether there is any difference at all. I don't know how one does the same for speaker wires since it could violate rule 5 below.

5. Switching time must be very fast. I like this to be a fraction of a second so that I can go back and forth instantly. Using things like HdMI for example is out of the question as it mutes audio attempting to lock. Ditto for stopping, switching speaker cables, and playing again. This is why I have no opinion on speaker cables. Never tested it because I don't know how.

6. A ton of time on your hand! Even simple A/B test blows an entire weekend for me.

How is this not a far more rational way of getting to the truth of the matter than sighted listening tests?
I highly recommend people attempt at least one rigorous test at home if not for anything other than being able to challenge the other guy good :D.
 
I have done this many times. It is true that stress is gone.
I have added bold to your post. The true believers should take note.

ABX of CD/LP though requires a lot more energy than ABX of music files on a computer. To do this you need:

1. Two identical sources. I use two DVD audio players. If you don't have two identical players, do the test with the players you do have. Once you develop a preference, do the test again with the sources reversed (and randomized). If preference moves, then it was due to the player and your results are invalid. If not, then you are golden. I assume most people don't have two identical expensive LP players :). So they should likely try to test in digital domain as that also allows fast remote control such as hitting play on two players at once.

2. Two identical pieces of content (I use DVD-A).

3. The content above needs to be revealing of differences.

4. A very transparent switching and amplification device. I use my Stax headphone amp(s) for IC tests. There, I worry whether it is representative of my normal amplifier. But it is a decent compromise to test whether there is any difference at all. I don't know how one does the same for speaker wires since it could violate rule 5 below.

5. Switching time must be very fast. I like this to be a fraction of a second so that I can go back and forth instantly. Using things like HdMI for example is out of the question as it mutes audio attempting to lock. Ditto for stopping, switching speaker cables, and playing again. This is why I have no opinion on speaker cables. Never tested it because I don't know how.

6. A ton of time on your hand! Even simple A/B test blows an entire weekend for me.
Are you trying to describe an ABX test or some sort of ABC/HR test?

We must be clear for the scientifically challenged to distinguish between blind testing which is used to determine preference versus blind testing to determine whether or not there does exist a difference.
 
I have read every paper I have found from AES and elsewhere and yet to see anything that comes close to telling me what jitter is audible . The methodology used in the papers which do exists, are extremely easy to invalidate (the problem set is infinite so even if they wanted to prove this point, they likely couldn't). I could be mean and put you on spot and say "prove this declaration" but I won't :D.

Fair enough. I haven't read them all by any means, but I've read enough to know that "what's audible" when it comes to jitter is pretty debatable. I concede the point. My personal experience, FWIW, is that I've heard what I think might be jitter (it's really hard to know) a few times in some really bad products, but do not hear it in any quality implementations. I think (opinion again) jitter is a paper tiger.

That is not the only way to analyze such claims. We can first look to engineering principals and see if they can explain the design. Only when that fails us, do we need to resort to difficult, expensive and time consuming process of blind testing.

Of course. And we can measure first. Given appropriate impedances, sufficiently low noise and distortion, sufficiently broad, flat frequency response and more than enough headroom for the load to be driven, electronics should be pretty transparent. All of this can be measured. I'm just so accustomed to audiophiles rejecting measurements that sometimes I don't even bring it up. Blind testing, taken to statistical validity, is time-consuming, but it doesn't have to be all that difficult or expensive. Still, I'd go to design and measurement first and only use abx for verification. And all of that is electronics only. All bets are off when it comes to transducers, the wild card in the playback game.

For example, I know MP3 encoders at 128kbps tend to roll off the high frequencies. We don't need double blind test to realize this. Likewise, mathematics shows us the level of jitter we need to get below to reproduce 16 bits of audio samples. Anything less means a degradation. I expect quality hardware to not degrade the audio samples below CD's resolution -- dare I say whether I can hear it or not. It is like buying a car that advertises it can do 150 miles an hour but in reality tests show that it can only do 100. Should I not care because I can't find a road here to test it?

You probably shouldn't spend much energy caring about that which has no impact on you, but many seem to. They seem to believe that theoretical ability to reach 150 mph somehow affects the quality of their driving experience at 55, and some of those people, no many of those people, are high-end manufacturers and reviewers selling their opinions, and their dismissal of any science that contradicts their opinions, to the hobby. I think it has been bad for the industry and the hobby, a very counter-productive turn away from the goal of higher fidelity to the source toward the pursuit of some tonal signature we convince ourselves, in the absence of belief in evidence, is somehow more natural. But mostly I just find it annoying. :)

P
 
So you're in your own room, you've got an ABX box in place to compare two different speaker cables. Doesn't matter which ones. It could be Nordost Valhalla versus Rat Shack. It could be Transparents $30K+ versus Blue Jeans.

No one is watching you. You are alone, sitting in the sweet spot, in your most comfortable clothes. You can perform this test using your own gear, your own carefully selected music, at your own leisure. There is no time limit. You can do this test in 1 day, 1 week, 1 month, 1 year. No one will even know you are engaged in this scientific undertaking. You decide when to listen. You decide when to take a break. You decide when to flip the switch on the ABX box. You decide when to stop the test and look at the data.

Where is the stress?

How is this not a far more rational way of getting to the truth of the matter than sighted listening tests?

The bottom line is that there is no way to reason someone out of a position when that person didn't use reason to acquire the position in the first place. In other words, there is no way to reason with a true believer.

Then you best keep the results to yourself 'cause no one will believe you anyway.
 
Every year most of car racing circuits have a race where all the cars are of the same model and specifications. Somebody wins that race. So somebody is able to make a car go faster, corner faster,and last longer than the other cars. We all agree there must be some reason consistent with the laws of physics why the winning driver is able to cause his car to outperform the other cars. We can't necessarily attribute to that the driver. Then we have to explain why the Ferrari formula 1 so often finishes first and second.

Indeed you are dead wrong. The ability of a car to do 150 mph does have an effect on the cars performance at 55mph. Inter alia it will accelerate faster from 0 to 55mph and will operate at a lower rpm at 55 and it will reach 55mph faster that a car whose top speed is say 100mph. It is well accepted that wide bandwith has an effect on the audible spectrum.

This argument has been made thousand of times. Its not that we ignore measurements. We just realize they don't tell the whole story.

We do know one thing. Music stored in a a digital medium is not identical to real music. Something happened. If it's not jitter it's something else. If we spend all our time arguing about ABX/DBT we are never going to find the answer.
 
Are you trying to describe an ABX test or some sort of ABC/HR test?
I am talking about A/B tests in general.

We must be clear for the scientifically challenged to distinguish between blind testing which is used to determine preference versus blind testing to determine whether or not there does exist a difference.
I personally don't think consumers should concern themselves at such levels. To the extent they do any kind of blind testing, it is a big step up from where they are today. Taking it to ABX level, testing null hypothesis, level matching to Ni'th degree, etc. is needed if you are going to publish a paper in AES :). But otherwise, you just want to test yourself once in a while to roughly understand the limits of your perception.

What I am not a fan of is subjecting everyday audiophiles to trick tests (sorry Ethan :)), taking negative results as definitive proof, etc. Again, if you are doing this commercially you may do such things but doing so with everyday audiophiles takes the fun out of their hobby.

How is this for straddling the fence? :D
 
You probably shouldn't spend much energy caring about that which has no impact on you, but many seem to. They seem to believe that theoretical ability to reach 150 mph somehow affects the quality of their driving experience at 55, and some of those people, no many of those people, are high-end manufacturers and reviewers selling their opinions, and their dismissal of any science that contradicts their opinions, to the hobby. I think it has been bad for the industry and the hobby, a very counter-productive turn away from the goal of higher fidelity to the source toward the pursuit of some tonal signature we convince ourselves, in the absence of belief in evidence, is somehow more natural. But mostly I just find it annoying. :)

P
I am not sure I would go along with that in the context of this forum and people you are addressing. This is not "what is good enough" but "what is best." If I can hear a difference between two DACs and you trust my judgement, you should be fine in spending 10X more money to get that equipment and feel good about it even if you can't hear the difference readily. It is not like they are asking you to donate money for them to buy the gear :).

Personally, I can be frugal as hell :). So don't take the above as something I always practice. Yes, I have esoteric gear in some cases, but in others, everyday stuff does the job. In discussing things here, I don't want to impose my desire for high ROI on folks. It is their choice to go there or not. If someone asks me to tell them the best camera they could buy, I could name that. And they can do that even though they don't yet know how to take a decent picture. At least they know the equipment is not at fault when their pictures don't have the fidelity they need.
 
I am not sure I would go along with that in the context of this forum and people you are addressing. This is not "what is good enough" but "what is best." If I can hear a difference between two DACs and you trust my judgement, you should be fine in spending 10X more money to get that equipment and feel good about it even if you can't hear the difference readily. It is not like they are asking you to donate money for them to buy the gear .

If they absolutely trust your judgement and you have no skin in the game, sure, but you didn't call the forum "what's expensive" or even "what's theoretically best," or "What Goldeneared Magazine" thinks is best. If you can't hear it, feel it, taste it, smell it...experience it, it isn't better, much less best. And you don't have to be frugal. You can spend the money on the best of something else. I choose single-barrel bourbon. :) Or better yet, in the context of this discussion, spend it on an upgrade you can hear.

P
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu