Is ABX finally Obsolete

Phelonious Ponk · Jul 21, 2011

Let me see if I can break this down in a way that helps you understand why we don't need to invent a better double blind methodology, even if any of us were qualified to design one...

Okay then we have a situation where an audiophile claims to hear something that is not confirmed by conventional measurements. Said audiophile accepts an ABX challenge. Hs results are statistically insignificant. He say score 10 out of 20.

Whether or not the results are statistically significant is not determined by the score. It is determined by the testing methodology, the number of trials, and the margin of error.

We can come to several conclusions. He could not reliably hear the difference. Or there is something wrong with the way he approached the test. The test is fundamentally flawed. Or I have suggested the he did not want thear a difference.

If the methodology is sound, and enough trials are run to exceed the margin of error, the audiophile's score of 10 correct out of 20 concludes, statistically, that he could not reliably hear the difference. Period. Not I, or anyone with any sense, would argue that ABX testing is conclusive regardless of the quality of the execution. I've run quite a few casual, blind A/B listening test on myself, at home. I understand that I've only made a point for myself when I've done so. I can't speak for anyone else, but that kind of ABX testing is not what I'm talking about here. I'm talking about the kind Meyer and Moran did. The kind Stereophile did in comparing amps years ago, the kind Harman is doing all the time in their labs.

Many have suggested the problem is auditory memory.

Many have. It is well-understood, in scientific, not audiophile circles, that subtle differences are most reliably detected through quick switching. All hearing tests are based on this understanding. Would you endeavor to re-design your opthamologists vision test if you didn't like the results? Would you insist on a whole new, self-designed testing methodology because you were seeing an H instead of an A on that fourth line down? Or would you, perhaps, accept that the collected wisdom of the people who have long designed and excecuted vision tests might be more rational than your desire to have 20/20 vision?

Although it is often stated you could listen as long as you like, rapid switching is a much more cost effective method.

Eliminate "cost" from the sentence above and you have embraced the collective scientific wisdom.

Surely even proponents of ABX do not think it is perfect. Perhaps we can come up with a better double blind methodology with getting offended and calling each other names.

I don't think anything is perfect and I have no interest in calling you names, Gregg, but we're going to need an audiologist, a research scientist and a statistician to come up with a better methodology. This one is a tested, tweaked and widely accepted method of testing the perception of sensory differences including and beyond audio. The only place I'm aware of that it is broadly considered to be invalid is on audiophile forums. Even wine snobs seem to accept the results more gracefully.

Tim

Gregadd · Jul 21, 2011

Tim I do not feel the need to list everything I have done or read on ABX. In fact I sort of wish I could ignore it. I could challenge your hypotheticals. See above. I think I have done enough of that in this thread. I already posted on this site the methodology for proving the null set.

Look here. A common misconception is that failing to prove existence an audible difference is proof of a lack audible difference. See http://en.wikipedia.org/wiki/Statistical_power

Phelonious Ponk · Jul 21, 2011

Gregadd said:
Tim I do not feel the need to list everything I have done or read on ABX. In fact I sort of wish I could ignore it. I could challenge your hypotheticals. See above. I think I have doen enough of tht in this thread. I already posted ob thie site the methodology for proving the null set.

OK. I didn't state any hypotheticals other than the vision test, but OK.

Tim

Gregadd · Jul 21, 2011

Tim- I edited my post for a link to statistical power. http://en.wikipedia.org/wiki/Statistical_power

microstrip · Jul 21, 2011

Could you help me finding a cookbook for ABX? I am not looking for detailed explanations or ABX theory - just a few practical recipes that tell me everything I need to carry an ABX valid test.

Aspects I would like to know are - optimum duration of each music, pause time between A, B and X, pause time between each test set, if a track can be repeated, etc. As all these aspects can influence the outcome of the test, I suppose that some strict rules widely accepted must exist.

Thanks!

Phelonious Ponk · Jul 21, 2011

microstrip said:
Could you help me finding a cookbook for ABX? I am not looking for detailed explanations or ABX theory - just a few practical recipes that tell me everything I need to carry an ABX valid test.

Aspects I would like to know are - optimum duration of each music, pause time between A, B and X, pause time between each test set, if a track can be repeated, etc. As all these aspects can influence the outcome of the test, I suppose that some strict rules widely accepted must exist.

Thanks!

I'm sure Sean could provide you with guidelines, but I have no idea if Harman considers their specific methodologies proprietary or not.

Tim

Phelonious Ponk · Jul 21, 2011

Gregadd said:
Tim I do not feel the need to list everything I have done or read on ABX. In fact I sort of wish I could ignore it. I could challenge your hypotheticals. See above. I think I have done enough of that in this thread. I already posted on this site the methodology for proving the null set.

Look here. A common misconception is that failing to prove existence an audible difference is proof of a lack audible difference. See http://en.wikipedia.org/wiki/Statistical_power

It may be a common misconception, Gregg, but it's not mine. The link you provided is a good one, and it demonstrates how much effort is required to reduce the chance of error enough to have confidence that your results are significant, much less proof. Proof is awfully evasive. Atomics is still called a "theory" after all, and is rigorously questioned somewhere, I'm sure. But we can agree that "failing to prove the existence of an audible difference" does not prove that there is no difference.

It is, however, awfully close.

Rockets are launched, medicines are tried, surgeries are performed and consumer products are taken to market with no more "proof." Are mistakes made? You bet. Does even the most successful result "prove" that the results will always be the same? Of course not. It's all about probabilities, my friend, and while good ABX testing can bring the possibility of error down to something so vanishingly small that it is completely lost in the shadow of the occurance of placebo effect, Audiophiles will always believe they are the exception, that their golden ears are hearing night and day differences in that tiny little space that the data has dismissed. It is, evidently, the nature of the beast. And don't get offended; I'm not calling you a beast

.

Tim

es347 · Jul 21, 2011

Perhaps Mike Lavigne could chime in at this point regarding his cable blind test in response to the James Randi challenge.

Gregadd · Jul 21, 2011

Tim I wont' nitpick because I get your point. For example tiny errors in directional equipment can send you hurtling into outer space never to return.
Harman does DBT but not much ABX. They are seeking far more detailed info than ,can you tell a difference. I gave a link to that before on this site You may not know what speaker you are listening to. But they are asking far more detailed preference info.

I am sure everyone already knows but ABX is a form of DBT. Not DBT itself. ABX is intended to do but when basic thing, hide the identity of the equipment under evaluation. A true ADBT involves so much more. Like does the medicine work.

Suppose the group is consistently wrong? Suppose we kept data and found not only were they consistently wrongs Say 4 out of 20 correct. Further investigation revealed they were wrong on the same round? Suppose we inserted an artifact in the sample to see if they could spot and remmeber it?

DS-21 · Jul 21, 2011

Gregadd said:
Suppose we inserted an artifact in the sample to see if they could spot and remmeber it?

That sounds awfully familiar.

Phelonious Ponk · Jul 21, 2011

Gregadd said:
Tim I wont' nitpick because I get your point. For example tiny errors in directional equipment can send you hurtling into outer space never to return.
Harman does DBT but not much ABX. They are seeking far more detailed info than ,can you tell a difference. I gave a link to that before on this site You may not know what speaker you are listening to. But they are asking far more detailed preference info.

I am sure everyone already knows but ABX is a form of DBT. Not DBT itself. ABX is intended to do but when basic thing, hide the identity of the equipment under evaluation. A true ADBT involves so much more. Like does the medicine work.

Which one is it, then? That you won't nitpick or that you're going to nitpick?

I'll pick a few nits to make sure we are at least accurate about terms. ABX is not a form of DBT. Let's break it down:

Sighted AB is a listening party; calling it a test gives it way too much credit.

Blind AB is a BT (blind test).

Everyblind test intends to hide the identity of what is being tested; that's why it's blind.

Add the X (ABX) and you add a control ("what is X?").

Add the D (DBT) and even the person conducting the test doesn't know what is being tested at any given time (Double Blind Test).

An ABX is not necessarily double blind. I'm not familiar with "ADBT."

Suppose the group is consistently wrong? Suppose we kept data and found not only were they consistently wrongs Say 4 out of 20 correct. Further investigation revealed they were wrong on the same round?

I've never seen results like that, but if I did, I would suppose something was very wrong with the testing methodology.

Tim

amirm · Jul 21, 2011

Phelonious Ponk said:
I'm sure Sean could provide you with guidelines, but I have no idea if Harman considers their specific methodologies proprietary or not.

Tim

If you mean the speaker tests they do, it is not ABX. It is double blind subjective ratings applied to what you are hearing. It is not a binary test of whether you can tell which one "X" is.

I am have never appreciated the fascination with "ABX" tests. When talking about objective methodology, I use the term double blind testing. It is the blind aspect that gets rid of bias. The specifics of ABX being good or not is another matter. Yet in every argument I have with people, they keep asking about "ABX" tests.

I venture to guess there are 100 times more blind tests performed of other types than ABX in the field of audio/video.

Gregadd · Jul 21, 2011

DS-21 said:
That sounds awfully familiar.

Sort of reminds me of when I was in school. Reginald has abiltiy but he will not apply himself.

Gregadd · Jul 21, 2011

amirm said:
If you mean the speaker tests they do, it is not ABX. It is double blind subjective ratings applied to what you are hearing. It is not a binary test of whether you can tell which one "X" is.

I am have never appreciated the fascination with "ABX" tests. When talking about objective methodology, I use the term double blind testing. It is the blind aspect that gets rid of bias. The specifics of ABX being good or not is another matter. Yet in every argument I have with people, they keep asking about "ABX" tests.

I venture to guess there are 100 times more blind tests performed of other types than ABX in the field of audio/video.

Yeah what he said ,Tim. What you want to knows is does the medicine work.

"ADBT." Sorry that should have been a DBT.

Phelonious Ponk · Jul 21, 2011

Gregadd said:
Yeah what he said ,Tim. What you want to knows is does the medicine work.

"ADBT." Sorry that should have been a DBT.

True dat. But you don't know whether or not the medicine works if you don't know what you took. I'm not hung up on ABX vs DBT, or even DBT vs. blind. I'm not even all that hung up on good methodology for my own purposes, to be honest. I just think evaluating components blind - even if it is casual, with no pretense of statistical validity - is a couple of huge steps ahead of sighted listening, which is just a bad way to compare two pieces of gear, period.

Tim

Vincent Kars · Jul 21, 2011

amirm said:
I am have never appreciated the fascination with "ABX" tests. When talking about objective methodology, I use the term double blind testing. It is the blind aspect that gets rid of bias. The specifics of ABX being good or not is another matter. Yet in every argument I have with people, they keep asking about "ABX" tests.

I venture to guess there are 100 times more blind tests performed of other types than ABX in the field of audio/video.

+1

DS-21 · Jul 21, 2011

Phelonious Ponk said:
True dat. But you don't know whether or not the medicine works if you don't know what you took. I'm not hung up on ABX vs DBT, or even DBT vs. blind. I'm not even all that hung up on good methodology for my own purposes, to be honest. I just think evaluating components blind - even if it is casual, with no pretense of statistical validity - is a couple of huge steps ahead of sighted listening, which is just a bad way to compare two pieces of gear, period.

Add "and level matched" after "blind," and I think you've nailed it.

Often, what people hear as "subtle sonic differences," after all, are just small broadband level differences.

Gregadd · Jul 21, 2011

Matched levels do present quite a problem. If the original differences they heard were based on "level differences", once they are removed they may no longer"recognize" the equipment they thought superior. Morevoer the difference they heard may now have disappeared.

Phelonious Ponk · Jul 21, 2011

DS-21 said:
Add "and level matched" after "blind," and I think you've nailed it.

Often, what people hear as "subtle sonic differences," after all, are just small broadband level differences.

Absolutely.

Tim

fas42 · Jul 21, 2011

DS-21 said:
True. The argument always seems to end have the following result:

audiophool: YOUR hearing may not be good enough to hear the differences, but I hear them so na, na, na na na.
rational music lover: you're deluded.

The funny thing is, most of both of those statements (everything except for the first clause in "audiophool") is true! The reason is that the two approach audio from fundamentally opposed weltanschauungen: one that views audio as a straightforward application of engineering principles to achieve a desired result, and one that thinks there's magic in the machine.

Where these two points of view can resolve their differences and meaningly merge to an agreement is for firstly the "rational music lover" to accept that something more than "straightforward" application is required, it will still all be totally "correct" principles, but doing it in a relatively unsophisticated and dumbed down manner will not achieve the result the other party knows is possible.

On the other hand, the "audiophool" has to jettison the belief in "magic", there ain't no such animal, not in audio at least! By luck and endless fiddling he has managed to achieve superior qualities in his sound compared to what the other party is content with, but he needs to understand that there are still very real, very rational principles at work making it happen for him. He may not understand the principles, almost no-one may appreciate what they are, but you can be certain the underlying mechanisms are perfectly capable of being fully, "scientifically", understood if someone makes a major effort in tracking down what's going on ...

Frank

Is ABX finally Obsolete

New Member

WBF Founding Member

New Member

WBF Founding Member

VIP/Donor

New Member

New Member

VIP/Donor & WBF Founding Member

WBF Founding Member

New Member

New Member

Banned

WBF Founding Member

WBF Founding Member

New Member

WBF Technical Expert: Computer Audio

New Member

WBF Founding Member

New Member

Addicted To Best

Similar threads