A question for Mr. Science....

If I recall correctly Tim, the construction is a tradition. It's not meant to be a presumption but rather a statement to be proven or disproven. Stating it in the positive removes ambiguity.

It's like a game of Name that Tune. "I can name that song in 5 notes", "I can do it in 3", "Prove it"
 
The stastical analysis used adressed the hypothesis/supposition/question "Listeners can perceive differences"; it did not address "listeners cannot perceive differences" or "can listeners perceive differences or not".

I'm probably being really dense here, but here, again, is everything we have about this study:

It is currently common practice for sound engineers to record digital music using high-resolution formats, and then down sample the files to 44.1kHz for commercial release. This study aims at investigating whether listeners can perceive differences between musical files recorded at 44.1kHz and 88.2kHz with the same analog chain and type of AD-converter. Sixteen expert listeners were asked to compare 3 versions (44.1kHz, 88.2kHz and the 88.2kHz version down-sampled to 44.1kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2kHz and their 44.1kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2kHz and files recorded at 44.1kHz

We've talked the first part to death, and its imprecise language indicates, at least to me, that they are just gathering data (this study aims at investigating ...whether....). So is this the part where you got the statistical analysis used? --

Sixteen expert listeners were asked to compare 3 versions (44.1kHz, 88.2kHz and the 88.2kHz version down-sampled to 44.1kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2kHz and their 44.1kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2kHz and files recorded at 44.1kHz

Where, in there, do you see the hypothesis that listeners can hear a difference? I see that result reported, but that's a pretty different thing....

I have always assumed these things were deliberately kept as neutral as possible; they gathered the input, came to conclusions (hypothesis?), then subsequently challenged/tested those conclusions (null hypothesis?). I've conducted AB/X listening, informally, on myself, and that's the way I approached it -- "Let's see if I can hear a difference," not "I can hear a difference...let's see."

If I've had it wrong all along, I'm cool with that. God knows I've been wrong before. But I want to understand it.

Tim
 
Let me try again. hypothesis: listeners can hear a difference. Test of hypothesis: can listeners hear a difference, or : whether or not listeners can hear a difference. Do those last two sound like the same thing to you (they do to me, excepting the terrible syntax of the second)?
 
Let me try again. hypothesis: listeners can hear a difference. Test of hypothesis: can listeners hear a difference, or : whether or not listeners can hear a difference. Do those last two sound like the same thing to you (they do to me, excepting the terrible syntax of the second)?

Let me try again: The syntax is fine; the meaning is very different. I'm being a linguist, not a scientist, perhaps, but the hypothesis you're stating is your statement. What they said could just as easily mean the opposite.

Tim
 
No, because the statistical data analysis they performed was on the hypothesis I stated; so regardless of how you interpret the wording, that was the test they performed
 
listeners can perceive differences between musical files recorded at 44.1kHz and 88.2kHz with the same analog chain and type of AD-converter.

Why not simply change it too can listeners perceive differences between musical files recorded at 44.1kHz and 88.2kHz with the same analog chain and type of AD-converter. You are after all trying to find out if they can indeed hear differences. You could run the tests with the same clips encoded at the different sample rates and look for a statisticaly significant difference.

Rob:)
 
The first question one needs to ask is: What is meant by listeners can perceive? Is it all people? Is it 50% of people? What is reasonable/clinically significant? Assume that it’s x% of people. Also assume that one of the exclusion criteria is that people with hearing problems cannot participate in the study, unless an independent governance group deem it be unethical:)

Study design

This is a double blind study. Subjects are told they will hear two music files which may or may not be different. The reason for this is that subjects might be inclined to say that they hear a difference when there is no difference. A given subject is randomly assigned to only one of the 4 sequences of treatments:
1. Musical file A then musical file B
2. Musical file B then musical file A
3. Musical file A then musical file A
4. Musical file B then musical file B

The null hypothesis to be tested is that the proportion of people who can correctly perceive a difference (I prefer “detect”) is less than or equal to 0.0x. The alternative hypothesis is that this proportion is > 0.0x.
Assume that the level of significance is 5%.

Conduct the study, unblind the results, calculate the p-value, etc.

Mike
 
Is a hypothesis always a positive? That's the only way I can think of that you could come to the conclusions you've reached.

Tim
 
If I may intrude, again, a hypothesis is stated as a positive but the conclusion can go anywhere from fully affirmative to fully negative and anywhere in between (level of significance). If the hypothesis were to cover the whole gamut of probable outcomes it would be a real pain to communicate it to the community and would lead to a ping pong match to determine the goal. The OP is a perfect example of this. The "whether or not" jumbles things up. It isn't necessary since the results should cover whatever outcome arises anyway. By stating the hypothesis in the positive you are in effect putting a stake in the ground as a reference point. A negative would cover every place else except the point.
 
If I may intrude, again, a hypothesis is stated as a positive but the conclusion can go anywhere from fully affirmative to fully negative and anywhere in between (level of significance). If the hypothesis were to cover the whole gamut of probable outcomes it would be a real pain to communicate it to the community and would lead to a ping pong match to determine the goal. The OP is a perfect example of this. The "whether or not" jumbles things up. It isn't necessary since the results should cover whatever outcome arises anyway. By stating the hypothesis in the positive you are in effect putting a stake in the ground as a reference point. A negative would cover every place else except the point.

Thank you.

Tim
 
:)
 
And, to elaborate a bit more, there can be 4 very distinct "operations" in this "game":

Observations: something is seen to happen which is interesting, which indicates (key word!) that there is some underlying "truth", which is as yet not fully comprehended.

Hypothesis: a theory is devised which attempts to explain the observations as being the result of some underlying mechanism, or as a phenomenon, in a meaningful and rational way (no magic!)

Testing: a carefully controlled set of procedures is devised which should provide data and thus, evidence, one way or the other, about the hypothesis. This is where the ABX lives, and nowhere else!!

Analysis: you digest the experimental results from the testing, and apply statistical procedures to this process so that you end up with a certain level of certainty as to whether the hypothesis is right or not. Technically, nothing is 100% "true" or not, you just have a certain level of confidence in your conclusion. In other words, you never really "know" the Truth, that's God's domain!! ;)

Frank
 
Last edited:
There's an interesting article about ABX in the AES Journal by Les Leventhal, called "Type I and Type II Errors in the Statistical Analysis of Listening Tests". It's well worth reading, as it talks about the null hypothesis and lots of other details regarding statistical analysis of ABX test results. You can download the article here. There's also some lengthy discussion with Mr. Leventhal and others in some Stereophile letters to the editor.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu