Hmm, how come ALL of the previous posts I quoted came up again, even tho I had answered them before and this is a completely new post??
Even though we focus on two outcomes: hearing a difference and not, I am saying there is another which is the transition point. In this area, testers by definition are unsure so they vote correctly and other times not.
I am pretty sure this is where I have had problems with your hypothesis before. I think it lies in THIS sentence...
In this area, testers by definition are unsure so they vote correctly and other times not. I do get your idea of oscillation, analogous to the 1 or 0 point in digital (or the transition between them).
See, I take it that they vote
correctly each time. THAT there was a difference each time, and they did not hear it, is neither here nor there.....????? If they heard it they voted so, if they did not hear it they voted so. One time they heard it (say), voted 'yes', the next time they did not hear it and voted 'no'. You say 'Well, they SHOULD have heard it, so we are getting a null result unfairly' (or words to that effect).
I am saying 'We are getting a null resultfairly because sometimes they heard it and sometimes they didn't'. Same set of tests, exactly the same result and conclusion (null test), yet with completely different 'spin' (if you will) on it.
It seems you are setting it up that there is a right or wrong answer, is they SHOULD have heard it or such? That, I thought, would be the essential question here, CAN they hear it or not, yes or no.
They also tend to second guess themselves because they are being asked to give a Yes/No answer. I believe both of these factors push the statistics to be non-conclusive. And since non-conclusive is taken as "statistically can't tell the difference" I theorize that we tend to opt for negative findings in this situation.
HOW do you know they tend to second guess themselves?? Maybe they do or do not, it just reads as a statement of fact (which it could be for all I know).
IF they reported a sense of second guessing, well to me that is as validly explained by saying the two stimuli were so close together great difficulty was had. No need to decide anything more than that.
I further am interested in how we quantify the region. One of the ways to do that in my opinion is to scrutinize the test itself. In compression testing, we have a set of audio tests we know to be revealing.
How do we know they are revealing if not by blind tests? IS that how 'we know'??
The content can be shown mathematically to be revealing of differences.
I don't quite understand 'mathematically' here, unless you mean things like 'only 10 db down' or stuff like that. Ie we are using JND's?? (which I further assume is done by large scale blind testing?? maybe not in the same sense as we use it for audiophile stuff, just 'unknown stimuli given to listeners to see what *we* can hear or not hear')
I just get the idea that you are arguing from 'they SHOULD have heard it yet didn't' and are using that to hang everything off. Was not there some thing about a swedish radio codec that is used to show why DBTs cannot find the small differences? Is that the type of example in mind here?