da ear bone's connected to da, goosebump bone, da goosebump bone's connected to da, heart bone, da heart bone's connected to da, pupil bone ...My ears, I do not trust that much. My goosebumps and vital signs, I trust much more.
Once again, Vince, very nicely put: "peels off distortion" is a good way of expressing the modus operandi people should adopt in assessing the performance of a configuration ...It is gratifying when one successfully peels off distortion from their system. In my case, I just have to depend on the creators of the CD I am listening to to get it right.
After all the distortion, and obfuscation is cleared, you are left with the limitations of your speakers. I will be the first to recognize my system may not match well with, "Death Metal."
mep --
You make a good point. I think you point out a trap we easily fall into: we mistake hearing something as 'different' which is not always 'better'. I usually require an extended audition or multiple auditions to distinguish between the two.
I trust my ears but only fast switching when trying to discern differences between suitable components (speakers are easy though obviously if they are different brands, those stepping up within a brand can be harder to diagnose).
Your mileage most certainly will vary. My audio memory is too short and I know it.
Tom
You are, however, right as rain with the scientific and statistical community which claims, and offers good evidence that differences are more accurately identified when the A/B switching is fast, and that listening over time only allows the time to muddy the waters with perceptual bias.
Fact is that memory traces decay rather fast, it's actually a matter of seconds, and errors may become huge, even for simple tasks. There is abundance of evidence for the rather poor performance of auditory memory, as far as comparison tests are concerned. I'm currently reading literature relevant to this aspect. e.g.
Bachem, "Time factors in relative and absolute pitch determination", J. of the Acoustical Society of America 1954, p.741
Bodohoska, "Immediate and short memory: recall of simple auditory stimuli", Acta Psychologica 1976, p.341
Kinchla, "A diffusion model of perceptual memory", Perception & Psychophysics 1967, p.219
and no, I would not trust my ears because of 1. experimental bias and 2. poor auditory memory. That's why I bought all of my gear without any prior auditioning.
Klaus
Of course the ears are the final arbiter. As such they can NEVER BE WRONG!!!!
What we are talking about is hoiw the brain interpets what the ears hear. That subjects itself to infinite varaition.
There is also plenty of data which indicates that what the brain interprets is pretty consistent from brain to brain, and that, in fact, the brain compensates for minor hearing deficiencies to bring us closer to the same "truth," not further apart. So while I agree that the ears are the final arbiter, they are only the arbiter of what I like; no more, no less.
Tim
There's plenty of data on the norm, but unfortunately, scientific repeatable testing for outliers is very much more difficult - which is when we rely on anecdotal evidence.
Sometimes, when I'm doing design, I have a problem in a narrow frequency range. Like a high-Q resonance. It can be very difficult for instruments to pick this up because an impulse won't be long enough to excite it, and it may be a resonance that is created by a frequency that is in the tweeters and a separate one in the midrange, where even a slow sweep does not show it up easily enough.
This is where I rely on my ultimate "instrument" - my sister.
I grab her, and play the piece of music that excites the resonance, and she will go F3# or something. Perfect pitch is when you play a scale, and then play one note, and you are able to identify it. She has perfect absolute pitch, and can ALWAYS identify the note being played. Even in isolation. Play her a single note out of nowhere and she can identify it.
However, eventually, after all the instruments are done, it is still my ears that tell me what I like. I still have not found the entire suite of measurements that I can make that will determine that I will like the *final* design sans listening. On the other hand, I would have no clue where to start if I didn't have measurements. So, measurements and instruments are a foundation. Without a foundation, the house will fall over. The best foundation will give you the best possibility that you will end up with a great end-product.
It seems to me that a/b/x testing is good for distinguishing differences, but it can be weak on letting you distinguish preferences.
It won't tell you, for example, if a system is fatiguing.
Only an extended audition will do that.
Read Sean's blog. The people at Harman have found AB/X testing very useful in distinguishing preferences, and in areas outside of high-end audio, where there is a mysterious cultural bias against such testing, it is used to distinguish preferences all the time. Fatiguing? Now that's a unique attribute specifically related to long-term listening. That would be a challenge for AB/X testing. I wonder, though, if you could correlate AB/X testable attributes with sounds that are found to be fatiguing over the long term? I'll bet you could. I'll bet sounds perceived to be too bright or harsh in quick-switched AB/X listening tests would be exactly the same that would be considered fatiguing in long-term listening.
I wonder if anyone is running these kinds of tests? Sean, are you listening? Any testing going on at Harman to identify sonic attributes that are fatiguing, then map these to attributes that can be quickly identified in AB/X tests?
Tim
I think we need to first clarify what the different purposes are between an AB/X test versus a Preference Test. AB/X tests are designed to measure how reliably you can detect ANY audible difference between two components. You switch between component A and B, and then indicate whether X = A or B. It doesn't attempt to establish a preference.
Preference tests allow the listener to compare A versus B (paired comparison test) or compare A versus B,C,D... (a multiple comparison test) and the listener rates or rank orders each test object on a preference scale. The test assumes that there are audible differences between the test objects. If the audible differences are near threshold, a preference test is probably not the right test.
I've designed some hybrid tests for testing different power amplifiers that combine AB/X with preference. The listener indicates what X is, and also rates A and B on a 10-point preference scale. If they can't reliably identify X - then you can assume their preferences are meaningless or statistically random (and typically they are).
The tests typically last 20-30 minutes and we've found no effects related to fatigue, particularly if the listeners are well trained. Listener fatigue tends to set when tests are too long (> 40 minutes) or the listener's task is too difficult. I suspect that listening at high SPL's (+90 dB) and listening to highly distorted sound could also cause an early onset of listener fatigue in listening tests. We normally listen at comfortable playback levels unless we are testing for dynamic compression and max SPL capability. For ethical reasons, these are two parameters we'd rather establish via objective measurements, or else use the marketing/sales team as our listeners [just kidding]