Conclusive "Proof" that higher resolution audio sounds different

Do these measure harmonic structure. Noise has a signature that is high frequency smearing or harmonic distortion,it effects clarity.

Yes it would. In this particular case the FFT is showing bins approximately 11 hz wide. Each section of 11 hz makes up the graph. 1 khz would show a spike near 1khz and another at 2khz or 3 khz if there were harmonics.

Noise need not be high frequency, it can have any frequency. When I say noise is audible here in the difference, it is only audible with amplification of some 40 decibels when good resampling was used. If you just listen to the file on its own, you hear silence. With amplification you hear something much like interstation hiss from FM with quieting turned off.

The only effects that matter are those you can hear. Aural perception is limited to 20 khz and below. EMI can cause disturbances which effect that range. Or they can cause no change whatsoever. You can only hear effects of EMI by the effects of signals in the audible range. So if you know enough about signals in the audible range you can determine if they sound different. Why they are different if they are is another issue and could be any one of many things. EMI isn't some special condition that just by its presence is heard.

By your bolded statement you just turned the whole high end cable business on it's ear.
 
Max,

There seems to be some confusion between the sense of null or negative and not valid in your question. I can not see any systematic connection between nulls and sighted tests.

Hi Micro. My questions were based on this comment of yours - The referred controls for false negatives are needed only if the experiment returns a null (a negative, the usual result of poorly carried tests).

Sorry if I confused things. The connection I'm making between nulls and sighted tests is based on the fact that, before the blind-testing part of your typical forum blind-test, participants are usually given the opportunity to listen sighted first, and they nearly always (as far as I can tell) report audible differences. Often though the participants then proceed to listen blind or, with knowledge removed, and cannot reliably differentiate the products being tested.

I'm asking, based on this typical scenario, why the null result might be rejected, in a case like this?

It seems to me that in these typical cases, the controls (removal of knowledge) have the desired effect of removing all non-aural stimuli, which in turn often exposes the sighted results as false positives.

I do not see how any other possible variable could cause sighted and blind results to differ bar the one very important one, knowledge.

I'm also wondering why you suggest that a null result in a given blind-test is usually the result of poorly carried out tests, given what I've outlined above. Is there data available that suggests so?

A null is never 100% valid - but if you have controls its comprehensiveness will be much wider.

As the Wiki piece suggests, a null should not be rejected once controls were in place. What seems to be the point of contention on this thread is what constitutes 'controls', what relevance some do or do not have to the validity of a given test.

I think it's wise to follow the tried-and-trusted procedures ala wiki, and Tim's link, etc, and not to let those with commercial interests determine what's best, or what's valid.

The asked evidence is given by the non existence of tests resulting in any significant positive identifications - and in part believing that the people who wrote specifications asking for controls know what they are addressing. Unfortunately these tests we refer are never properly documented, and any post asking for details is never answered. The only rigorous challenge about "small differences" (electronics, cables and sources) I know about was the Quad challenge, properly reported and analyzed in Electronics and Wireless World many decades ago.

I agree that more in-depth reporting of your typical blind-test procedures would be good. The ABX test of DACs recently hosted by a guy on Pink Fish was very thorough though, and a null was the result (differences were reported sighted first).
 
Last edited:
Jeez, Tim, all tests, not just ABX have certain criteria that are needed to be met before it is considered a valid, sensitive test. This is testing 101. Part of this is usually pre-screening of the participants. Nobody, that I know of, ever skips this & then just changes the participants after a test & runs it again (unless it was badly designed test in the first place). I'm really struggling to understand what is so difficult about all of this.

No one is talking about eliminating pre screening, John, I'm talking about your assertion that trained listeners are required for a valid blind test. And this response of yours to Frantz is interesting...

one positive result is valid but any number of negative results means the controls have to be examined" - this is a blatant (maybe willful) misunderstanding of the nature of the test.

...because you're the one who said, "a positive result proves the test," and, in the same post, "a negative result throws an immediate spotlight on the controls in place." Have you sunk to attributing your most outrageous statements to me, once you realize they are unsupportable?

If that isn't the same as "one positive result is valid but any number of negative results means the controls have to be examined," I think your communications skills need some work. Actually it's much worse than that. "A positive result proves the test?" What science teacher told you that any one result proves anything? And did you really say that in the same sentence with "statistical analysis?"

Now go ahead and deny that you said what you said. For my part, I won't be opening this thread again.

Tim
 
Frantz,

Your final remark just shows you have not read the debates. I can imagine your time is too valuable to go through others hundreds of contributions, but the "is there anything else" is really amusing.

Thanks micro for thinking for me . Always appreciated :)
 
No one is talking about eliminating pre screening, John, I'm talking about your assertion that trained listeners are required for a valid blind test. And this response of yours to Frantz is interesting...



...because you're the one who said, "a positive result proves the test," and, in the same post, "a negative result throws an immediate spotlight on the controls in place." Have you sunk to attributing your most outrageous statements to me, once you realize they are unsupportable?

If that isn't the same as "one positive result is valid but any number of negative results means the controls have to be examined," I think your communications skills need some work. Actually it's much worse than that. "A positive result proves the test?" What science teacher told you that any one result proves anything? And did you really say that in the same sentence with "statistical analysis?"

Now go ahead and deny that you said what you said. For my part, I won't be opening this thread again.

Tim

:)

mistyping?
 
I think they started out in the cable business on their ear. The obvious step to reduce EMI effects is going with balanced connections.

Still wrong.
 
Only in the field of alternative fantasy physics.

i should just let that go, but this is the reason AES engineers think audiophiles are crackpots.
 
No one is talking about eliminating pre screening, John, I'm talking about your assertion that trained listeners are required for a valid blind test.
It depends on the level of sensitivity that is needed for the test i.e what level of small impairment is being analysed, Tim. As the level of impairment being analysed reduces the need for trained listeners increases. You are resisting every explanation with an argument which is blinding you to the blindingly obvious.
And this response of yours to Frantz is interesting...

...because you're the one who said, "a positive result proves the test," and, in the same post, "a negative result throws an immediate spotlight on the controls in place." Have you sunk to attributing your most outrageous statements to me, once you realize they are unsupportable?

If that isn't the same as "one positive result is valid but any number of negative results means the controls have to be examined," I think your communications skills need some work.
Of course the two statements are the same in essence - but I was stating the obvious truth & you were using the same phrases as an example of absurdity. That's the understanding that you are missing.
Actually it's much worse than that. "A positive result proves the test?" What science teacher told you that any one result proves anything? And did you really say that in the same sentence with "statistical analysis?"

Now go ahead and deny that you said what you said. For my part, I won't be opening this thread again.

Tim
Tim, the statistical analysis is contained within the positive result - it wouldn't be a positive result without aa 95% & greater level of confidence. You will probably, in the future, look back on your statements in this thread & realise that you were so off base that you will wish them all to be deleted.
In the meantime I hope you & Frantz at least try doing this ABX test. Max did & he at least has some exposure to how it works but seems to reject (or maybe doesn't understand) the statistical elements of it?
 

Now probably not with your interpretation.

I started a thread on the Ampex list if anybodys interested on what the AES guys think...

Here's a few comments so far

"Look up the works of the late, great Neil Muncy, former member of this
list. He wrote the book on the subject. Or at least all the seminal
article."

"Well, without signal grounding, there is more noise than signal.'

I'm guilty of obsessions as others are here....it's time for me to leave.
 
Hi Micro. My questions were based on this comment of yours - The referred controls for false negatives are needed only if the experiment returns a null (a negative, the usual result of poorly carried tests).

Sorry if I confused things. The connection I'm making between nulls and sighted tests is based on the fact that, before the blind-testing part of your typical forum blind-test, participants are usually given the opportunity to listen sighted first, and they nearly always (as far as I can tell) report audible differences. Often though the participants then proceed to listen blind or, with knowledge removed, and cannot reliably differentiate the products being tested.

I'm asking, based on this typical scenario, why the null result might be rejected, in a case like this? It seems to me that in these typical cases, the controls (removal of knowledge) have the desired effect of removing all non-aural stimuli, which in turn often exposes the sighted results as false positives.

A null with no controls carried after such sighted situations should be rejected exactly for the same reason you reject the sighted - these "challenges" are not meaningful or serious. You say it all when you say your typical forum blind-test. No one will say all sighted listening is valid - sighted listening must have its own controls, other wise we get false positives.


I do not see how any other possible variable could cause sighted and blind results to differ bar the one very important one, knowledge.
I'm also wondering why you suggest that a null result in a given blind-test is usually the result of poorly carried out tests, given what I've outlined above. Is there data available that suggests so?
As the Wiki piece suggests, a null should not be rejected once controls were in place. What seems to be the point of contention on this thread is what constitutes 'controls', what relevance some do or do not have to the validity of a given test.

I think it's wise to follow the tried-and-trusted procedures ala wiki, and Tim's link, etc, and not to let those with commercial interests determine what's best, or what's valid.
I agree that more in-depth reporting of your typical blind-test procedures would be good. The ABX test of DACs recently hosted by a guy on Pink Fish was very thorough though, and a null was the result (differences were reported sighted first).

The main question is that proper tests are beyond what non expert people can do at home in reasonable time with reasonable effort. Implementing serious controls takes a lot of effort. The ITU recommendation refers to it clearly - you need to use known and confirmed positives in these tests. I would not know what to use - I would need to research this subject before entering such matters.
I have no knowledge of the DAC tests you reported - what were the control tests?

BTW You have a point when you say that most sighted listening results in false positives. I am not interested in them - I am interested in the true sighted positives. Think about the Devialet. Tens of people describe its bass performance in similar ways, claiming large differences from other excellent amplifiers. No one tested it blind. What should we think? After all it measures similar to many others - too good for audio use. :)
 
A null with no controls carried after such sighted situations should be rejected exactly for the same reason you reject the sighted - these "challenges" are not meaningful or serious. You say it all when you say your typical forum blind-test. No one will say all sighted listening is valid - sighted listening must have its own controls, other wise we get false positives.

The main question is that proper tests are beyond what non expert people can do at home in reasonable time with reasonable effort. Implementing serious controls takes a lot of effort. The ITU recommendation refers to it clearly - you need to use known and confirmed positives in these tests. I would not know what to use - I would need to research this subject before entering such matters.
I have no knowledge of the DAC tests you reported - what were the control tests?

BTW You have a point when you say that most sighted listening results in false positives. I am not interested in them - I am interested in the true sighted positives. Think about the Devialet. Tens of people describe its bass performance in similar ways, claiming large differences from other excellent amplifiers. No one tested it blind. What should we think? After all it measures similar to many others - too good for audio use. :)

Well, now sighted positives would occur even if you flipped a coin. As in someone would claim a difference which also would be found in blind testing or would match up with some measurement. In most sighted testing you have no way to differentiate one from the other.

As to ITU guidelines, I believe they call one aspect anchor points. Known audible differences mixed in at random. Quite simple to add to forum tests. The recently discussed AVS files did this by accident. Some files were .2 db louder. Known to be edge of an audible difference. I don't know how the group results tallied up, but several did abx those. Whether a level difference, FR difference or something else is best would likely depend on the focus of what is being tested for in each instance.

Now you also would like for conditions to be as close to exactly the same for everyone. Not possible for forum type testing. Several consequences of interpreting such tests from that, but I will leave that for now. Even though the bulk of the ITU guidelines are related to this factor. Most of the rest are about statistical matters.

I also found a post from J_J about positive and negative controls. It says something known to be audible functions as a positive control. Negative control consists of presenting something identical twice. If your results for the two presentations differ by much there is something wrong with the test. Both of these could be used on forum tests though rarely are.

Here it is in j_j 's words:

Any test needs both positive controls and negative controls.
A positive control is something the subject SHOULD hear. A failure there suggest that the test is broken.
A negative control is presenting the exact same thing twice (something you can't do well with LP's and tape, by the way). You should not see any discrimination there. If you do, something is wrong.
 
Last edited:
Yes J_J's more comprehensive list is worth repeating here although it's been posted a number of times already

I've also highlighted phrases related to what has been discussed here (ad nauseum)
A short, and undoubtedly insufficient list (since I'm writing this off the cuff) would be:

1) listener training
2) quiet, single-listener situation, with equipment, acoustics, etc of appropriate quality
3) negative and positive controls, and stimulus repetition for evaluation of consistency
4) perfect time alignment and level alignment (either of those off by much at all will absolutely result in a positive result)
5) feedback during training and after each individual trial
6) consistent A and B stimuli, which the subject is permitted to know, and who can refresh their recollection at any time. This is also an element that can easily cause any test to be positive by mistake.
7) transientless, quiet switching between the signals, with extremely low latency. Switch transients can cause either lower sensitivity or unblind a test, depending on how they arise.
8) the ability to loop the test material under user control
9) of course the setup must be double-blind, ordering must be varied, etc. All standard test confusion issues must be satisfied.


That's just few, that's not even close to a full set, but just that much shows how it isn't easy to run a good test.

What esldude is reporting is timing alignment issues of about 10mS which would invalidate the test - I believe the volume level differences (0.2dB) have been rectified in subsequent versions of the ArnyK & Scott's files? I believe a timing alignment has been shown directly for SCott's files on AVS i.e showing & comparing the waveforms side-by-side for the 24/96 & 16/44 files.

If this timing misalignment is correct for both ArnyK's & Scotts files then this ABX test is invalidated.

This sort of timing delay would not be noticeable in normal, full song listening - it's only noticeable when using a section of the song & doing quick A/B switching, I believe.

BTW, ArnyK's later samples with the test tones at the end were a sort of control - to test if people's playback equipment suffered from IMD. Unfortunately, they may also have been badly conceived?
 
Last edited:
A null with no controls carried after such sighted situations should be rejected exactly for the same reason you reject the sighted - these "challenges" are not meaningful or serious.

Micro, we're still not understanding each other, possibly my fault.

I'm not talking about a null with no controls. I'm talking about blind-tests or, controlled tests where participants report audible differences while knowing what they're listening to, DAC A or DAC B, for example, when listening sighted, first, before the blind-test begins but in the same venue, using the same system on the same day, then fail to reliably identify audible differences when they don't know what they're listening to, i.e, after the sighted tests that always precede the blind in these events - they only know they're listening to one of the things being tested at a given time, but not which one, A or B, but they fail to reliably distinguish A from B, or to be able to say which one of A or B is X, if an ABX test.

This is a result, scenario what have you that's extremely common amongst the regular forum blind A/B, or blind ABX tests.

I'm suggesting that the reason for the different reports, i.e., positive sighted but nulls with knowledge removed (same participants, same system, same day) is because of expectation bias/placebo creating false positives sighted. The controls, i.e., the subsequent removal of knowledge, do their job in these instances. The null results indicate that the reported sighted differences were simply imagined.

I'm arguing that these null results should not be rejected - and the Wiki link backs this up.

You say it all when you say your typical forum blind-test. No one will say all sighted listening is valid - sighted listening must have its own controls, other wise we get false positives.

Sighted listening is only valid to each listener on a personal basis. It's never valid in terms of proof. Your typical forum run controlled blind-test is different, because of the controls, though one needs statistically relevant data for proof. Also, null results don't prove anything 100%, but they very strongly indicate no audible differences were present.

The main question is that proper tests are beyond what non expert people can do at home in reasonable time with reasonable effort. Implementing serious controls takes a lot of effort. The ITU recommendation refers to it clearly - you need to use known and confirmed positives in these tests. I would not know what to use - I would need to research this subject before entering such matters.

I disagree. It's not at all difficult to level match using a competent, appropriate tool. Removal of knowledge is not difficult either with a bit of planning and common sense. What else is there, in your opinion that needs controlling? Let's not forget that when participants report differences during the sighted part of a blind-test, this rules out negative expectation bias - they'd be expecting to hear differences blind.

I have no knowledge of the DAC tests you reported - what were the control tests?

It was a controlled blind ABX test. The listeners all reported audible differences first, before the controlled testing, between several DACs varying hugely in price. When the controlled testing started, none could reliably identify them or, what X was. A null result that should not be rejected.

BTW You have a point when you say that most sighted listening results in false positives. I am not interested in them - I am interested in the true sighted positives. Think about the Devialet. Tens of people describe its bass performance in similar ways, claiming large differences from other excellent amplifiers. No one tested it blind. What should we think? After all it measures similar to many others - too good for audio use. :)

I've seen many glowing reports of the Devialet but as you say, no blind-tests yet to confirm a difference in comparison to other amps.
 
Last edited:
Again, it's very clear that swapping possible false positives (sighted) for possible false negatives (blind) gets no-one anywhere in deducing anything. Nothing can be drawn from such listening sessions except that more care is needed if one wants to derive any implications from such listening sessions (that particular forum listening test only had two variable controlled - sightedness & level matching). In other words attempting to eliminate the possibility of false negatives (blind) is the only way for better assurance that the results are perhaps correct - hence the need for the controls mentioned by J_J as his summary of what is contained in the ITU guidelines "Methods for the subjective assessment of small impairments in audio systems"
 
Last edited:

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu