Conclusive "Proof" that higher resolution audio sounds different

jkeny · Aug 7, 2014

Phelonious Ponk said:
Fair enough. I'll leave it to Max, who is doing fine work here. Did we agree that you were wrong? No...probably not.

Tim

Is this a tag-team or something?
I also agree to disagree with your Artistic Director.

Phelonious Ponk · Aug 7, 2014

jkeny said:
Is this a tag-team or something?
I also agree to disagree with your Artistic Director.

Nope. Just two separate sources of reason.

Tim

esldude · Aug 7, 2014

jkeny said:
I'm finished with this - I think you know what the answer is & don't need to ask - uncontrolled variables = what?

Long term sighted listening variables:

Ambient noise levels at times of day or times of year.
Weather variables which cause noise,temperature and humidity variables even indoors with HVAC.
Personal stress variables from many sources like:
Your daughter's new boyfriend, unhappy spouse, vacations or lack thereof, unexpected expenses of living, expected periodic expenses of living, changes at work, changes with close friends in all these areas and their effect on you, relatives in a myriad of ways. Along with a lowering of stress from the opposite of all these or when good things happen.
Industrial activity in your area varies by time of year, day and weather which also effects low frequency ambient noise levels.

And on, and on, and on, and on, and on .......................................... this list being not even 1 tenth of one percent of the categories.

In other words, long term sighted listening is sure to have more variables than any other possible listening situation. Even careful research into such matters by people in the field find exactly the result you would expect with such a multitude of variables involved. Yet some hold out that though without evidence it is the better way.

Now what is known are many psychological mechanisms by which people feel much better, far more comfortable and become emotionally convinced of what they experience in such conditions despite the underlying variability.

esldude · Aug 7, 2014

maxflinn said:
The most important variable by far is knowledge, closely followed by volume level.

The rest are not anywhere near as important.

This deserves showing up again, so plus one.

jkeny · Aug 8, 2014

This is all fine but I would like to present my summary for the defense, your honour (this feels like a trial):

- We all have an ability to tell gross differences in sound when sighted, right?
- what you are arguing about is the cut-off point at which this becomes unreliable (see Amir's experience below)
- We probably all have different cut-off points where we can't reliably tell differences.
- Yes there are influencing factors with any listening, sighted or blind, long-term or short but over the long term these factors change with the result that a particular factor has less of an influence then it does in a single-shot (or short) listening test (sighted or blind). Look how much work goes into trying to control some of these factors in a well designed blind test?
- I wouldn't say that sightedness is the biggest influencing factor. The biggest factor is expectation bias. Removing sightedness denies a visual anchor for expectation bias. There are many forms of expectation bias. The Nocebo effect is just as strong an influencing factor i.e "I expect not to hear a difference" & it is much less visible/obvious than sightedness. There are lots of other biases which have a strong influence on our perceptions - this whole section of the thread itself is a study in confirmation bias,, each looking solely for evidence that confirms their position.
- In other words, by aggregation, the long term nature of the listening reduces the effect of any particular factor compared to single-shot listening.
- This aggregation of results leading to a conclusion is also at play with many other people on many different systems, with many different influencing factors. This aggregation has a tendency to balance out influencing factors & result in the wisdom of crowds.

I would also cite what Amir wrote (& JackD alluded to) - look at how Amir had to pick the differences between the tracks - he did so sighted (he knew which was A & which B) - he could EASILY do this all day 100% of the time. Look at how his attitude changed when he entered the "blind" X part of the test - he was much less sure, much more likely to second-guess himself. So which approach are you telling me gives the most reliable results - the easy sighted 100% differences or the difficult unsighted ones?

So maybe you are right & maybe I am right? My main point is that once-off or short sessions are much more prone to being influenced by these factors & it's very difficult to control for them. Long term listening has a more balanced conclusion given long enough & enough different systems, people, etc.

All I know is that I (& many others) prefer to evaluate which device I want to live with - by living with it (if possible).
My experience tells me that it brings me the correct result 99% of the time

I also take exception to the sort of remarks that seem to percolate through these posts - remarks that suggest listeners are being taken for fools by knowing manufacturers/retailers in audio.

Robh3606 · Aug 8, 2014

I also take exception to the sort of remarks that seem to percolate through these posts - remarks that suggest listeners are been taken for fools by knowing manufacturers/retailers in audio.

Hello John

It's not about being a fool, it's about not being in control of your own biases. That's the point of blind testing. Removing sight based biases, whatever you want to call them be it expectation and so on, has a significant effect on the results. Check out Tooles book Chapter 17.5 Bias from Nonauditory Factors. Just read the whole chapter 17 Subjective Evaluations if you have not already.

Rob

jkeny · Aug 8, 2014

Robh3606 said:
Hello John

It's not about being a fool, it's about not being in control of your own biases. That's the point of blind testing. Removing sight based biases, whatever you want to call them be it expectation and so on, has a significant effect on the results. Check out Tooles book Chapter 17.5 Bias from Nonauditory Factors. Just read the whole chapter 17 Subjective Evaluations if you have not already.

Rob

Rob, that control of (or in most cases lack of control) of biases was what MOST of the rest of my post was about..

I find that too much emphasis is focussed on sightedness & it become the only variable that is considered worth controlling. The other variables/factors are not given due consideration (I know volume matching is considered) but I'm talking about all the many psychological cognitive factors that can influence the outcome. Look at the short list esldude gave for long term testing - these don't go away in blind testing - these don't go away in short A/B testing. My preference for long-term listening comes from this very point - it normalises these influences & doesn't introduces new ones - the ones that Amir detailed & most people feel when being "tested"

My point has always been unless these factors are controlled then the the test is not valid. So the debate has become which is the better INVALID test.

Let me go back to how this long-term listening topic started in this thread - as Tim reminded me, it was Ron Party who posted this " I was speaking, perhaps ineloquently, to the claims we've all read that long term and not short term listening is what is required to reliably and repeatably detect certain differences." I think he was to trying to ascertain just how significant these difference were to long-term listening? Well, we have a body of people who claim high-res is significantly better sounding than RB. They have formed this opinion over long-term listening. So are these people now proven to be correct seeing as these ABX results confirm their claims. Or are you going to tell me that they just guessed correctly? How do you know?

My PS thought was a PS because I notice that a theme tends to run through some of the posts, Rob - it wasn't my main point, just a PS

FrantzM · Aug 8, 2014

jkeny said:
Rob, that control of (or in most cases lack of control) of biases was what MOST of the rest of my post was about..

I find that too much emphasis is focussed on sightedness & it become the only variable that is considered worth controlling. The other variables/factors are not given due consideration (I know volume matching is considered) but I'm talking about all the many psychological cognitive factors that can influence the outcome. Look at the short list esldude gave for long term testing - these don't go away in blind testing - these don't go away in short A/B testing. My preference for long-term listening comes from this very point - it normalises these influences & doesn't introduces new ones - the ones that Amir detailed & most people feel when being "tested"

My point has always been unless these factors are controlled then the the test is not valid. So the debate has become which is the better INVALID test.

Let me go back to how this long-term listening topic started in this thread - as Tim reminded me, it was Ron Party who posted this " I was speaking, perhaps ineloquently, to the claims we've all read that long term and not short term listening is what is required to reliably and repeatably detect certain differences." I think he was to trying to ascertain just how significant these difference were to long-term listening? Well, we have a body of people who claim high-res is significantly better sounding than RB. They have formed this opinion over long-term listening. So are these people now proven to be correct seeing as these ABX results confirm their claims. Or are you going to tell me that they just guessed correctly? How do you know?

My PS thought was a PS because I notice that a theme tends to run through some of the posts, Rob - it wasn't my main point, just a PS

John

Allow me to look more into your post.

The OP was that under conditions of blind testing, that is knowledge removed not of a person being actually blindfolded, severalpeople were able and that with a high degree of certainty, to hear clear differences between Hi=Rez and lower rez. I believe the most important aspect is knowledge removed not sight as you are grasping to that notion. I would think that the post by max made it clear long term: There are too many factors and they add up , they don;t normalize, they can't normalize since you have no control over these and remember you know the products in question.you compare this to training ... You would have to admit that it is stretching to the limit of most human mental elasticity ...

You then write

My point has always been unless these factors are controlled then the the test is not valid. So the debate has become which is the better INVALID test.

A perfect example of per absurdo argument is " if this is not perfect then it is not valid". Wouldn''t you allow us that given two tests both of them imperfect, that the one that is less subject to biases would have the stronger probaility of being the better, conservatively 51% ? In that regard does "long term" qualify when the multitude of extraneous factors is enough to remove the notion of "test' from the activity?

And when we talk about a "Body of people", how large is that body? and under what conditions did this "body" arrive to these conclusions? Wouldn't the absence of controls make their claim invalid from the start?

Robh3606 · Aug 8, 2014

The other variables/factors are not given due consideration

Hello John

Well they don't really change do they?? That's the point. All the "pressure" is there either way if you are more worried about fitting in with the crowd, proving a point, or just being honest about what you can actually hear. Being honest in front of this crowd being the most difficult.

My preference for long-term listening comes from this very point - it normalizes these influences & doesn't introduces new ones -

Do your really think a technique that normalizes is the best for finding differences?? Seems to me it would do a great job at masking them. To a point I understand what you are saying I have my "house sound" that my systems are voiced too. I had a person tweak an EQ I knew it in minutes that something was up. We have all had issues where this or that was off and you know right away. I am not saying it is not useful just not the best way to look for differences that may not be obvious to the casual listener.

Rob

Stereoeditor · Aug 8, 2014

jkeny said:
My point has always been unless these factors are controlled then the the test is not valid. So the debate has become which is the better INVALID test.

A great point, to which the answer seems dependent on preference and politics rather than absolute objectivity.

John Atkinson
Editor, Stereophile

jkeny · Aug 8, 2014

Robh3606 said:
Hello John

Well they don't really change do they?? That's the point. All the "pressure" is there either way if you are more worried about fitting in with the crowd, proving a point, or just being honest about what you can actually hear. Being honest in front of this crowd being the most difficult.

Well let's look at esldude's, list of factors & consider if these remain fixed from day to day over a long-term listening (not to mention that they are NOT the same for everyone)?

Ambient noise levels at times of day or times of year.
Weather variables which cause noise,temperature and humidity variables even indoors with HVAC.
Personal stress variables from many sources like:
Your daughter's new boyfriend, unhappy spouse, vacations or lack thereof, unexpected expenses of living, expected periodic expenses of living, changes at work, changes with close friends in all these areas and their effect on you, relatives in a myriad of ways. Along with a lowering of stress from the opposite of all these or when good things happen.
Industrial activity in your area varies by time of year, day and weather which also effects low frequency ambient noise levels.

And on, and on, and on, and on, and on .......................................... this list being not even 1 tenth of one percent of the categories.

Do your really think a technique that normalizes is the best for finding differences?? Seems to me it would do a great job at masking them.

I didn't say normalises differences, I said normalises the influencing factors. Let me use this very apt example - the factors interfere with the correct sensing of the actual signal, right. These factors are many & varied & change from person to person, day to day (not all, but a lot do - see esldudes' list).

There are two ways to try to get to better sense the signal - remove/control the factors or do what an FFT does. What's that? It averages out measurements over many. many runs so that the factors now become random noise because they are not the same for everybody & so are not as additive. However, the signal is the same in each run & even though it's a small signal & for any particular run the factors could swamp it & prevent it being sensed correctly - when run over many runs (aggregated), this signal is additive & rises above the grass of noise that represents the influence of the factors.

To a point I understand what you are saying I have my "house sound" that my systems are voiced too. I had a person tweak an EQ I knew it in minutes that something was up. We have all had issues where this or that was off and you know right away. I am not saying it is not useful just not the best way to look for differences that may not be obvious to the casual listener.

Rob

So your "house sound" wasn't arrived at through blind listening, right? It was arrived at over long-term listening, I presume? Did you find that it fluctuated from day to day, depending on influences? Does it fluctuate now based on any of the factors that have been mentioned? Do you have to close your eyes to be sure that your system still sounds the same?

jkeny · Aug 8, 2014

FrantzM said:
John

Allow me to look more into your post.

The OP was that under conditions of blind testing, that is knowledge removed not of a person being actually blindfolded, severalpeople were able and that with a high degree of certainty, to hear clear differences between Hi=Rez and lower rez. I believe the most important aspect is knowledge removed not sight as you are grasping to that notion.

I mean exactly that Frantz, as everyone knows - visible knowledge removed - no grasping here. But there is a distinction here & possibly why I'm reluctant to use your "knowledge removed" phrase. Just how much knowledge is removed? If I still know I'm testing cables, then all knowledge isn't removed - my expectation biases are still in play, same applies to amplifiers, DACs, etc - knowing what is being tested is biasing knowledge. So the phrase "knowledge removed" is a pretense - it doesn't really mean what it says.

Not removing expectation bias greatly influences results. Would this invalidate the results, in your opinion?

I would think that the post by max made it clear long term: There are too many factors and they add up , they don;t normalize, they can't normalize since you have no control over these

I differ from this view. If done over time the influence of factors will be averaged out (see my FFT example). If I did a test on a particular day then the factors at play on that day have more of an influence on the result then if I did it over a week, a month (presuming factors change). Multiply this by all the people doing long-term listening & the factors become normalised - no one factor becoming more influential than it should normally.

All this talk about removing a single sightedness factor being of benefit ignores the fact that removing this factor also introduces new factors that weren't there before & there is no control over - I'm talking here about forum organised blind tests. If I do a single shot test (not controlling for anything other than sightedness & level) then the other psychological factors (baggage) that we carried into the test plus the new factors introduced by the test will all have an influence. You are saying that all these factors are outweighed by removing sightedness & therefore the test is better than if done over an extended period where the fluctuations in influence of these factors has to happen?

and remember you know the products in question.you compare this to training ... You would have to admit that it is stretching to the limit of most human mental elasticity ...

I don't know what you are saying here?

You then write A perfect example of per absurdo argument is " if this is not perfect then it is not valid". Wouldn''t you allow us that given two tests both of them imperfect, that the one that is less subject to biases would have the stronger probaility of being the better, conservatively 51% ? In that regard does "long term" qualify when the multitude of extraneous factors is enough to remove the notion of "test' from the activity?

Yes, if it's not valid, then it is not any better than anecdotal evidence, is it? You are arguing that even though you accept the test is not valid, that by removing a single bias, that it is more accurate or better test (with no control over any other factors/biases)?

And when we talk about a "Body of people", how large is that body? and under what conditions did this "body" arrive to these conclusions? Wouldn't the absence of controls make their claim invalid from the start?

Well, I'm talking about the body of people who have listened to X. Obviously if this is a small number then it doesn't have much value. And please don't bring into it the number of people who believe in flying saucers or miracles or whatever. We make judgements based on type of people & numbers as to whether we value their opinions. Is this not what happens on every audio forum - we ask for advice about X, judge the number of responses & the "quality" of those responses to reach a tentative conclusion which we confirm or otherwise.

jkeny · Aug 8, 2014

Stereoeditor said:
A great point, to which the answer seems dependent on preference and politics rather than absolute objectivity.

John Atkinson
Editor, Stereophile

Yes, this is what the debaters here seem to miss - the tests are invalid but they are arguing that their test is more valid than my test - let's not even deem them tests, as someone said - let's correctly call them anecdotes - their anecdote is more valid than mine

Phelonious Ponk · Aug 8, 2014

Stereoeditor said:
A great point, to which the answer seems dependent on preference and politics rather than absolute objectivity.

John Atkinson
Editor, Stereophile

See rob's post above. The answer is only dependent upon that if you've forced reality into a very tight, black and white corner. Why would a hobby dominated by subjectivists suddenly become so ridged in this one, small area? People are bending over backwards to find a reason why a comparative listening method -- let's not call it a test; the casual listening being held up as the benchmark certainly doesn't rise to that level -- which eliminates many opportunities for bias is no better than one that begs for them. It makes no logical sense. The position is not "valid."

Tim

jkeny · Aug 8, 2014

Phelonious Ponk said:
See rob's post above. The answer is only dependent upon that if you've forced reality into a very tight, black and white corner. Why would a hobby dominated by subjectivists suddenly become so ridged in this one, small area? People are bending over backwards to find a reason why a comparative listening method -- let's not call it a test; the casual listening being held up as the benchmark certainly doesn't rise to that level -- which eliminates many opportunities for bias is no better than one that begs for them. It makes no logical sense. The position is not "valid."

Tim

Tim, I agree it's not a test but how many people on audio forums do you see citing "blind test results" when they are talking about these forms of casual listening. So this is at least progress - they are anecdotes, that's all.

So to your point - why should removing one bias (what are the many that you talk about?) surely be better than removing no biases? It's an unfounded statement.
How is it better if any of the remaining biases, collectively or individually, are substantially influential enough to mask the difference?
The differences are still masked by the remaining biases so how can this be better?
Unless you can prove that the remaining biases are not of a strong enough influence to mask the differences. But then you would need to know what they are & measure them & compare the result with/without these biases. Oh, wait, you've just done a fully controlled test

maxflinn · Aug 8, 2014

Phelonious Ponk said:
Why would a hobby dominated by subjectivists suddenly become so ridged in this one, small area? People are bending over backwards to find a reason why a comparative listening method -- let's not call it a test; the casual listening being held up as the benchmark certainly doesn't rise to that level -- which eliminates many opportunities for bias is no better than one that begs for them. It makes no logical sense. The position is not "valid."

Tim

I completely agree, Tim..

IMO, looking to invalidate virtually all non-sighted testing on the basis of it not accounting for enough variables - whilst looking to validate sighted-testing even though it accounts for none, could be put under the Wiki definition of 'illogical'.

Also, re John's point, aggregation of subjective sighted opinions along with non-sighted reports is pointless, IMO.

maxflinn · Aug 8, 2014

John, variables, such as we're discussing cannot alter a speakers output (apart from SPL), nor mask differences

.

Removal of knowledge is of paramount importance, then level matching. The former deals with the psychological and the latter the fact that louder is different.

Nocebo is a red herring once listeners report differences sighted, and allowing for the fact that almost all blind-tests have a listening panel that include at least some believers.

Can you state some variables apart from the three I've mentioned that, if not accounted for, could invalidate a blind-test, and why?

jkeny · Aug 8, 2014

Just stating opinions without any supporting text or counter-argument, is not very productive.
This is why I get the feeling of a tag-team in action - say it enough times maybe?

OK, I see you actually made another post while I was replying:

John, variables, such as we're discussing cannot alter a speakers output, nor mask differences .

Hmmm, are you denying the effect of psychological variables in masking the perception of differences? This is a new angle, please explain

Removal of knowledge is of paramount importance, then level matching. The former deals with the psychological and the latter the fact that louder is different.

Yes, and...............?

Nocebo is a red herring once listeners report differences sighted, and allowing for the fact that almost all blind-tests have a listening panel that include at least some believers.

Another interesting contradiction - Nocebo means that someone won't hear a difference sighted or blind so not sure what your point is?

Can you state some variables apart from the three I've mentioned that, if not accounted for, could invalidate a blind-test, and why?

Totally confused now

esldude · Aug 8, 2014

jkeny said:
Tim, I agree it's not a test but how many people on audio forums do you see citing "blind test results" when they are talking about these forms of casual listening. So this is at least progress - they are anecdotes, that's all.

So to your point - why should removing one bias (what are the many that you talk about?) surely be better than removing no biases? It's an unfounded statement.
How is it better if any of the remaining biases, collectively or individually, are substantially influential enough to mask the difference?
The differences are still masked by the remaining biases so how can this be better?
Unless you can prove that the remaining biases are not of a strong enough influence to mask the differences. But then you would need to know what they are & measure them & compare the result with/without these biases. Oh, wait, you've just done a fully controlled test

Classic routines from the Last Comic Standing___Illogical edition.

jkeny · Aug 8, 2014

maxflinn said:
I completely agree, Tim..

IMO, looking to invalidate virtually all non-sighted testing on the basis of it not accounting for enough variables

Max, I suspect that you are confusing terms here. "Not valid" means that the test is not of a sufficiently rigorous standard to provide a relatively accurate result. Your use of "invalidate" seems to me that you think I'm saying blind tests are irrelevant - I'm not - they are of the same value as anecdotes. But I don't dismiss anecdotes as irrelevant - maybe you do?

- whilst looking to validate sighted-testing even though it accounts for none, could be put under the Wiki definition of 'illogical'.

Again we are talking about anecdotal evidence, Max - we all agree that uncontrolled blind tests are not valid (even your SVP, Tim), just the same as sighted-tests.

Also, re John's point, aggregation of subjective sighted opinions along with non-sighted reports is pointless, IMO.

I'm not sure what this means or what your point is?

Conclusive "Proof" that higher resolution audio sounds different

Industry Expert, Member Sponsor

New Member

New Member

New Member

Industry Expert, Member Sponsor

Well-Known Member

Industry Expert, Member Sponsor

Member Sponsor & WBF Founding Member

Well-Known Member

Member

Industry Expert, Member Sponsor

Industry Expert, Member Sponsor

Industry Expert, Member Sponsor

New Member

Industry Expert, Member Sponsor

New Member

New Member

Industry Expert, Member Sponsor

New Member

Industry Expert, Member Sponsor

Similar threads