Conclusive "Proof" that higher resolution audio sounds different

Orb · Aug 13, 2014

Phelonious Ponk said:
Orb? What are you talking about? That I don't doubt that Amir heard what he reports hearing, or that I would call anything requiring training and a very specific methodology (as this did in this example) to identify subtle? Notice the "I" in both of those. Either way, I don't recall any of you telling me, 50 pages or so ago, that I could neither presume to trust Amir's judgement nor have an opinion of what is subtle.

tim

Yes we did,
quite a few of us have mentioned several times in the distant past that it is wrong to go beyond the scope and focus of what such an ABX test provides, and sorry but you are doing just that.
It is a tool to identify differences with trained listeners, nothing more can be read into it apart from differences have been heard; you cannot then say it shows how audibility is trivial for normal listening in same way people would be wrong to say this is having an impact on listeners; to say either of those views it would then need to be correlated to listening behaviour same as with tolerances-thresholds.
Anyway dropping out of this conversation again because this has all been mentioned before and not just by me.
Cheers
Orb

Orb · Aug 13, 2014

esldude,
its a pain in backside but do you know which set of files you have?
Just asking as the keys were put up I think several times by Arny, also still feel it is worth comparing to TPDF as I understand you used a different dither while TPDF is the recommended one at this stage.
BTW I tend to agree this is about how digital music is "handled", which becomes even more interesting when considering the complete chain from say mics (possibly compounded with PDM mics)-to-master file-to-consumer pc-to-dac.
Would be great if you could try with TPDF as you passed with the original ones and could not differentiate using what you applied (I know I asked you before and you kindly gave the information but just in case anyone else tries please could you mention again - sorry)

Thanks again
Orb

Phelonious Ponk · Aug 13, 2014

Orb said:
Yes we did,
quite a few of us have mentioned several times in the distant past that it is wrong to go beyond the scope and focus of what such an ABX test provides, and sorry but you are doing just that.
It is a tool to identify differences with trained listeners, nothing more can be read into it apart from differences have been heard; you cannot then say it shows how audibility is trivial for normal listening in same way people would be wrong to say this is having an impact on listeners; to say either of those views it would then need to be correlated to listening behaviour same as with tolerances-thresholds.
Anyway dropping out of this conversation again because this has all been mentioned before and not just by me.
Cheers
Orb

Read what is written, not what you would like to argue against. You're not telling me to stay within the scope of ABX testing (which you're improperly defining), you're telling me I'm not entitled to an opinion. If you're concerned about the scope of ABX testing, you should be talking to the people who think it is not only ok, but essential, to base testing methodology on results. And by the way, while we're talking about testing scope and methodology, blind testing is not limited to either to small differences or to trained listeners. Not among research professionals. Not even on this board.

Yes, this has all been mentioned before and it is all still wrong.

Tim

FrantzM · Aug 13, 2014

I , also find the reference to "trained listeners" strange .. Would it be that an ABX is only valid with trained subjects? And in that context what to make of sighted tests? Are they more valid?

This is what I get from the whole thing. Amir contended and has facts to prove there are differences between Hi-Rez and not Hi-Rez and those differences can be perceived under blind conditions. Such differences cannot be explained by Bias, knowledge or the likes with the test material provided. We will debate all we want that blind tests are not adequate or sufficient. They remain a better tool to find differences than sighted test where the sheer number of variables/biases is bewildering. Is there anything else? Please educate me.

esldude · Aug 13, 2014

Orb said:
esldude,
its a pain in backside but do you know which set of files you have?
Just asking as the keys were put up I think several times by Arny, also still feel it is worth comparing to TPDF as I understand you used a different dither while TPDF is the recommended one at this stage.
BTW I tend to agree this is about how digital music is "handled", which becomes even more interesting when considering the complete chain from say mics (possibly compounded with PDM mics)-to-master file-to-consumer pc-to-dac.
Would be great if you could try with TPDF as you passed with the original ones and could not differentiate using what you applied (I know I asked you before and you kindly gave the information but just in case anyone else tries please could you mention again - sorry)

Thanks again
Orb

Well I have since last posting about it many pages ago found time to ABX triangular dither or TPDF. I also found only random results when redithered. I used Audacity for the redithering which since version 2.0.3 has used the Sox algorithm. It results in this FFT which still looks better than Arny's original resampling.

Diff Jangling keys triangular dither.jpg

I found Arny's original files with the attached high frequency tones for checking for IMD. Other than the extra tones they seem to be the same as what I used earlier having the same artifacts from resampling. In addition I made my own jangling keys recording with plenty of content up into the 30 khz plus range. Made 96/24 with zero manipulation in between. That also is indistinguishable by me once resampled well.

As for which dither is preferred there is conflicting information with some saying TDPF is always preferred. Others say if you know no other manipulation will occur shaped is better. As you can see comparing the two, shaped has lower noise floors below and especially around 3 khz. It rises above TDPF in the upper octave and a half where our hearing is less sensitive. In any case, once resampled with Sox I cannot discern a difference with either.

There are at least 3 things that could be audible here. 96 vs 44 sample rate and the related frequency limits, 24 vs 16 bit, and effects of dither. Comparing 96/24 and 96/16 could tease out the latter two at least. The other worthwhile test would be 96/24 vs 44/24.

esldude · Aug 13, 2014

Orb said:
Yes we did,
quite a few of us have mentioned several times in the distant past that it is wrong to go beyond the scope and focus of what such an ABX test provides, and sorry but you are doing just that.
It is a tool to identify differences with trained listeners, nothing more can be read into it apart from differences have been heard; you cannot then say it shows how audibility is trivial for normal listening in same way people would be wrong to say this is having an impact on listeners; to say either of those views it would then need to be correlated to listening behaviour same as with tolerances-thresholds.
Anyway dropping out of this conversation again because this has all been mentioned before and not just by me.
Cheers
Orb

Actually the difference in results for trained vs untrained listeners has been researched by scientists in the field. The answer on the importance of training and experience is it depends. It depends on what is being tested for in particular. Some types of differences appear to be perceived as well by both groups. Others show some difference in how small and how reliably some differences can be heard. I don't know of any difference that untrained listeners pick up less well that would not also be into the range of 'subtle' that Tim is referring to here even for trained listeners.

Phelonious Ponk · Aug 13, 2014

FrantzM said:
I , also find the reference to "trained listeners" strange .. Would it be that an ABX is only valid with trained subjects? And in that context what to make of sighted tests? Are they more valid?

This is what I get from the whole thing. Amir contended and has facts to prove there are differences between Hi-Rez and not Hi-Rez and those differences can be perceived under blind conditions. Such differences cannot be explained by Bias, knowledge or the likes with the test material provided. We will debate all we want that blind tests are not adequate or sufficient. They remain a better tool to find differences than sighted test where the sheer number of variables/biases is bewildering. Is there anything else? Please educate me.

Other than the position taken by a couple of members here that blind tests are not only sufficient but require little control/scrutiny as long as the results are positive, but must be re-run, through a rigorous (but pretty poorly defined, here) set of controls if a negative result occurs, I think you've got it about right. I wouldn't say the very limited results reported here meet the standards for scientific facts, but you're really close, Frantz.

Tim

Orb · Aug 13, 2014

esldude said:
Actually the difference in results for trained vs untrained listeners has been researched by scientists in the field. The answer on the importance of training and experience is it depends. It depends on what is being tested for in particular. Some types of differences appear to be perceived as well by both groups. Others show some difference in how small and how reliably some differences can be heard. I don't know of any difference that untrained listeners pick up less well that would not also be into the range of 'subtle' that Tim is referring to here even for trained listeners.

Totally agree and that usually falls into specific focus such as identified by Harman studies on perceived speaker quality, but it is pretty clear in this context trained listeners also includes how to listen-isolation-methodology for specific anomalies-trait within ABX, same way one has to use trained listeners and practice when doing similar tests involving distortion if you want to get that down to lowish levels of identification; think Sean Olive also did something similar a few years ago.
I think it is pretty clear the approach of trained listeners to this test is very different to untrained; just look at the approach before Amir and myself mentioned techniques and then how it helped a couple of members.

But again it does not matter if the difference is subtle or not, it matters how it effects listening behaviour if Tim wishes to say it is negligible to normal listening.
Some effects can be quite large before affecting listener behaviour in terms of tolerance-threshold and/or satisfaction-emotional connection, and yet others much lower.
So it is wrong to claim (appreciate you are just adding to Tim's original POV) it would be insignificant without understanding why it is happening and then taking it further to understand the mechanism and its implications for listeners.
Anyway for a different thread IMO but it is taking this to the next step and that is an assumption.

And thanks for doing the TPDF test, much appreciated as wondering if this was a possibility, blast one theory possibly gone

Cheers
Orb

jkeny · Aug 13, 2014

FrantzM said:
I , also find the reference to "trained listeners" strange .. Would it be that an ABX is only valid with trained subjects? And in that context what to make of sighted tests? Are they more valid?

This is what I get from the whole thing. Amir contended and has facts to prove there are differences between Hi-Rez and not Hi-Rez and those differences can be perceived under blind conditions. Such differences cannot be explained by Bias, knowledge or the likes with the test material provided. We will debate all we want that blind tests are not adequate or sufficient. They remain a better tool to find differences than sighted test where the sheer number of variables/biases is bewildering. Is there anything else? Please educate me.

Frantz, I don't think you have tried running this ABX test? If you had you might understand that the success for most depends on isolating out the same section from both audio samples where the listener can hear a distinct difference & is confident that (s)he can then do it blind. It's a self-training step. But it has been pointed out 50 pages ago that in order to identify such a section the listener must be able to listen to listen with great attention & having a certain ability to know what to look out for i.e previous experience of spotting minor audible differences. I would suggest that there is a vast range that people fall into in this regard - at the one end we have people like Amir who has done this in the past as part of his role in MS, there are a middle group who have a wide exposure to audio equipment & have developed a certain acquaintance with some forms of audible differentiation & there are a large body of people who don't care, have no interest, just want the music & the there are a fundamentalist group who deny such differences can possibly exist.

I would suggest that this test will give different results depending on which group the listener belongs to!

FrantzM · Aug 13, 2014

jkeny said:
Frantz, I don't think you have tried running this ABX test? If you had you might understand that the success for most depends on isolating out the same section from both audio samples where the listener can hear a distinct difference & is confident that (s)he can then do it blind. It's a self-training step. But it has been pointed out 50 pages ago that in order to identify such a section the listener must be able to listen to listen with great attention & having a certain ability to know what to look out for i.e previous experience of spotting minor audible differences. I would suggest that there is a vast range that people fall into in this regard - at the one end we have people like Amir who has done this in the past as part of his role in MS, there are a middle group who have a wide exposure to audio equipment & have developed a certain acquaintance with some forms of audible differentiation & there are a large body of people who don't care, have no interest, just want the music & the there are a fundamentalist group who deny such differences can possibly exist.

I would suggest that this test will give different results depending on which group the listener belongs to!

Not sure what your point is. The defintion of ABX test is clear and doesn't leave much to interpretation: I am quoting Wikipedia:

From Wikipedia, the free encyclopedia
An ABX test is a method of comparing two choices of sensory stimuli to identify detectable differences between them. A subject is presented with two known samples (sample A, the first reference, and sample B, the second reference) followed by one unknown sample X that is randomly selected from either A or B. The subject is then required to identify X as either A or B. If X cannot be identified reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proven that there is a perceptible
difference between A and B.

There is nowhere in this definition where "training" figires. it is clear that training might be helpful but not a requsite for a valid ABX test.

I have agreed long time ago that an ABX test is not trivial but even half -baked an informal it seems to remove a large number of biases that other test fail to take into account. As for the middle group you are talking about we could call them informally trained but shouldn't we test the reliability of their observations? Or should we take it as face value because they have been exposed long term to the so-called differences? And underlying this but not completely stated but too often implied, is the fact that those who who maintain these observations have systems that are deemed of greater resolution? Why then under blind conditions the reliability of their observation falls so abysmally and so systematically? Stress? Really?
I have no problem that you contest the validity of forum AB tests, that is fine but trying so vociferously to show that sighted tests are a better alternative remain a quixotic battle. It may satisfy you and /or some but the facts you have so far presented do not support it.

And just for kicks: As for fundamentalist groups .. Pendulum swings both ways

jkeny · Aug 13, 2014

FrantzM said:
Not sure what your point is. The defintion of ABX test is clear and doesn't leave much to interpretation: I am quoting Wikipedia:

There is nowhere in this definition where "training" figires. it is clear that training might be helpful but not a requsite for a valid ABX test.

I have agreed long time ago that an ABX test is not trivial but even half -baked an informal it seems to remove a large number of biases that other test fail to take into account. As for the middle group you are talking about we could call them informally trained but shouldn't we test the reliability of their observations? Or should we take it as face value because they have been exposed long term to the so-called differences? And underlying this but not completely stated but too often implied, is the fact that those who who maintain these observations have systems that are deemed of greater resolution? Why then under blind conditions the reliability of their observation falls so abysmally and so systematically? Stress? Really?
I have no problem that you contest the validity of forum AB tests, that is fine but trying so vociferously to show that sighted tests are a better alternative remain a quixotic battle. It may satisfy you and /or some but the facts you have so far presented do not support it.

And just for kicks: As for fundamentalist groups .. Pendulum swings both ways

Frantz, you still are missing the point - in doing the ABX test an initial stage for everyone (let's say) is that they identify a section which they feel is different between the two audio samples. This is done before the ABX test run itself, no? How do you think they set about doing this?

This identification of differences in this pre-trial stage is the crucial step & is actually the test - if no differences are heard at this stage,there's usually no point in doing ABX trial runs, is there? (Unless of course you think that differences can be sensed subconsciously in these tests?) Now, the ability to identify the sections that best show differences between the samples, in the pre-trial run, is being done sighted, right. So this is the hurdle to overcome, first. This is called training yourself & some will already have this ability well formed, others, not so much, others not at all & others who couldn't care less & the Nocebo group. All these will have greater or lesser success with this pre-trial. The ABX section itself is just a confirmation (or otherwise) that you have correctly chosen differences which are really there - testing the reliability of their observations, as you put it.

That's why I asked you if you have tried doing this ABX test. It might be worthwhile to so in order to understand it's workings a bit better. There seems to be an awful lot of time/energy being wasted on this thread trying to explain the operation of the test & what statistical significance means to people who haven't tried running this ABX test. Just try it, honestly, first before giving a Wikipedia quote about experience training

Orb · Aug 13, 2014

Frantz,
training can also be informal (much less value though depending upon the scope-focus-context) where guidance is provided on what to do, as seen by some who decided to retry this specific ABX test after suggestions in the threads on how to approach the listening and also the ABX.
Cheers
Orb

maxflinn · Aug 13, 2014

From Wikipedia, the free encyclopedia

An ABX test is a method of comparing two choices of sensory stimuli to identify detectable differences between them. A subject is presented with two known samples (sample A, the first reference, and sample B, the second reference) followed by one unknown sample X that is randomly selected from either A or B.

The subject is then required to identify X as either A or B. If X cannot be identified reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proven that there is a perceptible
difference between A and B.

I thought the highlighted section needed er, highlighting

FrantzM · Aug 13, 2014

John

The whole thing about ABX is the statistical reliability of the observation. It removes the "feel" that I heard a difference, it tries to establish a degree of certainty, this being a human thing there is no perfection but anything above 50% place it outside the realm of total uncertainty. You moved the goal post posing yourself as a teacher to an ignorant student. If your point about the debate is just to win it .. That is one of the available strategies. If it is to sustain your point that blind tests are no more valid than sighted ones you need a better line of argumentation.

P.S. Thanks Max for the highlighting

RogerD · Aug 13, 2014

RogerD said:
Do these measure harmonic structure. Noise has a signature that is high frequency smearing or harmonic distortion,it effects clarity.

esldude said:
Yes it would. In this particular case the FFT is showing bins approximately 11 hz wide. Each section of 11 hz makes up the graph. 1 khz would show a spike near 1khz and another at 2khz or 3 khz if there were harmonics.

Noise need not be high frequency, it can have any frequency. When I say noise is audible here in the difference, it is only audible with amplification of some 40 decibels when good resampling was used. If you just listen to the file on its own, you hear silence. With amplification you hear something much like interstation hiss from FM with quieting turned off.

I agree, but EMI is much harder to distinguish in other frequencies. Again EMI effects clarity, so harmonics are effected, not necessarily the fundamental tone. That is why system resolution is important as well as identification.

And when I listened to the two files I focused on the clarity of the upper band of the jingling keys. The clarity was the only marker I needed.

Phelonious Ponk · Aug 13, 2014

I understand that getting a real grip in AB and ABX testing protocols can be difficult. Heck, you google it and you'll be in 6-7 pages before you get past all the audiophile denials on message boards. Stick with it, though, and sooner or later you'll get to real research people talking about proper methodologies instead of folks with clear agendas trying to manipulate methodologies to deliver the results they'd like to see:

http://www.hopkins-research.com/abx.htm

Nothing here about "pre-trial" stages or pre-training requirements. Certainly nothing about different control requirements based on results. A lot about statistical requirements. What they are testing is perception and there's no doubt that different people can percieve different things. When a long string of nulls is returned in statistically significant numbers, and you believe it was the participants, not what was being tested, that caused that result, you run the test again, changing that variable - the participants -- and see if you get a different, statistically significant result. You do not declare that all participants in all tests must meet a criteria you expect to deliver the result you want, and you do not change a bunch of variables (the controls John and Micro say are required for null results but not for positive ones), and re-test only the subjects who returned a result you didn't like.

I don't know where you guys are getting this stuff, but I wish you'd tell me so I could avoid it.

Tim

jkeny · Aug 13, 2014

FrantzM said:
John

The whole thing about ABX is the statistical reliability of the observation. It removes the "feel" that I heard a difference, it tries to establish a degree of certainty, this being a human thing there is no perfection but anything above 50% place it outside the realm of total uncertainty. You moved the goal post posing yourself as a teacher to an ignorant student. If your point about the debate is just to win it .. That is one of the available strategies. If it is to sustain your point that blind tests are no more valid than sighted ones you need a better line of argumentation.

P.S. Thanks Max for the highlighting

Frantz, have you tried running this or any ABX test? Can you tell me where my description of the preparation is wrong? If not can you say how training is/is not a cogent factor in the sense in which I outlined?

I'm not trying to put myself forth as a teacher & sorry if it comes across like that. What amazes me is what is being disputed here - the firstly the whole statistical basis of the ABX testing - with statements like "one positive result is valid but any number of negative results means the controls have to be examined" - well duh, yeah - if you can't show the test has been controlled properly the the results are suspect, of course. This is a blatant (maybe willful) misunderstanding of the nature of the test. This latest sideshow about training is similarly confused.

I was just suggesting that people actually try the test before making such wildly uninformed statements - it would help the SNR on this thread

microstrip · Aug 13, 2014

FrantzM said:
I , also find the reference to "trained listeners" strange .. Would it be that an ABX is only valid with trained subjects? And in that context what to make of sighted tests? Are they more valid?

This is what I get from the whole thing. Amir contended and has facts to prove there are differences between Hi-Rez and not Hi-Rez and those differences can be perceived under blind conditions. Such differences cannot be explained by Bias, knowledge or the likes with the test material provided. We will debate all we want that blind tests are not adequate or sufficient. They remain a better tool to find differences than sighted test where the sheer number of variables/biases is bewildering. Is there anything else? Please educate me.

Frantz,

Your final remark just shows you have not read the debates. I can imagine your time is too valuable to go through others hundreds of contributions, but the "is there anything else" is really amusing.

jkeny · Aug 13, 2014

Jeez, Tim, all tests, not just ABX have certain criteria that are needed to be met before it is considered a valid, sensitive test. This is testing 101. Part of this is usually pre-screening of the participants. Nobody, that I know of, ever skips this & then just changes the participants after a test & runs it again (unless it was badly designed test in the first place). I'm really struggling to understand what is so difficult about all of this.

esldude · Aug 13, 2014

RogerD said:
I agree, but EMI is much harder to distinguish in other frequencies. Again EMI effects clarity, so harmonics are effected, not necessarily the fundamental tone. That is why system resolution is important as well as identification.

The only effects that matter are those you can hear. Aural perception is limited to 20 khz and below. EMI can cause disturbances which effect that range. Or they can cause no change whatsoever. You can only hear effects of EMI by the effects of signals in the audible range. So if you know enough about signals in the audible range you can determine if they sound different. Why they are different if they are is another issue and could be any one of many things. EMI isn't some special condition that just by its presence is heard.

Conclusive "Proof" that higher resolution audio sounds different

New Member

New Member

New Member

Member Sponsor & WBF Founding Member

New Member

New Member

New Member

New Member

Industry Expert, Member Sponsor

Member Sponsor & WBF Founding Member

Industry Expert, Member Sponsor

New Member

New Member

Member Sponsor & WBF Founding Member

VIP/Donor

New Member

Industry Expert, Member Sponsor

VIP/Donor

Industry Expert, Member Sponsor

New Member

Similar threads