Conclusive "Proof" that higher resolution audio sounds different

microstrip · Aug 14, 2014

maxflinn said:
Micro, we're still not understanding each other, possibly my fault.

I'm not talking about a null with no controls. I'm talking about blind-tests or, controlled tests where participants report audible differences while knowing what they're listening to, DAC A or DAC B, for example, when listening sighted, first, before the blind-test begins but in the same venue, using the same system on the same day, then fail to reliably identify audible differences when they don't know what they're listening to, i.e, after the sighted tests that always precede the blind in these events - they only know they're listening to one of the things being tested at a given time, but not which one, A or B, but they fail to reliably distinguish A from B, or to be able to say which one of A or B is X, if an ABX test.

This is a result, scenario what have you that's extremely common amongst the regular forum blind A/B, or blind ABX tests.

I'm suggesting that the reason for the different reports, i.e., positive sighted but nulls with knowledge removed (same participants, same system, same day) is because of expectation bias/placebo creating false positives sighted. The controls, i.e., the subsequent removal of knowledge, do their job in these instances. The null results indicate that the reported sighted differences were simply imagined.

I'm arguing that these null results should not be rejected - and the Wiki link backs this up.

Sighted listening is only valid to each listener on a personal basis. It's never valid in terms of proof. Your typical forum run controlled blind-test is different, because of the controls, though one needs statistically relevant data for proof. Also, null results don't prove anything 100%, but they very strongly indicate no audible differences were present.

I disagree. It's not at all difficult to level match using a competent, appropriate tool. Removal of knowledge is not difficult either with a bit of planning and common sense. What else is there, in your opinion that needs controlling? Let's not forget that when participants report differences during the sighted part of a blind-test, this rules out negative expectation bias - they'd be expecting to hear differences blind.

It was a controlled blind ABX test. The listeners all reported audible differences first, before the controlled testing, between several DACs varying hugely in price. When the controlled testing started, none could reliably identify them or, what X was. A null result that should not be rejected.

I've seen many glowing reports of the Devialet but as you say, no blind-tests yet to confirm a difference in comparison to other amps.

Altough I understand your points, unless you decide to read elsewhere about controls and the statistical nature of listening tests we will go nowhere debating sighted versus blind tests - we are speaking different dialects, and I do know how to make it simpler.

Both types of listening need control mechanisms - and as referred before, audiophile challenges are a good example of how they shoul not be carried.

These debates chase their tails because we do not move outside repetition, wikipedia and forum culture. There are excellently written standards, articles by known audio writers, even by known audio designers, but very few care about them, perhaps because they are not freely or easily available.

jkeny · Aug 14, 2014

What we also see here in the thread recently is the corollary of what has been argued about for many pages now - it may well be that the ABX tests on ArnyK's & Scotts files turn out not to be valid tests because the files themselves have some timing differences between original & downsampled files.

This proves how relevant the controls are for any valid test - in this case specifically point 4 on his list "4) perfect time alignment and level alignment (either of those off by much at all will absolutely result in a positive result)"

But we can't just take one control & say, yes that's important but the others aren't.
The test itself didn't include ant positive or negative controls only until later were test tones introduced to examine possible IMD in the playback equipment - a form of control but the test may have been able to have included hidden anchors to determine if false positives were being reported or false negatives.

I think in a lot of ways ArnyK's test shows the prevailing attitude to this - blind tests have almost always delivered a null result in the past so no real thought is given to the flaws within the test & how to control or examine them. If nothing else these positive test results (whether they prove to be based on bad samples or not) has thrown a light on proper testing procedure & the propensity of improper testing procedure to deliver false negatives (nulls)

maxflinn · Aug 14, 2014

Micro, the reason the sighted reports from these tests differ from the blind results of the same tests is the controls - Knowledge and levels.

They remove non-aural stimuli and differences that might be caused by differing SPL. The reported differences then vanish, nearly every time.

Is there any evidence that adding yet more controls brings the blind results more into line with the sighted? (same test, sighted then blind).

I'd like to see it!

esldude · Aug 14, 2014

microstrip said:
Altough I understand your points, unless you decide to read elsewhere about controls and the statistical nature of listening tests we will go nowhere debating sighted versus blind tests - we are speaking different dialects, and I do know how to make it simpler.

Both types of listening need control mechanisms - and as referred before, audiophile challenges are a good example of how they shoul not be carried.

These debates chase their tails because we do not move outside repetition, wikipedia and forum culture. There are excellently written standards, articles by known audio writers, even by known audio designers, but very few care about them, perhaps because they are not freely or easily available.

Well we have something like the ITU BS1116 recommendations. Available here for all to see:

Interesting in that it has the following right near the beginning:

This Recommendation is intended for use in the assessment of systems which introduce impairments so small as to be undetectable without rigorous control of the experimental conditions and appropriate statistical analysis. If used for systems that introduce relatively large and easily detectable impairments, it leads to excessive expenditure of time and effort and may also lead to less reliable results than a simpler test. This Recommendation forms the base reference for the other Recommendations, which may contain additional special conditions or relaxations of the requirements included in this Recommendation.

So microstrip, what is the equivalent how to for sighted listening?

The consensus among audiophiles seems to be long term is the way to go. That alone introduces many variables simply beyond controlling. A good many of those variables which immediately become controlled by doing the most simple blind short term comparison.

Now I would add level matching, but so many think it not important. I would add listening to the same music or signal. But many times have I seen audiophiles listen to a couple songs with one piece of gear, switch gear listen to different songs and make proclamation of differences observed. I am not trying to paint with an overly broad brush. Poorly done sighted or blind comparisons don't invalidate the methods in total.

But can you gives us a summary of these sighted methodologies , the excellently written standards you mention?

Stating clearly my position, when weighing the evidence and reasoning through the matter, sighted listening seems an inferior method of evaluation for discerning differences compared to blind. I cannot imagine what sighted methodology would make it comparable. It is part of the different dialects you mention. Sighted listeners hold their belief without playing by rules of evidence the blind comparisons are built upon. You imply an understanding of statistical controls would allow sighted tests a place. I would like to see what method makes that possible. I sure hope you have something more statistically sound than, "one million audiophiles speaking with their pocketbooks can't all be wrong".

jkeny · Aug 14, 2014

esldude said:
This Recommendation is intended for use in the assessment of systems which introduce impairments so small as to be undetectable without rigorous control of the experimental conditions and appropriate statistical analysis. If used for systems that introduce relatively large and easily detectable impairments, it leads to excessive expenditure of time and effort and may also lead to less reliable results than a simpler test. This Recommendation forms the base reference for the other Recommendations, which may contain additional special conditions or relaxations of the requirements included in this Recommendation.

Can anyone tell me how the underlined text applies? In what circumstances would this test be unreliable for gross or easily detected impairments - it seems to be counter-intuitive?

jkeny · Aug 14, 2014

The other interesting question is: If ArnyK's audio files are actually shown to be flawed then what does that say about the sensitivity/accuracy of the ABX test itself or how it's run? These files have been around for 10 years or more, I believe & nobody has reported positive results up to recently. Remember the original files seem to be doubly flawed - differences of 0.2dB & timing differences of 10mS - both of which are reported as being within the audible range.

Winer's files have similarly been around for 10 years & only recently have positive results been reported & reported by many, most of whom are not trained listeners.

So why all these false negatives? What is wrong with the running of ABX?

Further question that occur to me are:
- which results are correct -the previous many null results or the recent positive ABX results?
- why did nobody return positive results before now & now there are more than one returned positive result?
- would a more carefully controlled ABX test (including files) have avoided all this?
- What are the other biases not being accounted for in this ABX blind tests that have give 100% null (false negatives) results up to now?
- What overlooked bias is stronger than the level & timing differences that are in these files & are known to be audible?
- what bias has been removed that now allows people to easily hear Winer's 20 pass files differences, whereas before it wasn't audible?

esldude · Aug 14, 2014

jkeny said:
Can anyone tell me how the underlined text applies? In what circumstances would this test be unreliable for gross or easily detected impairments - it seems to be counter-intuitive?

I actually have the same question. Excessive time and effort I understand. Less reliable results I don't.

esldude · Aug 14, 2014

jkeny said:
The other interesting question is: If ArnyK's audio files are actually shown to be flawed then where does that say about the sensitivity/accuracy of the ABX test itself? These files have been around for 10 years or more, I believe & nobody has reported positive results up to recently. Remember the original files were doubly flawed - differences of 0.2dB & timing differences of 10mS - both of which are reported as being within the audible range.

Winer's files have similarly been around for 10 years & only recently have positive results been reported & reported by many, most of whom are not trained listeners.

So why all these false negatives? What is wrong with the running of ABX. Why have

So the question are:
- which results are correct -the previous many null results or the recent positive ABX results?
- why did nobody return positive results before now & now there are more than one returned positive result?
- would a more carefully controlled experiment (including files) have avoided all this?
- What are the other biases not being accounted for in this ABX blind tests that have give 100% null (false negatives) results up to now?
- What overlooked bias is stronger than the level & timing differences that are in these files & are known to be audible?
- what bias has been removed that now allows people to easily hear Winer's 20 pass files differences, whereas before it wasn't audible?

Arny's files were the right level or pretty darn close. The AVS files were mismatched in volume. Arny's files were different in time by one sample or two I forget right off hand. I corrected that before I ABX'd it. So at least for me that wasn't the reason.

As for those not being detected previously I have a few likely ideas. For me, I didn't have a good quality computer playback when those first came out. And being able to do comparisons that way is easier. So back when those files were first out it would have meant burning to CD-R and listening. Now you likely have 100 times as many people who can drag and drop and give it a go just that easily. Further even fractional seconds of delay in switching corrupt results for difficult to discern differences. With CD-R near instant switching isn't possible. With Foobar it is. I couldn't have abx'd the files without that. Even Arny's original files aren't an easy thing to pick out. So none of that really goes against what is known for ABX testing conditions and the effects on results.

The only part of your questions I can agree with is:

- would a more carefully controlled experiment (including files) have avoided all this?

I think the way forward if you wish to call it that, isn't to disavow the benefits and ways blind tests are better. Rather to properly criticize when they are done wrongly and increase education of the main things that must be done correctly. The recent AVS files being .2 db different in level in this day is hugely incompetent. The other thing for very fine differences perhaps not fully appreciated is the need for short segments and instantaneous switching. Switching delays of half a second are enough to corrupt otherwise solid results. None of this is news for genuine psychoacoustic work. But it sometimes is in casual forum based comparisons.

jkeny · Aug 14, 2014

esldude said:
Arny's files were the right level or pretty darn close. The AVS files were mismatched in volume. Arny's files were different in time by one sample or two I forget right off hand. I corrected that before I ABX'd it. So at least for me that wasn't the reason.

OK, you have me confused - there are two sets of files from AVS - one is Arny's (jangling keys) the other is Scotts. There is a third ABX test that Amir passed & that was Winer's files. Could we try to keep these clearly named to avoid confusion? You tested Arny's jangling keys files & I thought you came up with a timing difference between orig & resampled file caused by the resampler?

So, ArnyK's tracks had level difference (0.2dB, I believe?)but only about half a sample - 13uS mismatched, when I tested them. Firstly 0.2dB is claimed to be audible. Secondly, I believe you were saying that the differences were explainable because of old Vs new samplers & possibly different types of dither? Are you saying the files are audibly flawed or not? Idf they are audibly flawed then why were they not being picked up before now. If they are not audibly flawed then what is the problem about different dither, 0.2dB level differences r timing differences

The 10mS difference is in Scott's set of files - three pairs in all? But these are not long standing files

Winers files are audibly different but am I right in thinking that, nobody until recently, reported hearing differences between the different loopback passes Vs original

As for those not being detected previously I have a few likely ideas. For me, I didn't have a good quality computer playback when those first came out. And being able to do comparisons that way is easier. So back when those files were first out it would have meant burning to CD-R and listening. Now you likely have 100 times as many people who can drag and drop and give it a go just that easily. Further even fractional seconds of delay in switching corrupt results for difficult to discern differences. With CD-R near instant switching isn't possible. With Foobar it is. I couldn't have abx'd the files without that. Even Arny's original files aren't an easy thing to pick out. So none of that really goes against what is known for ABX testing conditions and the effects on results.

The only part of your questions I can agree with is:

- would a more carefully controlled experiment (including files) have avoided all this?

I think the way forward if you wish to call it that, isn't to disavow the benefits and ways blind tests are better. Rather to properly criticize when they are done wrongly and increase education of the main things that must be done correctly. The recent AVS files being .2 db different in level in this day is hugely incompetent. The other thing for very fine differences perhaps not fully appreciated is the need for short segments and instantaneous switching. Switching delays of half a second are enough to corrupt otherwise solid results. None of this is news for genuine psychoacoustic work. But it sometimes is in casual forum based comparisons.

Mostly, agreed - better controls are required if people are going to use blind tests as evidence of whatever. If those controls are not in place then they are anecdotal reports,just the same as sighted listening reports, no better, no worse!

But, I also think it might be wise to ponder how these results are now returning positives when in the past they didn't. I don't think your PC explanation is correct - in fact it contradicts what ArnyK, himself maintains & that is that playback IMD is responsible for the audible differentiation. Older PCs & soundcards would have suffered IMD to a greater extent in greater numbers than modern PCs & soundcards & therefore one would expect to have seen more positive results then than now!

microstrip · Aug 14, 2014

esldude said:
Well we have something like the ITU BS1116 recommendations. Available here for all to see:

Interesting in that it has the following right near the beginning:

This Recommendation is intended for use in the assessment of systems which introduce impairments so small as to be undetectable without rigorous control of the experimental conditions and appropriate statistical analysis. If used for systems that introduce relatively large and easily detectable impairments, it leads to excessive expenditure of time and effort and may also lead to less reliable results than a simpler test. This Recommendation forms the base reference for the other Recommendations, which may contain additional special conditions or relaxations of the requirements included in this Recommendation.

So microstrip, what is the equivalent how to for sighted listening?

The consensus among audiophiles seems to be long term is the way to go. That alone introduces many variables simply beyond controlling. A good many of those variables which immediately become controlled by doing the most simple blind short term comparison.

Now I would add level matching, but so many think it not important. I would add listening to the same music or signal. But many times have I seen audiophiles listen to a couple songs with one piece of gear, switch gear listen to different songs and make proclamation of differences observed. I am not trying to paint with an overly broad brush. Poorly done sighted or blind comparisons don't invalidate the methods in total.

But can you gives us a summary of these sighted methodologies , the excellently written standards you mention?

Stating clearly my position, when weighing the evidence and reasoning through the matter, sighted listening seems an inferior method of evaluation for discerning differences compared to blind. I cannot imagine what sighted methodology would make it comparable. It is part of the different dialects you mention. Sighted listeners hold their belief without playing by rules of evidence the blind comparisons are built upon. You imply an understanding of statistical controls would allow sighted tests a place. I would like to see what method makes that possible. I sure hope you have something more statistically sound than, "one million audiophiles speaking with their pocketbooks can't all be wrong".

The standards are for blind listening - sorry I can not repeat myself extensively in every post. Sighted listening is discussed less formally in many other articles I referred to. Unfortunately I am away from my office I could give you some references of articles I own. An no, I would never refer to audiophiles and pocketbooks. I would pick the audio designers and system assemblers that design great electronics and cables, and voice great systems using sighted listening.

esldude · Aug 14, 2014

jkeny said:
OK, you have me confused - there are two sets of files from AVS - one is Arny's (jangling keys) the other is Scotts. There is a third ABX test that Amir passed & that was Winer's files. Could we try to keep these clearly named to avoid confusion? You tested Arny's jangling keys files & I thought you came up with a timing difference between orig & resampled file caused by the resampler?

So, ArnyK's tracks had level difference (0.2dB, I believe?)but only about half a sample - 13uS mismatched, when I tested them. Firstly 0.2dB is claimed to be audible. Secondly, I believe you were saying that the differences were explainable because of old Vs new samplers & possibly different types of dither? Are you saying the files are audibly flawed or not? Idf they are audibly flawed then why were they not being picked up before now. If they are not audibly flawed then what is the problem about different dither, 0.2dB level differences r timing differences

The 10mS difference is in Scott's set of files - three pairs in all? But these are not long standing files

Winers files are audibly different but am I right in thinking that, nobody until recently, reported hearing differences between the different loopback passes Vs original

Mostly, agreed - better controls are required if people are going to use blind tests as evidence of whatever. If those controls are not in place then they are anecdotal reports,just the same as sighted listening reports, no better, no worse!

But, I also think it might be wise to ponder how these results are now returning positives when in the past they didn't. I don't think your PC explanation is correct - in fact it contradicts what ArnyK, himself maintains & that is that playback IMD is responsible for the audible differentiation. Older PCs & soundcards would have suffered IMD to a greater extent in greater numbers than modern PCs & soundcards & therefore one would expect to have seen more positive results then than now!

About Ethan's files I have had nothing to say and not dealt with them.

Scotty's AVS files were recent and had a timing shift, and a sub-sample timing shift and .2 db level difference. Arny's files were shifted by one even sample at 96 khz. There was no level difference. It appears the residual which is barely discernible was due to older resampling artifacts. Once I resampled with Sox in Audacity I could not find a difference blind and the residuals between the files were at a very, very low level. I didn't bother with Scotty's after initially abxing the first and then finding the level difference. If resampled with Sox I could not discern those either.

Mostly, agreed - better controls are required if people are going to use blind tests as evidence of whatever. If those controls are not in place then they are anecdotal reports,just the same as sighted listening reports, no better, no worse!

I most definitely do not agree with your irrational false dichotomy that blind tests without full controls are equivalent to anecdotal reports. There is no equivalency there. If done with flawed files like the Scotty files they aren't anecdotal they are misleading. In the case of Arny's files they aren't flawed except that they point out barely audible artifacts of resampling. Being on the edge of inaudible, long switching times could explain the inability of most in the past to fail at abx tests with them. With Foobar that problem is handled. Still the difference is barely discernible at all. I don't see a contradiction there. One isn't wrong to conclude in most cases they are audibly equivalent, and under the best conditions barely audible as different. Increasingly controlled blind tests are like filters with increasingly finer results and abilities to discriminate. That is pointing out why your false dichotomy is misleading.

As for my explanation of Arny's files being more audible I think one can't escape that a couple of orders of magnitude or more of people are set up to playback digital files thru a computer than was the case when Arny first did his files. In the beginning more than a decade ago maybe a couple hundred people listened to them. Either over very poor quality sound cards or over CD-R's in good systems. Now in the last 4 years those with a good quality method of playing directly from a computer over the best home system as well as many more high quality headphone users has literally exploded. Many thousands or tens of thousands may have listened to his files. Whatever percentage could have heard a difference the total number is greatly magnified and the chances the files are tried over a good up to date playback chain are as well.

Orb · Aug 14, 2014

Just to add esldude,
Amir also resampled the keys, would be good if he contributed to the sampling-test side of this.
Regarding hirez, like you I also think that most early tests even back before before 2008/09 are flawed in some way anyway due to the fact how hirez was actually handled; either with studios or software or dacs.
True native digital hirez from studio-to-DAC output really has only been consistent in the last few years, and still there is probably around a 20% to 25% digital issue with albums reviewed and measured by HifiNews each month, and that is ignoring that some dacs still downsample-decimate usb interface as a quick hardware consideration example.

Cheers
Orb

jkeny · Aug 14, 2014

esldude said:
About Ethan's files I have had nothing to say and not dealt with them.

Scotty's AVS files were recent and had a timing shift, and a sub-sample timing shift and .2 db level difference. Arny's files were shifted by one even sample at 96 khz. There was no level difference. It appears the residual which is barely discernible was due to older resampling artifacts. Once I resampled with Sox in Audacity I could not find a difference blind and the residuals between the files were at a very, very low level. I didn't bother with Scotty's after initially abxing the first and then finding the level difference. If resampled with Sox I could not discern those either.

Mostly, agreed - better controls are required if people are going to use blind tests as evidence of whatever. If those controls are not in place then they are anecdotal reports,just the same as sighted listening reports, no better, no worse!

I most definitely do not agree with your irrational false dichotomy that blind tests without full controls are equivalent to anecdotal reports. There is no equivalency there.

You mean they are better than sighted, anecdotal reports or worse?

If done with flawed files like the Scotty files they aren't anecdotal they are misleading.

Sure & so would sighted,anecdotal reports of these flawed files have been misleading - so in your example, they are equivalent.

In the case of Arny's files they aren't flawed except that they point out barely audible artifacts of resampling. Being on the edge of inaudible, long switching times could explain the inability of most in the past to fail at abx tests with them. With Foobar that problem is handled. Still the difference is barely discernible at all. I don't see a contradiction there. One isn't wrong to conclude in most cases they are audibly equivalent, and under the best conditions barely audible as different. Increasingly controlled blind tests are like filters with increasingly finer results and abilities to discriminate.

Agreed, the more controls in place the more sensitive the test.

That is pointing out why your false dichotomy is misleading.

The reason Arny gave for rejecting the positive results was that he reckoned IMD (of the high res files) was the tell for people to be able to discriminate the files apart. This would have been more prevalent on older soundcards in the past & would have been pretty obviously audible even without Foobar ABXing (was this created in 2009?)

As for my explanation of Arny's files being more audible I think one can't escape that a couple of orders of magnitude or more of people are set up to playback digital files thru a computer than was the case when Arny first did his files. In the beginning more than a decade ago maybe a couple hundred people listened to them. Either over very poor quality sound cards or over CD-R's in good systems.

Yes, again poor quality souncards would have produced very noticeable IMD when playing the high res files & be easily identified

Now in the last 4 years those with a good quality method of playing directly from a computer over the best home system as well as many more high quality headphone users has literally exploded. Many thousands or tens of thousands may have listened to his files. Whatever percentage could have heard a difference the total number is greatly magnified and the chances the files are tried over a good up to date playback chain are as well.

Tenuous, at best!

Orb · Aug 16, 2014

Esldude,
just quickly and wondering if I have missed something; did you download the recent music files with regards to 0.2db gain as from what I can tell in the thread this was corrected?
http://www.avsforum.com/forum/91-au...-aix-high-resolution-audio-test-take-2-a.html

Regarding the keys, heck good luck working out which version of those you have as I am sure that has been redone at least 3 times; maybe ping Amir for the latest set and maybe what he did regarding his own downsampling/decimation for validation on the music and keys.
The music files still have a time offset problem I think, but this only matters in certain listening approaches and not the way Amir and a few others listen and suggested how to, for some others would be an issue as shown by Zilch.

Thanks
Orb

esldude · Aug 16, 2014

Orb said:
Esldude,
just quickly and wondering if I have missed something; did you download the recent music files with regards to 0.2db gain as from what I can tell in the thread this was corrected?
http://www.avsforum.com/forum/91-au...-aix-high-resolution-audio-test-take-2-a.html

Regarding the keys, heck good luck working out which version of those you have as I am sure that has been redone at least 3 times; maybe ping Amir for the latest set and maybe what he did regarding his own downsampling/decimation for validation on the music and keys.
The music files still have a time offset problem I think, but this only matters in certain listening approaches and not the way Amir and a few others listen and suggested how to, for some others would be an issue as shown by Zilch.

Thanks
Orb

I did get the new AVS files in Scotty's thread. Other than the level fix everything else appears the same.

It is strange to me to be so hard headed about using something that leaves additional artifacts from the resampling. Use Sox, use Audacity (which uses the Sox resampler) or use the Sox Foobar plugin for the conversion. The resulting file has zero timing issues. Zero level issues. Has an extremely low residual difference between the original below 20 khz. Bam, one step, no muss, no fuss and you have an excellent file to compare without the additional issues between the files.

I never did ABX the level corrected files. Once I saw the first ones and the other issues I just used a resampler I knew to be good. And the files from that are indistinguishable by me from the original.

As for the jangling keys file, somewhere along the way Arny provided the originals with the appended IMD test tones. I also made my own jangling keys files. When resampled in Audacity both are audibly equivalent to me.

The issue Zillch posted about is when switching between A and B by skipping over with the sound continuing (if I understood correctly). At least for myself I have found playing complete, but short segments of 5 seconds or less the most revealing.

Orb · Aug 16, 2014

They (including Scott) mention SOX and some long term members did not want it used it seems, while others tested that *shrug*.
Those following please appreciate there is 3 scott threads; 1 discussion, first attempt files, then 2nd attempt files.
Unfortunately the IMD test tones with Arny has appeared more than once as well, fun fun fun

Ok not fun

Amir, mind providing some input as well please as I think you did the test with resampled files like esldude that had no time offset/gain issues.
Thanks
Orb

esldude · Aug 16, 2014

Orb said:
They (including Scott) mention SOX and some long term members did not want it used it seems, while others tested that *shrug*.
Those following please appreciate there is 3 scott threads; 1 discussion, first attempt files, then 2nd attempt files.
Unfortunately the IMD test tones with Arny has appeared more than once as well, fun fun fun
Ok not fun

Amir, mind providing some input as well please as I think you did the test with resampled files like esldude that had no time offset/gain issues.
Thanks
Orb

Well I don't know why the complaints against Sox. I know there are other good resamplers too. But with Sox you do the resampling and get time and level dead on and an excellent null below 20 khz with no effort. Whatever they did use does not do that. I think iZotope is recognized as perhaps the best. I don't have it so can't say what it does if used other than testing of it shows very small artifacts as well.

amirm · Aug 16, 2014

Orb said:
Amir, mind providing some input as well please as I think you did the test with resampled files like esldude that had no time offset/gain issues.
Thanks
Orb

Arny's files never had sync problems. But yes, I did resample them using latest version of Audition CC and I could still tell them apart.

============

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/24 20:27:41

File A: C:\Users\Amir\Music\Arnys Filter Test\keys jangling amir-converted 4416 2496.wav
File B: C:\Users\Amir\Music\Arnys Filter Test\keys jangling full band 2496.wav

20:27:41 : Test started.
20:28:07 : 00/01 100.0%
20:28:25 : 00/02 100.0%
20:28:55 : 01/03 87.5%
20:29:02 : 02/04 68.8%
20:29:12 : 03/05 50.0%
20:29:20 : 04/06 34.4%
20:29:27 : 05/07 22.7%
20:29:36 : 06/08 14.5%
20:29:44 : 07/09 9.0%
20:29:55 : 08/10 5.5%
20:30:00 : 09/11 3.3%
20:30:07 : 10/12 1.9%
20:30:16 : 11/13 1.1%
20:30:22 : 12/14 0.6%
20:30:29 : 13/15 0.4%
20:30:36 : 14/16 0.2%
20:30:41 : 15/17 0.1%
20:30:53 : 16/18 0.1%
20:31:03 : 17/19 0.0%
20:31:07 : Test finished.

----------
Total: 17/19 (0.0%)

amirm · Aug 16, 2014

esldude said:
Well I don't know why the complaints against Sox. I know there are other good resamplers too. But with Sox you do the resampling and get time and level dead on and an excellent null below 20 khz with no effort. Whatever they did use does not do that. I think iZotope is recognized as perhaps the best. I don't have it so can't say what it does if used other than testing of it shows very small artifacts as well.

What they used is part of the workflow that professionals use to produce the music we get. Sox, iZotope, etc. are not commonly used by pros so it is not part of the music chain we receive.

What is fascinating is that after so many years, it is only now, through this testing, that we realize the files are not level matched, time sync'ed, etc. How come we assumed they were and challenged people to tell them apart?

Even with the flaws, hardly anyone has posted ABX tests of Scott/Mark files. THis shows that trained listeners do exist and can outperform other listeners. Therefore tests that did not use trained listeners are not reliable.

Orb · Aug 16, 2014

amirm said:
Arny's files never had sync problems. But yes, I did resample them using latest version of Audition CC and I could still tell them apart.

I should had emphasised better the sync/time offset was specific to the Scott files and thanks for the heads-up with what you did; how is dither handled on the downsampling-decimation with Audition CC and did you ever get a chance to test this further?
Cheers
Orb

Conclusive "Proof" that higher resolution audio sounds different

VIP/Donor

Industry Expert, Member Sponsor

New Member

New Member

Industry Expert, Member Sponsor

Industry Expert, Member Sponsor

New Member

New Member

Industry Expert, Member Sponsor

VIP/Donor

New Member

New Member

Industry Expert, Member Sponsor

New Member

New Member

New Member

New Member

Banned

Banned

New Member

Similar threads