Measurements & the stereo illusion

can't go there with this sentence as written. while yes we dont need our heads locked in a vice to recognise a soundstage, to evaluate you need your head locked. ......

OK, maybe the wording was imprecise - what I mean is we seem to be able to sense a change in sound stage without or heads being fixed in a vice because we use this "purposive motion" to change the angle of incidence of the wave or envelope to our two ears & analyse it's new presentation, thus giving us a better fix on location - I don't know what processes are involved here but it is a different process from how we perceive the location of an object if it moves by analysing the sound from that object.

My simple reasoning is that if this weren't the case we would be very confused by the real world & the fox would never get it's prey :)
 
there is a lot to it. for example, our eyes are used by the brain to also pinpoint a sound location as our ears can be fooled if used by themselves.....i was at a small club, sitting at front table, three or four piece unamplified band. i closed my eyes and localized the drum as being in one spot, but when i opened them i was off quite a bit due to reflections in the room wall. so what, well, it still sounded ok, there was still an image, it just happened to not be physically precise. in nearly all recorded stereo, the soundstage is all made up anyway for you. i would say that even if the components and room and all that skew the soundstage (and they do) and so if you use a different component and it skews the stage a bit differently, what has been lost?...its artificial to start with. even a modest system can throw a soundstage, in fact, IMO single driver speaker pair excels at it cause the speakers IMO are the biggest problem along with the room, not the playback electronics so much, but yeah, they are part of the story.

Yes, but I'm talking not about exactly locating a particular instrument in a soundfield & then locating it precisely in a slightly different position in a new soundfield, I'm talking more about the holistic impression of the soundfield changing perceptibly along the spectrum of diffuse to solid (a probable reason being that the location of elements of the sound stage have become more precise). The more solid sound stage being the result of better timing & probably better level reproduction (as they often go hand in hand) - a lower noise floor helps too, I suspect.

Yes sound stages artificially created in the recording process are probably more un-real & lacking in natural cues & may be more difficult to detect sound stage issues while those created as a result of good microphone techniques & recording may well be easier to detect but that is really just a distraction to the point. I believe we can perceive a change in sound stage such as deeper, more 3D-like with better replay systems. I know room, speakers, electronics all come into play but the measurements just show timing differences in the source electronics to start with & I'm focussing on this as the only measurements we currently have.
 
We seem to be able to do the job of hearing the illusion of sound stage from stereo given reasonable reproduction & room. So if your premise was correct then we would not be able to do this except in very limited, controlled circumstances, yet we regularly hear sound stage produced from our replay systems despite less than optimal condition, no?


Hello jkeny

You are looking for a audible change with a time variation/delay of 5X10-6. In a real room you are looking at miliseconds 10^3. There is a big difference between the two. I wouldn't be surprised if 5X10-6 was less time than it takes for the sound to propogate from the voice coil of a transducer, in say a large woofer as an example, to the edge of the cone. If you look at impulse response on speakers they typically measure 1 to 3 or so miliseconds.

Look at the Precedence/Haas effect experiment using a stereo pair of loudspeakers. Using equal sound levels the level of delay added to shift the image was between .6 and 1 miliseconds. That would shift the image to the hard left or right from the phantom center image. The sound would shift away from the delayed loudspeaker. Obviously the shift started before the total time .6-1 milisec but look at the time magnitude in the experiment vs what you are proposing. You are looking for a noticeable shift with a delay of 5X10-6?? To say that there is a significant difference in the delay times is an understatement.

Rob:)
 
Hello jkeny

You are looking for a audible change with a time variation/delay of 5X10-6. In a real room you are looking at miliseconds 10^3.
Do you mean that it takes milliseconds for the sound to reach our ears? So what?
There is a big difference between the two.
Are you saying that because it takes milliseconds for the sound to reach us that we won't notice a change in microseconds between the L & right sound reaching us? If so then you disagree with JJ & the limits of ICTD
I wouldn't be surprised if 5X10-6 was less time than it takes for the sound to propogate from the voice coil of a transducer, in say a large woofer as an example, to the edge of the cone. If you look at impulse response on speakers they typically measure 1 to 3 or so miliseconds.
Again, I think you have the wrong end of the stick - we are talking about ICTD which involves two ears, by definition, not about sensing a difference of 5 microseconds in the sound reaching ONE ear!

Look at the Precedence/Haas effect experiment using a stereo pair of loudspeakers. Using equal sound levels the level of delay added to shift the image was between .6 and 1 miliseconds. That would shift the image to the hard left or right from the phantom center image. The sound would shift away from the delayed loudspeaker. Obviously the shift started before the total time .6-1 milisec but look at the time magnitude in the experiment vs what you are proposing. You are looking for a noticeable shift with a delay of 5X10-6?? To say that there is a significant difference in the delay times is an understatement.

Rob:)
But JJ already stated that he stood over 5 microseconds as a just noticeable ICTD difference so I'm not sure who you are disagreeing with me or JJ?
 
So if we apply some maths (& correct me if I'm wrong) - the speed of sound in air is about is 343metres/s therefore in 1 microseconds it travels 0.3cm.
It think you are off by a factor of 10. 343*100/1000000 = .03 cm. That is one third of a millimeter. I suspect our current speakers move back and forth that much already :).
 
It think you are off by a factor of 10. 343*100/1000000 = .03 cm. That is one third of a millimeter. I suspect our current speakers move back and forth that much already :).

Yes, thanks I knew I would mess up somewhere. But multiply this figure by 10 for the just audible ICTD of 10microseconds & we get to 0.3cms - do you think they move that much? :)
 
Are you saying that because it takes milliseconds for the sound to reach us that we won't notice a change in microseconds between the L & right sound reaching us? If so then you disagree with JJ & the limits of ICTD

Hello jkeny

You are talking Inter Channel Time Difference Yes?

inter-channel level & timing (ICLD & ICTD)

Did you look up the Precedence Effect/Haas, that's what I was talking about.

Again, I think you have the wrong end of the stick - we are talking about ICTD which involves two ears, by definition, not about sensing a difference of 5 microseconds in the sound reaching ONE ear!

How did you get "sound reaching ONE ear" out of the Precedence Effect/Haas experiment where you use a stereo pair of speakers and use delay in one channel to shift the phantom image??

When they did the experiment for the Haas effect they used a stereo pair of speakers looking to see how much delay in one channel would shift the phantom image. The actual number .6-1 milisecond are thousands of orders of magnitude greater than what you propose.

If you want to see why don't you just repeat the original experiment using your numbers in a real room not using headphones in some contrived experiment. 5 micro seconds maybe the correct number under experimental conditions using headphones not neccesarilly what we can hear in our rooms at home listening to our systems.

But JJ already stated that he stood over 5 microseconds as a just noticeable ICTD difference so I'm not sure who you are disagreeing with me or JJ?

So maybe I disagree with both of you. JJ in the sense that I don't think 5 micro second time shift between channels will noticeably shift the image here in real world listening to our systems at home. There is way too much going on in the room in real time to hear it.

If it is not audible under the same conditions we use to listen to music then it doesn't matter.

Rob:)
 
So maybe I disagree with both of you. JJ in the sense that I don't think 5 micro second time shift between channels will noticeably shift the image here in real world listening to our systems at home. There is way too much going on in the room in real time to hear it.

The 5 microsecond number is for good headphones in a quiet room with trained listeners. Your mileage may vary.
 
Hello JJ

Thanks for the clarification.

Rob:)
 
Hello jkeny

Did you look up the Precedence Effect/Haas, that's what I was talking about.

How did you get "sound reaching ONE ear" out of the Precedence Effect/Haas experiment where you use a stereo pair of speakers and use delay in one channel to shift the phantom image??

When they did the experiment for the Haas effect they used a stereo pair of speakers looking to see how much delay in one channel would shift the phantom image. The actual number .6-1 milisecond are thousands of orders of magnitude greater than what you propose.
The link I gave already to here demonstrates that 0.22milliseconds will audibly shift the image. It may not be the min timing change that will be noticeable? But it's also more complicated than this as the recording of a audio event & it's replay will involve both time & level changes between the speakers to create the stereo illusion. What I'm asking about is what shifts in these characteristics have been investigated in relation to sound stage improvements. JJ has stated that 5 microseconds is the just audible timing shift that is noticeable in ICTD & 0.2dB in ICLD. I presume these were measured independently? When both a timing shift & level shift are combined what do the values change to then?

Is speaker listening going to be less sensitive in this area? I don't know if this is necessarily the case - yes room reflections, etc come into play but HRTF come into play with headphones. Also sound stage is a much more realistic illusion with speakers & I would contend that it's easier to notice a change in a realistic sound stage.

If you want to see why don't you just repeat the original experiment using your numbers in a real room not using headphones in some contrived experiment. 5 micro seconds maybe the correct number under experimental conditions using headphones not neccesarilly what we can hear in our rooms at home listening to our systems.
Yes but see above comments


So maybe I disagree with both of you. JJ in the sense that I don't think 5 micro second time shift between channels will noticeably shift the image here in real world listening to our systems at home. There is way too much going on in the room in real time to hear it.

If it is not audible under the same conditions we use to listen to music then it doesn't matter.

Rob:)

Maybe you are correct & maybe not?
 
Last edited:
Hello jkeny

The link I gave already to here demonstrates that 0.22milliseconds will audibly shift the image.

Thanks it was fun to try thanks for posting it.

When both a timing shift & level shift are combined what do the values change to then?

That certainly would be interesting. I have been using Tooles book as a reference I will take a look to see if there are any studies referenced.

Maybe you are correct & maybe not? .

That's right and there is only one way to find out. Unless of course someone has already done it and it's buried in a journal somewhere.

Rob
 
That's right and there is only one way to find out. Unless of course someone has already done it and it's buried in a journal somewhere.

Rob
I just found this guy's page looking at "Position of Phantom Source" & has a downloadable program http://www.ohl.to/about-audio/audio-softwares/pops-position-of-phantom-source

He also references some papers http://hauptmikrofon.de/HW/Wittek_thesis_201207.pdf "Perceptual differences between wavefield synthesis and stereophony" 2007
 
Great find John! This is more or less what I've been looking for!
 
In a quick read of that Ph.D thesis (& it's well worth reading) which deals with comparing a single line array of speakers (Wave Field Synthesis, WFS) with two speaker Stereophonic & the various aspects of the reproduced sound space (particularly with respect to the creation of a realistic acoustic space), I have extracted a couple of quotes (of course I need to read the whole paper again in depth to evaluate all that is in there but these extracts leapt out at me as relevant to the discussion on this thread)

First, the perception models & processing mechanisms:
"Two different approaches to explaining stereophonic perception were introduced, more precisely the theory of ‘summing localisation’ (e.g. Blauert, 1997 and the ‘association model’ by Theile, 1980). The summing localisation theory assumes a physical synthesis of the loudspeaker signals in order to create a substitute source that physically resembles a real source regarding the essential localisation cues."
According to the results of the experiments in that paper 'Summing Localisation' model breaks down in 2 area -
- " the missing explanation for the suppression of perception of comb filtering in stereophonic hearing."
- "the perceived phantom source direction, in particular in the case of interchannel time differences, cannot be predicted nor explained sufficiently by summing localisation"

"The association model can offer an explanation for these phenomena, because it assumes the presence of two different processing mechanisms. The first stage is able to separately locate the individual loudspeaker signals by a comparison of the ear signals with a known pattern. The second stage will fuse their coherent signals after a location dependent inverse filtering. Hence, the ear signals are not directly evaluated for localisation and sound colour perception."

The idea that room reflections might somehow invalidate or lessen the the whole idea of small interchannel differences being a source of sound stage (& in this case sound colour or timbre) seems to have been addressed already in some papers - the concept being put forward that reflections help to decolour the sound i.e make it more accurately represent the timbre of the sound. "The hypothesis of a binaural decolouration was mentioned in the literature in the context of early reflections (Salomons, 1995; Brüggen, 2001a, 2001b). An investigation incorporating stereophonic reproduction in that context could give rise to similar results regarding the binaural advantage in sound colour perception." (Sound colour & localisation seem to be strongly related to one another.) "Further investigations that may have to balance different alternatives should bear in mind the general priority which is given to sound colour in spatial perception (Rumsey et al., 2005)."
"As discussed in this thesis, in principle no disadvantages can be identified in stereophony for the creation of a spatial sound field exhibiting accurate depth and distance as well as accurate properties of directional imaging and sound colour reproduction"

Also from that paper - pages 42 to 48 discuss ILD & ITD and how close a stereo signal with panning one or the other comes to a real world sound stage. Then on page 47 "For frequencies above ca. 3 kHz, the ITDs are congruent with the interchannel time differences due to the influence of head shadowing which prevents the summing of the loudspeaker signals. This means that time-panning (?tmax ? 1 ms) can also create high frequency ITDs which are larger than in reality (?tmax ~ 600µs). In the case of combined time- and level panning, a smaller interchannel time difference is created, and therefore, a close to natural high-frequency ITD can be produced"
 
Here's a paper that might help Tim (Ponk) as he seems to have an issue with how people describe what they hear in this hobby - seems it has always been a problem "EVALUATING SPATIAL ATTRIBUTES OF REPRODUCED AUDIO EVENTS USING A GRAPHICAL ASSESSMENT LANGUAGE – UNDERSTANDING DIFFERENCES IN LISTENER DEPICTIONS" http://epubs.surrey.ac.uk/516/1/fulltext.pdf

Of course if the hearing of such spatial attributes is denied as being "real" then the I guess no language or definition of terms is necessary or will suffice.
Perhaps reading these & other papers may reduce the usual derision & snide remarks that are seen on this forum when anyone sticks their head above the parapet & tries to describe what they hear.

As I said before in this thread to Steve, I was admonished in the past by him for my "Style" of posting & yet he allows these types of derisory posts without comment - is this a balanced approach?
 
Here's a paper that might help Tim (Ponk) as he seems to have an issue with how people describe what they hear in this hobby - seems it has always been a problem "EVALUATING SPATIAL ATTRIBUTES OF REPRODUCED AUDIO EVENTS USING A GRAPHICAL ASSESSMENT LANGUAGE – UNDERSTANDING DIFFERENCES IN LISTENER DEPICTIONS" http://epubs.surrey.ac.uk/516/1/fulltext.pdf

Of course if the hearing of such spatial attributes is denied as being "real" then the I guess no language or definition of terms is necessary or will suffice.
Perhaps reading these & other papers may reduce the usual derision & snide remarks that are seen on this forum when anyone sticks their head above the parapet & tries to describe what they hear.

As I said before in this thread to Steve, I was admonished in the past by him for my "Style" of posting & yet he allows these types of derisory posts without comment - is this a balanced approach?

Really, John, the only time I have a problem with the way audiophiles describe what they hear is when they seem to be purposely avoiding using terms that are well-established, well-studied, measurable and verifiable and, instead, substituting fuzzy, often misappropriated words that could mean almost anything to claim performance breakthroughs or superior results. And they almost always make, or at least strongly imply, that claim.

If you experience a huge expansion of the sound stage, that makes your music more musical, more real, that brings the instruments and voices into your listening room so palpably that you can almost taste them....and you haven't changed, or even moved, the speakers, treated the room, changed a grossly underpowered amp for one up to the task, mentioned the recording or referred to a single measurable performance parameter that has changed; if instead, what you've done to achieve all that unmeasurable drama is install a new digital player or changed the power supply to your already well- isolated dac, yes, I'm going to ask for something to back that up. I might even have a little fun with it.

Tell me you think something changed and it sounds good to you, but honestly, you're not really sure of what it is because you can't seem to measure it? I'll leave you alone.

Derision? I guess that depends on where you sit. From here it looks like a sense of humor and a grip on reality.

Tim
 
Easy boys....errrr....men. Let's stay clear of the green ink shall we?
 
....

Tell me you think something changed and it sounds good to you, but honestly, you're not really sure of what it is because you can't seem to measure it?

Tim
Here's the problem in a nutshell that seems to plague all these discussions - I describe what is heard as a more solid, 3 dimensional sound stage (& others correlate this same listening experience) & yet only a measurement is considered valid. That's why I said it would be good if the people who had a tendency to do this, read the papers linked to get some idea of what is involved in making the measurements they are demanding as proof. Look at the test methodologies in that Ph.D paper regarding sound localisation & ask yourself, do you really expect people engaged in this hobby to go to these lengths? What would happen if they didn't? They would produce half-arsed tests which showed nothing. Similarly settling for half-arsed DBT tests (as is often suggested in these discussion) is more of a fantasy posing as science (pseudo-science) than the evidence coming from the correlated descriptions of many people using many different systems (of course mass delusion/suggestibility usually gets mentioned to explain this).

A good point was recently raised on another forum - how many valid, formal DBT results have been found to be found wanting after longer-term usage by a larger body of users - this can't be mass-delusion, can it?
 
Let me keep this impersonal, John.

Measurements are not the only thing that is valid. People's opinions are valid. And when people hear things that don't show up in any existing metrics, if they tell me what they think they hear and why the like it, I'm cool. But very often (like almost always), they seemm to be compelled to describe what they hear as superior, sometimes even in quasi-technical terms. And when I don't hear it and ask if they've measured any changes that might point to this evasive reality I don't hear, they typically question the resolution of my system and the quality of my hearing.

That, I have a problem with.

Your opinion is that...as an example - Amarra sounds better than iTunes? Good. Amarra is clearly, dramatically, obviously superior to iTunes and if I can't hear that it's my problem, in spite of the fact that my position is the one that is supported by the theory, the measurements and simple logic? Bad. Especially if you're the guy selling Amarra.

It's not hard.

it would be good if the people who had a tendency to do this, read the papers linked to get some idea of what is involved in making the measurements they are demanding as proof. Look at the test methodologies in that Ph.D paper regarding sound localisation & ask yourself, do you really expect people engaged in this hobby to go to these lengths?

No, but if electronic components play any part in this "more solid, more 3 dmensional sound stage," A) I would expect there to be measureable changes in the output of those components if they are creating such a dramatic change B) The measurements in that study you linked would be pretty useless in diffentiating the component's impact from everything else involved anyway, and C) While I certainly do not expect people in the hobby to go to such lengths, or care about much more than having fun, I expect companies in the industry to substantiate their claims. And no, that's not personal either. I don't think you've made any claims for your products in this thread.

Tim
 
Last edited:
Let me keep this impersonal, John.

Measurements are not the only thing that is valid. People's opinions are valid. And when people hear things that don't show up in any existing metrics, if they tell me what they think they hear and why the like it, I'm cool. But very often (like almost always), they seemm to be compelled to describe what they hear as superior, sometimes even in quasi-technical terms. And when I don't hear it and ask if they've measured any changes that might point to this evasive reality I don't hear, they typically question the resolution of my system and the quality of my hearing.

That, I have a problem with.
Yes, I have tried to keep it impersonal, Tim. The point I made in the last post was about being asked for measurements to "validate" my listening results - really nothing about you/your system or what you are hearing/not hearing. So let's leave your personal listening aside.

Your opinion is that...as an example - Amarra sounds better than iTunes? Good. Amarra is clearly, dramatically, obviously superior to iTunes and if I can't hear that it's my problem, in spite of the fact that my position is the one that is supported by the theory, the measurements and simple logic? Bad. Especially if you're the guy selling Amarra.

It's not hard.

Tim
Let's leave the selling issue out of it too - it's nothing to do with it & could be deemed personal

I borrowed a Mac Air not so long ago & while I had it I downloaded, Fidelia, Amarra & Audirvana.
I could hear differences between these 3 programs, Audirvana sounding too detailed & typical digital sounding; Fidelia sounded too soft & actually missed detail & Amarra was in the middle & my preferred it. I wasn't spoken to by any vendor of these programs, don't sell them myself or really have any interest in the Mac or it's software - I just decided to do the test. Was it blind, no! Could it be the result of bias - I had no obvious bias one way or the other & to ensure I eliminated ALL POSSIBLE biases that might influence the results would require a formally organised DBT, not just closing my eyes.

I report what I hear not what theory tells me to hear (if you read that paper you would see that theory is really not sufficient to explain what is heard - that's one of the reasons why I suggested you read it). Even if you just read the bits I quoted you would realise this!. So calling on theory as a defence for your position is mistaken.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu