Auditory Scene Analysis (ASA) explained

Status
Not open for further replies.

jkeny

Industry Expert, Member Sponsor
Feb 9, 2012
3,374
42
383
Ireland
I have mentioned this a number of times before & maybe even done a thread on it before but it bears repeating as it's importance cannot be overestimated when it comes to understanding our connection to the sound of our playback systems.

I started a thread on ASA in 2016 on ASR (which Amir obviously didn't read any part of) & started with this post

Yes, this hobby & the goal of audio reproduction is the creation of an illusion - an illusion that gives us enough audible cues to satisfy our ever-vigilant auditory processing. Much like we watch television, videos or movies which offer us enough believability to allow us to forget their limitations & become engaged with the content - this is where the emotional connection then begins. If there isn't enough believability we are constantly aware of the medium & it's portrayal - when we are bored by the sound, it's a sure indication that this believability is missing or less convincing

Let's forget about the psychophysical aspect of the ear mechanism & focus on Auditory Scene Analysis or ASA. This is the area of study, first started by Bregman in 1990 or so, which is concerned with how we make sense of the vibrations of the eardrums & create an auditory scene from these signals. Much like we create a visual scene from the impact of photons on the rods & cone cells in the eyes, we create an auditory scene from the two streams of electrical signals coming from both ears.

Now, when you think about it, this is a highly complex & interesting problem that the brain has to try to solve - to continuously create a fully realistic, moving auditory scene that maps the auditory objects in that scene & follow their change through time. In other words close your eyes & listen - you will hear & be able to locate all the sonic objects around you, including the size of the room, etc. just from the electrical waves being generated in the ears. Think about it - this is the equivalent of sitting at a corner of a swimming pool & being able to use only the waves hitting this corner to sense how many people are in the pool, where they are & where they're moving to, what they're doing, how big the pool is, etc.

The idea of ASA seems to have its genesis in trying to answer the question that the "Cocktail Party Effect" gives rise to - how do we follow one conversation among all the other conversations & noise at a party. The audio signals from all sources are hitting the ear at the same time as the audio signal from the conversation so how do we isolate & associate the signals that belong to just the followed conversation from among all the rest i.e how do we form an auditory object & follow it in the face of changing signals & changing surrounding auditory signals?

How the brain does this is being teased out in ASA & other areas of sound research. The auditory processing happens whether we are listening to the real world or to the signals from our speakers which are attempting to create an illusion of an audio performance or audio event

We perceptually ascertain audio objects in what we hear by the brain processing that we perform on the signal. The perception of these audio objects occurs because we seem to cross correlate particular signal markers which we associate with that particular audio object - spatial location, timbre, temporal coherence, & amplitude all seem to play a role - let's call these some of our perceptual factors for identifying this object. So, the interplay & relationship between these factors are the rules or schema or models that is the study of ASA. If these rules are adhered to in the audio playback system then we have a believable illusion - the more the rules are diverged from, the less believable the illusion.

Now, one thing about digital audio - because it is based on mathematics & is almost infinitely adjustable, it has so many new ways to diverge from these rules & introduce new audio anomalies that we have never heard in the real world - things like digital filter ringing spring to mind. When we encounter a new audio anomaly that we haven't met before we tend to be subconsciously confused as we have no biological model to fit it to & we are not consciously aware of what is wrong, just that we want to turn off the playback or are bored by the sound & our attention drifts. I suspect that this occurs more often than we would like to admit & may well be where the disagreements arise from - between those that intuitively (Or explicitly know this) & those who believe that measurements tell us everything ?

This part is something I wrote before which continues from the thoughts above (so forgive some of the repetition to what's above):

Yes, it's already been stated here but is worth repeating - what we hear is a construct of our brain processing. Fundamentally, there is not enough data in the signals that are picked up by the two ears to fully construct the auditory scene - we need to use all sorts of pattern-matching, extrapolation, experience of the behaviour of sounds in the real world (biological models of sound), sight, etc. to generate the fairly robust auditory scene that we continuously do.

One of the important points that comes from the research is that we are continually processing the auditory data & updating our best-guess auditory scene by decomposing, analysing & comparing the auditory signal stream & comparing it to already stored auditory models of the world

People who interpret psychoacoustics as being the illusional part of hearing & what makes it untrustworthy are completely missing this fundamental point - psychoacoustics is what allows us to make sense of the jumble of pressure waves impinging on our eardrums. It's what allows us to pick out the auditory objects, such as the bassoon in the orchestra & be able to follow it's musical line through a performance or be able to switch to listening to the string section.

Stereo reproduction is itself a trick - a trick that uses some learned knowledge about psychoacoustics to present an acceptable illusion of a real auditory scene. However, not knowing the full rules/techniques that our brains use in psychoacoustics somewhat hampers this goal of realistic audio reproduction. As a result, we can find that small discoveries are stumbled upon which audibly improve matters in a small way but we have no clear explanation yet for how they are working at the psychoacoustic level.

Without this knowledge of psychoacoustic rules, we are stumbling around using unsophisticated measurements & I believe, incorrect concepts about the limits of audibility. A lot of the improvements that I hear reported in audio are about increased realism, increased clarity, etc. - in other words they are no longer about frequency/amplitude improvements - they are improvements in other factors which our psychoacoustic rules are picking up on & we are perceiving as more realistic. Or, maybe they are small changes in freq/amplitude that currently are dismissed as inaudible but further knowledge about psychoacoustic workings may well reveal them to be audible when part of the dynamics of music & not when tested in a lab with simple tones?
 
I have mentioned this a number of times before & maybe even done a thread on it before but it bears repeating as it's importance cannot be overestimated when it comes to understanding our connection to the sound of our playback systems.

I started a thread on ASA in 2016 on ASR (which Amir obviously didn't read any part of) & started with this post

Trying somehow to indulge you but all I am reading here is speculations. Others may have a different take and will participate. I am all ears but am lost. This is supposed to be about perception and it i no different from the photons hitting our eyeballs or chemicals eliciting a smell through electrical signals ... IOW Perceptual thus psycho physical .. Kind of hard to forget.
By all means do continue and find ways to show us how truly ignorant some are for not understanding this but ... eh ... I am too ignorant to fully grasp its extraordinary importance to digital music which no one hears by the way since our ears cannot , yet, make anything out of purely digital signal but I know you'll have the last word...

Now sitting and waiting for you to explain us more about ASA.
 
Some of this might be termed "speculation"; I would say it's extrapolating from what is known, based on observation. Moving it from empiric observations to "established scientific fact" (as vague as that is in the area of psychoacoustics) is what I assume ASA is.
 
Some of this might be termed "speculation"; I would say it's extrapolating from what is known, based on observation. Moving it from empiric observations to "established scientific fact" (as vague as that is in the area of psychoacoustics) is what I assume ASA is.

If the method is onlyobservational and not based on anything objective then we fail in the gray area of "scientifism" ... How is that different from what we, for example, see or smell or feel or emote on ? And notice you use the term "assume" I would suppose ... "suppose". ;)
 
If the method is onlyobservational and not based on anything objective then we fail in the gray area of "scientifism" ... How is that different from what we, for example, see or smell or feel or emote on ? And notice you use the term "assume" I would suppose ... "suppose". ;)


Frantz. based on your inferences should we assume then that you remove the following from your "signature line?"..... ;)

"Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction."
— E. F. Schumacher
 
Trying somehow to indulge you but all I am reading here is speculations. Others may have a different take and will participate. I am all ears but am lost. This is supposed to be about perception and it i no different from the photons hitting our eyeballs or chemicals eliciting a smell through electrical signals ... IOW Perceptual thus psycho physical .. Kind of hard to forget.
By all means do continue and find ways to show us how truly ignorant some are for not understanding this but ... eh ... I am too ignorant to fully grasp its extraordinary importance to digital music which no one hears by the way since our ears cannot , yet, make anything out of purely digital signal but I know you'll have the last word...

Now sitting and waiting for you to explain us more about ASA.

Frantz, thanks for showing an interest & I will try to explain as best I can anything you have questions about, if you want to ask specific questions.

It's difficult to write good explanations & convey the essence of a subject & if I haven't achieved this the please point out where I am vague or lacking in clarity & I will try to elucidate it.

All I would ask from you is that you genuinely try to understand what I write with an attitude of discovery rather than an attitude of dismissal & how to score points.

In essence, ASA tries to explain how we make sense of the signals that hit the eardrums - in other words it is the study of how we hear what we hear. This is already further advanced for visual perception than it is for auditory perception but again Visual Scene Analysis is the study of how we see what we see - it isn't a moving image of the scene somewhere in our brain - it is an analysis of the nerve signals arriving in the brain via the optic nerve - it's the processing of different elements of the signal, in different parts of the brain & the amalgamation of these various processing strands into a final thing we call the perception of a visual scene. When I say different elements, I mean elemental stuff, - like edges is one strand of processing - the signals are analysed & all signals that signify edges of objects are extracted & processed as a separate stream in a different part of the brain to the rest of the signal elements. The same happens for colour, the same for movement - this is a very simplistic explanation but it gets across the essence of what it's about

Exactly the same occurs in auditory processing - separate strands of the signal are separated out & analysed in different parts of the brain & finally brought together to for what we call our auditory perception i.e what we hear - a moving auditory scene, just like we experience a moving visual scene.

All of this applies whether we are listening to sounds in the real world or to sounds emanating from our replay system

Hope this helps some?
 
If the method is onlyobservational and not based on anything objective then we fail in the gray area of "scientifism" ... How is that different from what we, for example, see or smell or feel or emote on ? And notice you use the term "assume" I would suppose ... "suppose". ;)

Some of this might be termed "speculation"; I would say it's extrapolating from what is known, based on observation. Moving it from empiric observations to "established scientific fact" (as vague as that is in the area of psychoacoustics) is what I assume ASA is.

Yes, ASA is an area of scientific research & there are many scientific, peer reviewed papers published. Do you want me to link to any?

The field of study first started with Al Bregman in the 90's - his website at McGill university http://webpages.mcgill.ca/staff/Group2/abregm1/web/
 
Yes, ASA is an area of scientific research & there are many scientific, peer reviewed papers published. Do you want me to link to any?

The field of study first started with Al Bregman in the 90's - his website at McGill university http://webpages.mcgill.ca/staff/Group2/abregm1/web/

I didn't make myself clear.... What does that (ASA) go to do with Digital Audio? This stretch is all yours I suppose? Not part of the study itself? The ASA studies you linked us to? Care to show us a link as to its relevance (ASA's) to Digital Audio? And its shortcomings thereof?
 
Frantz. based on your inferences should we assume then that you remove the following from your "signature line?"..... ;)

"Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction."
— E. F. Schumacher

Steve

I frankly don't see in which way the removal of complexity plays in ASA. Does it remove complexity? From your vantage point, do you think ASA adds or remove complexity? Would you be able to tell me its relevance to digital Audio?
Skepticism Is not a sin. It is healthy. I have shown that I can change my stance when proven wrong... By all means do prove me wrong in my assertions here on the relevance of ASA about DIgital Audio and I will. In the absence of that, we are left with speculations...
 
I didn't make myself clear.... What does that (ASA) go to do with Digital Audio? This stretch is all yours I suppose? Not part of the study itself? The ASA studies you linked us to? Care to show us a link as to its relevance (ASA's) to Digital Audio? And its shortcomings thereof?

ASA is relevant to all audio & everything we 'hear' - sounds from the real world, from analogue or digital playback.

If I understand you correctly you want a peer reviewed paper which deals with ASA & digital audio reproduction & shows where there might be some shortcomings with digital audio which aren't part of analogue reproduction (and also shortcomings with analogue reproduction which aren't found in digital audio)?

It's not just digital audio that is reporting these types of improvements that ASA can explain - many reports of improvements in analogue audio are the same as the audible improvements reported in digital audio. What I was saying was that the type of reports being heard are more about the gestalt & believability of the playback, the illusion of realism, the solidity of the soundstage - all of these characteristics are explainable with ASA

Now has anybody applied their research to this hobby or to audio replay in general? No - the focus of ASA is teasing out how the send of hearing works & studying audio playback systems Vs ASA would be the least productive way of doing this & not even considered.

There was a paper which caused a bit of a stir in audiophile circles a while back "Human hearing beats the Fourier uncertainty principle" which is explained when knowledge of ASA is considered.


Anyway, Frantz, if this paper you demand isn't produced then you consider all I say is a stretch & irrelevant? Fine, this thread will not satisfy you, then!
 
Frantz

I'm not a smart engineer like you to prove anyone wrong but you have been in and out of this thread on many occasions by your own volition. All I see John doing is to ask skeptics to think out of the box. Speculation...... perhaps but maybe not. I'm not a neuro scientist but then again neither are you. Beyond that I've been reading this with interest. I have nothing to contribute but not sure you do as well. Curiosity has always piqued my interest Frantz. Without it we would likely be back in the dark ages. Yet I have no idea whether there is fact or fantasy in ASA. Yet I'm an interested reader without casting aspersions nor doubts

When I had just graduated meds school there was a young doctor who had some wild thoughts which were beyond mainstream medicine as we were taught. He was frowned on, shunned, laughed at and even sanctioned by the medical board. He kept doing what he did and challenged people to his way of thinking. Many years later he was proved correct. Is there an analogy in this story? That's up to you to judge and hence my comment re your signature line. If you believe what you put in your signature you might just think a little about either doing what John is asking or if you can't do that then that signature imo is contrary to your beliefs.
 
From parallel thread where John indirectly references the book, "Auditory Neuroscience: Making Sense Out of Sound" which as best as I can figure out, he has not read:

index.php


So we learn that ASA is a concept but not fully embraced by research community. Indeed the author goes on to explain how the test that John provided in hearing a tone within noise does not at all need ASA to be explained:

index.php


[....]

index.php


index.php


In other words, the basic science of psychoacoustics is relevant and powerfully explains a lot of what we hear.

That is not to say the concepts behind ASA don't have applicability. The cognitive part of our brain does get involved in interpreting sound. Here is the author again on the effect of vision on what we hear:

index.php


In case you have not seen this effect, here it is:


And of course the whole notion of placebo and hearing elasticity is all a function of the brain as opposed to ear.

Cognitive aspects of our hearing does explain for example why we don't hear room reflections as "echos" or why we can hear the signature of a speaker through the same reflections in any room.

What it cannot be used for is random assertions that this and that imagined distortion is an audible problem. Such claims need to be demonstrated first using controlled testing. Proper references provided. Measurements to back it, etc. Otherwise it is just random chatter in a forum which I ignore, this thread notwithstanding at the moment.
 
Steve

Not a problem with your approach. i for one am quite at ease with the fact that we don't know it all and that there aspects of things that are not easily explainable by current knowledge. So I hear you (pun intended) and feel comfortable with your viewpoint. As long as it is admitted that this is speculative, then I am fine. I can live with it. Coloring it as Scientific is what I object to.
Additionally . How can we move forward if all that we do is admit without discussing? Which means coming up with different and contrarian viewpoints? Then JKenny and other will be forced to come up with better ways to frame their points, to convince.

The Schumacher quote, I feel strongly about and try to abide to with (alas) too often spectacular failings: It is about the removal of complexity, not a negation of skepticism, of questioning and of discussing.. Far from that
 
It seems to me that the people who have promised to ignore this thread as random chatter within a forum are the ones with the most comments

As I said I am not smart enough to make any comments here about cause and effect but frankly I don't consider any of the participants in this thread to be smart enough to comment with any degree of certainty about ASA either and that includes IMO John himself. Speculation to some means curiosity for others but it is a still unproved entity hence the deliberate posturing by people who are unable to think beyond
 
Surely. The real problem is when the main cause of skepticism is either ignorance of fundamentals or myopia. Not our case fortunately ... :)

Could well be microstrip. In the absence of perfect knowledge we are left with .. again ... speculation ... The questioning in itself is healthy unless you just want us to accept anything and everything. You are not preaching that.. Are you? :)
 
The questioning in itself is healthy

Frantz, why do I get the feeling that based on your prior posts in this thread that I don't get that feeling from you. I think you have been rather dogmatic in just the opposite. I do agree that one cannot call this science without peer review. John says there have been peer reviewed papers
 
I am out of this discussion Steve perhaps a PM later.

I don't see dogma in my posts. Just questioning and great skepticism.
 
Could well be microstrip. In the absence of perfect knowledge we are left with .. again ... speculation ... The questioning in itself is healthy unless you just want us to accept anything and everything. You are not preaching that.. Are you? :)

No, I am not. But we must accept that this forum turns around an hobby that choose stereo reproduction as its center of interest. And concerning high-end stereo, systematic skepticism has never proved to be a good way of getting an excellent system.
 
Steve

I frankly don't see in which way the removal of complexity plays in ASA. Does it remove complexity? From your vantage point, do you think ASA adds or remove complexity? Would you be able to tell me its relevance to digital Audio?
Skepticism Is not a sin. It is healthy. I have shown that I can change my stance when proven wrong... By all means do prove me wrong in my assertions here on the relevance of ASA about DIgital Audio and I will. In the absence of that, we are left with speculations...

Ok, I think I see what you are asking - why do I mention digital audio as a different case to analogue audio & what has ASA got to do with this? Bear with my explanation until the end

How we analyse audio signals isn't fully formed at birth - the mechanism is there ears, cochlea, auditory nerve, brain mechanisms, etc but it has to be learned. Just the same as speech isn't fully formed at birth but has to be learned - the mechanisms are there but learning is involved.

We learn from the sounds in the real world. We generate our analyse based on the behaviour of these sounds. For instance we all now know that a big instrument will produce a series of harmonics resulting in a perception of deeper sound. We all now know that distant sounds have their HF attenuated. We all know that sound behaves in certain ways based on our experiences/examples of real world sounds. We didn't always know this, we learned it through constantly encountering the same experience of sound behaviour in the world & imprinting it in our analytic engine (brain).

We do the same with all our senses - we learn & imprint the examples encountered in the real world.

So when we first visited a hall of mirrors it was fascinating because we were seeing things in an unusual & funny way. We found them funny because we could understand how the distortion from the usual way of perceiving object occurred - it wasn't confusing/disorientating.

Now if we couldn't understand the relationship between what we were perceiving & the image we are used to seeing, we might just be disoriented - if this presented image was the result of digital processing. For instance, if what we saw was slightly pixelated, & we are encountering it for the first time, we may sense something is wrong but not know what as we don't have the a reference for it - we have never encountered it before in our experience of the real world

Now back to audio - analogue audio can only be wrong in a limited way - it is constrained by the mechanical & electrical behaviour of the devices which it uses to record & replay the signal - microphones, mic amps, amps, tape machines, vinyl cutting machine, turntables, needle, arm, etc. The errors introduced in this process are constrained by physics

My point about digital audio is that it is the result of mathematical manipulation & signal processing - it is not constrained by the parameters of the physical world - subtle errors which haven't been encountered before & we have no reference for can be encountered. We can sense something is wrong but yet not know what it is - we have no frame of reference.

Well we do have a frame of reference - it's how we have learned to analyse audio from birth & this is what ASA is about. With digital signal processing there are so many new ways in which the signal can be changed (purposely or in error) than are possible when we were relying on electro-mechanical means for recording/reproduction

There, simple isn't it :) ?
 
Status
Not open for further replies.

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu