Auditory Scene Analysis (ASA) explained

Status
Not open for further replies.
Frantz

I'm not a smart engineer like you to prove anyone wrong but you have been in and out of this thread on many occasions by your own volition. All I see John doing is to ask skeptics to think out of the box. Speculation...... perhaps but maybe not. I'm not a neuro scientist but then again neither are you. Beyond that I've been reading this with interest. I have nothing to contribute but not sure you do as well. Curiosity has always piqued my interest Frantz. Without it we would likely be back in the dark ages. Yet I have no idea whether there is fact or fantasy in ASA. Yet I'm an interested reader without casting aspersions nor doubts

When I had just graduated meds school there was a young doctor who had some wild thoughts which were beyond mainstream medicine as we were taught. He was frowned on, shunned, laughed at and even sanctioned by the medical board. He kept doing what he did and challenged people to his way of thinking. Many years later he was proved correct. Is there an analogy in this story? That's up to you to judge and hence my comment re your signature line. If you believe what you put in your signature you might just think a little about either doing what John is asking or if you can't do that then that signature imo is contrary to your beliefs.

Steve, ASA isn't fantasy - it's a well respected & active area of research & not some idea of mine

What I'm doing is relating how the findings of ASA explain the reports of realism, soundstage clicking into place, air around instruments, etc - it's all about how the brain's processing of the sound better matches the internal model of hearing that we all have - it really is that simple.

What is difficult is teasing out this model - it's not fully unraveled yet
 
(...) The Schumacher quote, I feel strongly about and try to abide to with (alas) too often spectacular failings: It is about the removal of complexity, not a negation of skepticism, of questioning and of discussing.. Far from that

Curiously, the Schumacher quote, written by well known the author of "Small is beautiful", can be considered an abetment, even an endorsement of the boutique high-end audio and empirical developments of small manufacturers. :) The quote is not centered on removal of complexity at all, but on the small scale - the human scale. Anyway, Schumacher is an economist, I consider that the quote is meaningless in this thread...
 
From parallel thread where John indirectly references the book, "Auditory Neuroscience: Making Sense Out of Sound" which as best as I can figure out, he has not read:

...snip

So we learn that ASA is a concept but not fully embraced by research community. Indeed the author goes on to explain how the test that John provided in hearing a tone within noise does not at all need ASA to be explained:
....snip
In other words, the basic science of psychoacoustics is relevant and powerfully explains a lot of what we hear.

That is not to say the concepts behind ASA don't have applicability. The cognitive part of our brain does get involved in interpreting sound. Here is the author again on the effect of vision on what we hear:

...snip

And of course the whole notion of placebo and hearing elasticity is all a function of the brain as opposed to ear.

Cognitive aspects of our hearing does explain for example why we don't hear room reflections as "echos" or why we can hear the signature of a speaker through the same reflections in any room.

What it cannot be used for is random assertions that this and that imagined distortion is an audible problem.
You were doing so well up to this point. And yes, we use all our senses when perceiving something - mainly because any one sense doesn't provide enough information in it's signal stream to come to a conclusive judgement about what's being sensed (remember we are perceiving the world every second we are awake so quick judgement is essential).

"Random assertions", "imagined distortions", "audible problem" - could you use any more of a loaded sentence? I'm not even going to dissect this - it's getting boring to hear your same hand-waving assertions over & over.
Such claims need to be demonstrated first using controlled testing. Proper references provided. Measurements to back it, etc. Otherwise it is just random chatter in a forum which I ignore, this thread notwithstanding at the moment.
Sure, Amir, keep telling yourself this if it is the comfort blanket that you need.
 
Last edited:
Steve

Not a problem with your approach. i for one am quite at ease with the fact that we don't know it all and that there aspects of things that are not easily explainable by current knowledge. So I hear you (pun intended) and feel comfortable with your viewpoint. As long as it is admitted that this is speculative, then I am fine. I can live with it. Coloring it as Scientific is what I object to.
Additionally . How can we move forward if all that we do is admit without discussing? Which means coming up with different and contrarian viewpoints? Then JKenny and other will be forced to come up with better ways to frame their points, to convince.

The Schumacher quote, I feel strongly about and try to abide to with (alas) too often spectacular failings: It is about the removal of complexity, not a negation of skepticism, of questioning and of discussing.. Far from that

You know, I agree with you, Frantz - I find challenges help me to solidify my thinking on a topic & also to express it in a way that better communicates the concepts.

But only if the challenges are specific & logical & show that the person is actually engaged in trying to understand what's being discussed.

Hand waving & generalised objections are not of much value for moving a thread forward or for my honing of my thoughts
 
Could well be microstrip. In the absence of perfect knowledge we are left with .. again ... speculation ... The questioning in itself is healthy unless you just want us to accept anything and everything. You are not preaching that.. Are you? :)

Frantz, I'm not asking you to accept anything but I am asking you to engage in a meaningful way that shows you have an interest in the subject & have read & follow the posts.
 
Curiously, the Schumacher quote, written by well known the author of "Small is beautiful", can be considered an abetment, even an endorsement of the boutique high-end audio and empirical developments of small manufacturers. :) The quote is not centered on removal of complexity at all, but on the small scale - the human scale. Anyway, Schumacher is an economist, I consider that the quote is meaningless in this thread...

OK
 
Ok, I think I see what you are asking - why do I mention digital audio as a different case to analogue audio & what has ASA got to do with this? Bear with my explanation until the end

How we analyse audio signals isn't fully formed at birth - the mechanism is there ears, cochlea, auditory nerve, brain mechanisms, etc but it has to be learned. Just the same as speech isn't fully formed at birth but has to be learned - the mechanisms are there but learning is involved.

We learn from the sounds in the real world. We generate our analyse based on the behaviour of these sounds. For instance we all now know that a big instrument will produce a series of harmonics resulting in a perception of deeper sound. We all now know that distant sounds have their HF attenuated. We all know that sound behaves in certain ways based on our experiences/examples of real world sounds. We didn't always know this, we learned it through constantly encountering the same experience of sound behaviour in the world & imprinting it in our analytic engine (brain).

We do the same with all our senses - we learn & imprint the examples encountered in the real world.

So when we first visited a hall of mirrors it was fascinating because we were seeing things in an unusual & funny way. We found them funny because we could understand how the distortion from the usual way of perceiving object occurred - it wasn't confusing/disorientating.

Now if we couldn't understand the relationship between what we were perceiving & the image we are used to seeing, we might just be disoriented - if this presented image was the result of digital processing. For instance, if what we saw was slightly pixelated, & we are encountering it for the first time, we may sense something is wrong but not know what as we don't have the a reference for it - we have never encountered it before in our experience of the real world

Now back to audio - analogue audio can only be wrong in a limited way - it is constrained by the mechanical & electrical behaviour of the devices which it uses to record & replay the signal - microphones, mic amps, amps, tape machines, vinyl cutting machine, turntables, needle, arm, etc. The errors introduced in this process are constrained by physics

My point about digital audio is that it is the result of mathematical manipulation & signal processing - it is not constrained by the parameters of the physical world - subtle errors which haven't been encountered before & we have no reference for can be encountered. We can sense something is wrong but yet not know what it is - we have no frame of reference.

Well we do have a frame of reference - it's how we have learned to analyse audio from birth & this is what ASA is about. With digital signal processing there are so many new ways in which the signal can be changed (purposely or in error) than are possible when we were relying on electro-mechanical means for recording/reproduction

There, simple isn't it :) ?
Against my better judgement and I will live to regret it :D

John What do you mean that Digital is not bound by the physical world? it is. John.
You don't hear Digital , We only hear analog and what we hear has lower and upper limits and digital won't change that.
WHy is it in the absence of knowledge most people cannot discern digitally manipulated sounds from Analogu, even to the point that some claims (wrongly) to be aqble to derive the quality of a system from a Youtube clip a rather crude digital construct when compare to the SOTA DACS and High Res we have here?????

At the end anything we hear when reproducing music in the here and now is through electromechanical means devices. I read somewhere in a High ENd AUdio forum that an Edison cylinder had a special sound. We had one at home and several 78 and frankly ... :rolleyes: :rolleyes: :rolleyes: ... We are not yet at the point where sounds go directly to our brains through the application of electrodes on our skins or directly at a machine to human purely electrical interface and for that i doubt it would remain in digital form. At this point in time the interface remains electro mechanical and Analog .. We discuss here on WBF about an analog, electromechanical interface!! About electromechanical devices.. That is what our TT, DACs, Cartridge, DAC, Phono Preamp, DAC, R2R, DAC, preamp, SET, AMP, SS, and finally speakers :) contribute to make of electricity soundwaves for our enjoyment or not :)

Another thing .. When you and perhaps most people visit a hall or mirror they may not have been disoriented or annoyed. OTOH, I, have seen people panic in Hall of mirrors.. So your experience is not a rule. Then again we are speculating... we can continue but I fail to see the relevance of ASA to Digital Audio which many here find perfectly acceptable and able to convey all the emotions they need from music , we even have some new converts . I say to them: Welcome !
 
Against my better judgement and I will live to regret it :D

John What do you mean that Digital is not bound by the physical world? it is. John.
You don't hear Digital , We only hear analog and what we hear has lower and upper limits and digital won't change that.
What I meant was that prior to digital audio we were using electro-mechanical means of recording & reproduction - which are bound by the physical world. I thought I had explained this clearly?

WHy is it in the absence of knowledge most people cannot discern digitally manipulated sounds from Analogu, even to the point that some claims (wrongly) to be aqble to derive the quality of a system from a Youtube clip a rather crude digital construct when compare to the SOTA DACS and High Res we have here?????

At the end anything we hear when reproducing music in the here and now is through electromechanical means devices. I read somewhere in a High ENd AUdio forum that an Edison cylinder had a special sound. We had one at home and several 78 and frankly ... :rolleyes: :rolleyes: :rolleyes: ... We are not yet at the point where sounds go directly to our brains through the application of electrodes on our skins or directly at a machine to human purely electrical interface and for that i doubt it would remain in digital form. At this point in time the interface remains electro mechanical and Analog .. We discuss here on WBF about an analog, electromechanical interface!! About electromechanical devices.. That is what our TT, DACs, Cartridge, DAC, Phono Preamp, DAC, R2R, DAC, preamp, SET, AMP, SS, and finally speakers :) contribute to make of electricity soundwaves for our enjoyment or not :)
Hmmm. you are either purposefully pretending to not understand my posts or you really don't - I can't decide which yet?

Another thing .. When you and perhaps most people visit a hall or mirror they may not have been disoriented or annoyed. OTOH, I, have seen people panic in Hall of mirrors.. So your experience is not a rule. Then again we are speculating... we can continue but I fail to see the relevance of ASA to Digital Audio which many here find perfectly acceptable and able to convey all the emotions they need from music , we even have some new converts . I say to them: Welcome !
Right, I've given you the benefit of the doubt but decided you are not here to actually add anything other than disagreement with me, smoke & confusion - you certainly show no signs of listening to or understanding anything I posted.

I won't be replying to you, Frantz unless you show an intention of engaging in a meaningful way with the thread topic

Actually, I'll deal with the highlighted text "I fail to see the relevance of ASA to Digital Audio" is that not what I just tried to explain in a post in answer to this very same question of yours? So what's the point in coming back with exactly the same question as if I hadn't said anything? This is why I find your attitude here to be disingenuous - you don't wish to engage in what I said, you misunderstand or don't take time to read it & just post the same question again
"which many here find perfectly acceptable and able to convey all the emotions they need from music"
What is it about this way of thinking shown here & by Amir also - a binary way of looking at things which I suspect is just a debating tactic. I already explained that people can find their system sounds great & couldn't be better, then they discover with some addition it sounds even better, more realistic, etc. Does this mean that they found their system unacceptable prior to its improvement? No it doesn't. So using the argument that "many here find perfectly acceptable and able to convey all the emotions they need from music" is a meaningless as far as this is concerned
 
Last edited:
I don't see any new converts at all Frantz. I see people interested in discussing an interesting topic. Does that make me a convert? Definitely not. Your welcome comment I saw as somewhat acrimonious

Steve

It bears to re-read my post I will simply copy part of it here
to Digital Audio which many here find perfectly acceptable and able to convey all the emotions they need from music , we even have some new converts
. I believe microstrip to be one. He can correct me. What is acrimonious about it?
My post was against my better judgement. I knew it the second I replied to John post. Out for good!
 
Steve

It bears to re-read my post I will simply copy part of it here . I believe microstrip to be one. He can correct me. What is acrimonious about it?
My post was against my better judgement. I knew it the second I replied to John post. Out for good!
Funnily enough, almost everytime you reply to a post of mine you say this & then proceed to show that all you are interested in is posting some sort of criticism of me - you never show any interest in the actual topic being discussed, it's usually a hit & run technique & frankly boringly transparent, Frantz.

Good bye & try to restrain yourself before your next reply to one of my posts & stop yourself in your usual attempts at denigrating me
 
Last edited:
Ok, one thing that is overlooked in this is that music composers have an inherent sense of how to use ASA to direct attention within musical lines of a piece - this is without knowing ASA, they have just deveoped these techniques over time & practise
Here's one paper which touches on this "The effects of rhythm and melody on auditory stream segregation"
You will have to join researchgate (free) to read the full paper

More on Music perception & a survey of the type of sounds used in perceptual testing. An interesting finding that the sound envelopes used in auditory research may well be skewing the results.
"SURVEYING THE TEMPORAL STRUCTURE OF SOUNDS USED IN MUSIC PERCEPTION"

"RECENT WORK FROM OUR LAB ILLUSTRATES AMPLITUDE envelope’s crucial role in both perceptual (Schutz, 2009) and cognitive (Schutz & Stefanucci, 2010) processing. Consequently, we surveyed the amplitude envelopes of sounds used in Music Perception, categorizing them as either flat (i.e., trapezoidal shape), percussive (aka ‘‘damped’’ or ‘‘decaying’’), other, or undefined."​

And finally, Welcome to, Spatio-Operational Spectral Synthesis, https://ccrma.stanford.edu/~mburtner/SOS.html - which is exploration of Bregman's auditory streams as a means being artistically creative!
 
(...) I believe microstrip to be one. He can correct me. (...)

Surely I will. I would never refer to digital in the absolute terms you did. :) Anyway I have been listening to digital with great pleasure since I owned the Forsell CD combo - probably the more "emotional" CD player I have owned - in the late 90's! But it is completely out of the subject of this thread.
 
What's also good for me about questions & challenges on a thread like this is that it also makes me renew some of my reading & also sometimes results in discovering some new reading like this 2017 paper which looks very interesting a sit deals with both auditory & visual scene analysis "Auditory and visual scene analysis: an overview"

I think it's worthwhile to copy the first two paragraphs of the intro here even just for another voice & perspective:
Imagine you are walking on a big busy square. Cars are crossing, pedestrians are walking past and towards you, someone rings their bicycle bell to warn you that they want to pass, you hear people chatting, a taxi driver is shouting and the bell of the nearby school is indicating that school just finished. Meanwhile you note a beautiful coloured tree with its leaves turning orange, because autumn is setting in, and you start to think about your next holiday destination. Our brain is very well equipped to rapidly convert such a mixture of sensory inputs—both visual and auditory—into coherent scenes so as to perceive meaningful objects and guide navigation [1,2], and also to imagine visual and auditory scenes and distinguish them from ‘real’ scenes.

The task of analysing a mixture of sounds so as to arrive at percepts corresponding to the individual sound sources was termed ‘auditory scene analysis’ by Bregman [3]. The task is also known as the ‘cocktail party problem’ [4], which refers to the ability to follow one conversation when many people are talking at the same time. The auditory system has to determine whether a sequence of sounds all came from a single source, and should be perceived as a single ‘stream’ or whether there were multiple sources [5]. In the latter case, each sound in the sequence has to be allocated to its appropriate source and multiple streams should be heard. Similarly, in vision, the visual system has to partition a visual scene into one or more objects and a background, determining which elements in the scene ‘belong’ to which object or to the background. Visual scene analysis research was initially impelled by a compelling idea of Marr [6]. He postulated that the purpose of the visual system is to provide a representation of what is present in the outside world. Although the sensation of seeing complex scenes is seemingly effortless and occurs extremely rapidly, the sensory input is highly complex and dynamic. It takes only a few hundred milliseconds to activate a large cascade of different brain regions, each performing a different transformation of the sensory input [7]. The underlying neural mechanisms of these complex spatio-temporal processes pose major conceptual and methodological challenges for researchers in cognitive neuroscience [8,9].
 
John,
Thanks for the background on ASA. You have mentioned it a few times over the last year or so and it's good at last to get some understanding of what is proposed in this. The slide show was really interesting. I liked the linking to visual scene analysis (VSA) because this is where it kind of all hits a wall for me. The ASA theory itself has some great ideas about correlating sound with meaning and understanding. I don't believe that any of this however is an end within itself and while it does a nice job of packaging up these parcels of understanding that our processing of sound can provide it is still in itself for me kind of a limited understanding of perception.

I suggest this with no hubris but rather as stated in Bregman's abstract the theory itself does nothing to actually explain the essential states of perception. So if I misunderstand but this is a theory that is anything but an end within itself.

What we are talking about is not ASA but rather just sound perception. ASA is a theory but it is not the outcome. Auditory scene analysis itself strikes me as an incomplete idea that is perhaps a phase rather than a cycle because human perception is not just a state of analysis. Surely sound perception is an element within human perception which is an element within consciousness which is part of the human experience. This is all too compartmentalised to be a destination in understanding. So we get a glimpse at one of the fragments by observing how audio data is packeted and it's metadata but no explanation of how any of this translates to meaning or no complete or real sense of the absorption back into whole of process.

I think the ideas here are fascinating and important but in the end just such a small fragment of understanding within any more complete human understanding of perception. Bregman says in his video address that music is not natural sound but an artificact. I do wonder if the complex layering of natural sounds makes the resultant music in any way unnatural if the earliest of human nature is to communicate through music. The parts are natural, the instinct is natural, holistically music itself is utterly natural.

I appreciate the scale of thinking in Bregman's thinking but it also strikes me as essentially incomplete. Not a destination in understanding through our experience of sound but rather a fantastic snapshot of how we parcel the sonic landscape up as a step towards ultimately digesting our sensory journey so it can manifest as meaning.
 
Thanks Tao
I hope I have not painted ASA as something more than it is - a theory of how auditory perception works at a deeper level than the psychophysical study of the mechanics of the ear. To me it represents the next layer of the onion in understanding the working of our presceptions.

It is not meant as a study of or explanation for consciousness - that is a far, far more complex question & one that will be extant for decades if not millennia.

You are correct ASA is not the outcome - the outcome is how we respond to, react to the sounds we are perceiving.

How I relate ASA to this hobby is that for me, it explains a lot of the rift between what is heard & what is currently measured - it suggests to me that our auditory perception is a pattern matching analytic technique which we use to make sense of the ever changing sounds from what surrounds us. That might be our playback systems just as much as it might be the bird singing outside my window, at the moment.

What it gave me an insight into (& continues to do) are that 1st order thinking in perception is mistaken - it's often the perceptual effects of some issue that we hear rather than the issue itself. For instance auditory streams (in the sense of ASA) & their importance to what we are perceiving is a crucial understanding to take away from ASA. And it's not as simple as it first appears. You may think a stream is just us following the basoon musical line in an orchestra but what ASA is teasing out is how we are able to follow this musical line - what are the characteristics that we pick out from the nerve impulse signals which tell us to group just those particular elements together as the basoon line (or the massed violins musical line). And it's not a passive analysis, we are constantly predicting what comes next based on our knowledge & experience of how these sounds behave in the world. If the nerve signals (bottom up) don't match our prediction (top down) we quickly formulate a new behaviour model & continue. So this top-down, bottom-up comparison is an ongoing & continual process & we are never in a state of complete certitude about what we are hearing - that's the way it works - it's origins as an alert system are still evident & fully operational.

The implications of this for our hobby are that we are dealing with sound patterns & the closer the sound pattern matches our model of the real world the more realistic we will perceive the playback. So if the fade of a triangle is foreshortened we are aware of this - could be on the recording, could be a dampened triangle but if we hear the same recording on another playback system & it is not foreshortened, we need to consider this & decide which is correct.

If we find we are fatigued by one playback system & not by another, it may well be because our auditory perception is constantly undergoing resets in the prediction that I mentioned above. We are not consciously aware of these resets, they happen below our conscious mind but we may be experiencing the effects of such issues in the sound.

What I find ASA brings to my thinking on audio playback systems is the very concrete notion that it is auditory perception that is the final judge of a system & not simplistic measurements.

I have been promoting the idea for a long time now, that the test signals used for measurements (& indeed for auditory testing) are too simplistic & don't address the whole pattern matching strength that is at the core of auditory processing. I find that such testing is misinformed & misguided & that's also what ASA has opened my eyes to.

I believe there's enough in all of that to be happy with the narrow focus of ASA research & leaving the questions of consciousness for another millenium
 
What's also good for me about questions & challenges on a thread like this is that it also makes me renew some of my reading & also sometimes results in discovering some new reading like this 2017 paper which looks very interesting a sit deals with both auditory & visual scene analysis "Auditory and visual scene analysis: an overview"

I think it's worthwhile to copy the first two paragraphs of the intro here even just for another voice & perspective:
Actually the more meaningful part comes in the article itself. Let's review that and see if we a) already know about it and b) agree with it.

index.php


This should be fascinating insight to people who say they "trust their ears." The process as correctly stated is bi-directional. What you think, influence what you think you are hearing. And more so can even change what you are hearing!

Let's remember the simple exercise I mentioned in another thread where you play the same file over and over again. And how doing so results in "hearing" differences in detail even though nothing has changed as far as what the ear is picking up.

Of course this describes the placebo effect where our expectation of difference in an outcome results in us concluding the same. The "goals of the individual" in hearing a difference is met. "Prior knowledge" that we have changed something so therefore the sound must be different, comes into play.

Does audio science already know this? Of course. This is why we perform controlled experiments where we attempt to remove the influence of these factors. We don't tell you if something has changed. And with it, we neuter the brain's influence over your hearing perception to conclude that something "must be different."

Is it important to know why such bias and variation exists? No. We have empirical data that has told us this for decades. It is why we have controls and rigor in our testing for audio differences.

The paper goes on:

index.php

index.php

index.php


All confirming what I just explained. Notice the last line: that we perceive differences even when nothing ("the environment") has changed.

In summary, audio science and what is investigated here are completely consistent. We can investigate the "why" in this type of research but the "what" is already known. That humans are full of preconceptions and multiple-sensory inputs and willing imagination. If you want to make an audio assessment that is durable across listeners and represents what the device is doing, you need to follow what is already part and parcel of audio science. That is, reduce as much as you can the reverse connection between the brain and your ears. What is presented in this paper (and book I reference) simply reinforce what we know as a matter of science, but unfortunately choose to ignore in our audio hobby.
 
How I relate ASA to this hobby is that for me, it explains a lot of the rift between what is heard & what is currently measured - it suggests to me that our auditory perception is a pattern matching analytic technique which we use to make sense of the ever changing sounds from what surrounds us. That might be our playback systems just as much as it might be the bird singing outside my window, at the moment.
Except that (outside of room acoustics) this research in no way, no shape supports your assumption. Indeed all the research involves sounds that are measureable and then we examine what is audible.

In acoustics where we ignore that we have two ears and proceed to use one microphone to measure things this does come true. But for upstream products, nothing remotely in the research touches on this whatsoever.

What it does say as I have repeated quoted from your own references is that what we think we "hear" is actually a product of multiple sense and a creative brain. That is the part you want to learn and with it, go back to your experiments and correct their major flaws. Once you do that, you will see that measurements predict why there is or is not an audible difference. There is no conflict there.
 
Status
Not open for further replies.

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu