Wrong, because there’s a problem. Instruments are measuring physical attributes,....frequency, voltage, impedance, noise, distortion etc. Etc Our ears don’t measure anything. They are simply detectors.
This is something of a red herring because it's somewhat semantic in terms of what we are talking about. If what you mean by "our ears" equates to "our hearing" then certainly our hearing mechanism "measures" quantity and characteristics of sound. Yes our brain is involved of course, but it's involved in interpreting measurements from devices we create as well.
Do instruments have anything like these algorithms?
Yes. Some do. Complex software is now part and parcel of many instruments used to analyze sound.
But, again, instruments aren't detached out there in the world operating on their own; the point is we are part of the process - we process the information from instruments.
How? By finding correlations between what the instruments detect, and what we experience.
Take a compressor. You can dial the values down and up using a piece of music, observe how you perceive the change in sound quality, and then understand the relation of the instruement's values (compressor) to what you hear. This isn't magic.
Sometimes we can do this successfully....for coloration, frequency accuracy would suffice, harshness could perhaps be derived from IMD or jitter values, but what of the multitude of other qualities. How do we measure ability to portray rhythmic drive, timing, soundstage or emotion. We can’t.
You have acknowledged we can measure what we hear in some instances, but now seek to raise examples that "can't" be measured. I think your examples are problematic in that they are somewhat mushy to begin with, and mash together various subjects. Now, I'm not saying that everything you just described HAS already been demarcated scientifically. But that's different from your apparent claim that none of it has been correlated with measurements...or that it CAN'T be measurable.
It appears you are making the leap from
"I don't know how these things can be measured" to presuming therefore "No One Knows How To Measure These Things."
Take soundstage. In fact, there is quite a lot known, in technical terms, about how a soundstage is created. The size and acoustic characteristics of a hall are measurable, as are the sound characteristics of, say, a symphony placed somewhere specifically within that hall.
Engineers choose microphone for known (measurable) characteristics such as pick up patterns, frequency response etc, which together with placement (measurable) helps them achieve the soundstage they wish to capture - or manipulate. The same goes for mixing. You pan something more to the left, that's where it will appear in the soundstage. You decrease it's amplitude, add certain types of processing, e.g reverb, and you can send it more in to the apparent distance. None of this is done supernaturally - these are done by altering technical values that are measurable. I manipulate sound, and soundscapes all day long (right now I'm creating a scene that takes place in a large interior train station in the early 1900s). I'm manipulating the scale of the sound, placement of sources etc just as I want via various technical manipulations, inculding reverb plug ins, volume changes, frequency alterations, stereo widener plug-ins, you name it. Again...not magic...if the values I'm changing to achieve this weren't measurable in the first place, this technology wouldn't exist! The same goes for the fact that the apparent scale and imaging of surround sound clearly follows measurable and predictable parameters in our perception. Otherwise all the various sound enhancing modes in surround receivers, or even Dolby Atmos, Auro etc, couldn't have been designed in the first place!
Then there is playback: Again, there are many things known about what type of measurable phenomena contribute to image specificity, spaciousness, etc. That's why for instance you'll see in John Atkinson's Stereophile speaker measurement sections, information such as the lateral response graphs and JA telling you:
"But note in fig.5 the evenness of the contour lines, something that always correlates with stable, precise stereo imaging."
Various forms of reflections in a room - sidewall, rear, ceiling, floor etc are known to affect perceived imaging and spaciousness. Floyd Tool documents all sorts of these effects in his literature. He also shows how the science correlates many speaker measurements to soundstaging in the sense that certain measurable characteristics, e.g. flat amplitude, even dispersion, and low resonant mode (in drivers or cabinets) predict a speaker will "disappear" as a sound source better, and you won't get some instruments and sounds clustering in to the speaker.
Now, does this mean that the absolutely exact, precise experience you have of a soundstage from a certain speaker in a certain room is describable from measurements to the precision you'd want it? Maybe, maybe not. But asking for total precsion is one thing; the claim that nothing about it is measurable and predictable via measurements is another.
As to things like "rhythmic drive, timing, soundstage or emotion"....that all depends on how precise you can actually be about those. One can't measure mushy ideas. But then, if an idea is mushy to begin with, we can't say it's a specific "thing" that can't be measured to begin with. Even audiophiles seem confused about what things like "PRAT" are or if it's even a reasonable concept (I've seen many debates).
As to what might create the sense of "rhythmic drive/timing" if we take that to mean, say, a snappy sense of pace vs a more sluggish, slow-sounding pace, then if two different music signals alter these perceptions, it would be measurable. Some candates can be things like bass frequency extension, along with measurable attributes of the sound in the room like frequency evenness, existence of room nodes in the bass region, ringing in the bass region etc. There are plenty of instances in which removing bulges, bloating and ringing in the bass region have been perceived as clearing up "sluggish" bass and improving the sense of 'quickness and timing" in the bass. Room correction software included in subwoofers, or full blown software systems, calculate and fix such issues.
Throwing in terms like "emotion" muddies the waters further. "emotion" in terms of the listener obviously isn't encoded in the musical signal so that would be the wrong place to look for it. But in principle "emotional response" can be measured. It's done all the time in science, in the biological, behavioral and social sciences (usually via self-report, and other methods). You could in principle study people's emotional responses to music by altering some parameters - either in the source, or changing speakers or whatever - to see if there is a correlation to be found.
Anyway....much of that is one giant distraction from the fact that we'd want to first have a way of being sure someone is perceiving a real thing in the first place! Blind testing is a good way of aiding confidence in this respect.
Cheers.