Hi Guys
Personally I think you guys are making your job of analysis vastly more complicated than necessary.
I think you will get much farther by examining your hearing and not the “stereo image” on a recording.
For now forget the recording process entirely.
When a person stands in front of you and talks, you can identify the location aurally with your eyes closed.
Your ears Pinna response imposes an angle dependant comb filtering and eq to the pressure reaching your ear drums.
This looks awful in an “in ear” measurement BUT instead of hearing any of that as flaws and comb filtering, it is instead how you can determine height of the source.
When you wear headphones, you have neutralized that pinna response aspect of spatial detection and so with headphones, the image floats over your head or in your head.
If one takes a set of headphones and you feed the SAME signal to both sides, one produces the SAME mono phantom image one seeks with stereo.
This does not have to be a recording or have ANY processing, it is the identical nature of the R and L signal which make the phantom appear to be between them.
With loudspeakers, one can produce the same mono phantom image by feeding an identical signal to each speaker.
The more identical the signals reaching the R and L ear, the more “real” the phantom image can be.
With loudspeakers one also has some amount of IACT or inter aural cross talk, the sound which arrives from the Right speaker but wraps around to the Left ear and vise versa.
This corruption reduces the realism of the image because an actual source has none of that cross leakage.
A HUGE effect but hardly ever addressed is that essentially all loudspeakers radiate a complex pattern, this allows you hear where a single speaker is with your eyes closed but also hear how far away it is.
There are speakers that are VERY hard to identify the physical depth when only one is on, these tend to produce a much stronger stereo image when two are used.
When a speaker shouts its identity, the ear can easily hear the R and L speaker as the source which obviously competes with / defeats the desired phantom image.
It is possible to have a mono phantom image so compelling that people would rather believe it is a center channel speaker than the program material.
I forget who, but someone here had posted a picture of a small full range driver on a large baffle. This is one kind of speaker that would produce a “simple” radiation pattern and so likely to be hard to localize in depth and then produce a strong stereo image.
Keep in mind, you are trying to reduce all of the extra signals as much as possible, this is like a noise problem.
For example, in a room, normal loudspeakers image much more poorly than they can outdoors (by that I mean the image that can be created between the two sources).
The reason for that is in a room, there are many delayed reflections which all sound like the original signal but compete with the direct signal which is the one that creates the image.
Another problem is that loudspeakers spread out signals in time. Feed an broad band impulse to a normal multiway loudspeaker and one finds that instead of the entire spectrum being reproduced at one instant, it emerges normally with the highs first and lows last. Even examining an individual driver like a woofer with time delay spectrometry, one finds that even the lone driver produces the hf end first and the lf end last.
Obviously, any distortion, be it in time, harmonic, amplitude variations, any and all signals which arrive after the direct path and so on, corrupt the signal, these things allow your ears to determine this is not real, or that really sounds like Diana Krull floating in front of me.
Now, how one captures or re-creates that same real seeming image is an entirely separate can of worms (I think). There are ways to capture a live image, but none do what I am looking to do, the object here is to extend this into a 360 degree image with “over head” image as well.
A couple weeks ago I posted a link to a recording I made using a microphone invention I am working on. I believe I have found something new and useful but it is still a work in progress, try this link with headphones, keep in mind this the “forward” facing portion of the stereo image.
Also, like the last one I posted, I have to apologize for the program material, not exactly music but something I am very familiar with and so a valuable reference (and possibly entertaining of you like trains). Fwiw, there is NO compression, NO spatial processing of any kind, this is the output from the detector. .
Like the approach with the loudspeakers, I tried to fix the acoustic problem at the origin instead of compensations, see what you think / hear.
http://www.danleysoundlabs.com/TrainStart.wav
Tom Danley
Danley Sound Labs