It's rather a given that a phantom image will collapse the same way a visual one will at too close a distance. We see in stereo too. Are we not going a bit overboard by wanting to enjoy the stereo effect from everywhere including places we venture into only for curiosity and not practical use?
I agree with Tom in the sense that there is a lack of "energy" with any phantom image especially in a direct comparison with a mono recording from a single speaker aimed squarely at you. While I agree with this, multichannel has its own problems. Setting aside the need to process two channel sound and the infinite things that can go wrong in implementation, physically the
typical multi-channel system is configured with a handicapped center channel. This leads to its own imbalances in SPL with the L and R at different frequencies. We also now get two phantom images instead of one between L and C and C and R. Gains in "energy" can come at the cost of coherence if implemented poorly. Something I think we are all familiar with with HTIBs haphazardly set up in most homes.
As far as two channel goes there are so many ways to skin the cat, it can be a nightmare if one doesn't know exactly what one considers their
personal objectives to be, "presentation" being a common one amongst audio enthusiasts. On one extreme we have the purely direct sound camp typically using constant directivity or near field listening, and on the other extreme, the disappearing speaker camp typically midfield with loudspeakers with flatter off axis response. Curiously, one could look at the FR or Spectral displays of a song or album and get pretty close averaged measured response using an RTA with either approach. That it will typically cost more with the latter due to the higher dependency on the room itself to achieve the response I guess is to be expected but should not bear in anyway on the approach's validity.
Mr. Danley makes some very, very logical observations with regards to polar patterns not just on a channel per channel basis but even driver by driver one. If one were to listen and measure two systems with similar FR, one with a small neat stage and the other a spector like wall of sound, the measurements would say they sound similar but would not account for the obvious differences in presentation even if similar in tonal balance. Measure not from the listening position but let's say 6ft up, 3 ft left of center at the speaker plane,where there's little localization with the "neat" system and a fair amount of energy from the "wall of sound" system and there you will almost surely find one big contributor to the differences in image size and intensity....the summing of the loudspeakers as dictated by their dispersion patterns as aided by boundary reinforcement.
To be able to manipulate the phantom image we have to know what it is made of. It's not magic. It's air molecules at varying levels of excitation in a given space at a given time. It's invisible to the eye, but it is definitely there. It is PHYSICALLY there. From there we work our way back to how to excite them exactly the way we want to and to do that we need to take into account both electrical and kinetic influences. Hopefully someone actually interested to study the matter in detail, will actually plot all of this out to a point including modeling how air molecules will react when acted upon by the two original impulses as well as multiple generations of reflected energy from the same. Until then we work with what we have...............