... looking to get a feeling of "Being there" rather than experience of "They are here in the room with me"
In the listening room there is, in effect, a "competition" between the acoustic signature of the venue on the recording (whether real or engineered or both), and the acoustic signature of the room you are listening in. Let's call these the "First Venue" and the "Second Venue". When the First Venue cues dominate (and all else is good), "you are there." When the Second Venue cues dominate (and all else is good), "they are here." In general, getting the First Venue cues to dominate is easier said than done, as the Second Venue cues naturally tend to dominate in most home listening rooms.
It is something that I have been involved with for years. It can be achieved most easily by moving from stereo to multichannel.
Even though I'm a two-channel guy, Kal is absolutely correct. The dedicated side/rear channels are designed to present the First Venue's acoustic signature more effectively than two channel is designed to.
But some two-channel systems still do a pretty good job of it. This is something I have been involved with for years, and it calls for the First Venue's cues to be effectively presented, while the Second Venue's cues are minimized. We'll come back to this in a minute.
Kal, I HOPE you will go into a bit more detail about the how's and why's of a good multichannel system in this context. Not that I'm necessarily thinking of "changing religions"... but in audio, sometimes it's good to have an "open canon".
I assume it's the room/speaker interaction as a larger transducer or different type such an Omnidirectional (MBL) could provide that experience.
You are on the right track! In my experience good polydirectional (credit to the late great Richard Shahinian for the term) systems - dipoles, bipoles, omnis, etc. - tend to do a good job of presenting the First Venue cues effectively, assuming proper set-up. A good "conventional" system can also do a good job of presenting the First Venue cues effectively, but in addition to proper set-up, room acoustic treatment tends to play a larger larger role than with polydirectionals, at least in my opinion... speaking of which, consider everything that follows to simply be my opinion.
The end result will be inevitably recording-dependent, but let's give 'em all the best chance we reasonably can, by effectively presenting the First Venue cues while disrupting the Second Venue cues.
In order for the First Venue cues to be effectively presented, they need to be strong enough for us to hear them; they need to be easily recognizable by the ear; they need to arrive from many different directions; and they need to not die away too quickly.
The First Venue cues are of course included in the direct sound, but that's arguably the worst possible direction for reflections to come from. Fortunately they are also included in the reflected energy in the room. The ear/brain system can pick out those First Venue ambience cues from the reflections in the listening room based on their spectral content, and connect them to the appropriate first-arrival sounds. Timbre is also enriched along the way.
First Venue cues "strong enough for us to hear them" means that we need a fair amount of reverberant energy, which implies wide pattern or polydirectional speakers and/or a room that is not overdamped. The latter helps insure that they "don't die away too quickly". And the wide/polydirectional pattern + ideally a lot of diffusion = the First Venue cues "arrive from many different directions."
In order for the First Venue cues to be "easily recognizable by the ear", they must be spectrally correct. This implies that the spectral balance of the off-axis energy is similar to the spectral balance of the first-arrival sound, AND that the room doesn't over-absorb the short wavelengths (high frequencies) and correspondingly degrade the spectral balance of the reverberant energy. Of course we want to avoid slap echo, so there's a balance we're looking for, and in general diffusion serves that goal better than absorption.
But imo this is only HALF the battle.
The other half is, we want to weaken and/or disrupt the "Second Venue" cues - that is, the inherent acoustic signature of the listening room.
Undesirable Second Venue "small room signature" is stongly conveyed by the earliest reflections, and the earlier the stronger that effect. So we want to avoid early reflections as much as possible; and/or diffuse them such that they are not strong and distinct ("specular"); and/or aborb them uniformly. The latter cannot be accomplished by a few inches of foam, which soaks up the short wavelengths but has little effect on longer ones, and thereby screws up the spectral balance of the reflections. We want the reflections to decay fairly slowly (though not too slowly), as quick decay is another source of "small room signature", which is another reason to use something other than absorption to address the early reflections, where possible.
If we can impose a significant delay on the strong onset of reflections, and push that inrush of reflections back in time somewhat, we can disrupt the "small room signature" cues by introducing contradictory "somewhat larger room" cues. An example of this would be, putting those MBL's you mentioned far away from all of the walls, such that it takes a while for the reflections off the walls to reach the listening area. This relatively late-onset inrush of reverberant energy contradicts the normal "small room signature" cues we would otherwise get. So we end up with relatively indistinct Second Venue cues, which makes it more likely that our effectively-presented First Venue cues will dominate. Thus MBLs and Maggies and such are capable of doing "you are there" quite well with proper set-up, and we easily hear the different "there's" from one recording to the next, which indicates the First Venue is indeed dominant, rather than an enhanced (by the longer reflection paths) Second Venue. With more conventional speakers the same principles apply, including: Minimize the early reflections (via diffusion or angled reflectors or whatever) while cultivating the late ones.
Compared with all the cues we'd get in the actual venue, even the best stereo system presents us with a poverty of First Venue cues. The ear takes in all of these different and often contradictory cues and constructs a "best fit" impression of the acoustic space we are in. If we have effectively presented the First Venue cues while minimizing the Second Venue cues, with a good recording that "best fit" may well end up being a reasonable facsimile of the acoustic space of the recording (again, whether real or engineered or both).
I'm not saying this is the ONLY thing that goes into a "you are there"-capable system, but it's arguably one of the things. And, note that a professional acoustician can make a small room behave like a much larger and better space.
Imo, ime, ymmv, etc.