What is a "good" speaker? Here are a few possibilities suggested by Robert Greene on his forum:
1 A good speaker is one that in a plausible and obtainable
room situation--placement and room acoustics- makes recorded music
sound like the real musical event recorded if the real
musical event is recorded by some plausible and identified
microphone technique.
2 A good speaker is one that in a plausibly obtainable placement
and room condition makes a lot of commercial recordings
sound good to me , good in the sense that I enjoy and am undisturbed
by the sound .
3 A good speaker (or speaker/room combination)
is one that is good in the sense of item 2 but for most
people, not just for me personally.
4 A good speaker is one that makes commercial recordings sound
the way they are supposed to.
5 A good speaker is one that looks good with some
set of usual measurements, whichever set I have decided
or someone has decided are relevant and important.
6 A good speaker is one that a large number of audio people
can be convinced is good.
7 A good speaker is one that operates correctly according
to some theoretical model of what a speaker ought to do.
The problem with 1., as I see it, is that, even with acoustic music, very few recordings are made this way. Most such recordings are either multi-miked or at least made like Mercury and Telarc, with three widely separated omnidirectional microphones. With such miking arrangements, since humans only have two ears separated by about six inches, we have now way of knowing how recordings made with more than two microphones, much less widely separated more-than-two microphones, ought to sound like in terms of the space captured or even the tonality. This means that only recordings made with a single pair of quasi-coincident microphones (e.g., Blumlein, M-S, X-Y, ORTF arrays) should be used to determine the "goodness" of the speakers. My wild estimate is that this limits you to well under a tenth of one percent of all commercially available recordings of acoustic music.
Now you may argue that if you can get those few recordings to sound close to the live event, all other recordings will also sound as they "should" sound. This assumption, while it may be reasonable, has problems. One is that the stereo playback paradigm for the quasi-coincidentally miked recordings must involve a subtended stereo separation angle of at least 90 degrees. From experience I know that with this much separation other types of recordings--the vast majority of them--sound too "stretched out" and frequently have unnatural sounding "holes" in the staging. For most recordings, the engineers assume subtended separation to be around 60 degrees and such reduced separation is needed to avoid such spatial problems.
If possible, a "good" speaker, for the home listener--as opposed to an equipment reviewer-- should be one that in a plausible and obtainable room situation--placement and room acoustics--makes as much commercially recorded music sound like the real musical event as possible. At least for those who enjoy and are not disturbed by the sound of a real musical event, that would make the "good" home speaker a source of maximum enjoyment of recorded acoustic music. I'd argue for combining goals 1, 2, and 3, in other words.
Whether this goal is attainable may depend on how much weight you want to put on strict accuracy to what was recorded versus musical enjoyment of much of the recorded repertoire. It may be a form of the old accuracy vs. musicality debate. I, for one, am willing to give up a bit of accuracy as to the reproduction of the 0.1% of commercial recordings made with a pair of quasi-coincident microphones if that's necessary in order to gain more enjoyment from the other 99.9% of commercial recordings of acoustic music.
One rather obvious concession would be to design the default frequency response of the speakers so as to roll off the highs and boost the bass a bit since it is rather obvious that most recordings of acoustic music through most speakers which are adjusted to measure ruler flat at the listening position sound bass light and treble heavy. It's better to allow home listeners to EQ the 0.1% of properly balanced recordings than to require them to EQ the 99.9% which fit this pattern.
Goal 4. is unknowable. We don't and can't know for most recordings what the engineers heard during the final mastering sessions and we likely don't even know the equipment chain used. I'm assuming that "sound the way they are supposed to" means sound the way the mastering engineer wanted it to sound. There also may have been assumptions made that if recordings sound the way they do on the mastering system, it will sound different, but different in an consumer-preferable way, on most home systems and thus "sound the way they are supposed to."
Goal 5. has some promise. I think that the folks who emphasize the spinorama method are using this. There are even websites which rate speakers based on their conformance to someone's idea of an ideal spinorama data set. One could quibble about how much high-frequency roll off there should be at what angle and starting at what frequency. I would add time coherence as evidenced in a step response on the design axis, plus low distortions of various types at all relevant frequencies and SPLs. I suspect that a data set could be agreed upon which would result in goodness which would only be impeded by listening room problems which could be addressed via speaker/listener placement, EQ, and room treatments.
Note: in a small room like mine, "airless" speakers are quite rare. This may not be so in larger rooms. In my small room, the problem is quite the opposite. The proximity of room surfaces tends to produce overly live/echoey/bright/brittle/nasty sound from most recordings with most speakers without absorbtive or at least diffusive room treatment. Dispersion can be fairly narrow and still sound "room filling" in a small room. The D&D 8c speakers have the closest to ideal dispersion I've yet heard for my small room. Even with bare walls, the sound is well focused and well balanced tonally; it's alive and yet not splashy/bright/annoying in any way, even at high SPLs. It's "awesome," actually, and nowhere "dull" or "airless" in the room or even down the hall. Yes, to me they sound yet better with dispersive room treatment and better yet again with absorptive room treatment at the speaker end of the room, but the differences are not nearly so vast as with most speakers. I could easily live with these speakers with bare walls in this small room.
Goal 6 seems too subjective and too disconnected from the live acoustic music paradigm. Most audio people these days really don't know what live acoustic music sounds like unamplified in a good hall and could care less since that's not their music of choice.
Goal 7 might work if most designers agreed on a theory. But there is little agreement about theory at either the recording or playback end. The lack of standard recording or playback paradigms is what's behind all this uncertainty as to what constitutes a "good" speaker. I think it all comes back to Audio's Circle of Confusion. The lack of such standards creates an inherent ambiguity as to goodness since recording quality has to be judged through playback hardware. See Audio Musings by Sean Olive: Audio's Circle of Confusion from which I quote a part here:
Audio’s “Circle of Confusion” is a term . . . that describes the confusion that exists within the audio recording and reproduction chain due to the lack of a standardized, calibrated monitoring environment. Today, the circle of confusion remains the single largest obstacle in advancing the quality of audio recording and reproduction.
The circle of confusion is . . . Music recordings are made with (1) microphones that are selected, processed, and mixed by (2) listening through professional loudspeakers, which are designed by (3) listening to recordings, which are (1) made with microphones that are selected, processed, and mixed by (2) listening through professional monitors...... you get the idea. Both the creation of the art (the recording) and its reproduction (the loudspeakers and room) are trapped in an interdependent circular relationship where the quality of one is dependent on the quality of the other. Since the playback chain and room through which recordings are monitored are not standardized, the quality of recordings remains highly variable.