The reviewer's reviewing system.

Hell , I reviewed a Z-sys product for TNT audio back in '98 and Z-sys used a line out of it in their ad blurb .. Im a pro :)
 
What you say is true Jack but the answer to a different question. Namely, the routinely stated point that audiophiles have different tastes than general public. As you say, research shows that just like other specialized groups of listeners, their preference *overall* is no different. A speaker that does well with trained listeners also does well with them.

Michael asked a different question. He asked about *listening skills*, not preference for loudspeakers. For that, we can go directly to the source of the research:

Differences in Performance and Preference of Trained versus Untrained Listeners in Loudspeaker Tests: A Case Study
Sean E. Olive, AES Fellow

First this graph:

i-txMH42K-XL.png


So clearly nothing here is a matter of scale. Trained listeners have far better skills here. So there is no implication of me changing the research again, here is the original words in the paper:

"The performance of the trained panel is significantly
better than the performance of any other category of listener.
They are about three times better than the best group
of audio retailers, five times better than the reviewers
, and
27 times better than the students. The combination of
training and experience in controlled listening tests clearly
has a positive effect on a listener’s performance. The students’
poor performance is likely due to the student’s lack
of training and professional experience in the field of
audio. The reviewers’ performance is somewhat of a surprise
given that they are all paid to audition and review
products for various audiophile magazines. In terms of listening
performance, they are about equal to the marketing
and sales people, who are well below the performance of
the audio retailers and trained listeners.
"


So there is no manipulation of research. It is very clear they did poorly. But how did this data come about? The answer is the previous paragraph where a statistical analysis is performed on the results of each group. The analysis shows the consistency with which different groups rate the same product. The same loudspeaker is presented multiple times in the study. An objective instrument would rate it the same every time. Humans are not that consistent but ideally they would be close to the instrument. But the research shows that without training, groups such as audio reviewers are highly inconsistent:

"To examine listener performance in view of occupation
more clearly, the mean listener FL values were plotted as
a function of occupation for both tests (see Fig. 8).
In the four-way tests the listener performance of the different
categories based on the mean FL values from highest
to lowest was trained listeners (94.36), audio retailers
(34.57), and audio reviewers (18.16).
"


In other words, they cannot be counted to tell the "truth" in a single trial like trained listeners are. You have to test them over and over again and then look at the overall sum. As a group, they simply lack the ability to spot a problem and consistently point that out in every comparison to other loudspeakers.

This is the data Jack. And unfortunately not so easy to understand and parse out of the sea of research over a 30 year period. One has to read every bit of it and over and over again to get a consistent view. Or you can trust that I am not trying to screw you when I summarize them :).

Hi Amir,

Thanks for posting this but the above is of no scientific value whatsoever. Please would you provide "n" for each group and critically a standard deviation around the mean for each 3 groups. Or send a link to this "case study" so that I can criticallly appraise the scientific validity of this experiment.

Many thanks,

Bill
 
Dallas I haven't reached that chapter yet but will read it intently as I have 2 subs sitting waiting on me to "integrate" them into my room. This is a slow read for me as my brain is older and the sponge qualities have hardened somewhat.
 
Hi Amir,

Thanks for posting this but the above is of no scientific value whatsoever. Please would you provide "n" for each group and critically a standard deviation around the mean for each 3 groups. Or send a link to this "case study" so that I can criticallly appraise the scientific validity of this experiment.

Many thanks,

Bill
It is all in the paper. I provided the link to AES site where you can purchase it ($7 I think). Vast majority of the paper is statistical analysis by the way.

Quickly answering your question, the total "N" was 268. 215 were audio retailers, 14 University students, 21 marketing/sales, 6 audio reviewers, and 12 trained listeners. ANOVA analysis is shown in table 4 for each category discussed (subject, content, session, loudspeaker, etc. and combination of each). Saving you the expense of buying the article, here is the summary from the text:

"In both tests there was a highly significant difference
in preference between the different loudspeakers; F(3,
258) 231.0, p < 0.0001 for the four-way test, and F(2,
292) 149.2, p < 0.0001 for the three-way test. A Scheffé
post-hoc test performed at a significance level of 0.05
showed a significant difference in the means between all
pairs of loudspeakers in both tests.

Other main effects that were statistically significant in
both tests were listening group; F(15, 86) 5.2, p <
0.0001 for the four-way test and F(19, 146) 4.2, p <0.0001 in the three way test.


Program was statistically significant in both tests; F(3,
258) 4.698, p 0.0033 for the four-way tests and F(3,
438) 9.99, p < 0.0001 in the three-way tests. Details on
the main effects and interactions are discussed in the following
sections."


The paper by the way is from J. Audio Eng. Soc., Vol. 51, No. 9, 2003 September. The significant part being "J." meaning it is published in the Journal and hence peer reviewed and requires this type of statistical analysis.
 
Dallas I haven't reached that chapter yet but will read it intently as I have 2 subs sitting waiting on me to "integrate" them into my room. This is a slow read for me as my brain is older and the sponge qualities have hardened somewhat.
The book is a tough read. It is more of a reference than one continuous book to read. I must have gone back to it 20+ times. If you want a distilled version, you can read my articles :)

http://www.madronadigital.com/Library/BassOptimization.html

http://www.madronadigital.com/Library/Computer Optimization of Acoustics.html

The latter has a ton of measurements from different rooms showing the effect of optimizations.
 
I think there's a misinterpretation of the so called Haas effect. Haas studied sound localization. There is no acoustical research which suggests that humans totally exclude sound outside a certain window. In fact, Toole points out this common misunderstanding in his book. According to Toole the listener still combines timbral effects from late arrival energy well after the so called Haas window. IOW, everything still matters. :)

Not on my part Michael. In fact I'm in agreement with you here. Outside the window the reflections are interpreted as reflections and that was my point, not exclusion. In this instance the reflected sound off of the windows was not only late at arrivial, distance was great therefore reflections weaker plus the facts that because of the rear of the room is very diffuse 3rd reflections were likewise weak and that the radiation pattern of the speakers have little HF rearward anyway.
 
Amir,

No accusations of fudging. Looking only at the trial results I make no conclusions only the observations that in terms of preference the group was consistent with the trained listeners who scored best and the retailers who scored second best while other groups had differences in order. They can't be that bad then, at least these 6 weren't. As to why, Let's leave it to Dr. Toole and Sean to sort the interpretation of the particular group's data. It seems Dr. Toole differs from Sean based on his statement in his lecture. It is Sean's paper however so I suppose his word carries more weight on this matter. His word is that the study "suggests" and is not fully definitive as is the case with most if not all real papers. Can we agree to leave it at that? Many times we can take these conclusions to mean more than the authors suggest. This to me is dangerous. It becomes something not intended by the study a tool for profiling. Something that we should all be careful about. I for one can't make much of a sample of 6. Imagine my saying Kal, who I've never met or even seen a picture of has no ears based on 6 guys that I also no nothing about. Make that 300 and maybe I could. Just kidding Kal. :)
 
It is all in the paper. I provided the link to AES site where you can purchase it ($7 I think). Vast majority of the paper is statistical analysis by the way.

Quickly answering your question, the total "N" was 268. 215 were audio retailers, 14 University students, 21 marketing/sales, 6 audio reviewers, and 12 trained listeners. ANOVA analysis is shown in table 4 for each category discussed (subject, content, session, loudspeaker, etc. and combination of each). Saving you the expense of buying the article, here is the summary from the text:

"In both tests there was a highly significant difference
in preference between the different loudspeakers; F(3,
258) 231.0, p < 0.0001 for the four-way test, and F(2,
292) 149.2, p < 0.0001 for the three-way test. A Scheffé
post-hoc test performed at a significance level of 0.05
showed a significant difference in the means between all
pairs of loudspeakers in both tests.

Other main effects that were statistically significant in
both tests were listening group; F(15, 86) 5.2, p <
0.0001 for the four-way test and F(19, 146) 4.2, p <0.0001 in the three way test.


Program was statistically significant in both tests; F(3,
258) 4.698, p 0.0033 for the four-way tests and F(3,
438) 9.99, p < 0.0001 in the three-way tests. Details on
the main effects and interactions are discussed in the following
sections."


The paper by the way is from J. Audio Eng. Soc., Vol. 51, No. 9, 2003 September. The significant part being "J." meaning it is published in the Journal and hence peer reviewed and requires this type of statistical analysis.

Thanks Amir. I am familiar with the "J" nomenclature as I publish statistical manuscript in evidence based medicine :)

I would have to order it to evaluate it properly - seems strange to have such massive imbalances in the arms but I guess for pragmatic reasons. Btw - because something is published in a peer review journal it does not make it good quality.
 
IMO its more important for the reader to know the components involved rather than some arbitrary A/B/C class rating that has no consensus. The information is valuable for better context only if the reader is familiar with the associated equipment otherwise its not going to be helpful either way.

david

Surely. For me reviews are mostly informative and entertaining. IMHO ratings and stars are meaningless. However, for some good reviewers, if we know the reviewer system, room and preferences fairly well we can get an approximate idea of the performance of the component.

IMHO many people expect too much from a audio review.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu