It must be empirically and statistically shown that any failure to find differences among systems is not due to
experimental insensitivity because of poor choices of audio material, or any other weak aspects of the experiment, before
a “null” finding can be accepted as valid. In the extreme case where several or all systems are found to be fully
transparent, then it may be necessary to program special trials with low or medium anchors for the explicit purpose of
examining subject expertise (see Appendix 1).
These anchors must be known, (e.g. from previous research), to be detectable to expert listeners but not to inexpert
listeners. These anchors are introduced as test items to check not only for listener expertise but also for the sensitivity of
all other aspects of the experimental situation.
If these anchors, either embedded unpredictably within the context of apparently transparent items or else in a separate
test, are correctly identified by all listeners in a standard test method (§ 3 of this Annex) by applying the statistical
considerations outlined in Appendix 1, this may be used as evidence that the listener’s expertise was acceptable and that
there were no sensitivity problems in other aspects of the experimental situation. In this case, then, findings of apparent
transparency by these listeners is evidence for “true transparency”, for items or systems where those listeners cannot
differentiate coded from uncoded versions.
On the other hand, if these anchors fail such correct identification by any listeners, then this suggests that either these
listeners lacked sufficient expertise, or else that there were sensitivity flaws in the situation, or both. In that case, the
apparent transparency of systems cannot be properly interpreted, and the experiment will need to be run again with new
listeners to replace the ones who failed this additional test, and with any other changes that may increase experimental
sensitivity.