Yes, weaknesses is a better term
Fair enough.
Again, you're confusing ABX with the usual forum blind A/B tests
I'm not so sure that the former is much better than the latter.
Yes, exactly - sure they can with difficulty but they seldom are catered for & this is why the tests are flawed
Vital's A/B tests seemed to cater for everything so I wouldn't call them flawed.
Sure, that's one of the checklist covered, perhaps?
Yes, it seems like good, sensible practice, to me.
Sighted testing is considered to be just anecdotes. So too are the blind tests without the necessary controls/factors catered for - just anecdotes. Trying to elevate them to better than sighted tests is just a religious belief & does a disservice to well organised blind testing.
Sighted testing is oft put forward as evidence though, think our friend Mark for example, not anecdotal, as it
should be seen.
I disagree that only strictly controlled double-blind ABX testing is worthy of elevation above sighted testing as a means for discerning potential audible differences. Removing as many biases as possible is best, but removing some is better, the main one being the knowledge of what component or file is being tested at any given time, followed closely by level matching, and so on.
So, tempering this with our knowledge of just how poor audio memory is, I'd rate it like this - from worst to best -
Long-term sighted, non-level matched comparison and, any other form of long-term testing.
Short-term sighted, non-level matched.
Short-term sighted, level-matched.
Short-term blind, level matched A/B or ABX.