Aural memory of a real system sound, flawed as it may be, is still far superior to comparing system videos. I don't think there is even a contest.
Besides, even if you concede that aural memory shifts the parameters of comparison to some extent, that shifting by unequal recording conditions between systems in different rooms is much more severe.
The only time when you (hopefully) can avoid different recording conditions is documenting component exchanges in the same system while recording equipment is the same, the microphone position is in the same place, the recording volume precisely equalized etc. And even then the recordings can give you only an exceedingly crude idea.