Yes exactly. Instead of just toggling the low-order bit as J-test tries to do, we need to induce rate changes and see how they track the input and what artifacts they generate at the rate changes. Tracking errors here as you explain create distortions.
But will rate changes not happen anyway during the J-Test & that rate change distortion show on the graph?
I presume the rate change occurs naturally anyway, as a result of a drift between the calculated average speed of the input clock & the output clock?
Here's my simplistic understanding of the workings of ASRCs & why it is not the solution to jitter & indeed introduces it's own problems:
A portion of an incoming digital stream gets stored in a small buffer (not the whole song) - it gets clocked out of this buffer by a new local clock. You can never say the two clocks are the same, there will always be some difference between them so whether you are clocking out at a completely different sample rate, (usually called upsampling to 192KHz or whatever) or using the same nominal clock speed, the following applies.
It's required to recalculate the output amplitude of the outgoing sample based on an algorithm applied to the input sample's & surrounding sample's. Why? Because the output sample is being shifted in time & therefore not the same slice in time through the waveform so the amplitude at this new point in time will be different to the original. This requires an operation on the sample called interpolation i.e the new amplitude is calculated from the sample & surrounding samples. To calculate the new amplitude requires an interpolation ratio.
Calculating the interpolation ratio is where the problem arises - what is the input clock speed? The original clock fluctuates, so the clock speed has to be calculated as an average of the clock speed of a number of samples. This is an average figure & therefore may match some or none of the samples clock, certainly not all or even many. So firstly, just to be mathematically accurate about all this, this is incorrect & we will have amplitude errors on the output as a result. So for all the samples whose clock speed do not match exactly the average speed there will be an amplitude error on the output. Will this error be small enough to be inaudible?
The second problem arises because we have to keep a check on the buffer & if it is filling up too quickly our average clock calculation has to be re-done because the new samples are at a faster average clock rate. This means that the interpolation ratio is re-calculated & will be different to the last ration which applied to the last block of samples. Will this change in ratio cause a jump in distortion at this transition? It is speculated that it will & experience suggests that there is a limiting threshold below which ASRCs do not improve the sound.