Audible Jitter/amirm vs Ethan Winer

Ethan Winer · Aug 10, 2010

amirm said:
Jitter creates unwanted harmonics. Once those harmonics rise up to the same level as what a single bit of audio sample would represent, then you would lose that bit of accuracy.

Ah, okay, then I withdraw my withdrawal. Or, I was right the first time.

Since jitter level is related to the signal level, then a very soft passage has jitter that is also very soft. So turning up the volume past normal will not make the jitter audible because it's still 80 to 120 dB below the music.

You can't on one hand say that 14 bits is good enough for some music but not for other

Sure I can! And you even agreed with me. With typical pop music that's normalized nobody can ever hear jitter or dither or using "only" 14 bits. I bet even 12 bits would be fine, other than maybe during a very quiet part. That's about what you get from good analog tape or LP records.

--Ethan

amirm · Aug 10, 2010

DonH50 said:
Random jitter does not create harmonic distortion; it creates spurs in the response but these are somewhat randomly related to the signal. Deterministic jitter will create harmonic distortion terms.

That is correct and was the topic of my first few posts here

. But since jitter doesn't have to be random, we need to look at all of its modes and I described the common ones we worry about in audio circles as a simplification.

Quantization noise is not harmonic distortion of the signal, but rather more like, well, noise! It sounds worse than thermal noise because it is not truly white noise and there is a relationship to the signal. INL can (and does) create harmonics.

That's not correct Don.

Quantization noise is not "noise." In signal processing circles, we use the term noise to describe distortion and harmonic distortion is exactly what we have here. This is easy to show, starting with stealing graphs from this well written little article on the web (there are countless others -- this is my first hit): http://www.cadenzarecording.com/Dither.html

Here is a picture of what happens if you just truncate 24 bits to 16 bits:

We see what appears to be square wave overlaid on our original signal. How do you make square waves? Well, with odd harmonics of a signal! We can see this in his simulations of a 100 Hz in 24 bits first before truncation:

Now after truncation to 16 bits:

If you look at the horizontal axis you see that the first harmonic peak created as a result of truncation is at 300 Hz. The next one is at 500 Hz. The one after is at 700Hz and so on. It clearly follows the pattern of odd harmonics (3, 5, 7, etc.).

Let's add a bit of (simple) dither to it and this is what we get:

Miracle happens! We add noise to the signal *before* conversion from 24 to 16 and now all the distortion is gone and replaced with noise. For once, math is behind you when it comes to audio fidelity

.

OK, so there is no free lunch. You get elevated noise floor. Let's overlay the before and after dither:

Fortunately there are clever techniques such as "noise shaping" to remove the audible impact of such noise. But that's for another chapter

.

terryj · Aug 10, 2010

I know this is about jitter, but since you posted graphs of how dither helps, I'd like to say I clearly remember some photos (not graphs) you once posted that clearly showed the effect of dither. A picture tells as much as a thousand words I have heard.

Pity you could not come up with those pics again, or indeed a similar pic which shows jitter (for those reading, I mean an actual photo of something, not graphs etc)

DonH50 · Aug 10, 2010

We'll have to agree to disagree on this one, Amir. For the theory, look at something like Discrete Signal Processing by Oppenheim and Schaffer (pretty sure that's the one -- mine's at work). I do believe ideal quantization generates only odd-order terms, but of course there are other things at play, e.g. the relationship of the clock to the signal, that causes every bin to fill in even in an ideal system. That's why the usual white-noise approximations are valid and standardly used. (Standardly, is that a word? I be an engineer, not a grammar person...) To me, noise and distortion arise from different things in the circuit, and have a much different impact on the output. I do not treat them the same. Maybe that's one difference between us low-brow hairy-knuckled engineers and the high-brow scientist types, but it's in me blood.

Truncation to 16 bits is not the same as sampling with a 16-bit system; truncation will add distortion. In that paper, I think dither is primarily masking the truncation errors, and you need a goodly amount of energy to that, thus the relatively high noise floor. Dither normally only needs to be a few lsbs, raising the noise floor by only a few dB, unless there is a lot of nonlinearity. At least, that's the way on works in the systems I have worked with (audio and RF, though we play other games in the RF world to decorrelate the spurs and not raise in-band noise). If I ever built a 16-bit converter with that high of spurs (only 40 dBFS or so, about what you might get out of 5 to 6 bit converter), I'd be fired, or at least tied to my desk until I fixed it. Maybe I should run some plots to show the differences, hmmm... In any event, the paper is an interesting and useful look at the impact of dither, but the test case is unrealistic. I suspect something else is going on...

Also, the impact of noise decorrelation (dither) is a bit different in delta-sigma designs (which I assume is the architecture you plotted from the rising noise response) than in a conventional converter. The early 1-bit, low-order loops were very prone to tones and dither was (and is, for that matter, in any such loops) required to suppress tones in the output of the delta-sigma modulator (ADC or DAC). Technology and techniques have advanced so that modern architectures are more complex, higher-order and often using multi-bit loops, so dither is less critical for stability and tones, though is often still added to produce a more pleasing noise floor.

To address your first point last, we agree on that one! That's why I am (usually) careful to distinguish between random and deterministic jitter. And, though I have yet to really research it in depth, I still feel deterministic jitter is a far worse culprit that random jitter for us (no matter what frequency we operate).

Back to practicing - Don

DonH50 · Aug 10, 2010

terryj -- Virtually all test systems now are computer-based so actual photos using film are almost impossible to come by, and you'll be taking a picture of a digital display anyway. Maybe if we dig up some vintage test equipment (we have some in our lab, but I'm not sure it still works). However, there are pix around showing the effect on musical spectra, if that helps... By "photos" do you mean time-domain pictures instead of spectral (frequency domain) plots? - Don

amirm · Aug 10, 2010

Those pictures were sure a hit, weren't they.

So here is a set of pictures I have to show jitter.

First without any jitter:

Now let's see what happens to it when we apply the kind of jitter Ethan says is not audible:

127052d1246461211-help-more-timing-jitter-issues-ww.jpg

Case closed. No?

OK, so here is a less silly demonstration by stereophile magazine: http://www.stereophile.com/features/1208jitter/

Real jitter will look worse than above because its motion is not uniform across the image as the above image represents. But then again, its magnitude would be less.

DonH50 · Aug 10, 2010

And, now you know why we musicians just ignore the ape with the stick up front...

RBFC · Aug 10, 2010

Ashkenazy seems to need a new razor. The reviews of his latest Exton release of the Respighi Pines of Rome, Fountains of Rome, and Feste Romane might suggest that he was in a bit of simian-mode, though.

Overall, this is one of the best audio threads ever. Great work, guys.

Lee

DonH50 · Aug 10, 2010

When in Rome...

Actually, we might be due for Pines again this fall, thanks for reminding me of yet another thing I need to practice before the season starts!

DonH50 · Aug 10, 2010

Whilst piddling I made a few plots that y'all might find interesting...

First is a plot showing the aperture time for various ADC or DAC resolutions vs. signal frequency. The aperture time is the time required for the signal to move 1 lsb, assuming a sine wave. If the jitter equals the aperture time, you lose about 8 dB in SNR assuming normal random jitter.

To see the impact of jitter, let's look at a 16-bit converter sampling at 44.1 kS/s (CD resolution and rate). The DAC is perfect except for the added random timing jitter. A perfect 16-bit ADC has SNR of about 98 dB. I have plotted the SNR vs. jitter for 100 Hz, 1 kHz, 2 kHz, 10 kHz, and 20 kHz signals. You can clearly see how the higher frequencies are much more sensitive to jitter. At 100 Hz, 10 ns of jitter is hardly noticeable, but at just 1 kHz the SNR has decreased by nearly 20 dB (down about 3 bits)! At 20 kHz, we have SNR less than an ideal 10-bit DAC (< 60 dB).

Another way to look at the impact is to plot the SNR lost as jitter increases, as shown below. A perfect 16-bit DAC would lose 0 dB in SNR; as jitter increases, more and more SNR is lost. With 1 ns of random jitter, things don't look too bad through 2 kHz, but at 20 kHz we see that 20 dB SNR loss.

To keep the loss to just a few dB at 20 kHz, we need jitter < 100 ps; with 1 ns the upper midrange and high end is getting pretty noisy, with the effective dynamic range reduced by 10 to 20 dB...

Fun with plots, hope this helps - Don

amirm · Aug 11, 2010

Lovely charts Don. Great way to visualize the data across the range.

Ethan Winer · Aug 11, 2010

DonH50 said:
And, now you know why we musicians just ignore the ape with the stick up front...

Q: What's the difference between an orchestra and a bull?
A: A bull has the horns in front and the assh0le in the back.

BTW, I have the utmost respect for conductors. Well, the good ones anyway...

--Ethan

Ethan Winer · Aug 11, 2010

amirm said:
here is a less silly demonstration by stereophile magazine ... Real jitter will look worse than above

I know you're kidding, or exaggerating to make the point, but photo "jitter" in realistic amounts would not be visible. As proof, I edited an old photo of me playing the cello and "jittered" it by shifting the image a worst-case -80 dB, which is 10,000 to 1, or 0.0355 pixels for the width of the attached photo. It looks just like the original version, but maybe you can see the jitter?

--Ethan

amirm · Aug 11, 2010

Ethan Winer said:
As proof, I edited an old photo of me playing the cello and "jittered" it by shifting the image a worst-case -80 dB, which is 10,000 to 1, or 0.0355 pixels for the width of the attached photo.

Shifting is not jitter. It is a static delay. Give me the original picture though and I will try to demonstrate something else

.

t looks just like the original version, but maybe you can see the jitter?

--Ethan

I think you looked younger in the original picture!

Seriously, how do you expect me to answer this with just seeing the modified image???

View attachment 556[/QUOTE]

DonH50 · Aug 15, 2010

Hey Amir -- Truncation Correction

Hi Amir,

I was working on some other stuff and remembered that the truncation plots you presented earlier in this thread had me bothered. They claim to show the results of truncating a 24-bit signal to 16 bits with resulting very high distortion. I responded that truncation was not the same as sampling to 16 bits and that truncation could cause distortion, but that I also felt something else was going on.

Truncation error occurs at the lsb level of the new (truncated) signal, 16 bits in that example. I ran a few tests and verified that, while truncation does slightly increase the noise floor, and very slightly increases distortion, it is a few dB at most. Nothing like shown in that plot. I think the signal was clipped in the truncation process, based upon the odd harmonic sequence, but there's other stuff going on as the harmonic amplitudes are a little high. Looks sort of like some loop (modulator) issues, and/or maybe perhaps the (noise) filters were not tweaked? Just truncating a 24-bit signal to 16 bits does not cause the output to be much worse than sampling at 16 bits, and in fact in most cases the difference is insignificant. I'm a little confused, nothing new!

FWIWFM - Don

amirm · Aug 15, 2010

Hi Don. I have been meaning to comment also in our differing views regarding quantization noise. I think some of that has to do with our perspective. I always look at the issues there post A/D conversion. There, we use the term quantization noise when we go from one resolution, e.g. 24-bit, to another, e.g. 16-bit. Similarly in video that is captured at 10-bits from telecine equipment and then provided in 8-bit more for consumption on Blu-ray/DVD. That conversion in absence of dither, produces quantization noise. Same thing happens when we encode audio in lossy manner when in frequency domain, we reduce the bit resolution.

As to your question here, I have not checked the work of the article I linked to. But his results are completely consistent with countless other simulations and practical measurements of the same. I am not home and am typing this on a slow link. But in a quick search online, here is another case of 24 to 16 bit conversion but this time, using Pro Tools (audio workstation software): http://www.themasteringhouse.com/dither/dither.html

First the oirginal 500 Hz signal:

Now with it converted to 16-bit without dither:

Note the peaks at the odd harmonics of 1.5 KHz, 2.5 Khz and so on (500 Hz times 3, 5, etc)

Now with the same signal with dither applied to it prior to conversion to 16 bits. But this time, he uses "noise shaping" in Pro Tools to push the noise into ulatrasonic region so that it is not audible:

Gregadd · Aug 15, 2010

"...Where does he get those toys?"

Amir Every time I see one of your graphs I think of Jack Nicholson playing the Joker in Batman I. After Batman uses some device to foil the Joker, he remarks:"Where does he get those toys."

DonH50 · Aug 15, 2010

OK Amir, we're getting somewhere! I think we are in agreement, with the caveat that quantization noise, while adding discrete spikes, is not the same as nonlinearity to me. But, look at those first two spectral plots on page 11 of this thread, and compare them to what you just posted. The first plot on page 11 has a noise floor below 90 dBFS or so, with a few distortion spurs sticking above (nonlinearities in the converter, or maybe the signal is slightly overdriving the ADC). Now, after truncation to 16 bits (next plot), you have spikes up around -40 dBFS -- yikes! That does not fit any conventional Nyquist sampling theory I know; I would have expected a few dB rise above a 16-bit noise floor.

Now look at the plots you just posted. The 16-bit plot has many more spurs, as expected, but their level is still below -100 dBFS or so. Now, that should yield around 16-bit SNR (98'ish dB) and fit the world I know. That's why I feel there is more than just simple truncation going on in your previous plots -- the spurs are just too high to be only truncation.

One hooker in all this is that it looks to me (from the shape of the noise floor) that you are plotting a delta-sigma converter, not a regular Nyquist (flash, SAR, etc.) converter. Due to the way the modulator loop works and relationship to the filtering going on, dither can have a much larger impact on the noise (spur) floor, but there's more going on in your last plot on this page. You stated he uses noise shaping to push the noise to the ultrasonic region -- that's what a delta-sigma modulator does! This is more than just added dither in-band; the modulator's noise response (transfer function) actually pushes the quantization noise way up in frequency, where it is easily filtered.

I know you know this, but let me digress for others who might not...

If I build a perfect 16-bit ADC sampling at 44.1 kS/s, put in a full-scale tone somewhere in the audio band, run the FFT, and calculate the SNR, I'll get about 98 dB. If I double the sampling rate (oversampling by two), the noise is now spread over about 40 kHz instead of 20 kHz. The total quantization noise (energy) is the same, but spread over twice the bandwidth. Now, if I filter out the upper half I don't need (20 to 40 kHz) and use only what's left in the audio band that I care about, I've gotten rid of half the noise, and the SNR goes up by 3 dB (1/2 bit). Double again, gain another 3 dB, and so on. We gain 0.5 bits (3 dB) for each doubling in the oversampling ratio (OSR).

The magic of a delta-sigma converter (ADC or DAC) is that the signal transfer function is not changed much (it still passes the signal through), but the noise transfer function is shaped so that noise gets pushed to the upper end of the Nyquist band. The total noise is the same, but the frequency response is much different; at low frequencies it is very low, and at high frequencies it is very high. Now, when we oversample and filter out the unused upper frequencies, we are getting rid of a larger portion of the noise, and end up with much higher SNR in the signal band. In fact, for an ideal M-order modulator and perfect noise filters, we achieve (M+0.5) bits (or, (M+0.5)*6 dB) for each doubling in OSR. Where our simple Nyquist converter gains a lowly 3 dB when we double the sampling rate, a 3rd-order (relatively low today) delta-sigma modulator gains 21 dB for that same doubling -- and delta-sigma modulators operate with much, much higher OSRs so the effect is large. In the real world, naturally, other implementation issues limit the SNR, not these theoretical limits...

So, I think we are on the same page, finally. I just needed to know what was really going on, and you needed me to clarify my position based on what I saw and how I related to it. Please let me know if I still glitched somewhere...

Whew! - Don

marty · Sep 27, 2010

First, congratulations to Amir and Ethan for a very informative "debate". Both sides presented their arguments well. My reading and observations nets out on the side of Amir. Ethan's key argument that jitter, if it is 100dB below the signal, must by definition be inaudible, reminds me of the old argument that an amplifier's performance at 40KHz must be irrelevant since we can only hear up to ~20KHz. We now know that not to be the case as the harmonics of signals at 40KHz may be heard in the audible range. The main reason I think jitter is likely audible is the recent work of some vehement anti-jitterholics such as Ed Meitner, whose recent effort , the XDS1, greatly impressed me.
http://www.whatsbestforum.com/showthread.php?187-XDS1-CD-SACD-Player&p=20031#post20031
Perhaps the reasons I was so impressed have to do with more than jitter reduction, but I have to defer to Meitner when he states his belief this is in large part the case. In any case, thanks to Amir and Ethan for their invaluable education. I remember the days there used to be great articles in Audio Magazine on technical stuff like this, but those days are long gone. "What's Best" is a more than satisfactory replacement because of the inherent strengths of blogging in real time!

Ron Party · Sep 27, 2010

marty said:
We now know that not to be the case as the harmonics of signals at 40KHz may be heard in the audible range.

I must have missed this one. How do we know that?

Audible Jitter/amirm vs Ethan Winer

Banned

Banned

New Member

Member Sponsor & WBF Technical Expert

Member Sponsor & WBF Technical Expert

Banned

Member Sponsor & WBF Technical Expert

WBF Founding Member

Member Sponsor & WBF Technical Expert

Member Sponsor & WBF Technical Expert

Banned

Banned

Banned

Banned

Member Sponsor & WBF Technical Expert

Banned

WBF Founding Member

Member Sponsor & WBF Technical Expert

Well-Known Member

WBF Founding Member

Similar threads