I will give a brief answer on the filter issue. Maybe Don can expand and/or write a new article on it
.
A quick intro. Sampling theory only works if you filter any and all components above half the sampling frequency. Any components left there create "aliasing" which in simple language means extra frequencies that should not be there. In a texbook, we can draw filters that are absolute. In case of a CD playback at 44.1KHz, that filter can be straight up and down step function at 22.05 KHz. In reality, we can never make such a filter. Remember that since creating a squarewave requires infinite bandwidth and hence not doable, creating a filter that has that kind of cut off is also an impossibility.
If we take into account that we only need 20 KHz out of that 22 KHz worth of bandwidth, we can construct a filter that starts at 20Khz and starts doing its thing. That makes the filter more doable but still not quite possible. Such a filter in typical configuration in IC DACs still doesn't have zero output at 22 Khz. Some amount is likely to exist. If we further assume that we can't hear those components since they are > 20 Khz, it seems that life is well. Alas, that is not meant to be.
When you play sound in the above system, the "aliasing components" above 20 Khz mix with the high frequencies in your music. When you mix two signals, you get the original signals plus their sums and difference. If there is a component at 23 KHz, it will mix with something at 19 Khz and create a new component at the difference of the two which is 4 Khz!!! So it is possible to now hear such a distortion since its frequency is much lower. This distortion is called Aliasing Intermodulation Distortion or AID for short.
You can minimize AID by starting earlier and sacrificing some of the your 20 Khz bandwidth.
In addition to AID, there is an issue of "ringing." The sharper a filter, the more its response varies in amplitude in the spectrum it is not supposed to touch.
Now let's assume that you have a 96 Khz sampling rate but still go to the school that says we do not hear anything above 20 KHz. You can build a DAC with dual filters. The first one can be a special type of filter which does not ring. Such a filter will not have the slope that we need to cut off all the aliasing components (i.e. above the 48 Khz bandwidth of 96 Khz sampling). But since we don't care about preserving all of that extra bandwidth, we can cascade a sharper filter which starts way later than 20 Khz but still manages to cut off everything at 48 Khz. Put another way, this system has a response that is flat to 20 Khz, then gradually rolls off toward zero and at some point, sharply goes to zero (or very close to it).
Think of it as a car with two brakes. One that is gentle but not super effective. And one that is very effective but shakes your car like crazy. If you use the gentle brake most of the way and only at the last minute use the strong brake, you get the benefits of both. Same is true here.
Alas, when someone makes a chip DAC (what is used in typical AVRs, mass market CD players, etc.), they try to comply with the "spec." The spec for a 96 Khz sampling says it must have a flat response (or close to it) up to 48 Khz. So they attempt to preserve the full bandwidth and put the same sharp cut off filter at the end which causes ringing. Discrete/dedicated DACs sport custom filters that deploy the above techniques and hence can minimize the above factors.
Note that there is a trade off. As Don and I have noted in the past, when you run a DAC faster, it simply cannot do its job as well. The system will have more noise and accuracy drops. So there is no free lunch here and the world is full of trade offs of this sort
.