Jack,
But excessively simple IMHO. There is more than Xmax and Xmin .When you establish the dynamic range of a concert people talk of the ratio between the maximum level and the noise level of the room . But the noise the room is useful information and part of the recording and must be encoded - it "needs some bits" - I do not know how many. The classical definition of dynamics compares the maximum level of undistorted signal with noise from electronics - that has a random behavior and does not represent any useful information. IMHO we can not go from one to the other just directly. It is why some people say that at less 18 or 20 bits are needed for subjective quality listening at typical sound levels, adding two extra bits for implementation losses, making it 20 or 22.
The preference for an higher number of bits, or in the case of Mark , DSD with its higher dynamic range is a clear subjective indication that the low levels are not clearly reproduced in 44.1/16. Our brain prefers the representation that has less errors.