Are there the differences about sound between these modes? If not, why does it must use 2 modes only for gapless playback?
Yes they are different now and will evolve differently.
Before discovery mode would download (cache) then play tracks one by one, so each track would be separated by the time it would take to download the track. Concert mode would download all tracks in the queue so you would have a longer wait time before playback starts but less between tracks. Why there’s still a gap even in concert mode I’ll explain later.
Right now discovery mode still employs the same scheme but concert mode now starts after the first track is downloaded and preloads the sub-sequential track(s)
during playback. The goal is to achieve this without a loss in sound quality, which can be verified by performing A/B comparisons, between the 2 playback modes.
In the future, we’ll either achieve concert mode to be truly gapless without any degradation to sound quality, in which case discovery mode can either be removed OR changed into a non caching playback mode OR a playback with a small buffer like Roon for fast music discovery / less critical listening. Or perhaps we’ll manage to make this type of playback identical sounding to full cache tracking in which case the only advantage of concert mode would be to have stable playback on unstable internet connections.
It could also eventually lead to just having a cached/non cached checkbox in stead of 2 playback modes.
But we cannot know how this is going to end up as there’s a lot of coding and development evolved with each step, and we do want to be able to verify sound quality implications of each step to the previous, so although this makes it a much longer road to travel, it’s one that provides us with the best night’s rest knowing we left no stone unturned.
Now to circle back to why there’s that gap in the first place, even in the initial concert mode playback scheme, there’s more to it then just downloading / caching the track, the track is also processed by an engine we have to think up a fancy name for, which was an
@EuroDriver morning shower idea on why bit identical files can sound different when downloaded from different sources, locations or media. This has been a controversial topic of discussion since the early days of digital file playback starting with different methods of CD ripping, where CD rips were considered to sound different depending on the ripping software used, the ripping drive used, power supplies used while ripping, etc. Nowadays this more or less moved to local file playback versus play from NAS or stream from online sources, where you can even have a preference for Tidal or Qobuz. This engine/process is a vital part of XDMS and not only provides a significant boost in sound quality but also aims at making the actual “source of bits” irrelevant.