One of the perennial debates often encountered here is whether WAV sounds better than FLAC on Naim streamers. The general consensus seems to be that any perceived differences are explained by the extra processing required to decode FLAC, and the resulting increase in digital noise which could be picked up by sensitive analogue components nearby.
A more contentious subset of this debate is whether FLAC transcoded on the fly to WAV on the uPnP server sounds different...?
Personally I can't tell them apart, but I was interested in whether there was any difference in the network traffic in these two scenarios so decided to do a little test.
I use a Shuttle bare bone PC, with an i5 processor and 4Gb RAM running Windows 7 and Asset as the Server. This is connected via a Netgear GS116 switch to my NDS all using CAT6 cables. They are about 15m apart with 3 solid brick walls in between. The music files are on a QNAP 459 Pro NAS next to the server.
I picked relatively short 2 minute track, Intro off the XX album, and converted this from FLAC to WAV using dBpoweramp music converter. I put this and the original FLAC file into a new directory and edited the metadata to change the artist field to 'Test Material' so I could distinguish them from the original album. I forced a rescan on Asset, then checked I could see the new tracks on nStream. All as expected so far.
For the network capture I used WireShark running on the server and captured all the packets flowing between the server and the NDS while the 2 tracks were played.
For the sake of simplicity I ignored all the uPnP discovery, and the content look-up between nServe and the Server and only considered the actual delivery of the music stream. It starts with an HTTP Get from the NDS, specifying the uPnP URL of the required music file. For the FLAC file transcoded to WAV the URL format is a path with forced.wav at the end, for the native WAV the path ends with a unique file ID, also of type .wav. The URL path is not the path to the file on the NAS, it is a uPnP 'shorthand' using alphanumeric tokens to represent the required file.
The response from the server is an HTTP 206, followed by the required file which is streamed back to the NDS. I looked closely at the WAV headers and they are different, but closer inspection showed that this was due to the meta data tags being included in the direct WAV, but not on the transcoded WAV. The data chunk looks identical - exactly the same length, and at various offsets containing exactly the same bytes. I have not done an exhaustive comparison yet, but a close 'eyeball' inspection over 30 mins could not see any difference.
Looking at the network exchange it is almost identical as well -
On the initial handshake the Max Segment Size is set to 1460 ( no jumbo frames ). The initial receive window from the NDS is 8192, which reduces to 6732 after 3 segments have been exchanged then remains constant. It's just textbook TCP, 1460 length segments, getting ACKed, final one has the FIN bit set. I can see absolutely no material differences, other than the presumably redundant transfer of a bunch of metadata at the end of the 'real' WAV which I guess the NDS just discards.
So, no further forward in working out why some people hear a difference....
Posted on: 18 October 2012 by DaveBk
Thanks for the comments.
Simon, re the uPnP index, yes, I guessed this was the issue from your earlier post. If nStream remembers this, any material changes to the content between scans could screw it up. I doubt if uPnP required this to be persistent as it assumes the directory will be searched each time.
Jan, in my test all the conversion is done on the server, Simon was responding to a question from Foxman. We can rationalise how FLAC might sound different if it is decoded in the streamer, but if this happens on a server 15m away it's hard to understand how this could impact the sound. I was just curious to see if the network traffic was different enough to put additional processing demands on the streamer.
Posted on: 18 October 2012 by Hook
Nice job Dave! This proves that that Asset transcoding works as advertised.
Like you said, if some folks are hearing differences, it can not have anything to do with the bitstreams Asset is delivering. Perhaps other UPnP servers are not as transparent or reliable, or perhaps it comes down to mains fluctuations or some other temporal events effecting the comparisons...who knows.
From my perspective, there is nothing left to investigate. We know that Naim WAV rips and those produced by matching against the Accuraterip database are the same. And now we know that on-the-fly transcoding is reliable, so no reason for FLAC users to fret over whether or not to convert files. Job done, and topic closed IMO. Hoorah!
Thanks a lot for doing what you said you were going to do (an increasingly rare quality these days)!
ATB.
Hook
Posted on: 19 October 2012 by Jan-Erik Nordoen
Originally Posted by DaveBk:
Jan, in my test all the conversion is done on the server, Simon was responding to a question from Foxman. We can rationalise how FLAC might sound different if it is decoded in the streamer, but if this happens on a server 15m away it's hard to understand how this could impact the sound. I was just curious to see if the network traffic was different enough to put additional processing demands on the streamer.
Hi Dave,
Just to set my mind at rest, should I read this as meaning that you measured differences in network congestion ? (for example, queues of data packets building up quickly, persisting and slow in dissipating in one case, but not in the other).
Obviously, it's not my field, but I do want to understand it better,
Thanks,
Jan
Posted on: 19 October 2012 by Guido Fawkes
Interesting report ....
The only way I can see the data would differ at receiving end is in the metadata, as the NDS mainly wants the PCM then if anything would it not sound better with the transcoded WAV where all the garbage, so to speak, had been removed before the data arrived. This saves it the job of discarding that which it does not need.
I'm still surprised there is any difference between FLAC and WAV given the way the NDS appears to work, and if it is transcoded then surely it just sees it all as WAV. Does the NDS play PCM from its buffer, by which time the original format is long forgotten - dealt with in an optically isolated part of the design.
Linn could not measure any difference between processing FLAC and WAV, but different machine so results may be different.
Still good work Dave and interesting results ... conclusion I'd draw is if you can transcode then do, as it certainly will not sound worse if you do and may improve the performance of some players. So it seems a no lose strategy.
Posted on: 21 October 2012 by Hook
Hi Graham -
I agree that uncompressed FLAC offers the best of both worlds -- the sound quality of WAV, and the portability and ease of tagging of FLAC. If I were starting from scratch today, I would either use dbpoweramp and rip to uncompressed FLAC, or I would get a UnitiServe and let Naim manage the WAV ripping.
I started ripping to FLAC level 0 (minimal compression) several years back, and had recently been considering a batch conversion to either WAV, AIFF or uncompressed FLAC. But now that Dave has proven that Asset's on-the-fly transcoding is transparent, I no longer see the need.
And now that Mr. Spoon has finished porting Asset to MacOS, I hope he will turn his attentiom to Linux. Would be nice to be able to install Asset directly on my QNAP NAS.
ATB.
Hook
Posted on: 21 October 2012 by AMA
Thanks Dave, your efforts are highly appreciated.
I also don't hear any difference between WAV and FLAC, transcoded on the fly by Asset uPnP.
You clearly prove that both WAV and transcoded FLAC result in the same data transmission through the network so that data content and traffic load could not be a source of sonic variation.
I still can leave a room for explanation on why some people may hear a difference.
It quite may be that decompressing the FLAC on PC may increase the mains contamination due to heavily increased CPU load.
If audio system is not properly isolated from the PC mains line (which could be the case even if the system and PC stay in different rooms) than it may cause a negative feedback and people start hearing the difference.
In this case I do advise those who hear a difference to experiment with following: let them play WAV from PC and at the same time run any decompression on the background (unrelated to the WAV in play). If they hear a deterioration -- this is it. There is a good point to reconsider the audio system mains wiring.
Posted on: 22 October 2012 by DaveBk
AMA,
My server is on a separate ring powered via a 1kVA UPS. The NDS is on a dedicated radial on a different consumer unit, so they are about as separate as possible for a domestic installation. Perhaps this is why I can't hear any difference?
There are so many variables that in theory could change the outcome, so my tests are only really valid under similar conditions. I'm lucky in that I can keep the computing side physically and electrically separated, other folk have different circumstances and may have different results.
I still want to find a way of exhaustively comparing the 2 data chunks 'on the wire' to make sure Asset is bit perfect, and network independent. My tests so far indicate this is the case, but I'd like to be absolutely sure.
Dave.