What is High Resolution Audio?
Posted by: Simon-in-Suffolk on 14 January 2016
I was reading a thought provoking article by Bob Stuat at the AES on what constitutes high resolution audio. Here is my summary of his article.
Firstly he makes a good comment, that in hi res audio we tend to borrow many metaphors and adjectives from the visual world such as focus, transparency and definition, where he maintains hires audio should be natural, resembling real life, and sounds should have clear depth and positioning and seperate readily into perceptual streams, particularly where environmental effects cause multiple arrivals to our ears providing temporal resolution of sound structures which is akin to spatial resolution in vision. I agree with this observation..
Stuart then goes onto define high res audio without falling into the traps of limiting to the narrow definitions of digital audio Nyquist sample frequency and bit depths.. So in the analogue world, assuming high definition equates to natural sounding, then referring to the research by JW Oppenheimer and others that hearing is not bound by pitch perception (approx 18kHz) but our brains appear to exploit population coding, and can approach a temporal resolution of 8 υS. This implies a Gaussian bandwidth of about 44kHz .
Next Stuart introduces encoding and replay systems ... Still all in analogue. He suggests that in a typical recording and replay environment it could be appropriate to consider 8 cascade stages .. Now each stage impacts the bandwidth on the other, so with 8 cascaded interfaces so as to protect the bandwidth of 44 kHz through the cascade, each stage should more likely have a bandwidth of 100 kHz.
Now any of these of stages could be digital and then Nyquist and bit depths comes in to play .. So here we can see at least 96kHz and upto 200 kHz (192kHz) given the cascaded bandwidth issues to fully capture 8uS spatial awareness. But referencing papers by M.S. Lewicki, and Jackson, Capp and Stuart, he states evidence shows with current digital sampling technology and hardware through decimation and interpolation that the results above 96kHz sample rate are very much diminished and that 96kHz would preserve most of the spectral content.
Now Stuart talks about bit depth of sampled audio. The increased bit depth above 16 bit give diminishing returns, and P.B. Fellgett's research into the thermal noise limit of a microphone shows that the fundamental limit of a microphone can be bound by a 17.5bit 192 kHz LPCM channel, therefore there is often little justification for using more than 18 bits.
Therefore to summarise Stuat effectively suggests that Hi Res audio should capture as close as possible the 8uS temporal resolution and offer at least 18 bit LPCM dynamic range. (So with digital audio at least 96kHz sample rate with 18 bit sample length - or more likely the excessive 24 bit)
Stuart then talks about air temperature, air flow and shape of recording and listening rooms, as these will modify the audio .. But that's another story.
So food for thought, and nice to have a view that is more from a science/engineering perspective rather than the marketing room. I recommend this paper to AES members on this forum rather than rely on my paraphrasing
.. J. Audio Eng Soc Vol 63 No 10 2015 Oct.
Simon