Thursday, June 30, 2016

Digital Audio Part 3 - Digitalization

Audio digitalization
Digitalization is the process of turning a type of information or signal into a number in order to store it, this number has to be a binary number,  and in the case of Audio this function is performed by a device called analogue / digital converter or ADC that converts tension values into binary numbers.

Once the sample is taken we need to store it, and in order to store it we need to transform it into a number, more specifically a binary number. The function is performed by a device named analogue / digital converter or ADC that converts tension values into binary numbers.

In the figure below we are using 3 digit binary numbers. Since every binary digit is denominated as a bit (from binary digit), we'd be using 3 bit numbers. It is easy to see that there are 8 (= 23)  3 bit numbers: 000, 001, 010, 011, 100, 101, 110, 111. To represent the diverse tension values that our samples could take, we divide the range of variation of the signal in 8 levels, and we approximate every sample to the immediate inferior level. 

In the central part of the figure below, we can compare the exact samples (empty dots), and the digitalized samples (filled dots). By comparing them we can see that the maximum error that occurs is of a division that is corresponding to one bit. The re-constructed waveform is considerably different from the original due to the fact 3 bits is a very low resolution.

Audio Digitalization
Effect of the sampling and digitalization process applied to a sine wave. The resolution is of 3 bits and the sample rate is 14.7 times higher than the wave's frequency. In the central figure the empty dots represent the exact samples and the filled dots represent the digitalized samples. Below is the reconstructed signal.

In the previous example we adopted, in an arbitrary way a 3 bit resolution. The result as it could be observed was very lacking because the reconstructed waveform is very distorted. It would be interesting to have a more systematic criteria to select the required resolution.

The problem is similar to deciding how many decimal digits would be required to present with pinpoint accuracy a given length of objects that are less than 1mt. In order to do this we would need 3 decimal digits, because such objects would have a measure between 0 and 999 mm. If we required a precision of tenths of millimeters, we would need 4 digits, because objects could have a measure between 0 and 9.999 tenths of mm.

In the world of audio, the criterion to determine the "precision" is the signal to noise ratio. Let's analyze the example of the figure above from that point of view. Forgetting about the own noise that the signal might contain, one collateral effect of digitalization is the appearance of an error, that could be assimilated to a noise. This noise is known as digitalization noise. Under this interpretation, the maximum peak to peak value of the signal is proportional to 8, and the maximum peak to peak value of noise is proportional to 1. Thus the signal to noise ratio is 8/1 = 8 that expressed in dB is:

 

If we take into account that in audio high fidelity is handled nowadays signal to noise ratios over 96 dB we can understand why a 3 bit resolution is totally insufficient. 

Let's suppose now that we increase the resolution to 4 bits. Due to the fact we have 16 possible values now instead of 8 the signal to noise ratio in dB would now be:


We can see that we had a 6 dB increase. This can be interpreted like this: While the signal's amplitude didn't change, when we doubled the amount of levels, every level reduced to the half, thus the signal noise was halved as well. Then if the signal to noise ratio is doubled, and a doubling is equivalent to a 6 dB increase. If we now increase the resolution in just 1 bit, taking it to 5 bits, we can observe that noise will be reduced in half as well, so the signal to noise ratio would experience another 6 dB increase.

We could obtain a more general expression of the signal to noise ratio. If we adopt a resolution of n bits, where n is any integer number, resulting in:



Applying this formula to the standard 16 bit resolution that is used in the most popular storage formats of digital audio, it results in a signal to noise ratio of 96 dB. This signal to noise ratio is, in normal conditions, enough to create impressive dynamic contrasts. In effect let's take into account that rarely we feel music in a level over 110 dB (which is quite deafening and not recomendable at all). If we subtract 96 dB to this value, we obtain 14 dB, a level of sound that probably few people would have "listened", because even during night time when sound conditions are very silent in an isolated room, it's normally difficult to get sound pressure levels below 20 dB.

It's necessary to warn that even though a system is working with digital audio formats of 16 bits, it's signal to noise ratio won't necessarily be of 96 dB. This is because given the diversity of analogue components that make part of every device, some noise is generated and this noise is added to the digitalization noise. In low cost equipment  low quality electronics are used, so it's manufacture is particularly noisy and the signal to noise ratio is much less than 96 dB.

Bibliography: Federico Miyara (2003) Acústica y sistemas de sonido. UNR Editora
ISBN 950-673-196-9

No comments:

Post a Comment