Sunday, June 26, 2016

Digital Audio - Part 2 Binary Numeration, Sampling & Sample rate.

In mathematics and digital electronics, a binary number is a number expressed in the binary numeral system or base-2 numeral system which represents numeric values using two different symbols: typically 0 (zero) and 1 (one). The base-2 system is a positional notation with a radix of 2. Because of its straightforward implementation in digital electronic circuitry using logic gates, the binary system is used internally by almost all modern computers and computer-based devices.  Each digit is referred to as a bit.

Binary Numeration System


Considering that all digital systems are based in binary numeration, before describing the basic sampling & digitalization processes of sound, we'll briefly refer to this numeration. In the decimal numeration (the system we habitually use), ten symbols are used (these are the digits 0, 1, 2....9) allocated in a positional system to represent the successive amounts. This means that for every new figure that is added it's value will be 10 times higher than the one to the right, for example: 

27 = 2x10 + 7

306 = 3x10+ 0 x 10 + 6   

In binary numeration, only two symbols are used (the digits 0 and 1), they are also allocated in a positional system but now every new figure has a value only 2 times higher than the former. For example.

100101 = [ ( 1 ) × 25 ] + [ ( 0 ) × 24 ] + [ ( 0 ) × 23 ] + [ ( 1 ) × 22 ] + [ ( 0 ) × 21 ] + [ ( 1 ) × 20 ]

In The table below we can see the conversion of decimal to binary for numbers 0 to 15.

Decimal / Binary
Decimal numbers and it's binary equivalent


The reason we use binary numbers is because electrically it's much easier to codify 0's and 1's. We just need a high level of tension (5 v) to get a 1 and a very low tension level (0 v) to get a 0. This makes our representation extremely insensitive to noise. Meaning that the signal would be effectively recoverable even in the presence of a 2V noise that is corresponding to a signal/noise ration as low as 20 log  5/2 = 8dB (inaudible if the system is analogue).

Sampling


Let's go to the sampling concept now. The acoustic signals (and as such, the electrical signals representing them too) vary in a continuous form, which means that the interval however small it is, will always contain infinite different values. However, for the effects of the auditive message not that many information is required. First because the human ear doesn't have as much discrimination over time, and secondly because it also lacks amplitude discrimination to distinguish values that are too close over time and their amplitude difference is very low. Besides being not necessary to send that much information, it is also inconvenient and impossible to handle it. Thus the concept of sampling arises. To sample a signal means replace the original signal for a series of samples taken at regular intervals. The frequency which the samples are taken is named sample rate. Sr   and the time over samples, sample period ST holds that:



Sample Rate


It's intuitively evident that sample rate must be high enough because that's the only way a much higher degree of detail can be achieved, which means that sound will be reproduced with more quality than the original one. When sampling there is a criteria that must be mandatorily fulfilled during the whole sampling process, this is because sample rate should be higher than the double of the maximum frequency present in the original signal, meaning.


S> 2mfreq

This is in consequence of a theorem called Sampling theorem, that says that a sampled signal can only be totally recoverable if it was sampling fulfilling the previously mentioned criteria. The frequency Sr/2 is denominated the Nyquist frequency.


The effect of sampling over a sine wave. The sample rate in this case is 14.7 times higher than the wave’s frequency.

It is important to understand that the maximum frequency that appears in the previous formula doesn't refer only to the maximum interest frequency, but also to the maximum frequency that effectively appeared in the signal to sample, even though the said frequency comes from a high frequency noise that is polluting the signal. In the case that the criteria is not fulfilled, when we try to recover the signal there will be additional frequency components in the useful band. To see this, let's suppose we sample an audio signal with a  frequency of 40 kHz, and an (inaudible) noise of 35 kHz appears superposed to the signal, this situation is illustrated in the figure below. As a consequence of the sampling process and posterior signal reconstruction, we get a frequency of 5kHz that wasn't originally present in the signal. This frequency that substitutes the original 35kHz frequency, is named alias frequency because it's an alias of the former. Let's watch especially that the original frequency (35 kHz) wasn't producing an audible sensation, but the new frequency isn't only audible but it is also near the region of maximum sensitivity of the human ear thus is perceived as a notorious and irritating sibilance.


Alias frequencies
The effects of sampling with a minor frequency rate than the double of the maximum frequency rate contained in the signal. A signal of 35 kHz is sampled with a frequency of 40 kHz; when trying to re-construct it, an alias frequency of 5kHz Appears.

The previous example is pointing to us that if we pretend to properly re-construct the signal after the sampling process it's indispensable to eliminate all spurious frequencies that fall after the audio spectrum, which means after 20 kHz. A lowpass filter with a very abrupt slope in the cutoff band (96 dB/octave or higher) is used for this purpose, named the antialiasing filter.

The choice of of 44.1 kHz as a standard sampling frequency for digital audio was precisely because of this problem and the consequent need of an antialiasing filter. If we impose a maximum frequency of 20 kHz for high quality audio, as the filter slope is fast but not infinitely fast,  spurious signals can only be considered reduced to insignificance levels just after 22 kHz as we can appreciate in the illustration below. Because of this a frequency that is slightly higher than the double (44.1 kHz) was chosen. The exact value of 44.1 kHz instead of 44 kHz appeared in the early days of digital recording in video tape make both norms compatible.

A downside of antialiasing filters is it's great complexity and the fact that they aren't harmless at all for the signal inside the pass band (in this case the audio band). Even in the case that the filter affects only imperceptibly the amplitude of the signal in such band, it will affect in an appreciable way the phase which can alter the stereo image.


Antialiasing filter
Frequency response of an antialiasing filter used for high quality digital audio.

Bibliography: Federico Miyara (2003) Acústica y sistemas de sonido. UNR Editora
ISBN 950-673-196-9

No comments:

Post a Comment