Skip to main content

The Nyquist Frequency

Aliasing

Sampling is used to convert from a continuous analog signal to a discrete digital signal. Intuitively, if the sample rate is too small, then the digital signal will not be a good representation of the analog signal. If a sound wave is oscillating at 10,000 Hz, then sampling at 5,000 Hz would be insufficient since the samples would completely miss half the cycles of the analog signal. With a sample rate of 5,000 Hz, it is impossible to record frequencies greater than or equal to 5,000 Hz. Those frequencies will appear as noise in the audio recording.

However, there is another issue that occurs when sampling sound frequencies that are smaller than the sample rate. Consider the waves shown in Module 1. The blue wave has frequency 4.5 Hz, the red wave has frequency 1.5 Hz, and the sample rate is 6.0 Hz. The samples line up perfectly where the blue and red waves intersect. This means that both waves appear the same after sampling. This phenomenon is known as aliasing.

Module 1
Aliasing with positive and negative frequencies.

The practical effect of aliasing is that high frequencies come through as low frequencies in the sampled signal. These aliased frequencies essentially become noise in the audio. The sample rate beyond which aliasing occurs is called the Nyquist frequency, which is equal to half of the sample rate. It is also called the folding frequency since the higher frequencies "fold" into the lower frequencies as noise. In Module 1, the Nyquist frequency is 3.0 Hz, half of the sample rate 6.0 Hz.

One interesting property of aliasing is that the specific frequencies that are aliases of each other are symmetric about the Nyquist frequency. For the waves in Module 1, the frequencies of 1.5 Hz and 4.5 Hz are symmetric about the Nyquist frequency 3.0 Hz.

Negative Frequencies

Two waves with the same frequency may or may not alias each other depending on the direction that the wave is traveling. The "direction" a wave is travelling is based on the sign of the amplitude, positive or negative. It turns out that a wave with a negative amplitude is the sample as a wave with a positive amplitude and a negative frequency because of the following formula.

sin(x)=sin(x)\sin(-x) = -\sin(x)

Module 2 shows a negative-frequency wave aliases to multiple positive-frequency waves.

Module 2
Aliasing with positive and negative frequencies.

Intuition Behind the Nyquist Frequency

Let's say we have an analog clock with an "hour" hand that is pointing at 12 o'clock, which we will call 0 for simplicity. In one rotation around the clock (in 12 hours), we "sample" once per hour. In one period, we collect 12 samples.

Sample Times=0,1,2,3,4,5,6,7,8,9,10,11\text{Sample Times} = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Now let's say there are two periodic events that occur: eating every 5 hours and sleeping every 17 hours.

Eating Times=0,5,10,3,8,1,6,11,4,9,2,7,0\text{Eating Times} = 0, 5, 10, 3, 8, 1, 6, 11, 4, 9, 2, 7, 0 Sleeping Times=0,5,10,3,8,1,6,11,4,9,2,7,0\text{Sleeping Times} = 0, 5, 10, 3, 8, 1, 6, 11, 4, 9, 2, 7, 0

The eating times and sleeping times are exactly the same. Eating and sleeping are aliases of each other. Let's add another periodic event, gaming, that occurs every 7 hours.

Gaming Times=0,7,2,9,4,11,6,1,8,3,10,5,0\text{Gaming Times} = 0, 7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5, 0

The samples are the same, but in reverse. If we had sampled traveling backwards, the samples would have been the same. When traveling backwards, the wave can be described as having a negative frequency.

Gaming Times Backwards=0,5,10,3,8,1,6,11,4,9,2,7,0\text{Gaming Times Backwards} = 0, 5, 10, 3, 8, 1, 6, 11, 4, 9, 2, 7, 0

Under what conditions does aliasing occur? Our three aliases so far have frequncies of -7 Hz, 5 Hz, and 17 Hz. Two frequencies f1f1 and f2f2 are aliases of each other if both of the following equations are true.

f1f2=0modsampleratef1+f2=0modsamplerate\begin{aligned} f1 - f2 &= 0 \mod samplerate \\ |f1| + |f2| &= 0 \mod samplerate \\ \end{aligned}

In this equation, "mod" refers to modulus, which means the remainder after dividing. It is sometimes represented as an operator with the percent symbol %. All pairs of frequencies so far follow this property.

175=12=0mod1217+5=12=0mod125(7)=12=0mod125+7=12=0mod12(7)17=24=0mod127+17=24=0mod12\begin{aligned} 17 - 5 \quad &= 12 &= 0 \mod 12 \\ |17| + |5| \quad &= 12 &= 0 \mod 12 \\ 5 - (-7) \quad &= 12 &= 0 \mod 12 \\ |5| + |-7| \quad &= 12 &= 0 \mod 12 \\ (-7) - 17 \quad &= -24 &= 0 \mod 12 \\ |-7| + |17| \quad &= 24 &= 0 \mod 12 \\ \end{aligned}

This means that the negative frequencies of -6, -7, -8, -9, -10, and -11 alias to the positive frequencies of 6, 5, 4, 3, 2, and 1. Therefore, the set of frequencies that can be unambiguously captured are 0, 1, 2, 3, 4, and 5. Only the frequencies less than the Nyquist frequency can be unambiguously captured.

The Nyquist-Shannon Sampling Theorem

The Nyquist-Shannon sampling theorem states the following:

If a function x(t)x(t) contains no frequencies higher than BB hertz, then it can be completely determined from its ordinates at a sequence of points spaced less than 1/(2B)1/(2B) seconds apart.

The Nyquist frequency is a direct consequence of this theorem. The proof of the theorem in the general case relies on the fact that x(t)x(t) can be decomposed into a summation of sines and cosines involving complex numbers a+bia + bi, where i=1i = \sqrt{-1}. Another way to describe the theorem is that it defines the number of samples required to interpolate a sum of sine and cosine waves of different frequencies.

Choosing a Sample Rate

There are two major audio distortions that occur as the sample rate decreases. One distortion is that higher frequencies are no longer present, i.e. the audio becomes lower-pitched overall. The other distortion is that those higher frequencies are "folded" into the lower frequencies, resulting in more noise. Module 2 shows the effects of decreasing the sample rate on audio from a song.

Module 3
A song with a sample rate of 32000 Hz.
A song with a sample rate of 22050 Hz.
A song with a sample rate of 16000 Hz. This is where audio quality starts becoming noticeably worse.
A song with a sample rate of 11025 Hz.
A song with a sample rate of 8000 Hz.

The typical human can hear sound frequencies between 20 Hz and 20,000 Hz. (The upper limit is probably closer to 18,000 Hz in reality, but "20 to 20" is easier to remember.) Since most modern microphones have a sample rate of 48,000 Hz, the Nyquist frequency is 24,000 Hz, which is above the maximum 20,000 Hz that humans can hear.

However, for real-world audio applications the sample rate is chosen to cut corners as much as possible without affecting the user experience. Processing audio can be computationally expensive, especially when using machine learning in real-time applications. The following table shows various applications and the sample rates used for each one.

ApplicationSample RateReason
Landline Phone Calls8,000 HzHuman voice almost never exceeds 4,000 Hz
Voice Activity Detection8,000 HzAccuracy doesn't improve with higher sample rates
Speech-to-Text Transcription16,000 HzAccuracy doesn't improve with higher sample rates
Bose QC 45 Microphones16,000 HzBluetooth effectively limits the bit rate of microphone audio
Compact Disc Audio44,100 HzPAL and NTSC video were used to store audio
Most Modern Microphones48,000 HzAllow resampling to lower rates for various reasons
Copyright © 2024 Audio Internals