Decomposing sound into frequencies

From: 3blue1brown

The Fourier transform is a fundamental mathematical idea that allows for an animated approach to understanding its components [00:00:07]. Its primary goal is to provide an introduction to the topic [00:00:16]. The core example often used to understand the Fourier transform is the decomposition of frequencies from sound [00:00:32]. This idea extends beyond sound and frequency into various areas of math and physics [00:00:34].

Sound as a Time-Varying Signal

A pure sound, like an A at 440 beats per second, causes air pressure to oscillate up and down around its equilibrium, forming a wave [00:00:50]. This means it completes 440 oscillations each second [00:01:06]. A lower pitch note, like a D, has the same wave structure but with fewer beats per second [00:01:09].

When multiple notes are played simultaneously, the resulting pressure versus time graph is the sum of the individual notes’ pressure differences [00:01:22]. This sum creates a more complex, wave-ish graph that is not a pure sine wave [00:01:41]. As more notes are added, the wave becomes increasingly complicated [00:01:48]. A microphone records this final sum of air pressure over time [00:02:03]. The central question is how to take such a complex signal and decompose it back into its pure constituent frequencies [00:02:10].

The Winding Machine Concept

The general strategy for decomposing frequencies is to build a mathematical machine that treats signals with a specific frequency differently from others [00:02:29].

Wrapping the Graph Around a Circle

Initial Signal: Consider a pure signal, for example, 3 beats per second, over a finite time interval (e.g., 0 to 4.5 seconds) [00:02:40].
Rotating Vector: The key idea is to “wrap” this graph around a circle [00:02:55]. This is visualized by imagining a rotating vector whose length at each point in time corresponds to the height of the graph at that time [00:03:07]. High points on the graph result in a greater distance from the origin, while low points are closer to the origin [00:03:14].
Winding Frequency: The speed at which this vector rotates around the circle is called the winding frequency [00:03:42]. This is distinct from the signal’s own frequency [00:03:35]. Adjusting the winding frequency changes the appearance of the wound-up graph [00:04:03].

Center of Mass Analysis

When the winding frequency matches the frequency of the signal (e.g., 3 beats per second), something special occurs: all the high points of the signal align on one side of the circle, and all the low points align on the opposite side [00:04:41].

To quantify this, imagine the wound-up graph as a metal wire with mass [00:04:59]. The center of mass of this wire will wobble as the winding frequency changes [00:05:08]. For most winding frequencies, peaks and valleys are spread out, keeping the center of mass close to the origin [00:05:16]. However, when the winding frequency matches the signal’s frequency, the center of mass shifts significantly, for instance, unusually far to the right [00:05:26].

A plot tracking the x-coordinate of this center of mass against different winding frequencies will show a spike at the signal’s frequency [00:05:42]. If the original signal is shifted up (oscillates around a positive value), there will also be a large spike around zero winding frequency, corresponding to the overall shift [00:07:07].

Decomposing Multiple Frequencies

This “almost Fourier transform” becomes particularly powerful when dealing with signals composed of multiple frequencies [00:08:43]. If a signal is a sum of two pure frequencies (e.g., 2 beats per second and 3 beats per second), applying this winding machine will result in spikes at both 2 and 3 cycles per second [00:09:12]. This linearity means that the transform of a sum of signals is the sum of their individual transforms [00:09:36]. Since the transform of a pure frequency is close to zero everywhere except for a spike at that frequency, adding together two pure frequencies results in a transform graph with peaks at those specific frequencies [00:10:11]. This machine effectively unmixes the original frequencies from their jumbled sum [00:10:29].

Practical Application: Sound Editing

One practical application of this concept is in sound editing [00:10:40]. If a recording has an unwanted high-pitch frequency, its Fourier transform will show a distinct spike at that high frequency [00:11:05]. By “smushing down” this spike in the frequency domain, one effectively filters out that high frequency from the sound [00:11:11]. An inverse Fourier transform can then convert this modified frequency representation back into a time-domain signal, yielding the original recording without the annoying pitch [00:11:21].

The Actual Fourier Transform

The “almost Fourier transform” described above primarily focuses on the x-coordinate of the center of mass [00:11:55]. However, the center of mass is a two-dimensional entity, also possessing a y-coordinate [00:12:02].

Complex Numbers and Rotation

In mathematics, particularly when dealing with two-dimensional concepts like rotation, it is elegant to think of them in the complex plane [00:12:05]. The center of mass then becomes a complex number with both a real (x) and an imaginary (y) part [00:12:12].

Euler’s formula provides a concise way to describe winding and rotation using complex numbers: e^(iθ) corresponds to a point on the unit circle [00:12:32]. For rotation at a rate of 1 cycle per second (clockwise), this can be expressed as e^(-2πift), where t is time and f is frequency [00:12:54].

Multiplying this exponential expression by the signal function g(t) (representing intensity versus time) causes the rotating vector to be scaled up and down according to the signal’s value, effectively drawing the wound-up graph [00:14:12]. This small expression elegantly encapsulates the idea of winding a graph around a circle with a variable winding frequency [00:14:31].

The Integral Formulation

To capture the center of mass of this wound-up graph, one can sum up many sampled points of the original signal on the wound-up graph as complex numbers and then divide by the number of points [00:14:47]. In the limit, this sum becomes an integral [00:15:14].

The actual Fourier transform is defined as this integral, without dividing out by the time interval [00:15:51]. This means it scales the center of mass by the length of the time interval considered [00:16:06]. Consequently, if a certain frequency persists for a longer time, the magnitude of the Fourier transform at that frequency increases [00:16:25]. For other frequencies, longer time intervals allow the wound-up graph to balance itself around the circle, cancelling out contributions [00:16:56].

The common notation for the Fourier transform of a function g(t) is ĝ(f) (g-hat of f) [00:17:31]. This new function ĝ(f) takes in a frequency (f, the winding frequency) and outputs a complex number [00:17:22]. This complex number represents the strength of that given frequency in the original signal, with its real part being the x-coordinate and its imaginary part being the y-coordinate [00:17:38].

The theoretical definition of the Fourier transform often uses integration bounds from negative infinity to infinity, meaning it considers the limit as the time interval grows infinitely large [00:18:36].

For a visual representation, see the process of the “winding machine” to derive the Fourier Transform [00:00:04].

Extension Beyond Sound

The Fourier transform extends to many other areas of mathematics and physics, beyond merely extracting frequencies from signals [00:19:01]. For further applications of Fourier Transform, refer to related topics.

Tubegraph

Explorer

Table of Contents