You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

4.3 KiB

Fourier Transform

Fourier Transform (FT) is one of the most important algorithms in signal processing (and really in computer science and mathematics in general), which enables us to express and manipulate any signal (such as a sound or picture) in terms of frequencies it is composed of (rather than in terms of individual samples). It is so important because frequencies (basically sine waves) are actually THE important thing in signals, they allow us to detect things (voices, visual objects, chemical elements, ...), compress signals, modify them in useful ways (e.g. filter out noise of specific frequency band, enhance specific frequency bands, ...). There also exists an optimized version of FT called Fast Fourier Transform (FFT) which does the same but much faster. For newcomers FT is typically not easy to understand, it takes time to wrap one's head around it, so don't worry if you feel dumb reading this.

What does FT do? It transforms the input signal from time (also space) domain -- the usual representation as "array of samples capture in time" -- to frequency domain -- an "array of frequency amounts". There is also an inverse Fourier Transform that does the opposite (transforms the signal from frequencies back to time samples). The time and frequency representations are EQUIVALENT in that either one can be used to represent any signal, none is somehow more superior than the other -- it turns out that any signal can be expressed as a sum of sine waves of different frequencies and so any signal can also be decomposed to them. In the frequency domain we can however usually do two important things we cannot do in time domain: analyze what frequencies are present (which can help e.g. in voice recognition, spectral analysis, earthquake detection, music etc.) and also MODIFY them (typicaly example is e.g. music equalizer or compression that removes or quantizes some frequencies); if we modify the frequencies, we may use the inverse FT to get a "normal" (time representation) signal back. Some things are also easier to do in the frequency domain, for example convolution becomes mere multiplication.

FT is actually just one of many so called integral transforms that are all quite similar -- they always transform the signal to some other domain and back, they use similar equation but usually use a different function. Other integral transforms are for example discrete cosine transformation (DCT) or wavelet transform. DCT is actually a bit simpler than FT, so if you are having hard time with FT, go check out DCT.

If you know linear algebra, this may help you understand what FT really does: Imagine the signal we work with is a POINT (we can also say a vector) in many dimensional space; if for example we have a recorded sound that has 1000 samples, it is really a 1000 dimensional vector, a point in 1000 dimensional space, expressed as an "array" of 1000 numbers (vector components). FT does nothing more than transforming from one vector basis ("system of coordinates", "frame of reference") to another basis; i.e. by default the signal is expressed in time domain (out usual vector basis), the numbers in the sound "array" are such because we are viewing them from the time "frame of reference" -- FT will NOT do anything with to the signal itself (it is a vector/point in space, which will stay where it is, the recorded sound itself will not change), it will merely express this same point/vector from a different "point of view"/"frame of reference" (set of basis vectors) -- that of frequencies. That's basically how all the integral transforms work, they just have to ensure the basis they are transforming to is orthogonal (i.e. kind of legit, "usable") of course.

As if it wasn't hard enough, there are kind of many different types of FT. For starters FT can be done in any number of dimensions, 1D FT (e.g. for sounds) is of course the simplest, but there also exists 2D FT (used e.g. for pictures), 3D FT etc.

TODO: more, code, picture etc.