Short-time Fourier transform

The short-time Fourier transform (STFT) is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time.

[1] In practice, the procedure for computing STFTs is to divide a longer time signal into shorter segments of equal length and then compute the Fourier transform separately on each shorter segment.

One then usually plots the changing spectra as a function of time, known as a spectrogram or waterfall plot, such as commonly used in software defined radio (SDR) based spectrum displays.

Full bandwidth displays covering the whole range of an SDR commonly use fast Fourier transforms (FFTs) with 2^24 points on desktop computers.

The Fourier transform (a one-dimensional function) of the resulting signal is taken, then the window is slid along the time axis until the end resulting in a two-dimensional representation of the signal.

, a complex function representing the phase and magnitude of the signal over time and frequency.

, to suppress any jump discontinuity of the phase result of the STFT.

In the discrete time case, the data to be transformed could be broken up into chunks or frames (which usually overlap each other, to reduce artifacts at the boundary).

Each chunk is Fourier transformed, and the complex result is added to a matrix, which records magnitude and phase for each point in time and frequency.

In this case, m is discrete and ω is continuous, but in most typical applications the STFT is performed on a computer using the fast Fourier transform, so both variables are discrete and quantized.

The magnitude squared of the STFT yields the spectrogram representation of the power spectral density of the function: See also the modified discrete cosine transform (MDCT), which is also a Fourier-related transform that uses overlapping windows.

If only a small number of ω are desired, or if the STFT is desired to be evaluated for every shift m of the window, then the STFT may be more efficiently evaluated using a sliding DFT algorithm.

Given the width and definition of the window function w(t), we initially require the area of the window function to be scaled so that It easily follows that and The continuous Fourier transform is Substituting x(t) from above: Swapping order of integration: So the Fourier transform can be seen as a sort of phase coherent sum of all of the STFTs of x(t).

An alternative definition that is valid only in the vicinity of τ, the inverse transform is: In general, the window function

This is one of the reasons for the creation of the wavelet transform and multiresolution analysis, which can give good time resolution for high-frequency events and good frequency resolution for low-frequency events, the combination best suited for many real signals.

This property is related to the Heisenberg uncertainty principle, but not directly – see Gabor limit for discussion.

The boundary of the uncertainty principle (best simultaneous resolution of both) is reached with a Gaussian window function (or mask function), as the Gaussian minimizes the Fourier uncertainty principle.

One can consider the STFT for varying window size as a two-dimensional domain (time and frequency), as illustrated in the example below, which can be calculated by varying the window size.

However, this is no longer a strictly time-frequency representation – the kernel is not constant over the entire signal.

When the original function is: We can have a simple example: w(t) = 1 for |t| smaller than or equal B w(t) = 0 otherwise B = window Now the original function of the Short-time Fourier transform can be changed as Another example: Using the following sample signal

Taking the Fourier transform produces N complex coefficients.

Of these coefficients only half are useful (the last N/2 being the complex conjugate of the first N/2 in reverse order, as this is a real valued signal).

There are only two variables, but decreasing fs (and keeping N constant) will cause the window size to increase — since there are now fewer samples per unit time.

So any attempt to increase the frequency resolution causes a larger window size and therefore a reduction in time resolution—and vice versa.

[4][5] Given a time window that is Τ seconds long, the minimum frequency that can be resolved is 1/Τ Hz.

The Rayleigh frequency is an important consideration in applications of the short-time Fourier transform (STFT), as well as any other method of harmonic analysis on a signal of finite record-length.

[6][7] STFTs as well as standard Fourier transforms and other tools are frequently used to analyze music.

The height of each bar (augmented by color) represents the amplitude of the frequencies within that band.

The depth dimension represents time, where each new bar was a separate distinct transform.

c. Nyquist criterion (avoiding the aliasing effect): d. Only for implementing the rectangular-STFT Rectangular window imposes the constraint Substitution gives: Change of variable n-1 for n: Calculate