Time–frequency analysis for music signals

Short-time Fourier transform (STFT), Gabor transform (GT) and Wigner distribution function (WDF) are famous time–frequency methods, useful for analyzing music signals such as notes played on a piano, a flute or a guitar.

Music is a type of sound that has some stable frequencies in a time period.

In musical theory, pitch represents the perceived fundamental frequency of a sound.

Short-time Fourier transform is a basic type of time–frequency analysis.

If there is a continuous signal x(t), we can compute the short-time Fourier transform by where w(t) is a window function.

There are some constraints of discrete short-time Fourier transform: Figure 1 shows the waveform of an audio file "" with 44100 Hz sampling frequency.

Observe that from t = 0 to 0.5 second, a chord consists of three notes (C-E-G) is played.

Spectrogram is the square of STFT, time-varying spectral representation.

Therefore, we should describe the frequency in logarithmic scale related to human hearing.

The Wigner distribution function can also be used to analyze music signals.

The advantage of the Wigner distribution function is the high clarity of the output; however, it is computationally expensive and has a cross-term problem, so it's more suitable to analyze signals without more than one frequency at the same time.

Fig.1 Waveform of the audio file ""
Fig.2 Gabor transform of ""
Fig. 3 Spectrogram of ""