This is achieved, in a process known as convolution, by fitting successive sub-sets of adjacent data points with a low-degree polynomial by the method of linear least squares.
When the data points are equally spaced, an analytical solution to the least-squares equations can be found, in the form of a single set of "convolution coefficients" that can be applied to all data sub-sets, to give estimates of the smoothed signal, (or derivatives of the smoothed signal) at the central point of each sub-set.
The method, based on established mathematical procedures,[1][2] was popularized by Abraham Savitzky and Marcel J. E. Golay, who published tables of convolution coefficients for various polynomials and sub-set sizes in 1964.
There are numerous applications of smoothing, such as avoiding the propagation of noise through an algorithm chain, or sometimes simply to make the data appear to be less noisy than it really is.
Each subset of the data set is fit with a straight horizontal line as opposed to a higher order polynomial.
The moving average is often used for a quick technical analysis of financial data, like stock prices, returns or trading volumes.
When the data points are equally spaced, an analytical solution to the least-squares equations can be found.
A polynomial will be fitted by linear least squares to a set of m (an odd number) adjacent data points, each separated by an interval h. Firstly, a change of variable is made where
They are elements of the matrix In general, In matrix notation this example is written as Tables of convolution coefficients, calculated in the same way for m up to 25, were published for the Savitzky–Golay smoothing filter in 1964,[3][5] The value of the central point, z = 0, is obtained from a single set of coefficients, a0 for smoothing, a1 for 1st derivative etc.
The summations in the matrix JTJ can be evaluated in closed form, so that algebraic formulae can be derived for the convolution coefficients.
[15] Savitzky–Golay filters are most commonly used to obtain the smoothed or derivative value at the central point, z = 0, using a single set of convolution coefficients.
For example, with a quadratic polynomial, An explicit expression for the inverse of this matrix can be obtained using Cramer's rule.
The trick is to transform the rectangular kernel into a single row by a simple ordering of the indices of the nodes.
Whereas the one-dimensional filter coefficients are found by fitting a polynomial in the subsidiary variable z to a set of m data points, the two-dimensional coefficients are found by fitting a polynomial in subsidiary variables v and w to a set of the values at the m × n kernel nodes.
Nikitas and Pappa-Louisi showed that depending on the format of the used polynomial, the quality of smoothing may vary significantly.
The idea of two-dimensional convolution coefficients can be extended to the higher spatial dimensions as well, in a straightforward manner,[17][21] by arranging multidimensional distribution of the kernel nodes in a single row.
The insufficient precision causes the floating point truncation errors to become comparable to the magnitudes of some C elements, which, in turn, severely degrades its accuracy and renders it useless.
Once the obtained C's in two consecutive iterations start having same significant digits until a pre-specified distance, the convergence is assumed to have reached.
If the distance is sufficiently large, the computation yields a highly accurate C. PCCC employs rational number calculations, by using GNU Multiple Precision Arithmetic Library, and yields a fully accurate C, in the rational number format.
[24] Chandra Shekhar has also laid out a mathematical framework that describes usage of C calculated on unity-spaced kernel nodes to perform filtering and partial differentiations (of various orders) on non-uniformly spaced kernel nodes,[21] allowing usage of C provided in the aforementioned database.
Although this method yields approximate results only, they are acceptable in most engineering applications, provided that non-uniformity of the kernel nodes is weak.
Both the extent of the distortion and S/N (signal-to-noise ratio) improvement: For example, If the noise in all data points is uncorrelated and has a constant standard deviation, σ, the standard deviation on the noise will be decreased by convolution with an m-point smoothing function to[26][note 5] These functions are shown in the plot at the right.
Although the moving average function gives better noise reduction it is unsuitable for smoothing data which has curvature over m points.
The optimal choice of polynomial order and number of convolution coefficients will be a compromise between noise reduction and distortion.
[28] One way to mitigate distortion and improve noise removal is to use a filter of smaller width and perform more than one convolution with it.
[30] The noise reduction formulae given above do not apply because correlation between calculated data points increases with each pass.
At very low angle, the plot is almost flat, meaning that low-frequency components of the data will be virtually unchanged by the smoothing operation.
[31] Some high-frequency noise components are attenuated more than others, as shown by undulations in the Fourier transform at large angles.
The noise reduction is a little less than would be obtained with one pass of a 5-point moving average which, under the same conditions, would result in the smoothed points having the smaller standard deviation of 0.45σ.
is constant, h. Examples of the use of the so-called convolution coefficients, with a cubic polynomial and a window size, m, of 5 points are as follows.