Its creator Jeffrey Uhlmann explained that "unscented" was an arbitrary name that he adopted to avoid it being referred to as the “Uhlmann filter.”[1] Many filtering and control methods represent estimates of the state of a system in the form of a mean vector and an associated error covariance matrix.
A covariance that is zero implies that there is no uncertainty or error and that the position of the object is exactly what is specified by the mean vector.
The mean and covariance representation only gives the first two moments of an underlying, but otherwise unknown, probability distribution.
The mean and covariance representation of uncertainty is mathematically convenient because any linear transformation
This is because a spuriously small covariance implies less uncertainty and leads the filter to place more weight (confidence) than is justified in the accuracy of the mean.
Returning to the example above, when the covariance is zero it is trivial to determine the location of the object after it moves according to an arbitrary nonlinear function
The earliest approximation was to linearize the nonlinear function and apply the resulting Jacobian matrix to the given mean and covariance.
This is the basis of the extended Kalman Filter (EKF), and although it was known to yield poor results in many circumstances, there was no practical alternative for many decades.
In 1994 Jeffrey Uhlmann noted that the EKF takes a nonlinear function and partial distribution information (in the form of a mean and covariance estimate) of the state of a system but applies an approximation to the known function rather than to the imprecisely-known probability distribution.
He suggested that a better approach would be to use the exact nonlinear function applied to an approximating probability distribution.
The motivation for this approach is given in his doctoral dissertation, where the term unscented transform was first defined:[2] Consider the following intuition: With a fixed number of parameters it should be easier to approximate a given distribution than it is to approximate an arbitrary nonlinear function/transformation.
More generally, the application of a given nonlinear transformation to a discrete distribution of points, computed so as to capture a set of known statistics of an unknown distribution, is referred to as an unscented transformation.In other words, the given mean and covariance information can be exactly encoded in a set of points, referred to as sigma points, which if treated as elements of a discrete probability distribution has mean and covariance equal to the given mean and covariance.
The principal advantage of the approach is that the nonlinear function is fully exploited, as opposed to the EKF which replaces it with a linear one.
Eliminating the need for linearization also provides advantages independent of any improvement in estimation quality.
Since the seminal work of Uhlmann, many different sets of sigma points have been proposed in the literature.
sigma points are necessary and sufficient to define a discrete distribution having a given mean and covariance in
Consider the vertices of an equilateral triangle centered on origin in two dimensions: It can be verified that the above set of points has mean
A similar canonical set of sigma points can be generated in any number of dimensions
[4][5] The unscented transform is defined for the application of a given function to any partial characterization of an otherwise unknown distribution, but its most common use is for the case in which only the mean and covariance is given.
This gives: This can be compared to the linearized mean and covariance: The absolute difference between the UT and linearized estimates in this case is relatively small, but in filtering applications the cumulative effect of small errors can lead to unrecoverable divergence of the estimate.
The effect of the errors are exacerbated when the covariance is underestimated because this causes the filter to be overconfident in the accuracy of the mean.
In this example there is no way to determine the absolute accuracy of the UT and linearized estimates without ground truth in the form of the actual probability distribution associated with the original estimate and the mean and covariance of that distribution after application of the nonlinear transformation (e.g., as determined analytically or through numerical integration).
Returning to the example, the minimal symmetric set of sigma points can be obtained from the covariance matrix
This motivates the use of the square of this difference to be added to the UT covariance to guard against underestimating of the actual error in the mean.
In other words, there is no choice of distribution with a given mean and covariance that is superior to that provided by the set of sigma points, therefore the unscented transform is trivially optimal.
This general statement of optimality is of course useless for making any quantitative statements about the performance of the UT, e.g., compared to linearization; consequently he, Julier and others have performed analyses under various assumptions about the characteristics of the distribution and/or the form of the nonlinear transformation function.
For example, if the function is differentiable, which is essential for linearization, these analyses validate the expected and empirically-corroborated superiority of the unscented transform.
[12] [13] Uhlmann and Simon Julier published several papers showing that the use of the unscented transformation in a Kalman filter, which is referred to as the unscented Kalman filter (UKF), provides significant performance improvements over the EKF in a variety of applications.
[14][4][6] Julier and Uhlmann published papers using a particular parameterized form of the unscented transform in the context of the UKF which used negative weights to capture assumed distribution information.
Julier has subsequently described parameterized forms which do not use negative weights and also are not subject to those issues.