Dynamic time warping

A well-known application has been automatic speech recognition, to cope with different speaking speeds.

Although DTW measures a distance-like quantity between two given sequences, it doesn't guarantee the triangle inequality to hold.

This example illustrates the implementation of the dynamic time warping algorithm when the two sequences s and t are strings of discrete symbols.

We can easily modify the above algorithm to add a locality constraint (differences marked).

In order to make the algorithm work, the window parameter w must be adapted so that

The DTW algorithm produces a discrete matching between existing elements of one series to another.

The 50 years old quadratic time bound was broken in 2016: an algorithm due to Gold and Sharir enables computing DTW in

Despite this improvement, it was shown that a strongly subquadratic running time of the form

Fast techniques for computing DTW include PrunedDTW,[5] SparseDTW,[6] FastDTW,[7] and the MultiscaleDTW.

[8][9] A common task, retrieval of similar time series, can be accelerated by using lower bounds such as LB_Keogh,[10] LB_Improved,[11] or LB_Petitjean.

[12] However, the Early Abandon and Pruned DTW algorithm reduces the degree of acceleration that lower bounding provides and sometimes renders it ineffective.

[13] Subsequent to this survey, the LB_Enhanced bound was developed that is always tighter than LB_Keogh while also being more efficient to compute.

For more than two sequences, the problem is related to the one of the multiple alignment and requires heuristics.

DBA[16] is currently a reference method to average a set of sequences consistently with DTW.

COMASA[17] efficiently randomizes the search for the average sequence, using DBA as a local optimization process.

A nearest-neighbour classifier can achieve state-of-the-art performance when using dynamic time warping as a distance measure.

[19] The windows that classical DTW uses to constrain alignments introduce a step function.

In contrast, ADTW employs an additive penalty that is incurred each time that the path is warped.

ADTW significantly outperforms DTW with windowing when applied as a nearest neighbor classifier on a set of benchmark time series classification tasks.

By viewing the observed samples at smooth functions, one can utilize continuous mathematics for analyzing data.

Roughness penalty terms for the warping functions may be added, e.g., by constraining the size of their curvature.

This approach has been successfully applied to analyze patterns and variability of speech movements.

[24][25][26] DTW and related warping methods are typically used as pre- or post-processing steps in data analyses.

If the observed sequences contain both random variation in both their values, shape of observed sequences and random temporal misalignment, the warping may overfit to noise leading to biased results.

[27] In human movement analysis, simultaneous nonlinear mixed-effects modeling has been shown to produce superior results compared to DTW.

[28] Due to different speaking rates, a non-linear fluctuation occurs in speech pattern versus time axis, which needs to be eliminated.

Moreover, if the warping function is allowed to take any possible value, very less[clarify] distinction can be made between words belonging to different categories.

So, to enhance the distinction between words belonging to different categories, restrictions were imposed on the warping function slope.

Dynamic time warping is used in finance and econometrics to assess the quality of the prediction versus real-world data.