Data augmentation

[1][2] Data augmentation has important applications in Bayesian analysis,[3] and the technique is widely used in machine learning to reduce overfitting when training machine learning models,[4] achieved by training models on several slightly-modified copies of existing data.

Synthetic Minority Over-sampling Technique (SMOTE) is a method used to address imbalanced datasets in machine learning.

In such datasets, the number of samples in different classes varies significantly, leading to biased model performance.

[5] When convolutional neural networks grew larger in mid-1990s, there was a lack of data to use, especially considering that some part of the overall dataset should be spared for later testing.

It was proposed to perturb existing data with affine transformations to create new examples with the same labels,[6] which were complemented by so-called elastic distortions in 2003,[7] and the technique was widely used as of 2010s.

[9] Data augmentation has become fundamental in image classification, enriching training dataset diversity to improve model generalization and performance.

The evolution of this practice has introduced a broad spectrum of techniques, including geometric transformations, color space adjustments, and noise injection.

[10] Geometric transformations alter the spatial properties of images to simulate different perspectives, orientations, and scales.

Techniques include: Injecting noise into images simulates real-world imperfections, teaching models to ignore irrelevant variations.

The applications of robotic control and augmentation in disabled and able-bodied subjects still rely mainly on subject-specific analyses.

[12] A common approach is to generate synthetic signals by re-arranging components of real data.

Lotte[13] proposed a method of "Artificial Trial Generation Based on Analogy" where three data examples

This approach was shown to improve performance of a Linear Discriminant Analysis classifier on three different datasets.

Tsinganos et al.[15] studied the approaches of magnitude warping, wavelet decomposition, and synthetic surface EMG models (generative approaches) for hand gesture recognition, finding classification performance increases of up to +16% when augmented data was introduced during training.

The prediction of mechanical signals based on data augmentation brings a new generation of technological innovations, such as new energy dispatch, 5G communication field, and robotics control engineering.