Feature (machine learning)

In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a data set.

[1] Choosing informative, discriminating, and independent features is crucial to produce effective algorithms for pattern recognition, classification, and regression tasks.

Features are usually numeric, but other types such as strings and graphs are used in syntactic pattern recognition, after some pre-processing step such as one-hot encoding.

In speech recognition, features for recognizing phonemes can include noise ratios, length of sounds, relative power, filter matches and many others.

Feature vectors are often combined with weights using a dot product in order to construct a linear predictor function that is used to determine a score for making a prediction.

Feature construction has long been considered a powerful tool for increasing both accuracy and understanding of structure, particularly in high-dimensional problems.

[6] The initial set of raw features can be redundant and large enough that estimation and optimization is made difficult or ineffective.

It requires the experimentation of multiple possibilities and the combination of automated techniques with the intuition and knowledge of the domain expert.