Relief (feature selection)

[1][2] It was originally designed for application to binary classification problems with discrete or numerical features.

Alternatively, these scores may be applied as feature weights to guide downstream modeling.

[6] Their strengths are that they are not dependent on heuristics, they run in low-order polynomial time, and they are noise-tolerant and robust to feature interactions, as well as being applicable for binary or continuous data; however, it does not discriminate between redundant features, and low numbers of training instances fool the algorithm.

Relief was also described as generalizable to multinomial classification by decomposition into a number of binary problems.

Rather than use Kira and Rendell's proposed decomposition of a multinomial classification into a number of binomial problems, ReliefF searches for k near misses from each different class and averages their contributions for updating W, weighted with the prior probability of each class.

[6] They include methods for improving (1) the core Relief algorithm concept, (2) iterative approaches for scalability, (3) adaptations to different data types, (4) strategies for computational efficiency, or (5) some combination of these goals.

Utilized an iterative `evaporative' removal of lowest quality features using ReliefF scores in association with mutual information.

[20] Results suggest improved power to detect 2-way epistatic interactions over ReliefF.

[22] SWRF* extends the SURF* algorithm adopting sigmoid weighting to take distance from the threshold into account.

[23] MultiSURF*[24] extends the SURF*[21] algorithm adapting the near/far neighborhood boundaries based on the average and standard deviation of distances from the target instance to all others.

MultiSURF* uses the standard deviation to define a dead-band zone where 'middle-distance' instances do not contribute to scoring.

Evidence suggests MultiSURF to be a well rounded option, able to detect 2-way and 3-way interactions, as well as simple univariate associations.

STIR [26][27] reformulates and slightly adjusts the original Relief formula by incorporating sample variance of the nearest neighbor distances into the attribute importance estimation.

This variance permits the calculation of statistical significance of features and adjustment for multiple testing of Relief-based scores.

Relief algorithm: Selection of nearest hit, and nearest miss instance neighbors prior to scoring.