Dilution (neural networks)

Dilution and dropout (also called DropConnect[1]) are regularization techniques for reducing overfitting in artificial neural networks by preventing complex co-adaptations on training data.

These techniques are also sometimes referred to as random pruning of weights, but this is usually a non-recurring one-way operation.

Output from a layer of linear nodes, in an artificial neural net can be described as This can be written in vector notation as Equations (1) and (2) are used in the subsequent sections.

During weak dilution, the finite fraction of removed connections (the weights) is small, giving rise to a tiny uncertainty.

[6][7] When the dilution is strong, the finite fraction of removed connections (the weights) is large, giving rise to a huge uncertainty.

Dropout is a special case of the previous weight equation (3), where the aforementioned equation is adjusted to remove a whole row in the vector matrix, and not only random weights Because dropout removes a whole row from the vector matrix, the previous (unlisted) assumptions for weak dilution and the use of mean field theory are not applicable.

Although there have been examples of randomly removing connections between neurons in a neural network to improve models,[3] this technique was first introduced with the name dropout by Geoffrey Hinton, et al. in 2012.

On the left is a fully connected neural network with two hidden layers. On the right is the same network after applying dropout.