The free energy principle is a theoretical framework suggesting that the brain reduces surprise or uncertainty by making predictions based on internal models and updating them using sensory input.
It highlights the brain's objective of aligning its internal model and the external world to enhance prediction accuracy.
[2] It establishes that the dynamics of physical systems minimise a quantity known as surprisal (which is the negative log probability of some outcome); or equivalently, its variational upper bound, called free energy.
The free energy principle is based on the Bayesian idea of the brain as an “inference engine.” Under the free energy principle, systems pursue paths of least surprise, or equivalently, minimize the difference between predictions based on their model of the world and their sense and associated perception.
In a 2018 interview, Friston explained what it entails for the free energy principle to not be subject to falsification:[5] I think it is useful to make a fundamental distinction at this point—that we can appeal to later.
[6] The notion that self-organising biological systems – like a cell or brain – can be understood as minimising variational free energy is based upon Helmholtz’s work on unconscious inference[7] and subsequent treatments in psychology[8] and machine learning.
This variational density is defined in relation to a probabilistic model that generates predicted observations from hypothesized causes.
This means that if a system acts to minimise free energy, it will implicitly place an upper bound on the entropy of the outcomes – or sensory states – it samples.
Negative free energy is formally equivalent to the evidence lower bound, which is commonly used in machine learning to train generative models, such as variational autoencoders.
Bayes' rule characterizes the probabilistically optimal inversion of such a causal model, but applying it is typically computationally intractable, leading to the use of approximate methods.
This holistic dual optimization is characteristic of active inference, and the free energy principle is the hypothesis that all systems which perceive and act can be characterized in this way.
This bound resists a natural tendency to disorder – of the sort associated with the second law of thermodynamics and the fluctuation theorem.
It is also used in Bayesian model selection, where free energy can be usefully decomposed into complexity and accuracy: Models with minimum free energy provide an accurate explanation of data, under complexity costs (c.f., Occam's razor and more formal treatments of computational costs[33]).
Here, complexity is the divergence between the variational density and prior beliefs about hidden states (i.e., the effective degrees of freedom used to explain the data).
This is because if sensory perturbations are suspended (for a suitably long period of time), complexity is minimised (because accuracy can be neglected).
[36][12] Free energy minimisation provides a useful way to formulate normative (Bayes optimal) models of neuronal inference and learning under uncertainty[37] and therefore subscribes to the Bayesian brain hypothesis.
[38] The neuronal processes described by free energy minimisation depend on the nature of hidden states:
that can comprise time-dependent variables, time-invariant parameters and the precision (inverse variance or temperature) of random fluctuations.
Free energy minimisation formalises the notion of unconscious inference in perception[7][9] and provides a normative (Bayesian) theory of neuronal processing.
The associated process theory of neuronal dynamics is based on minimising free energy through gradient descent.
is a derivative matrix operator):[39] Usually, the generative models that define free energy are non-linear and hierarchical (like cortical hierarchies in the brain).
In neuronally plausible implementations of predictive coding,[41] this corresponds to optimizing the excitability of superficial pyramidal cells and has been interpreted in terms of attentional gain.
A notable feature of this model is the reformulation of the free energy function only in terms of prediction errors during task performance:
The model has also proved to be fit to predict the EEG and fMRI data drawn from human experiments with high precision.
In the same vein, Yahya et al. also applied the free energy principle to propose a computational model for template matching in covert selective visual attention that mostly relies on SAIM.
[46] According to this study, the total free energy of the whole state-space is reached by inserting top-down signals in the original neural networks, whereby we derive a dynamical system comprising both feed-forward and backward prediction error.
, motor control can be understood in terms of classical reflex arcs that are engaged by descending (corticospinal) predictions.
This provides a formalism that generalizes the equilibrium point solution – to the degrees of freedom problem[47] – to movement trajectories.
Active inference is related to optimal control by replacing value or cost-to-go functions with prior beliefs about state transitions or flow.
[54] Active inference has been used to address a range of issues in cognitive neuroscience, brain function and neuropsychiatry, including action observation,[55] mirror neurons,[56] saccades and visual search,[57][58] eye movements,[59] sleep,[60] illusions,[61] attention,[44] action selection,[52] consciousness,[62][63] hysteria[64] and psychosis.