For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election.
The widespread availability of Markov chain Monte Carlo methods, however, has made this less of a concern.
The simplest and oldest rule for determining a non-informative prior is the principle of indifference, which assigns equal probabilities to all possibilities.
Some attempts have been made at finding a priori probabilities, i.e. probability distributions in some sense logically required by the nature of one's state of uncertainty; these are a subject of philosophical controversy, with Bayesians being roughly divided into two schools: "objective Bayesians", who believe such priors exist in many useful situations, and "subjective Bayesians" who believe that in practice priors usually represent subjective judgements of opinion that cannot be rigorously justified (Williamson 2010).
Perhaps the strongest arguments for objective Bayesianism were given by Edwin T. Jaynes, based mainly on the consequences of symmetries and on the principle of maximum entropy.
As an example of an a priori prior, due to Jaynes (2003), consider a situation in which one knows a ball has been hidden under one of three cups, A, B, or C, but no other information is available about its location.
[12] The example Jaynes gives is of finding a chemical in a lab and asking whether it will dissolve in water in repeated experiments.
Priors can be constructed which are proportional to the Haar measure if the parameter space X carries a natural group structure which leaves invariant our Bayesian state of knowledge.
For example, in physics we might expect that an experiment will give the same results regardless of our choice of the origin of a coordinate system.
Similarly, some measurements are naturally invariant to the choice of an arbitrary scale (e.g., whether centimeters or inches are used, the physical results should be equal).
The principle of minimum cross-entropy generalizes MAXENT to the case of "updating" an arbitrary prior distribution with suitable constraints in the maximum-entropy sense.
In the limiting case where the sample size tends to infinity, the Bernstein-von Mises theorem states that the distribution of
Indeed, the very idea goes against the philosophy of Bayesian inference in which 'true' values of parameters are replaced by prior and posterior distributions.
This is a quasi-KL divergence ("quasi" in the sense that the square root of the Fisher information may be the kernel of an improper distribution).
Due to the minus sign, we need to minimise this in order to maximise the KL divergence with which we started.
This in turn occurs when the prior distribution is proportional to the square root of the Fisher information of the likelihood function.
Another issue of importance is that if an uninformative prior is to be used routinely, i.e., with many different data sets, it should have good frequentist properties.
For example, one would want any decision rule based on the posterior distribution to be admissible under the adopted loss function.
then it is clear that the same result would be obtained if all the prior probabilities P(Ai) and P(Aj) were multiplied by a given constant; the same would be true for a continuous random variable.
Taking this idea further, in many cases the sum or integral of the prior values may not even need to be finite to get sensible answers for the posterior probabilities.
Many authors (Lindley, 1973; De Groot, 1937; Kass and Wasserman, 1996)[citation needed] warn against the danger of over-interpreting those priors since they are not probability densities.
of the individual gas elements (atoms or molecules) are finite in the phase space spanned by these coordinates.
An important consequence is a result known as Liouville's theorem, i.e. the time independence of this phase space volume element and thus of the a priori probability.
A time dependence of this quantity would imply known information about the dynamics of the system, and hence would not be an a priori probability.
If one considers a huge number of replicas of this system, one obtains what is called a microcanonical ensemble.
This fundamental postulate therefore allows us to equate the a priori probability to the degeneracy of a system, i.e. to the number of different states with the same energy.
Consider the rotational energy E of a diatomic molecule with moment of inertia I in spherical polar coordinates
Thus in quantum mechanics the a priori probability is effectively a measure of the degeneracy, i.e. the number of states having the same energy.
These functions are derived for (1) a system in dynamic equilibrium (i.e. under steady, uniform conditions) with (2) total (and huge) number of particles
In the case of fermions, like electrons, obeying the Pauli principle (only one particle per state or none allowed), one has therefore