Total variation

In mathematics, the total variation identifies several slightly different concepts, related to the (local or global) structure of the codomain of a function or a measure.

For a real-valued continuous function f, defined on an interval [a, b] ⊂ R, its total variation on the interval of definition is a measure of the one-dimensional arclength of the curve with parametric equation x ↦ f(x), for x ∈ [a, b].

The concept of total variation for functions of one real variable was first introduced by Camille Jordan in the paper (Jordan 1881).

[1] He used the new concept in order to prove a convergence theorem for Fourier series of discontinuous periodic functions whose variation is bounded.

The extension of the concept to functions of more than one variable however is not simple for various reasons.

The total variation of a real-valued (or more generally complex-valued) function

is the quantity where the supremum runs over the set of all partitions

Given a function f belonging to L1(Ω), the total variation of f in Ω is defined as where This definition does not require that the domain

is the set function and its total variation is defined as the value of this measure on the whole space of definition, i.e. Saks (1937, p. 11) uses upper and lower variations to prove the Hahn–Jordan decomposition: according to his version of this theorem, the upper and lower variation are respectively a non-negative and a non-positive measure.

is complex-valued i.e. is a complex measure, its upper and lower variation cannot be defined and the Hahn–Jordan decomposition theorem can only be applied to its real and imaginary parts.

137–139) and define the total variation of the complex-valued measure

into a countable number of disjoint measurable subsets.

is a signed measure: its total variation is defined as above.

is a vector measure: the variation is then defined by the following formula where the supremum is as above.

This definition is slightly more general than the one given by Rudin (1966, p. 138) since it requires only to consider finite partitions of the space

: this implies that it can be used also to define the total variation on finite-additive measures.

However, when μ and ν are probability measures, the total variation distance of probability measures can be defined as

, we eventually arrive at the equivalent definition and its values are non-trivial.

above is usually dropped (as is the convention in the article total variation distance of probability measures).

For a categorical distribution it is possible to write the total variation distance as follows It may also be normalized to values in

by halving the previous definition as follows The total variation of a

can be written as the sum of local variations on those subintervals: Theorem 2.

by definition: Under the conditions of the theorem, from the lemma we have: in the last part

Now again substituting into the lemma: This means we have a convergent sequence of

It is contained in the larger Banach space, called the ba space, consisting of finitely additive (as opposed to countably additive) measures, also with the same norm.

The distance function associated to the norm gives rise to the total variation distance between two measures μ and ν.

For finite measures on R, the link between the total variation of a measure μ and the total variation of a function, as described above, goes as follows.

by Then, the total variation of the signed measure μ is equal to the total variation, in the above sense, of the function

As a functional, total variation finds applications in several branches of mathematics and engineering, like optimal control, numerical analysis, and calculus of variations, where the solution to a certain problem has to minimize its value.

As an example, use of the total variation functional is common in the following two kind of problems One variable One and more variables Measure theory