Power law

Especially probability bounds, the suspected cause of typical bending and/or flattening phenomena in the high- and low-frequency graphical segments, are parametrically absent in the standard model.

In fact, there are many ways to generate finite amounts of data that mimic this signature behavior, but, in their asymptotic limit, are not true power laws.

[citation needed] Thus, accurately fitting and validating power-law models is an active area of research in statistics; see below.

; most identified power laws in nature have exponents such that the mean is well-defined but the variance is not, implying they are capable of black swan behavior.

On the one hand, this makes it incorrect to apply traditional statistics that are based on variance and standard deviation (such as regression analysis).

[2] The equivalence of power laws with a particular scaling exponent can have a deeper origin in the dynamical processes that generate the power-law relation.

Diverse systems with the same critical exponents—that is, which display identical scaling behaviour as they approach criticality—can be shown, via renormalization group theory, to share the same fundamental dynamics.

For instance, the behavior of water and CO2 at their boiling points fall in the same universality class because they have identical critical exponents.

Scientific interest in power-law relations stems partly from the ease with which certain general classes of mechanisms generate them.

[15] The demonstration of a power-law relation in some data can point to specific kinds of mechanisms that might underlie the natural phenomenon in question, and can indicate a deep connection with other, seemingly unrelated systems;[16] see also universality above.

The ubiquity of power-law relations in physics is partly due to dimensional constraints, while in complex systems, power laws are often thought to be signatures of hierarchy or of specific stochastic processes.

, which can represent uncertainty in the observed values (perhaps measurement or sampling errors) or provide a simple way for observations to deviate from the power-law function (perhaps for stochastic reasons): Mathematically, a strict power law cannot be a probability distribution, but a distribution that is a truncated power function is possible:

[10] More than a hundred power-law distributions have been identified in physics (e.g. sandpile avalanches), biology (e.g. species extinction and body mass), and the social sciences (e.g. city sizes and income).

If the resultant scatterplot suggests that the plotted points asymptotically converge to a straight line, then a power-law distribution should be suspected.

[57] On the other hand, in its version for identifying power-law probability distributions, the mean residual life plot consists of first log-transforming the data, and then plotting the average of those log-transformed data that are higher than the i-th order statistic versus the i-th order statistic, for i = 1, ..., n, where n is the size of the random sample.

If the resultant scatterplot suggests that the plotted points tend to stabilize about a horizontal straight line, then a power-law distribution should be suspected.

If the points in the plot tend to converge to a straight line for large numbers in the x axis, then the researcher concludes that the distribution has a power-law tail.

In general, power-law distributions are plotted on doubly logarithmic axes, which emphasizes the upper tail region.

[10][69] The survival function, on the other hand, is more robust to (but not without) such biases in the data and preserves the linear signature on doubly logarithmic axes.

Though a survival function representation is favored over that of the pdf while fitting a power law to the data with the linear least square method, it is not devoid of mathematical inaccuracy.

There are many ways of estimating the value of the scaling exponent for a power-law tail, however not all of them yield unbiased and consistent answers.

Alternative methods are often based on making a linear regression on either the log–log probability, the log–log cumulative distribution function, or on log-binned data, but these approaches should be avoided as they can all lead to highly biased estimates of the scaling exponent.

[10] Further, this comprehensive review article provides usable code (Matlab, Python, R and C++) for estimation and testing routines for power-law distributions.

Another method for the estimation of the power-law exponent, which does not assume independent and identically distributed (iid) data, uses the minimization of the Kolmogorov–Smirnov statistic,

Use of cumulative frequency has some advantages, e.g. it allows one to put on the same diagram data gathered from sample lines of different lengths at different scales (e.g. from outcrop and from microscope).

[citation needed] For example, Gibrat's law about proportional growth processes produce distributions that are lognormal, although their log–log plots look linear over a limited range.

[73] Stumpf & Porter (2012) proposed plotting the empirical cumulative distribution function in the log-log domain and claimed that a candidate power-law should cover at least two orders of magnitude.

As a solution to this problem, Diaz[57] proposed a graphical methodology based on random samples that allow visually discerning between different types of tail behavior.

However, Stumpf & Porter (2012) claimed the need for both a statistical and a theoretical background in order to support a power-law in the underlying mechanism driving the data generating process.

As such, the validation of power-law claims remains a very active field of research in many areas of modern science.