Log–log plot

In science and engineering, a log–log graph or log–log plot is a two-dimensional graph of numerical data that uses logarithmic scales on both the horizontal and vertical axes.

To find the slope of the plot, two points are selected on the x-axis, say x1 and x2.

The formula also provides a negative slope, as can be seen from the following property of the logarithm:

The above procedure now is reversed to find the form of the function F(x) using its (assumed) known log–log plot.

To find the function F, pick some fixed point (x0, F0), where F0 is shorthand for F(x0), somewhere on the straight line in the above graph, and further some other arbitrary point (x1, F1) on the same graph.

In other words, F is proportional to x to the power of the slope of the straight line of its log–log graph.

Specifically, a straight line on a log–log plot containing points (x0, F0) and (x1, F1) will have the function:

Since it is only operating on a definite integral (two defined endpoints), the area A under the plot takes the form

Rearranging the original equation and plugging in the fixed point values, it is found that

Log–log plots are often use for visualizing log-log linear regression models with (roughly) log-normal, or Log-logistic, errors.

This model is useful when dealing with data that exhibits exponential growth or decay, while the errors continue to grow as the independent value grows (i.e., heteroscedastic error).

As above, in a log-log linear model the relationship between the variables is expressed as a power law.

The left plot, titled 'Concave Line with Log-Normal Noise', displays a scatter plot of the observed data (y) against the independent variable (x).

This plot illustrates a dataset with a power-law relationship between the variables, represented by a concave line.

The transformation from the left plot to the right plot in Figure 1 also demonstrates the effect of the log transformation on the distribution of noise in the data.

In the left plot, the noise appears to follow a log-normal distribution, which is right-skewed and can be difficult to work with.

In the right plot, after the log transformation, the noise appears to follow a normal distribution, which is easier to reason about and model.

This normalization of noise is further analyzed in Figure 2, which presents a line plot of three error metrics (Mean Absolute Error - MAE, Root Mean Square Error - RMSE, and Mean Absolute Logarithmic Error - MALE) calculated over a sliding window of size 28 on the x-axis.

Each error metric is represented by a different color, with the corresponding smoothed line overlaying the original line (since this is just simulated data, the error estimation is a bit jumpy).

These error metrics provide a measure of the noise as it varies across different x values.

Log-log linear models are widely used in various fields, including economics, biology, and physics, where many phenomena exhibit power-law behavior.

They are also useful in regression analysis when dealing with heteroscedastic data, as the log transformation can help to stabilize the variance.

where M is the real quantity of money held by the public, R is the rate of return on an alternative, higher yielding asset in excess of that on money, Y is the public's real income, U is an error term assumed to be lognormally distributed, A is a scale parameter to be estimated, and b and c are elasticity parameters to be estimated.

Another economic example is the estimation of a firm's Cobb–Douglas production function, which is the right side of the equation

in which Q is the quantity of output that can be produced per month, N is the number of hours of labor employed in production per month, K is the number of hours of physical capital utilized per month, U is an error term assumed to be lognormally distributed, and A,

However, going in the other direction – observing that data appears as an approximate line on a log–log scale and concluding that the data follows a power law – is not always valid.

[2] In fact, many other functional forms appear approximately linear on the log–log scale, and simply evaluating the goodness of fit of a linear regression on logged data using the coefficient of determination (R2) may be invalid, as the assumptions of the linear regression model, such as Gaussian error, may not be satisfied; in addition, tests of fit of the log–log form may exhibit low statistical power, as these tests may have low likelihood of rejecting power laws in the presence of other true functional forms.

While simple log–log plots may be instructive in detecting possible power laws, and have been used dating back to Pareto in the 1890s, validation as a power laws requires more sophisticated statistics.

[2] These graphs are also extremely useful when data are gathered by varying the control variable along an exponential function, in which case the control variable x is more naturally represented on a log scale, so that the data points are evenly spaced, rather than compressed at the low end.

In chemical kinetics, the general form of the dependence of the reaction rate on concentration takes the form of a power law (law of mass action), so a log-log plot is useful for estimating the reaction parameters from experiment.