Probability bounds analysis is used to make arithmetic and logical calculations with p-boxes.
The bounds may have almost any shape, including step functions, so long as they are monotonically increasing and do not cross each other.
P-boxes are specified by left and right bounds on the distribution function (or, equivalently, the survival function) of a quantity and, optionally, additional information constraining the quantity's mean and variance to specified intervals, and specified constraints on its distributional shape (family, unimodality, symmetry, etc.).
Calculations with p-boxes, unlike credal sets, are often quite efficient, and algorithms for all standard mathematical functions are known.
Even when these ancillary constraints are vacuous, there may still be nontrivial bounds on the mean and variance that can be inferred from the left and right edges of the p-box.
These define distribution-free p-boxes because they make no assumption whatever about the family or shape of the uncertain distribution.
[6] When all members of a population can be measured, or when random sample data are abundant, analysts often use an empirical distribution to summarize the values.
When those data have non-negligible measurement uncertainty represented by interval ranges about each sample value, an empirical distribution may be generalized to a p-box.
Interval measurements can also be used to generalize distributional estimates based on the method of matching moments or maximum likelihood, that make shape assumptions such as normality or lognormality, etc.
There may be uncertainty about the shape of a probability distribution because the sample size of the empirical data characterizing it is small.
Several methods in traditional statistics have been proposed to account for this sampling uncertainty about the distribution shape, including Kolmogorov–Smirnov[9] and similar[10] confidence bands, which are distribution-free in the sense that they make no assumption about the shape of the underlying distribution.
There are related confidence-band methods that do make assumptions about the shape or family of the underlying distribution, which can often result in tighter confidence bands.
A confidence band about a distribution function is sometimes used as a p-box even though it represents statistical rather than rigorous or sure bounds.
[18][19][16] They characterize the inferential uncertainty about the estimate in the form of a collection of focal intervals (or sets), each with associated confidence (probability) mass.
This collection can be depicted as a p-box and can project the confidence interpretation through probability bounds analysis.
C-boxes can be computed in a variety of ways directly from random sample data.
There are confidence boxes for both parametric problems where the family of the underlying distribution from which the data were randomly generated is known (including normal, lognormal, exponential, Bernoulli, binomial, Poisson), and nonparametric problems in which the shape of the underlying distribution is unknown.
[20] Confidence boxes account for the uncertainty about a parameter that comes from the inference from observations, including the effect of small sample size, but also potentially the effects of imprecision in the data and demographic uncertainty which arises from trying to characterize a continuous parameter from discrete data observations.
They are analogous to Bayesian posterior distributions in that they characterize the inferential uncertainty about statistical parameters estimated from sparse or imprecise sample data, but they can have a purely frequentist interpretation that makes them useful in engineering because they offer a guarantee of statistical performance through repeated use.
In the case of the Bernoulli or binomial rate parameter, the c-box is mathematically equivalent to Walley's imprecise beta model[22][23] with the parameter s=1, which is a special case of the imprecise Dirichlet process, a central idea in robust Bayes analysis.
Since that time, formulas and algorithms for sums have been generalized and extended to differences, products, quotients and other binary and unary functions under various dependence assumptions.
Precise probability distributions and intervals are special cases of p-boxes, as are real values and integers.
Mathematically, a probability distribution F is the degenerate p-box {F, F, E(F), V(F), F}, where E and V denote the expectation and variance operators.
Its p-box looks like a rectangular box whose upper and lower bounds jump from zero to one at the endpoints of the interval.
P-boxes and probability bounds analysis have been used in many applications spanning many disciplines in engineering and environmental science, including: No internal structure.
To achieve computational efficiency, p-boxes lose information compared to more complex Dempster–Shafer structures or credal sets.
Some critics of p-boxes argue that precisely specified probability distributions are sufficient to characterize uncertainty of all kinds.
Under this criticism, users of p-boxes have simply not made the requisite effort to identify the appropriate precisely specified distribution functions.