Data envelopment analysis

Data envelopment analysis (DEA) is a nonparametric method in operations research and economics for the estimation of production frontiers.

[1] DEA has been applied in a large range of fields including international banking, economic sustainability, police department operations, and logistical applications[2][3][4] Additionally, DEA has been used to assess the performance of natural language processing models, and it has found other applications within machine learning.

[5][6][7] DEA is used to empirically measure productive efficiency of decision-making units (DMUs).

Although DEA has a strong link to production theory in economics, the method is also used for benchmarking in operations management, whereby a set of measures is selected to benchmark the performance of manufacturing and service operations.

[8] In benchmarking, the efficient DMUs, as defined by DEA, may not necessarily form a “production frontier”, but rather lead to a “best-practice frontier.”[1][9]: 243–285 In contrast to parametric methods that require the ex-ante specification of a production- or cost-function, non-parametric approaches compare feasible input and output combinations based on the available data only.

[10] DEA, one of the most commonly used non-parametric methods, owes its name to its enveloping property of the dataset's efficient DMUs, where the empirically observed, most efficient DMUs constitute the production frontier against which all DMUs are compared.

DEA's popularity stems from its relative lack of assumptions, the ability to benchmark multi-dimensional inputs and outputs as well as its computational ease owing to it being expressable as a linear program, despite its task to calculate efficiency ratios.

[11] Building on the ideas of Farrell,[12] the 1978 work "Measuring the efficiency of decision-making units" by Charnes, Cooper & Rhodes[1] applied linear programming to estimate, for the first time, an empirical, production-technology frontier.

Starting with the CCR model, named after Charnes, Cooper, and Rhodes,[1] many extensions to DEA have been proposed in the literature.

They range from adapting implicit model assumptions such as input and output orientation, distinguishing technical and allocative efficiency,[13] adding limited disposability[14] of inputs/outputs or varying returns-to-scale[15] to techniques that utilize DEA results and extend them for more sophisticated analyses, such as stochastic DEA[16] or cross-efficiency analysis.

[17] In a one-input, one-output scenario, efficiency is merely the ratio of output over input that can be produced, while comparing several entities/DMUs based on it is trivial.

However, when adding more inputs or outputs the efficiency computation becomes more complex.

Charnes, Cooper, and Rhodes (1978)[1] in their basic DEA model (the CCR) define the objective function to find

, no efficiency score exceeds one: and all inputs, outputs and weights have to be non-negative.

Because this optimization problem's dimensionality is equal to the sum of its inputs and outputs, selecting the smallest number of inputs/outputs that collectively, accurately capture the process one attempts to characterize is crucial.

And because the production frontier envelopment is done empirically, several guidelines exist on the minimum required number of DMUs for good discriminatory power of the analysis, given homogeneity of the sample.

This minimum number of DMUs varies between twice the sum of inputs and outputs (

Some advantages of the DEA approach are: Some of the disadvantages of DEA are: Assume that we have the following data: To calculate the efficiency of unit 1, we define the objective function (OF) as which is subject to (ST) all efficiency of other units (efficiency cannot be larger than 1): and non-negativity: A fraction with decision variables in the numerator and denominator is nonlinear.

The new formulation would be: A desire to improve upon DEA by reducing its disadvantages or strengthening its advantages has been a major cause for discoveries in the recent literature.

The currently most often DEA-based method to obtain unique efficiency rankings is called "cross-efficiency."

Originally developed by Sexton et al. in 1986,[17] it found widespread application ever since Doyle and Green's 1994 publication.

[18] Cross-efficiency is based on the original DEA results, but implements a secondary objective where each DMU peer-appraises all other DMU's with its own factor weights.

This approach avoids DEA's disadvantages of having multiple efficient DMUs and potentially non-unique weights.