Industrial processes, for example chemical or thermodynamic processes in chemical plants, refineries, oil or gas production sites, or power plants, are often represented by two fundamental means: Models can have different levels of detail, for example one can incorporate simple mass or compound conservation balances, or more advanced thermodynamic models including energy conservation laws.
For ease in deriving and implementing an optimal estimation solution, and based on arguments that errors are the sum of many factors (so that the Central limit theorem has some effect), data reconciliation assumes these errors are normally distributed.
Other sources of errors when calculating plant balances include process faults such as leaks, unmodeled heat losses, incorrect physical properties or other physical parameters used in equations, and incorrect structure such as unmodeled bypass lines.
Additional dynamic errors arise when measurements and samples are not taken at the same time, especially lab analyses.
The normal practice of using time averages for the data input partly reduces the dynamic problems.
The result is that, in practice, data reconciliation is mainly making adjustments to correct systematic errors like biases.
PDR started in the early 1960s with applications aiming at closing material balances in production processes where raw measurements were available for all variables.
[3] In the late 1960s and 1970s unmeasured variables were taken into account in the data reconciliation process.,[4][5] PDR also became more mature by considering general nonlinear equation systems coming from thermodynamic models.,[6] ,[7][8] Quasi steady state dynamics for filtering and simultaneous parameter estimation over time were introduced in 1977 by Stanley and Mah.
In other words, one wants to minimize the overall correction (measured in the least squares term) that is needed in order to satisfy the system constraints.
Data reconciliation relies strongly on the concept of redundancy to correct the measurements as little as possible in order to satisfy the process constraints.
Rigorous definitions of observability, calculability, and redundancy, along with criteria for determining it, were established by Stanley and Mah,[10] for these cases with set constraints such as algebraic equations and inequalities.
Next, we illustrate some special cases: Topological redundancy is intimately linked with the degrees of freedom (
Simple counts of variables, equations, and measurements are inadequate for many systems, breaking down for several reasons: (a) Portions of a system might have redundancy, while others do not, and some portions might not even be possible to calculate, and (b) Nonlinearities can lead to different conclusions at different operating points.
In 1981, observability and redundancy criteria were proven for these sorts of flow networks involving only mass and energy balance constraints.
[12] After combining all the plant inputs and outputs into an "environment node", loss of observability corresponds to cycles of unmeasured streams.
and increase their accuracy and precision: on the one hand they reconciled Further, the data reconciliation problem presented above also includes unmeasured variables
Based on information redundancy, estimates for these unmeasured variables can be calculated along with their accuracies.
There are several ways of data filtering, for example taking the average of several measured values over a well-defined time period.
of the probability density function of a chi-square distribution (e.g. the 95th percentile for a 95% confidence) gives an indication of whether a gross error exists: If
The chi square test gives only a rough indication about the existence of gross errors, and it is easy to conduct: one only has to compare the value of the objective function with the critical value of the chi square distribution.
The individual test compares each penalty term in the objective function with the critical values of the normal distribution.
-th penalty term is outside the 95% confidence interval of the normal distribution, then there is reason to believe that this measurement has a gross error.
When adding thermodynamic constraints such as energy balances to the model, its scope and the level of redundancy increases.
Including energy balances means adding equations to the system, which results in a higher level of redundancy (provided that enough measurements are available, or equivalently, not too many variables are unmeasured).
After the reconciliation statistical tests can be applied that indicate whether or not a gross error does exist somewhere in the set of measurements.
It is important to note that the remediation of gross errors reduces the quality of the reconciliation, either the redundancy decreases (elimination) or the uncertainty of the measured data increases (relaxation).
Therefore, it can only be applied when the initial level of redundancy is high enough to ensure that the data reconciliation can still be done (see Section 2,[11]).
PDR finds application mainly in industry sectors where either measurements are not accurate or even non-existing, like for example in the upstream sector where flow meters are difficult or expensive to position (see [13]); or where accurate data is of high importance, for example for security reasons in nuclear power plants (see [14]).
Another field of application is performance and process monitoring (see [15]) in oil refining or in the chemical industry.
As PDR enables to calculate estimates even for unmeasured variables in a reliable way, the German Engineering Society (VDI Gesellschaft Energie und Umwelt) has accepted the technology of PDR as a means to replace expensive sensors in the nuclear power industry (see VDI norm 2048,[11]).