Root cause analysis

Many techniques can be used for this purpose, ranging from good practices in design to analyzing in detail problems that have already occurred and taking actions to make sure they never recur.

The real root cause could be a design issue if there is no filter to prevent the metal scrap getting into the system.

Or if it has a filter that was blocked due to a lack of routine inspection, then the real root cause is a maintenance issue.

Compare this with an investigation that does not find the root cause: replacing the fuse, the bearing, or the lubrication pump will probably allow the machine to go back into operation for a while.

[7][8] As an unrelated example of the conclusions that can be drawn in the absence of the cost/benefit analysis, consider the tradeoff between some claimed benefits of population decline: In the short term there will be fewer payers into pension/retirement systems; whereas halting the population will require higher taxes to cover the cost of building more schools.

In aircraft accident analyses, for example, the conclusions of the investigation and the root causes that are identified must be backed up by documented evidence.

The next step is to trigger long-term corrective actions to address the root cause identified during RCA, and make sure that the problem does not resurface.

Instead a mixture of debugging, event based detection and monitoring systems (where the services are individually modelled) is normally supporting the analysis.

Training and supporting tools like simulation or different in-depth runbooks for all expected scenarios do not exist, instead they are created after the fact based on issues seen as 'worthy'.

As a result the analysis is often limited to those things that have monitoring/observation interfaces and not the actual planned/seen function with focus on verification of inputs and outputs.

In the domains of health and safety, RCA is routinely used in medicine (diagnosis) and epidemiology (e.g., to identify the source of an infectious disease), where causal inference methods often require both clinical and statistical expertise to make sense of the complexities of the processes.

[13] In the manufacture of medical devices,[14] pharmaceuticals,[15] food,[16] and dietary supplements,[17] root cause analysis is a regulatory requirement.

Without delving in the idiosyncrasies of specific problems, several general conditions can make RCA more difficult than it may appear at first sight.

In telecommunications, for instance, distributed monitoring systems typically manage between a million and a billion events per day.

Switching vendors may have been due to management's desire to save money, and a failure to consult with engineering staff on the implication of the change on maintenance procedures.

Example of a root cause analysis method