Staging (data)

There are staging area architectures, however, which are designed to hold data for extended periods of time for archival or troubleshooting purposes.

[3] In performing this function the staging area acts as a large "bucket" in which data from multiple source systems can be temporarily placed for further processing.

[6] The staging area and ETL processes it supports are often designed with a goal of minimizing contention within source systems.

The former method takes advantage of technical efficiencies, such as data streaming technologies, reduced overhead through minimizing the need to break and re-establish connections to source systems and optimization of concurrency lock management on multi-user source systems.

The ETL process utilizing the staging area can be used to implement business logic to identify and handle "invalid" data.

In this scenario the staging area can be used to maintain historical records during the load process, or it can be used to push data into a target archive structure.

Additionally data may be maintained within the staging area for extended periods of time to support technical troubleshooting of the ETL process.