Dimension (data warehouse)

A dimension is a structure that categorizes facts and measures in order to enable users to answer business questions.

In a data warehouse, dimensions provide structured labeling information to otherwise unordered numeric measures.

A common data warehouse example involves sales as the measure, with customer and product as dimensions.

These dimension can be classified in types:[3] A conformed dimension is a set of data attributes that have been physically referenced in multiple database tables using the same key value to refer to the same structure, attributes, domain values, definitions and concepts.

Most important, the row headers produced in two different answer sets from the same conformed dimension(s) must be able to match perfectly.'

At the most basic level, conformed dimensions mean exactly the same thing with every possible fact table to which they are joined.

By creating an abstract dimension, these flags and indicators are removed from the fact table while placing them into a useful dimensional framework.

The nature of these attributes is usually text or various flags, e.g. non-generic comments or just simple yes/no or true/false indicators.

To limit the size of the tables, multiple junk dimensions might be appropriate in other situations depending on the correlation between various indicators.

Junk dimensions are also appropriate for placing attributes like non-generic comments from the fact table.

Such attributes might consist of data from an optional comment field when a customer places an order and as a result will probably be blank in many cases.

Degenerate dimensions often play an integral role in the fact table's primary key.

The date dimension can include other attributes like the week of the year, or flags representing work days, holidays, etc.

For example, using the National Information Exchange Model (NIEM) the data element name would be "PersonGenderCode" and the enumerated values might be "male", "female" and "unknown".

The goal of a dimension table is to create standardized, conformed dimensions that can be shared across the enterprise's data warehouse environment, and enable joining to multiple fact tables representing various business processes.

Strategies for dealing with this kind of change are divided into three categories: Source:[12] Since many fact tables in a data warehouse are time series of observations, one or more date dimensions are often needed.

One of the reasons to have date dimensions is to place calendar knowledge in the data warehouse instead of hard-coded in an application.

If a high amount of detail is needed it is usually a good idea to split date and time into two or more separate dimensions.

A dimension table in an OLAP cube with a star schema