Entity–attribute–value model

This approach is actually quite efficient because the modest amount of attribute metadata for a given class or form that a user chooses to work with can be cached readily in memory.

(Note that while Codd's landmark paper was published in 1970, its heavily mathematical tone had the unfortunate effect of diminishing its accessibility among non-computer-science types and consequently delaying the model's acceptance in IT and software-vendor circles.

Date, Codd's colleague at IBM, in translating these ideas into accessible language, accompanied by simple examples that illustrated their power, cannot be overstated.)

EAV/CR is really characterized by its very detailed metadata, which is rich enough to support the automatic generation of browsing interfaces to individual classes without having to write class-by-class user-interface code.

In the words of Prof. Dr. Daniel Masys (formerly Chair of Vanderbilt University's Medical Informatics Department), the challenges of working with EAV stem from the fact that in an EAV database, the "physical schema" (the way data are stored) is radically different from the "logical schema" – the way users, and many software applications such as statistics packages, regard it, i.e., as conventional rows and columns for individual classes.

(Because an EAV table conceptually mixes apples, oranges, grapefruit and chop suey, if you want to do any analysis of the data using standard off-the-shelf software, in most cases you have to convert subsets of it into columnar form.

Such a tradeoff is generally worthwhile, because in the typical mixed schema of production systems, the data in conventional relational tables can also benefit from functionality such as automatic interface generation.

The correctness of the metadata contents, in terms of the intended system behavior, is critical and the task of ensuring correctness means that, when creating an EAV system, considerable design efforts must go into building user interfaces for metadata editing that can be used by people on the team who know the problem domain (e.g., clinical medicine) but are not necessarily programmers.

(Historically, one of the main reasons why the pre-relational TMR system failed to be adopted at sites other than its home institution was that all metadata was stored in a single file with a non-intuitive structure.

In EMRs that rely on an RDBMS, such as Cerner, which use an EAV approach for their clinical-data subschema, the vast majority of tables in the schema are in fact traditionally modeled, with attributes represented as individual columns rather than as rows.

Commercial electronic health record Systems (EHRs) use row-modeling for classes of data such as diagnoses, surgical procedures performed on and laboratory test results, which are segregated into separate tables.

In each table, the "entity" is a composite of the patient ID and the date/time the diagnosis was made (or the surgery or lab test performed); the attribute is a foreign key into a specially designated lookup table that contains a controlled vocabulary - e.g., ICD-10 for diagnoses, Current Procedural Terminology for surgical procedures, with a set of value attributes.

The typical case for using the EAV model is for highly sparse, heterogeneous attributes, such as clinical parameters in the electronic medical record (EMRs), as stated above.

"Dynamic" means that new classes and attributes need to be continually defined and altered to represent an evolving data model.

This scenario can occur in rapidly evolving scientific fields as well as in ontology development, especially during the prototyping and iterative refinement phases.

While the creation of new tables and columns to represent a new category of data is not especially labor-intensive, the programming of Web-based interfaces that support browsing or basic editing with type- and range-based validation is.

Note that an EAV data model is not essential here, but the system designer may consider it an acceptable alternative to creating, say, sixty or more tables containing a total of not more than two thousand rows.

In the dynamic-attribute scenario, it is worth noting that Resource Description Framework (RDF) is being employed as the underpinning of Semantic-Web-related ontology work.

Generic frameworks that utilize attribute and attribute-grouping metadata address the former but not the latter limitation; their use is more or less mandated in the case of mixed schemas that contain a mixture of conventional-relational and EAV data, where the error quotient can be very significant.

Also, when used as the basis for clinical-data query systems, EAV implementations do not necessarily shield the user from having to specify the class of an object of interest.

Building an application that has to manage data gets extremely complicated when using EAV models, because of the extent of infrastructure that has to be developed in terms of metadata tables and application-framework code.

The fact is that modeling sparse data attributes robustly is a hard database-application-design problem no matter which storage approach is used.

There exist several other approaches for the representation of tree-structured data, be it XML, JSON or other formats, such as the nested set model, in a relational database.

XML is preferable to EAV for arbitrarily hierarchical data that is relatively modest in volume for a single entity: it is not intended to scale up to the multi-gigabyte level with respect to data-manipulation performance.

When such a scenario holds, the use of datatype-specific attribute–value tables that can be indexed by entity, by attribute, and by value and manipulated through simple SQL statements is vastly more scalable than the use of an XML tree structure.

[citation needed] An alternative approach to managing the various problems encountered with EAV-structured data is to employ a graph database.

The issue of table joins are addressed by providing graph-specific query languages, such as Apache TinkerPop,[29] or the OpenCog atomspace pattern matcher.

However, this approach to modelling sparse attributes has several limitations: rival DBMSs have, notably, chosen not to borrow this idea for their own engines.

Limitations include: Many cloud computing vendors offer data stores based on the EAV model, where an arbitrary number of attributes can be associated with a given entity.

In 2010 therefore, Microsoft launched a premium offering, SQL Server Azure, a cloud-accessible, fully-fledged relational engine which allows porting of existing database applications with only modest changes.