Subject indexing

Subject indexing is the act of describing or classifying a document by index terms, keywords, or other symbols in order to indicate what different documents are about, to summarize their contents or to increase findability.

Examples of academic indexing services are Zentralblatt MATH, Chemical Abstracts and PubMed.

Automatic indexing follows set processes of analyzing frequencies of word patterns and comparing results to other documents in order to assign to subject categories.

Human indexers focus their attention on certain parts of the document such as the title, abstract, summary and conclusions, as analyzing the full text in depth is costly and time-consuming.

These experts understand controlled vocabularies and are able to find information that cannot be located by full text search.

The cost of expert analysis to create subject indexing is not easily compared to the cost of hardware, software and labor to manufacture a comparable set of full-text, fully searchable materials.

Extraction indexing involves taking words directly from the document.

For example, the term glucose is likely to occur frequently in any document related to diabetes.

Another problem with automated extraction is that it does not recognize when a concept is discussed but is not identified in the text by an indexable keyword.

[5] Since this process is based on simple string matching and involves no intellectual analysis, the resulting product is more appropriately known as a concordance than an index.

Controlled vocabularies do not completely remove inconsistencies as two indexers may still interpret the subject differently.

[2] The final phase of indexing is to present the entries in a systematic order.

Greater exhaustivity gives a higher recall, or more likelihood of all the relevant articles being retrieved, however, this occurs at the expense of precision.

"In order to achieve good consistent indexing, the indexer must have a thorough appreciation of the structure of the subject and the nature of the contribution that the document is making to the advancement of knowledge" (Rowley & Farrow, 2000,[16] p. 99).