Data (/ˈdeɪtə/ DAY-tə, US also /ˈdætə/ DAT-ə) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally.
[3] Data are commonly used in scientific research, economics, and virtually every other form of human organizational activity.
Data are collected using techniques such as measurement, observation, query, or analysis, and are typically represented as numbers or characters that may be further processed.
The stock of insights and intelligence that accumulate over time resulting from the synthesis of data into information, can then be described as knowledge.
[4][5] Data, as a general concept, refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing.
Using traditional data analysis methods and computing, working with such large (and growing) datasets is difficult, even impossible.
The Latin word data is the plural of datum, "(thing) given," and the neuter past participle of dare, "to give".
This usage is common in everyday language and in technical and scientific fields such as software development and computer science.
[7] Data, information, knowledge, and wisdom are closely related concepts, but each has its role concerning the other, and each term has its meaning.
For example, the entry in a database specifying the height of Mount Everest is a datum that communicates a precisely-measured value.
This measurement may be included in a book along with other data on Mount Everest to describe the mountain in a manner useful for those who wish to decide on the best method to climb it.
[10] Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation.
In the 2010s, computers were widely used in many fields to collect data and sort or process it, in disciplines ranging from marketing, analysis of social service usage by citizens to scientific research.
[14] This kind of data can come from a variety of sources, including: subscriptions, preference centers, quizzes, surveys, pop-up forms, and interactive digital experiences.
[20] The latter offers an articulate method of collecting, classifying, and analyzing data using five possible angles of analysis (at least three) to maximize the research's objectivity and permit an understanding of the phenomena under investigation as complete as possible: qualitative and quantitative methods, literature reviews (including scholarly articles), interviews with experts, and computer simulation.
Scientific publishers and libraries have been struggling with this problem for a few decades, and there is still no satisfactory solution for the long-term storage of data over centuries or even for eternity.
[21] Similarly, a survey of 100 datasets in Dryad found that more than half lacked the details to reproduce the research results from these studies.
Peter Checkland introduced the term capta (from the Latin capere, "to take") to distinguish between an immense number of possible data and a sub-set of them, to which attention is oriented.
[24] Johanna Drucker has argued that since the humanities affirm knowledge production as "situated, partial, and constitutive," using data may introduce assumptions that are counterproductive, for example that phenomena are discrete or are observer-independent.
[25] The term capta, which emphasizes the act of observation as constitutive, is offered as an alternative to data for visual representations in the humanities.