Validity (statistics)

Validity is the main extent to which a concept, conclusion, or measurement is well-founded and likely corresponds accurately to the real world.

In logic, validity refers to the property of an argument whereby if the premises are true then the truth of the conclusion follows by necessity.

This is why "scientific or statistical validity" is a claim that is qualified as being either strong or weak in its nature, it is never necessary nor certainly true.

Validity is important because it can help determine what types of tests to use, and help to ensure researchers are using methods that are not only ethical and cost-effective, but also those that truly measure the ideas or constructs in question.

For example, does an IQ questionnaire have items covering all areas of intelligence discussed in the scientific literature?

[8] Before going to the final administration of questionnaires, the researcher should consult the validity of items against each of the constructs or variables and accordingly modify measurement instruments on the basis of SME's opinion.

A test has content validity built into it by careful selection of which items to include (Anastasi & Urbina, 1997).

Items are chosen so that they comply with the test specification which is drawn up through a thorough examination of the subject domain.

Face validity is a starting point, but should never be assumed to be probably valid for any given purpose, as the "experts" have been wrong before—the Malleus Malificarum (Hammer of Witches) had no support for its conclusions other than the self-imagined competence of two "experts" in "witchcraft detection", yet it was used as a "test" to condemn and burn at the stake tens of thousands men and women as "witches".

High correlation between ex-ante predicted and ex-post actual outcomes is the strongest proof of validity.

A major factor in this is whether the study sample (e.g. the research participants) are representative of the general population along relevant dimensions.

This issue is closely related to external validity but covers the question of to what degree experimental findings mirror what can be observed in the real world (ecology = the science of interaction between organism and its environment).

To be ecologically valid, the methods, materials and setting of a study must approximate the real-life situation that is under investigation.

But sometimes, ethical and/or methological restrictions prevent you from conducting an experiment (e.g. how does isolation influence a child's cognitive functioning?).

On first glance, internal and external validity seem to contradict each other – to get an experimental design you have to control for all interfering variables.

On the other hand, with observational research you can not control for interfering variables (low internal validity) but you can measure in the natural (ecological) environment, at the place where behavior normally occurs.

The question of whether results from a particular study generalize to other people, places or times arises only when one follows an inductivist research strategy.

Furthermore, conflating research goals with validity concerns can lead to the mutual-internal-validity problem, where theories are able to explain only phenomena in artificial laboratory settings but not the real world.

In this context:[15] Robins and Guze proposed in 1970 what were to become influential formal criteria for establishing the validity of psychiatric diagnoses.

Kendler in 1980 distinguished between:[15] Nancy Andreasen (1995) listed several additional validators – molecular genetics and molecular biology, neurochemistry, neuroanatomy, neurophysiology, and cognitive neuroscience – that are all potentially capable of linking symptoms and diagnoses to their neural substrates.

On this basis, he argues that a Robins and Guze criterion of "runs in the family" is inadequately specific because most human psychological and physical traits would qualify - for example, an arbitrary syndrome comprising a mixture of "height over 6 ft, red hair, and a large nose" will be found to "run in families" and be "hereditary", but this should not be considered evidence that it is a disorder.

Perri and Lichtenwald (2010) provide a starting point for a discussion about a wide range of reliability and validity topics in their analysis of a wrongful murder conviction.