Z-factor

The Z-factor is a measure of statistical effect size.

It has been proposed for use in high-throughput screening (HTS), where it is also known as Z-prime,[1] to judge whether the response in a particular assay is large enough to warrant further attention.

In HTS, experimenters often compare a large number (hundreds of thousands to tens of millions) of single measurements of unknown samples to positive and negative control samples.

The particular choice of experimental conditions and measurements is called an assay.

Therefore, prior to starting a large screen, smaller test (or pilot) screens are used to assess the quality of an assay, in an attempt to predict if it would be useful in a high-throughput setting.

The Z-factor is an attempt to quantify the suitability of a particular assay for use in a full-scale HTS.

The Z-factor is defined in terms of four parameters: the means (

), the Z-factor is defined as: For assays of agonist/activation type, the control (c) data (

) which represent maximal activated signal; for assays of antagonist/inhibition type, the control (c) data (

In practice, the Z-factor is estimated from the sample means and sample standard deviations The Z'-factor (Z-prime factor) is defined in terms of four parameters: the means (

The Z-factor defines a characteristic parameter of the capability of hit identification for each given assay.

The following categorization of HTS assay quality by the value of the Z-Factor is a modification of Table 1 shown in Zhang et al. (1999);[2] note that the Z-factor cannot exceed one.

Note that by the standards of many types of experiments, a zero Z-factor would suggest a large effect size, rather than a borderline useless result as suggested above.

Extreme conservatism is used in high throughput screening due to the large number of tests performed.

The constant factor 3 in the definition of the Z-factor is motivated by the normal distribution, for which more than 99% of values occur within three times standard deviations of the mean.

If the data follow a strongly non-normal distribution, the reference points (e.g. the meaning of a negative value) may be misleading.

Another issue is that the usual estimates of the mean and standard deviation are not robust; accordingly many users in the high-throughput screening community prefer the "Robust Z-prime" which substitutes the median for the mean and the median absolute deviation for the standard deviation.

[3] Extreme values (outliers) in either the positive or negative controls can adversely affect the Z-factor, potentially leading to an apparently unfavorable Z-factor even when the assay would perform well in actual screening .

[4] In addition, the application of the single Z-factor-based criterion to two or more positive controls with different strengths in the same assay will lead to misleading results .

[6] A recently proposed statistical parameter, strictly standardized mean difference (SSMD), can address these issues.