Mean opinion score

It is the arithmetic mean over all individual "values on a predefined scale that a subject assigns to his opinion of the performance of a system quality".

MOS is a commonly used measure for video, audio, and audiovisual quality evaluation, but not restricted to those modalities.

In general, there is an ongoing debate on the usefulness of the MOS to quantify Quality of Experience in a single scalar value.

Therefore, it is mathematically incorrect to calculate a mean over individual ratings in order to obtain the central tendency; the median should be used instead.

It has been shown that for categorical rating scales (such as ACR), the individual items are not perceived equidistant by subjects.

MOS values gathered from different contexts and test designs therefore should not be directly compared.

Specifically, P.800.2 says:it is not meaningful to directly compare MOS values produced from separate experiments, unless those experiments were explicitly designed to be compared, and even then the data should be statistically analysed to ensure that such a comparison is valid.MOS historically originates from subjective measurements where listeners would sit in a "quiet room" and score a telephone call quality as they perceived it.

This kind of test methodology had been in use in the telephony industry for decades and was standardized in Recommendation ITU-T P.800.

As a result, minimum noticeable MOS differences determined using analytical methods such as in [8] may change over time.