[5] POLQA avoids weaknesses of the current P.862 model and is extended towards handling of higher bandwidth audio signals.
POLQA also targets the assessment of speech signals recorded acoustically by an artificial head with mouth and ear simulators.
In May 2010, ITU-T selected candidate models from three companies (OPTICOM, SwissQual / Rohde & Schwarz and TNO (Netherlands Organisation for Applied Scientific Research)).
Basically, the signals are analysed in the frequency domain (in critical bands) after applying masking functions.
Finally, the accumulated distortions in the speech file are mapped into a 1 to 5 quality scale as usual for MOS tests.
POLQA results principally model mean opinion scores (MOS) that cover a scale from 1 (bad) to 5 (excellent).
The inputs to the algorithm are two waveforms represented by two data vectors containing 16 bit PCM samples.
Cognitive aspects are however important when human beings are asked to score the quality of what they can perceive.
This conversion is performed by correcting the Disturbance Density values for situations with: Two further indicators, one for spectral flatness and one for level variations are also calculated in this step.
The perceptual model starts with scaling the reference signal to an ideal average active speech level of approximately -26dBov.
Subsequently small pitch shifts of the degraded signal will be eliminated (Frequency Dewarping).
Now, the spectra will be transformed to a psychoacoustically motivated pitch scale, by combining individual spectral lines (FFT bins) to so-called critical bands.
The result is for each frame of each signal a head-internal representation which indicates roughly how loud each frequency component would be perceived.
Now, a further idealization step of the reference signal takes place by filtering out excessive timbre and low level stationary noise.
At the same time, linear frequency distortions and stationary noise are partially removed from the degraded signal.
A paper which uses POLQA to investigate the impact of tone language and non-native listening on speech quality measurement can be found in.