Speech tempo

While most people seem to believe that they can judge how quickly someone is speaking, it is generally said that subjective judgements and opinions cannot serve as scientific evidence for statements about speech tempo; John Laver has written that analyzing tempo can be "dangerously open to subjective bias ... listeners' judgements rapidly begin to lose objectivity when the utterance concerned comes either from an unfamiliar accent or ... from an unfamiliar language".

[3] The problem with this approach is that the researcher must be clear as to whether the "sounds" s/he is counting are phonemes or physically observable phonetic units (sometimes called "phones").

As an example, the utterance 'Don't forget to record it' might in slow, careful speech be pronounced /dəʊnt fəget tə rɪkɔːd ɪt/, with 19 phonemes, each of which is phonetically realized.

If we are counting only units that can be observed and measured, it is clear that at faster speeds of utterance the number of sounds produced per second does not necessarily increase.

[8] His system, which uses terms mostly borrowed from musical usage, allows for simple variation away from normal in tempo, where monosyllables may be pronounced as "clipped", "drawled" or "held" and polysyllabic utterances may be spoken at "allegro", "allegrissimo", "lento" and "lentissimo".

He cites from his corpus-based analysis instances of increased tempo in cases of speakers' self-corrections of speech errors, and in citing embedded material in the form of titles and names, e.g. "I'm sorry, but we won't be able to start So you think you know what's happening for a few moments" and "This is the I'll show you a picture and you tell me what it is technique" (where the italicized text is spoken at faster tempo).

[11] The study by Kowal et al., referred to above, comparing story-telling with speaking in an interview, looked at English, Finnish, French, German and Spanish.

[12] From the point of view of the perception of tempo differences between languages, Vaane used spoken Dutch, English, French, Spanish and Arabic produced at three different rates and found that untrained and phonetically trained listeners performed equally well at judging the rate of speaking for familiar and unfamiliar languages.