Prosody (linguistics)

In linguistics, prosody (/ˈprɒsədi, ˈprɒz-/)[1][2] is the study of elements of speech, including intonation, stress, rhythm and loudness, that occur simultaneously with individual phonetic segments: vowels and consonants.

[5] Prosodic features are suprasegmental, since they are properties of units of speech that are defined over groups of sounds rather than single segments.

[6] When talking about prosodic features, it is important to distinguish between the personal characteristics that belong to an individual's voice (for example, their habitual pitch range, intonation patterns, etc.)

and the independently variable prosodic features that are used contrastively to communicate meaning (for example, the use of changes in pitch to indicate the difference between statements and questions).

The defining features of prosody that display the nuanced emotions of an individual differ across languages and cultures.

English makes use of changes in key; shifting one's intonation into the higher or lower part of one's pitch range is believed to be meaningful in certain contexts.

Prosody has had a number of perceptually significant functions in English and other languages, contributing to the recognition and comprehension of speech.

[16] It is believed that prosody assists listeners in parsing continuous speech and in the recognition of words, providing cues to syntactic structure, grammatical boundaries and sentence type.

Boundaries between intonation units are often associated with grammatical or syntactic boundaries; these are marked by such prosodic features as pauses and slowing of tempo, as well as "pitch reset" where the speaker's pitch level returns to the level typical of the onset of a new intonation unit.

A well-known example is the ambiguous sentence "I never said she stole my money", where there are seven meaning changes depending on which of the seven words is vocally highlighted.

However, when the speaker varies their speech intentionally, for example to indicate sarcasm, this usually involves the use of prosodic features.

The most useful prosodic feature in detecting sarcasm is a reduction in the mean fundamental frequency relative to other speech for humor, neutrality, or sincerity.

[27] Emotional prosody was considered by Charles Darwin in The Descent of Man to predate the evolution of human language: "Even monkeys express strong feelings in different tones – anger and impatience by low, – fear and pain by high notes.

[29] In typical conversation (no actor voice involved), the recognition of emotion may be quite low, of the order of 50%, hampering the complex interrelationship function of speech advocated by some authors.

[30] However, even if emotional expression through prosody cannot always be consciously recognized, tone of voice may continue to have subconscious effects in conversation.

A study by Marc D. Pell revealed that 600 ms of prosodic information is necessary for listeners to be able to identify the affective tone of the utterance.

Adults, especially caregivers, speaking to young children tend to imitate childlike speech by using higher and more variable pitch, as well as an exaggerated stress.

These prosodic characteristics are thought to assist children in acquiring phonemes, segmenting words, and recognizing phrasal boundaries.

Aprosody is often accompanied by the inability to properly utilize variations in speech, particularly with deficits in the ability to accurately modulate pitch, loudness, intonation, and rhythm of word formation.

[36] Phrasal prosody refers to the rhythm and tempo of phrases, often in an artistic setting such as music or poetry, but not always.

They're also seen to struggle with the identification and discrimination of semantically neutral sentences with varying tones of happiness, sadness, anger, and indifference, exemplifying the importance of prosody in language comprehension and production.

Producing these nonverbal elements requires intact motor areas of the face, mouth, tongue, and throat.

Damage to areas 44/45, specifically on the right hemisphere, produces motor aprosodia, with the nonverbal elements of speech being disturbed (facial expression, tone, rhythm of voice).

The right Brodmann area 22 aids in the interpretation of prosody, and damage causes sensory aprosodia, with the patient unable to comprehend changes in voice and body language.

Visualization of the prosody of a male voice saying "speech prosody": pitch in ribbon height, and periodic energy in ribbon width and darkness.
Audio for the visualization above.