Pitch contour may include multiple sounds utilizing many pitches, and can relate the frequency function at one point in time to the frequency function at a later point.
One of the primary challenges in speech synthesis technology, particularly for non-tonal languages, is to create a natural-sounding pitch contour for the utterance as a whole.
Pure tones have a clear pitch, but complex sounds such as speech and music typically have intense peaks at many different frequencies.
Nevertheless, by establishing a fixed reference point in the frequency function of a complex sound, and then observing the movement of this reference point as the function translates, one can generate a meaningful pitch contour consistent with human experience.
When a person speaks a sentence involving multiple [e] sounds, the peaks will shift within these ranges, and the movement of the peaks between two instances establishes the difference in their values on the pitch contour.