[1] Fant built off the work of Tsutomu Chiba and Masato Kajiyama, who first showed the relationship between a vowel's acoustic properties and the shape of the vocal tract.
By creating models of the vocal tract using X-ray photography, they were able to predict the formant frequencies of different vowels, establishing a relationship between the two.
Voiced sounds (e.g., vowels) have at least one source due to mostly periodic glottal excitation, which can be approximated by an impulse train in the time domain and by harmonics in the frequency domain, and a filter that depends on, for example, tongue position and lip protrusion.
[3] On the other hand, fricatives, such as [s] and [f], have at least one source due to turbulent noise produced at a constriction in the oral cavity or pharynx.
[4] The filter is the rest of the vocal tract, which can change shape through manipulation of the pharynx, mouth, and nasal cavity.