Additive synthesis

[1][2] The timbre of musical instruments can be considered in the light of Fourier theory to consist of multiple harmonic or inharmonic partials or overtones.

[4] In other words, the fundamental frequency alone is responsible for the pitch of the note, while the overtones define the timbre of the sound.

Additive synthesis aims to exploit this property of sound in order to construct timbre from the ground up.

As a result, only a finite number of sinusoidal terms with frequencies that lie within the audible range are modeled in additive synthesis.

The Fourier series of a periodic function is mathematically expressed as: where Being inaudible, the DC component,

See Media help In the general case, the instantaneous frequency of a sinusoid is the derivative (with respect to time) of the argument of the sine or cosine function.

By careful consideration of the DFT frequency-domain representation it is also possible to efficiently synthesize sinusoids of arbitrary frequencies using a series of overlapping frames and the inverse fast Fourier transform.

One method of decomposing a sound into time varying sinusoidal partials is short-time Fourier transform (STFT)-based McAulay-Quatieri Analysis.

[17][18] By modifying the sum of sinusoids representation, timbral alterations can be made prior to resynthesis.

[21] Software that implements additive analysis/resynthesis includes: SPEAR,[22] LEMUR, LORIS,[23] SMSTools,[24] ARSS.

[25] New England Digital Synclavier had a resynthesis feature where samples could be analyzed and converted into "timbre frames" which were part of its additive synthesis engine.

Also a vocal synthesizer, Vocaloid have been implemented on the basis of additive analysis/resynthesis: its spectral voice model called Excitation plus Resonances (EpR) model[26][27] is extended based on Spectral Modeling Synthesis (SMS), and its diphone concatenative synthesis is processed using spectral peak processing (SPP)[28] technique similar to modified phase-locked vocoder[29] (an improved phase vocoder for formant processing).

[30] Using these techniques, spectral components (formants) consisting of purely harmonic partials can be appropriately transformed into desired form for sound modeling, and sequence of short samples (diphones or phonemes) constituting desired phrase, can be smoothly connected by interpolating matched partials and formant peaks, respectively, in the inserted transition region between different samples.

In linguistics research, harmonic additive synthesis was used in the 1950s to play back modified and synthetic speech spectrograms.

[31] Later, in the early 1980s, listening tests were carried out on synthetic speech stripped of acoustic cues to assess their significance.

Time-varying formant frequencies and amplitudes derived by linear predictive coding were synthesized additively as pure tone whistles.

[32][33] Also the composite sinusoidal modeling (CSM)[34][35] used on a singing speech synthesis feature on the Yamaha CX5M (1984), is known to use a similar approach which was independently developed during 1966–1979.

[38][39][40][41] Harmonic analysis was discovered by Joseph Fourier,[42] who published an extensive treatise of his research in the context of heat transfer in 1822.

Around 1876,[44] William Thomson (later ennobled as Lord Kelvin) constructed a mechanical tide predictor.

The resulting Fourier coefficients were input into the synthesizer, which then used a system of cords and pulleys to generate and sum harmonic sinusoidal partials for prediction of future tides.

[47] The synthesizer drew a graph of the combination waveform, which was used chiefly for visual validation of the analysis.

The line of work was greatly advanced by Hermann von Helmholtz, who published his eight years worth of research in 1863.

[48] Helmholtz believed that the psychological perception of tone color is subject to learning, while hearing in the sensory sense is purely physiological.

[47] Helmholtz agreed with the finding of Ernst Chladni from 1787 that certain sound sources have inharmonic vibration modes.

[50] For harmonic synthesis, Koenig also built a large apparatus based on his wave siren.

It was pneumatic and utilized cut-out tonewheels, and was criticized for low purity of its partial tones.

[44] In 1938, with significant new supporting evidence,[51] it was reported on the pages of Popular Science Monthly that the human vocal cords function like a fire siren to produce a harmonic-rich tone, which is then filtered by the vocal tract to produce different vowel tones.

[53] In a 1940 Institute of Radio Engineers meeting, the head field engineer of Hammond elaborated on the company's new Novachord as having a "subtractive system" in contrast to the original Hammond organ in which "the final tones were built up by combining sound waves".

[54] Alan Douglas used the qualifiers additive and subtractive to describe different types of electronic organs in a 1948 paper presented to the Royal Musical Association.

[57] The following is a timeline of historically and technologically notable analog and digital synthesizers and devices implementing additive synthesis.

Schematic diagram of additive synthesis. The inputs to the oscillators are frequencies and amplitudes .
Sinusoidal analysis/synthesis system for Sinusoidal Modeling (based on McAulay & Quatieri 1988 , p. 161) [ 16 ]