Ambisonic data exchange formats

Data exchange formats for Ambisonics have undergone radical changes since the early days of four-track magnetic tape.

Researchers working on very high-order systems found no straightforward way to extend the traditional formats to suit their needs.

Furthermore, there was no widely accepted formulation of spherical harmonics for acoustics, so one was borrowed from chemistry, quantum mechanics, computer graphics, or other fields, each of which had subtly different conventions.

This led to an unfortunate proliferation of mutually incompatible ad hoc formats and much head-scratching.

This page attempts to document the different existing formats, their rationales and history, for the terminally curious and those unfortunate enough to have to deal with them in detail.

For successful exchange of Ambisonic material, some software requires the sender and receiver have to agree on the ordering of the components, their normalisation or weighting, and the relative polarity of the harmonics.

Since it is possible to omit parts of the spherical harmonic multipole expansion for content that has non-uniform, direction-dependent resolution (known as mixed-order), it might also be necessary to define how to deal with missing components.

In the case of transmission "by wire", be it an actual digital multichannel link or any number of virtual patchcords within an audio processing engine, these properties must be explicitly matched on both ends, since there is usually no provision for metadata exchange and parameter negotiation.

The first is Furse-Malham higher-order format, which is an extension of traditional B-Format, and the more modern SN3D, in ACN channel order.

For higher orders, this precedent becomes awkward, because spherical harmonics are most intuitively arranged in symmetric fashion around the single z-rotationally symmetric member m=0 of each order, with the horizontal sine terms m<0 to the left, and the cosine terms m>0 to the right (see illustration).

) begin with their z-rotationally symmetric member and then jump outward right and left (see table), with the horizontal components at the end.

[note 1] He implied yet another channel ordering, subsequently developed into an explicit proposal called SID for Single Index Designation[4] which was adopted by a number of researchers.

For future higher-order systems, adoption of the Ambisonic Channel Number (ACN)[5] has reached wide consensus.

For successful reconstruction of the sound field, it is important to agree on a normalisation method for the spherical harmonic components.

The following approaches are common: The maxN scheme by Daniel normalizes each single component to never exceed a gain of 1.0 for a panned monophonic source.

], it has significant engineering advantages in that it restricts the maximum levels a panned mono source will generate in some of the higher-order channels.

Originally introduced into Ambisonic use by Daniel, he notes: "High degree of generality - the encoding coefficients are recursively computable, and the first-order components are unity vectors in their respective directions of incidence".

As N3D and SN3D differ only by scaling factors, care is needed when working with both, as it may not be obvious on first listening if an error has been made, particularly on a system with a small number of speakers.

This has practical advantages for fixed-point media in the common situation where sources are concentrated on the horizontal plane, but the normalisation is somewhat arbitrary and its assumptions do not hold for strongly diffuse soundfields and sound scenes with strong elevated sources.

A third complication arises from the quantum mechanical formulation of spherical harmonics, which was adopted by some Ambisonics researchers.

, a convention called Condon–Shortley phase, which will invert the relative polarity of every other component within a given Ambisonic order.

The presence of Condon–Shortley phase in parts of the signal chain usually manifests itself in erratic panning behaviour and increasing apparent source width when going to higher orders, which can be somewhat difficult to diagnose and much harder to eliminate.

Polarity is generally only a concern when trying to reconcile theoretical formulations of the spherical harmonics from other academic disciplines.

For both of these encodings, the equations can be expressed directly, without separate normalisation or conversion factors, and there is no ambiguity around ordering.

For file-based storage and transmission, additional properties need to be defined, such as the base file format and, if desired, accompanying metadata.

From its parent, it inherits a maximum file size of 4GB, which is a serious limitation for live recording in higher orders.

This makes it possible to identify traditional #H#P mixed-order content by the number of channels present, as per the following table:[15] The free and open source C library libsndfile has included .amb support since 2007.

The basic format of AmbiX mandates a complete full-sphere signal set, the order of which can be uniquely and trivially deduced from the number of channels.

Additionally, the header now contains an adaptor matrix of coefficients, which needs to be applied to the data streams before they can be played back.

AmbiX was originally proposed at the Ambisonic Symposium 2011, building upon previous work by Travis[17] and Chapman et al.[5]

Spherical Harmonics up to Ambisonic order 5 as commonly displayed, sorted by increasing Ambisonic Channel Number (ACN), aligned for symmetry.