Independent component analysis

A common example application of ICA is the "cocktail party problem" of listening in on one person's speech in a noisy room.

When the statistical independence assumption is correct, blind ICA separation of a mixed signal gives very good results.

A simple application of ICA is the "cocktail party problem", where the underlying speech signals are separated from a sample data consisting of people talking simultaneously in a room.

Note that a filtered and delayed signal is a copy of a dependent component, and thus the statistical independence assumption is not violated.

When there are an equal number of observations and source signals, the mixing matrix is square (

Two assumptions: Three effects of mixing source signals: Those principles contribute to the basic establishment of ICA.

We may choose one of many ways to define a proxy for independence, and this choice governs the form of the ICA algorithm.

The non-Gaussianity family of ICA algorithms, motivated by the central limit theorem, uses kurtosis and negentropy.

[8] Typical algorithms for ICA use centering (subtract the mean to create a zero mean signal), whitening (usually with the eigenvalue decomposition), and dimensionality reduction as preprocessing steps in order to simplify and reduce the complexity of the problem for the actual iterative algorithm.

Whitening and dimension reduction can be achieved with principal component analysis or singular value decomposition.

Whitening ensures that all dimensions are treated equally a priori before the algorithm is run.

Well-known algorithms for ICA include infomax, FastICA, JADE, and kernel-independent component analysis, among others.

The problem was shown to have applications in many domains including medical diagnosis, multi-cluster assignment, network tomography and internet resource management.

The above problem can be heuristically solved[10] by assuming variables are continuous and running FastICA on binary observation data to get the mixing matrix

[citation needed] Another method is to use dynamic programming: recursively breaking the observation matrix

Experimental results from[11] show that this approach is accurate under moderate noise levels.

The Generalized Binary ICA framework[12] introduces a broader problem formulation which does not necessitate any knowledge on the generative model.

In other words, this method attempts to decompose a source into its independent components (as much as possible, and without losing any information) with no prior assumption on the way it was generated.

We can use kurtosis to recover the multiple source signal by finding the correct weight vectors with the use of projection pursuit.

The goal of projection pursuit is to maximize the kurtosis, and make the extracted signal as non-normal as possible.

y is a Gaussian random variable of the same covariance matrix as x An approximation for negentropy is A proof can be found in the original papers of Comon;[16][8] it has been reproduced in the book Independent Component Analysis by Aapo Hyvärinen, Juha Karhunen, and Erkki Oja[17] This approximation also suffers from the same problem as kurtosis (sensitivity to outliers).

Like the projection pursuit situation, we can use gradient descent method to find the optimal solution of the unmixing matrix.

Maximum likelihood estimation (MLE) is a standard statistical tool for finding parameter values (e.g. the unmixing matrix

Using ML ICA, the objective is to find an unmixing matrix that yields extracted signals

is far from the correct parameter values then a low probability of the observed data would be expected.

Using MLE, we call the probability of the observed data for a given set of model parameter values (e.g., a pdf

The early general framework for independent component analysis was introduced by Jeanny Hérault and Bernard Ans from 1984,[21] further developed by Christian Jutten in 1985 and 1986,[2][22][23] and refined by Pierre Comon in 1991,[16] and popularized in his paper of 1994.

A largely used one, including in industrial applications, is the FastICA algorithm, developed by Hyvärinen and Oja,[26] which uses the negentropy as cost function, already proposed 7 years before by Pierre Comon in this context.

Sepp Hochreiter and Jürgen Schmidhuber showed how to obtain non-linear ICA or source separation as a by-product of regularization (1999).

For instance, ICA has been applied to discover discussion topics on a bag of news list archives.

ICA on four randomly mixed videos. ^{[

4

]} Top row: The original source videos. Middle row: Four random mixtures used as input to the algorithm. Bottom row: The reconstructed videos.