This is paper draft that is probably more worthy of being a blog post, so I decided to blogize it. Here we go:

Abstract:

It is well-known that Independent Component Analysis (ICA) is not identifiable when more than one source is Gaussian (Comon 1993). However, the question of *how* identifiable it is is rarely asked. In this paper, we show a generalization of the identifiability result, namely that the number of degrees of freedom in the observational equivalence class of ICA is k-1, where k is the number of Gaussians in the sources.

However, since the sources are latent variables, k is unknown and needs to be estimated. We frame this as a problem of "hyper-equatorial regression" (i.e. find the hyper-great-circle that best fits the resampled points on the hypersphere), which is solved by doing PCA and discarding components beyond the knee of the eigenspectrum.

Having estimated the Gaussian subspace, we can then project the data onto it, and now we have a fully identifiable ICA.

We conclude by presenting a simple modification of the FastICA algorithm that only returns the identifiable components, by exiting early if non-Gaussian components can't be found. This is efficient because it avoids the bootstrap and clustering steps.

Independent component analysis is a statistical technique used for estimating the mixing matrix A in equations of the form

**x** = A

**s**, where

**x** is observed and

**s** and A are not.

**s** is usually called the "sources".

The matrix A can be identified up to scaling and column-permutations as long as the observed distribution is a linear, invertible mixture of independent components, at most one of which is Gaussian.

This note is concerned with the unidentifiability that results from more than one source being Gaussian, in which, rather than a single mixing matrix, there is a proper subspace of mixing matrices that leads to the observed joint distribution in the large sample limit.

The observational equivalence class of ICA (modulo scaling and permutation) is the set of mixing matrices that can be obtained by applying an invertible linear transformation of the Gaussian components while maintaining the non-Gaussian components fixed. If we order the components so that the Gaussian ones come first, the observational equivalence class for any maxing matrix A can be written as the set {MA | M is like the identity matrix except for k-by-k block on the top left, which is a general invertible matrix}.

If we have 2 Gaussians on a large number of sources, ICA is far from useless: rather, all but 2 components can be identified, corresponding to the two Gaussian sources. The observational equivalence class has one degree of freedom, as can be seen in (b) below.

Now we illustrate the behavior of the ICA statistic on an example with 3 sources, by resampling and plotting the directions of the resulting components on the unit sphere:

As we can see, in Fig (b), the two components that lie on the circle can rotate around an axis. The next figure provides a 3D plot of this.

ToDo: implement proper clustering, modify FastICA, write a proper report

Having run a clustering procedure to eliminate the tight clusters, we use PCA to find the k-plane around which the points on the ring lie. This gives us the Gaussian subspace.

Thanks to Ives Macedo for discussions.