##### Personal tools
•
You are here: Home Application of Information Theory to Blind Source Separation

# Application of Information Theory to Blind Source Separation

##### Document Actions

Introduction   ::   BSS   ::   Mackay's Approach   ::   Natural Gradient   ::   p(u)

## Introduction

The principles of information theory can be applied to the blind source separation problem. We will briefly state the problem, then develop steps toward its solution.

## Background and some preliminary results

We consider first the case of adapting a processing function g which operates on a a scalar X using a function Y = g ( X ) in order to maximize the mutual information between X and Y . That is, we assume that g ( X ) = g ( X ; w , w 0 ) for some parameters w and w 0 , which are to be chosen to maximize I ( X ; Y ). We assume that g is a deterministic function. We have

I ( X ; Y ) = H ( Y ) - H ( Y |X).

But since g is deterministic, H ( Y |X) = H ( g ( X )|X) = 0, so the mutual information is maximized when H ( Y ) is maximized. (Actually, if we are dealing with differential entropy, this may not be the case. But we will take derivatives, and in any event H ( Y |X) is constant.) Now, assuming the range of g is restricted (a reasonable assumption), what form should g be ideally? (the CDF of X ). Draw a picture. Recall that

If g ( x ) = F X ( x ), then dy /dx = f x ( x ), and we get f Y ( y ) = 1 (fill in some details). Under the rule for transformations,

But f x ( x ) does not depend on our parameters, so we can ignore it. Of course, we may not know the pdf of X , and may not have the flexibility to choose. However, what is frequently done is to assume a particular functional form, and just fill in the parameters. Take

Then an adaptive scheme is to take

As examined in the HW, we find

Similarly, we find

We define by this means a weight update rule:

The effect of this learning rule is to drive Y to be as uniform as possible, then the form of g . We can generalize this to N inputs and N outputs. Suppose we take

where the function is applied element-by-element (expand out). Then

I ( X ; Y ) = H ( Y ) - H ( Y |X) = H ( Y ).

We want to determine W and to maximize the joint entropy of the output, . W is a matrix, is a vector. We have the pdf transformation equation
where J is the Jacobian of the transformation,

Then, we before, we find

where the second term does not depend upon the parameters. Then

As explored in the homework,
 (1)

and similarly,

Copyright 2008, Todd Moon. Cite/attribute Resource . admin. (2006, May 15). Application of Information Theory to Blind Source Separation. Retrieved January 07, 2011, from Free Online Course Materials — USU OpenCourseWare Web site: http://ocw.usu.edu/Electrical_and_Computer_Engineering/Information_Theory/lecture4_1.htm. This work is licensed under a Creative Commons License