INFORMATION THEORY

We consider random variables X over a (discrete) space X. The probability distribution of X, p(X), can be viewed as a finite measure over X. Indeed

x p(x) = 1

The entropy (or information measure) of a subset A of X is

H(A) = - ∑x in A p(x)   ln( p(x) )
In particular H(X) = H(X) is the entropy of the distribution of the random variable X

For two random variables X and Y we can construct

where
H(X|Y=y)=- ∑x p(x|y) ln( p(x|y) )
I(X;Y)=H(X) - EY[ H(X|y) ]
 =H(X) + H(Y) - H(X,Y)

Notice that the information measure is symmetric. Furthermore if X and Y are independent, thus p(x|y) = p(x) and p(x,y)=p(x)p(y), the conditional entropy of X to Y is equal to the entropy of X and the information measure I(X;Y)=0. Also, for independent variables the joint entropy is the sum of the entropies of the two variables.




The entropy and information measure are amenable to a measure-theoretic point of view. To any random variable we can associate a set. Next we construct the sigma-algebra (ie, we consider also all possible unions, intersections, and complements) over these sets. For two random variable this is rather small, with eight elements. For N random variables it grows considerably large.

Finally we define a measure over this sigma-algebra, by

m(X) = H(X)
m(X u Y) = H(X,Y)
m(X n Y) = I(X;Y)
m(X - Y) = m(X n Yc) = H(X|Y)
This point of view allows a graphical representation, via Venn diagrams, of many important information theoretic relations. For example, the conditional information measure can be expressed as
I(X;Y | Z) = H(X|Z) - H(X|Y,Z)
  = H(X,Z) + H(Y,Z) - H(x,Y,Z) - H(Z)



C.E. Shannon, A mathematical theory of communication, Bell Syst. Tech. J. 27, 1948, 379-423
R.W.Yeung, A new outlook on Shannon's information measure IEEE Trans. Inform Th. 37, 1991, 466-474


Marco Corvi - Page hosted by geocities.com.