mutual information calculation for binary sequence
10 views (last 30 days)
Show older comments
How to calculate mutual information between two binary sequences. for eg- x= randn(1,100)>.5; y=rand(1,100)>.2;
2 Comments
Answers (2)
Roger Stafford
on 26 Jan 2014
Edited: Roger Stafford
on 26 Jan 2014
If you mean the theoretical mutual information of the two random variables you have defined, it would of course be zero, if we make the assumption that matlab generates mutually independent variables with sequences of 'rand' and 'randn' values. That is because the joint probabilities will always be the product of the respective marginal probabilities, thus rendering the logarithm factor identically zero.
On the other hand if you want an approximation based on two sample binary sequences which you have obtained, then simply do a count of the four cases and make the necessary summation as per the definition of mutual information (e.g., as in: http://en.wikipedia.org/wiki/Mutual_information.)
n = 100;
J = [sum(~x & ~y),sum(~x & y);sum(x & ~y),sum(x & y)]/n;
MI = sum(sum(J.*log2(J./(sum(J,2)*sum(J,1))))); % Assuming you use "bit" units
In your 'rand'/'randn' example with only 100 values 'MI' can be expected to deviate from zero appreciably , but as n is made larger, you should see it trend towards zero.
3 Comments
Roger Stafford
on 27 Jan 2014
As I have said, using sampled sequences of random variables can only yield an approximation of an underlying statistical quantity. If the process is not stationary, it would require a larger sample to accurately reveal such statistics. This applies to all statistical quantities such as mean values, covariance, skewness, etc., as well as the mutual information you are seeking. I do not know how long your samples need to be for this purpose in binary phase-shift keying.
With that said, the code I gave you in the second paragraph would still be valid for binary random variables even if they are not mutually independent. The amount by which the mutual information differs from zero is one kind of measure of the degree to which they are dependent.
Youssef Khmou
on 27 Jan 2014
Edited: Youssef Khmou
on 27 Jan 2014
hi, I think Roger gave the clue. Anyway, the mutual information is based on entropy which can be computed for any logarithmic base, in physics we use Neperian logarithm ( ex, Gibbs and von Neumann's entropy), in communications the log(2) base is used, the simplest formula for computing the mutual information between two gaussian variables is log(sqrt(1/(1-r^2))) where r is correlation coefficient between x and y :
N=1000;
x=randn(N,1);
y=randn(N,1);
C=corrcoef([x y]);
r=C(1,2); % or C(2,1)
Ixy=log(sqrt(1/(1-r^2)));
Note : Max{Ixy}=1.
2 Comments
Roger Stafford
on 28 Jan 2014
Anamika has not said the two variables are Gaussian - in fact, since they presumably have only discrete values they can't be Gaussian - and apparently wants a general method that would apply to any (binary?) bivariate distribution.
Youssef Khmou
on 28 Jan 2014
i think the pdfs should be known a priori, or else using the function 'entropy' is valid...
See Also
Categories
Find more on Random Number Generation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!