How to make this code faster?

2 views (last 30 days)
MJ
MJ on 15 Oct 2014
Commented: Geoff Hayes on 15 Oct 2014
I have coded a formulation using for loops but it takes forever when my matrices are very large. Can anyone help me make this code more efficient without for loops?
I'm coding the formulations in page two (in the lower half page) of this document: http://www-stat.wharton.upenn.edu/~tcai/paper/Testing-2-Covariance-Matrices.pdf
And here are my codes:
if true
% Input: X, Y
p=size(X,2);
n1=size(X,1); n2=size(Y,1);
sigma1=cov(X); sigma2=cov(Y);
for i=1:p
for j=1:p
for k=1:n1
A(k,:)=((X(k,i)-mean(X(:,i))).*(X(k,j)-mean(X(:,j)))-sigma1(i,j)).^2;
end
Teta_bar1(i,j)=1./n1 .* sum(A);
for k=1:n2
B(k,:)=((Y(k,i)-mean(Y(:,i))).*(Y(k,j)-mean(Y(:,j)))-sigma2(i,j)).^2;
end
Teta_bar2(i,j)=1./n2 .* sum(B);
M(i,j)=((sigma1(i,j)-sigma2(i,j)).^2)./(Teta_bar1(i,j)/n1+Teta_bar2(i,j)/n2);
end
end
end

Accepted Answer

Geoff Hayes
Geoff Hayes on 15 Oct 2014
MJ - what does it takes forever when my matrices are very large mean? Seconds, minutes, hours? Please provide an example of a matrix (perhaps just the dimensions) along with the time it takes to evaluate the above.
Before looking at vectorizing your code, you may want to profile your code to see where the bottlenecks are.
A couple things that you may want to do (for your optimization) is to allocate or pre-size matrices before you start to use them. For example, M appears to be a pxp matrix, so why not size it as such before you start to use it
p=size(X,2);
n1=size(X,1);
n2=size(Y,1);
sigma1=cov(X);
sigma2=cov(Y);
% pre-size matrices
M = zeros(p,p);
Teta_bar1 = zeros(p,p);
Teta_bar2 = zeros(p,p);
A = ??
B = ??
Look also at your inner loops that iterate over k
for k=1:n1
A(k,:)=((X(k,i)-mean(X(:,i))).*(X(k,j)-mean(X(:,j)))-sigma1(i,j)).^2;
end
Note that you calculate
mean(X(:,i))
on each iteration of k even though this average does not depend on k at all. It doesn't even depend upon j, so the code is calculating the same average p*n1 times when it only needs to do this once. Similarly, the
mean(X(:,j)
is calculated n1 times when it is only needed once. Try changing this to
for i=1:p
meanXi = mean(X(:,i)); % <--- calculate the mean
for j=1:p
meanXj = mean(X(:,j)); % <--- calculate the mean
for k=1:n1
A(k,:)=((X(k,i)-meanXi).*(X(k,j)-meanXj)-sigma1(i,j)).^2;
end
You can apply the same logic to the second inner loop that iterates over k to avoid the duplicate mean calculations.
  4 Comments
MJ
MJ on 15 Oct 2014
Wonderful!! Just applied your instructions and got the results in ONLY 3 seconds! You made my day, sir! :) Thanks a lot!
Geoff Hayes
Geoff Hayes on 15 Oct 2014
Glad it worked out!

Sign in to comment.

More Answers (0)

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!