Got Questions? Get Answers.
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
"Vectorize" code with nested for-loops

Subject: "Vectorize" code with nested for-loops

From: drgz

Date: 9 Aug, 2010 16:12:05

Message: 1 of 10

Hello,

quick question about vectorization of code.

Is there any way to simplify this with some clever functions I most likely don't know about? I'm attempting to model a nonlinear system by a Volterra series, but as I get to the higher nonlinear orders, the nested for-loops gets quite big in my brute-force attempt :)

So hopefully I'm lucky and this can be solved in a more elegant manner. If not, then I can live with that :)

Below I've pasted the part of code for the 9th order part of the model.

for n1 = 0:M(5)-1
    xn1 = delay_signal(x,n1);
    for n2 = n1:M(5)-1
        xn2 = delay_signal(x,n2);
        for n3 = n2:M(5)-1
            xn3 = delay_signal(x,n3);
            for n4 = n3:M(5)-1
                xn4 = delay_signal(x,n4);
                for n5 = n4:M(5)-1
                    xn5 = delay_signal(x,n5);
                    for n6 = 0:M(5)-1
                        xn6 = delay_signal(x,n6);
                        for n7 = n6:M(5)-1
                            xn7 = delay_signal(x,n7);
                            for n8 = n7:M(5)-1
                                xn8 = delay_signal(x,n8);
                                for n9 = n8:M(5)-1
                                    xn9 = delay_signal(x,n9);
                                    x9 = xn1.*xn2.*xn3.*xn4.*xn5.* ...
                                         conj(xn6).*conj(xn7).* ...
                                         conj(xn8).*conj(xn9);
                                    % Check if x9 exist in X9 matrix
                                    if checkExistence(x9,X9)
                                        continue;
                                    else
                                        n = n+1;
                                        X9(:,n) = x9;
                                    end
                                end
                            end
                        end
                    end
                end
            end
        end
    end
end

Here the different variables are:
- x: input "signal"
- M(): max. delay
- n: index for columns in X9 matrix
- X9: matrix holding the different signal products

- checkExistence: short function for checking whether or not the instantaneous vector already exist in the matrix (I assume there might be a built-in function for this as well in MATLAB?)
- delay_signal: just delay the x-vector by the given number of samples

Best regards,

D.

Subject: "Vectorize" code with nested for-loops

From: Jan Simon

Date: 9 Aug, 2010 16:56:04

Message: 2 of 10

Dear drgz,

> for n1 = 0:M(5)-1

Do not calculate "M(5) - 1" again and again. Assign it to an extra variable:
  M5_1 = M(5) - 1;
  for n1 = 0:M5_1
    ... etc
 
> for n6 = 0:M(5)-1
Ar you sure this is not:
                       for n6 = n5:M(5)-1
???

> for n9 = n8:M(5)-1
> xn9 = delay_signal(x,n9);
> x9 = xn1.*xn2.*xn3.*xn4.*xn5.* ...
> conj(xn6).*conj(xn7).* ...
> conj(xn8).*conj(xn9);

Do not calculate CONJ(xn6) etc. again and again and again. Store it after calculating it once in the corresponding loops.
Even xn1.*xn2.*...*conj(xn8) does not change between iterations of the n9 loop - so calculate it before the loop!

> if checkExistence(x9,X9)
> continue;
> else
> n = n+1;
> X9(:,n) = x9;
> end

Omit the unneeded CONTINUE. I hope, that you have preallocated X9? Otherwise the repeated growing is *very* slow in Matlab.
Because checkExistence is called very often, it is worth to post the code you are using. If X9 gets very big, a sorting might be helpful together with a binary search.

In general: Move as much calculations out of the inner loops as possible!

Kind regards, Jan

Subject: "Vectorize" code with nested for-loops

From: drgz

Date: 9 Aug, 2010 17:29:06

Message: 3 of 10

"Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message
> Do not calculate "M(5) - 1" again and again. Assign it to an extra variable:
> M5_1 = M(5) - 1;
> for n1 = 0:M5_1
> ... etc

Done! Thanks for the tip!

"Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message
> > for n6 = 0:M(5)-1
> Ar you sure this is not:
> for n6 = n5:M(5)-1
> ???

If I've understood the algorithm correct, for the Pth order kernel the first (P+1)/2 signal products should be "grouped" together or what to call it. So for the 9th order, n_i = n_(i-1):M_P-1 up to and including the 5th signal product, and then the rest of the signal products should be conjugated and "grouped" together (I can't think of any good word for describing this, but I hope you understand). Here in my explanation n_i is the number of delayed samples and M_P the max memory order.


> > for n9 = n8:M(5)-1
> > xn9 = delay_signal(x,n9);
> > x9 = xn1.*xn2.*xn3.*xn4.*xn5.* ...
> > conj(xn6).*conj(xn7).* ...
> > conj(xn8).*conj(xn9);
>
> Do not calculate CONJ(xn6) etc. again and again and again. Store it after calculating it once in the corresponding loops.
> Even xn1.*xn2.*...*conj(xn8) does not change between iterations of the n9 loop - so calculate it before the loop!

That makes sense as well, will change the code to take this into consideration.

>
> > if checkExistence(x9,X9)
> > continue;
> > else
> > n = n+1;
> > X9(:,n) = x9;
> > end
>
> Omit the unneeded CONTINUE. I hope, that you have preallocated X9? Otherwise the repeated growing is *very* slow in Matlab.
> Because checkExistence is called very often, it is worth to post the code you are using. If X9 gets very big, a sorting might be helpful together with a binary search.
>

I changed this to (check... basically returns a 1 if exist, or 0 if not)

if ~checkExistence(x9,X9,n)
 n = n+1;
 X9(:,n) = x9;
end

now. X9 is preallocated yes (otherwise this would take forever for large values of M ;)). As I try to not use too many coefficients in the modeling, the matrices for the different kernels don't get too large (at most N-by-50 I would assume). However, the last K-n columns (K = num. of columns in X9) will always be zero-vectors until the columns have been filled up, so I now only pass the columns of the X9 matrix that actually got non-zero column vectors. But if I can improve the code, hit me. I'm always open for improvements :)

The code for the other function is (after changing it to be compatible with the changes mentioned above)

function y = checkExistence(xn,Xn,n)

for m = 1:n
    if xn == Xn(:,m)
        y = 1;
        return;
    end
end
y = 0;
end

By only passing the first n columns of Xn I assume this will speed up the code slightly as it now doesn't need to check all the columns.


> In general: Move as much calculations out of the inner loops as possible!

I should have known that, but sometimes I need a kick in the ass to do the obvious!

Thanks for the comments!

Subject: "Vectorize" code with nested for-loops

From: Jan Simon

Date: 9 Aug, 2010 19:48:05

Message: 4 of 10

Dear drgz,

> for m = 1:n
> if xn == Xn(:,m)
> y = 1;
> return;
> end
> end
> y = 0;
> end
>
> By only passing the first n columns of Xn I assume this will speed up the code slightly as it now doesn't need to check all the columns.

This check might be faster, if the values of X9 have a large distribution: Check the complete vector only, if the first element matchs. And I assume ISEQUAL to be faster than comparing the complete vector and use the implicite ALL from the IF command:
  q = strfind(xn(1), Xn(1, :)); % Works for DOUBLEs also
  for m = q
     if isequal(xn, Xn(:,m))
         y = 1;
         return;
     end
 end
   y = 0;
 end
But the speed depends on the power of Matlab's JIT compiler.
Please start the profiler. If this function is a bottleneck, it is easy to create a Mex-function for this task.

> I should have known that, but sometimes I need a kick in the ass to do the obvious!

Sorry, I'd never kick you. I'm pazifist.
But I like to squeeze Matlab source code.

Kind regards, Jan

Subject: "Vectorize" code with nested for-loops

From: Dragan

Date: 9 Aug, 2010 20:47:06

Message: 5 of 10

"Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message <i3pm1l$hai$1@fred.mathworks.com>...
> This check might be faster, if the values of X9 have a large distribution: Check the complete vector only, if the first element matchs. And I assume ISEQUAL to be faster than comparing the complete vector and use the implicite ALL from the IF command:
> q = strfind(xn(1), Xn(1, :)); % Works for DOUBLEs also
> for m = q
> if isequal(xn, Xn(:,m))
> y = 1;
> return;
> end
> end
> y = 0;
> end
> But the speed depends on the power of Matlab's JIT compiler.
> Please start the profiler. If this function is a bottleneck, it is easy to create a Mex-function for this task.
>
> > I should have known that, but sometimes I need a kick in the ass to do the obvious!
>
> Sorry, I'd never kick you. I'm pazifist.
> But I like to squeeze Matlab source code.
>
> Kind regards, Jan

Thanks for the help Jan! That piece of code did indeed speed up the code!

With original code: http://folk.ntnu.no/mitrevsk/temp/before_change.PNG
With your code: http://folk.ntnu.no/mitrevsk/temp/after_change.PNG

For this simulation I used a greater memory depth order than I probably will need; for all the kernels combined, the size of the resulting matrix ended up at almost 16500-by-500.

I also did some more simulations with my complete implementation of the model class (I though that doing this OO could be nice for some reason), and it seems like the estimation of the coefficients that is the most time-consuming operation together with the QR factorization if/when the matrix is rank deficient, which isn't a big surprise when the size of the data matrix grows big.

So as for now I think the help you've provided me with will do the trick for my code. Again thanks a lot for your time and help, it is really appreciated!

Best regards,
D.

Subject: "Vectorize" code with nested for-loops

From: Andy

Date: 9 Aug, 2010 20:59:05

Message: 6 of 10

Are you only using checkExistence here?:

if ~checkExistence(x9,X9,n)
 n = n+1;
 X9(:,n) = x9;
end

If so, it might save a lot of time in function-calling overhead to just inline your check:

if ~isequal(xn,Xn(:,1:n))
 n = n+1;
 X9(:,n) = x9;
end

Subject: "Vectorize" code with nested for-loops

From: Dragan

Date: 10 Aug, 2010 07:42:04

Message: 7 of 10

"Andy " <myfakeemailaddress@gmail.com> wrote in message <i3pq6p$d7a$1@fred.mathworks.com>...
> Are you only using checkExistence here?:
>
> if ~checkExistence(x9,X9,n)
> n = n+1;
> X9(:,n) = x9;
> end
>
> If so, it might save a lot of time in function-calling overhead to just inline your check:
>
> if ~isequal(xn,Xn(:,1:n))
> n = n+1;
> X9(:,n) = x9;
> end

I could try that as well and see how it compares to the suggestion from Jan, thanks for the tip.

Best regards,

D.

Subject: "Vectorize" code with nested for-loops

From: Jan Simon

Date: 10 Aug, 2010 15:10:22

Message: 8 of 10

Dear Andy,

> if ~isequal(xn,Xn(:,1:n))

While "xn" is a vector, "Xn(:, 1:n)" is a matrix. So you have to include a loop in addition.
If the matrix gets really large, apply the STRFIND method in one or more further levels.

Kind regards, Jan

Subject: "Vectorize" code with nested for-loops

From: Andy

Date: 10 Aug, 2010 15:25:20

Message: 9 of 10

"Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message <i3rq4u$q9j$1@fred.mathworks.com>...
> Dear Andy,
>
> > if ~isequal(xn,Xn(:,1:n))
>
> While "xn" is a vector, "Xn(:, 1:n)" is a matrix. So you have to include a loop in addition.
> If the matrix gets really large, apply the STRFIND method in one or more further levels.
>
> Kind regards, Jan

Sorry, I missed that. There is probably a way to inline the check using repmat or arrayfun, but then you incur the overhead penalties that we were trying to avoid anyway. (Although perhaps builtin functions will be faster. It may be worth a try.)

Subject: "Vectorize" code with nested for-loops

From: Dragan

Date: 10 Aug, 2010 16:18:04

Message: 10 of 10

"Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message <i3rq4u$q9j$1@fred.mathworks.com>...
> Dear Andy,
>
> > if ~isequal(xn,Xn(:,1:n))
>
> While "xn" is a vector, "Xn(:, 1:n)" is a matrix. So you have to include a loop in addition.
> If the matrix gets really large, apply the STRFIND method in one or more further levels.
>
> Kind regards, Jan

That was my conclusion when I tried this; since I had to add an additional loop this did not improve the speed compared to your suggestion. As for now I'll stick to your code and see if I (maybe) can improve the other bottlenecks in the rest of the code. I'm not sure if its worth trying to port the QR factorization (or solving the normal equations) to C++/MEX or not; I mean, the built-in code (QR or mldivide) is quite fast IMO, but it's definitely the part of the code that takes the longest time.

And now that I'm writing here, maybe I could ask for some more tips. I would assume the code could be simplified even further if I could get it on the form as shown in http://folk.ntnu.no/mitrevsk/temp/kron_prod.PNG (just a snip from an article that is referring to a book by V. John Mathews).

Would it be correct to use

X5 = zeros(Nx,min(size(X1))*min(size(X3)));

for n = 1:Nx
      X5(n,:) = kron(X1(n,:),X3(n,:));
end

and so on for the higher order terms, to compose the different input vectors? And if so, any way to exchange the loop with some fancy function?

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us