Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
How to avoid for loops in 3 vector function.

Subject: How to avoid for loops in 3 vector function.

From: Lan

Date: 2 May, 2011 16:26:06

Message: 1 of 14

I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.

V0, M0, Y are n x 1 vectors (n=2000)

for i=1:length(V0)
    v=V0(i);
    b = (-3:0.01:3);
    l = length(b);
    vec0 = 2*Y - ones(n,1);% n x 1 vector
    mV0=repmat(V0,1,l); % n x n matrix
    t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
    vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
    vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
    sn0=vec0'*(vec1.*vec2);
    sn=1/(n*h)*sn0; % 1 x l array
    p = find(sn>=max(sn));
    betaH(i)=b(p(1));
end

This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?

Subject: How to avoid for loops in 3 vector function.

From: Florin Neacsu

Date: 2 May, 2011 18:11:08

Message: 2 of 14

"Lan" wrote in message <ipmluu$l21$1@fred.mathworks.com>...
> I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.
>
> V0, M0, Y are n x 1 vectors (n=2000)
>
> for i=1:length(V0)
> v=V0(i);
> b = (-3:0.01:3);
> l = length(b);
> vec0 = 2*Y - ones(n,1);% n x 1 vector
> mV0=repmat(V0,1,l); % n x n matrix
> t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> sn0=vec0'*(vec1.*vec2);
> sn=1/(n*h)*sn0; % 1 x l array
> p = find(sn>=max(sn));
> betaH(i)=b(p(1));
> end
>
> This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?

Hello,

There are many operations you could pull out of the for loop :


> b = (-3:0.01:3);
> l = length(b);
> vec0 = 2*Y - ones(n,1);% n x 1 vector
> mV0=repmat(V0,1,l); % n x n matrix

and also

>t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix

only uses the v, so you could do (ouside the for loop) something like

>t_temp=repmat(M0, 1, l)+)+repmat(b,n,1);
>my_ones=ones(n,l);

and inside the loop

>t=t_temp+v*my_ones;

Try changing these lines and see if it gets fast enough for your needs.

Regards,
Florin

Subject: How to avoid for loops in 3 vector function.

From: Lan

Date: 2 May, 2011 22:41:04

Message: 3 of 14

Thanks! I did what you suggested, put most of the code outside the loop and create a t_ temp. But the code still runs very slowly.... Is there another way to avoid the loop completely?

"Florin Neacsu" wrote in message <ipms3s$mcs$1@fred.mathworks.com>...
> "Lan" wrote in message <ipmluu$l21$1@fred.mathworks.com>...
> > I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.
> >
> > V0, M0, Y are n x 1 vectors (n=2000)
> >
> > for i=1:length(V0)
> > v=V0(i);
> > b = (-3:0.01:3);
> > l = length(b);
> > vec0 = 2*Y - ones(n,1);% n x 1 vector
> > mV0=repmat(V0,1,l); % n x n matrix
> > t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> > vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> > vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> > sn0=vec0'*(vec1.*vec2);
> > sn=1/(n*h)*sn0; % 1 x l array
> > p = find(sn>=max(sn));
> > betaH(i)=b(p(1));
> > end
> >
> > This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?
>
> Hello,
>
> There are many operations you could pull out of the for loop :
>
>
> > b = (-3:0.01:3);
> > l = length(b);
> > vec0 = 2*Y - ones(n,1);% n x 1 vector
> > mV0=repmat(V0,1,l); % n x n matrix
>
> and also
>
> >t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
>
> only uses the v, so you could do (ouside the for loop) something like
>
> >t_temp=repmat(M0, 1, l)+)+repmat(b,n,1);
> >my_ones=ones(n,l);
>
> and inside the loop
>
> >t=t_temp+v*my_ones;
>
> Try changing these lines and see if it gets fast enough for your needs.
>
> Regards,
> Florin

Subject: How to avoid for loops in 3 vector function.

From: Florin Neacsu

Date: 2 May, 2011 23:31:05

Message: 4 of 14

"Lan" wrote in message <ipnbu0$ku2$1@fred.mathworks.com>...
> Thanks! I did what you suggested, put most of the code outside the loop and create a t_ temp. But the code still runs very slowly.... Is there another way to avoid the loop completely?
>
> "Florin Neacsu" wrote in message <ipms3s$mcs$1@fred.mathworks.com>...
> > "Lan" wrote in message <ipmluu$l21$1@fred.mathworks.com>...
> > > I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.
> > >
> > > V0, M0, Y are n x 1 vectors (n=2000)
> > >
> > > for i=1:length(V0)
> > > v=V0(i);
> > > b = (-3:0.01:3);
> > > l = length(b);
> > > vec0 = 2*Y - ones(n,1);% n x 1 vector
> > > mV0=repmat(V0,1,l); % n x n matrix
> > > t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> > > vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> > > vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> > > sn0=vec0'*(vec1.*vec2);
> > > sn=1/(n*h)*sn0; % 1 x l array
> > > p = find(sn>=max(sn));
> > > betaH(i)=b(p(1));
> > > end
> > >
> > > This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?
> >
> > Hello,
> >
> > There are many operations you could pull out of the for loop :
> >
> >
> > > b = (-3:0.01:3);
> > > l = length(b);
> > > vec0 = 2*Y - ones(n,1);% n x 1 vector
> > > mV0=repmat(V0,1,l); % n x n matrix
> >
> > and also
> >
> > >t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> >
> > only uses the v, so you could do (ouside the for loop) something like
> >
> > >t_temp=repmat(M0, 1, l)+)+repmat(b,n,1);
> > >my_ones=ones(n,l);
> >
> > and inside the loop
> >
> > >t=t_temp+v*my_ones;
> >
> > Try changing these lines and see if it gets fast enough for your needs.
> >
> > Regards,
> > Florin

Hi,

Have you tried using profiler to identify the botlenecks ?

Another solution would be to tell us what you are trying to do and maybe someone could suggest a different approach.

Regards,
Florin

Subject: How to avoid for loops in 3 vector function.

From: Roger Stafford

Date: 3 May, 2011 18:22:21

Message: 5 of 14

"Lan" wrote in message <ipmluu$l21$1@fred.mathworks.com>...
> I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.
>
> V0, M0, Y are n x 1 vectors (n=2000)
>
> for i=1:length(V0)
> v=V0(i);
> b = (-3:0.01:3);
> l = length(b);
> vec0 = 2*Y - ones(n,1);% n x 1 vector
> mV0=repmat(V0,1,l); % n x n matrix
> t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> sn0=vec0'*(vec1.*vec2);
> sn=1/(n*h)*sn0; % 1 x l array
> p = find(sn>=max(sn));
> betaH(i)=b(p(1));
> end
>
> This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?
- - - - - - - - - - -
  You have defined vector b as the 601 discrete elements of the vector -3:0.01:3, which makes me think that in the ideal model you are working on, you would very likely prefer to consider the entire infinitude of all points within the interval -3<=b<=3 but have settled for the discrete case only because of practical computational considerations. If that guess is correct, it should be possible to revert back to this ideal continuum model and at the same time greatly increase the efficiency of your computation, by working with interval bounds in b rather than predefined discrete values.

  Your max(sn) is derived from the maximum possible value of 2*Y(j)-1 but restricted to some subset of possible index values j depending on which element v = V0(i) has previously been selected. For each v = V0(i) your vec2 defines an interval about v that all V0(j) must fall in to be considered in finding the maximum of sn, which puts a restriction on j. For each corresponding M0(j) with j satisfying this restriction, it is possible to define an interval also depending on v that b must lie outside of (assuming h is a positive scalar). If these b intervals also overlap the interval -3<=b<=3 anywhere, then the corresponding 2*Y(j)-1 can be considered in finding a maximum for sn.

  What I have just described is an order(n^2) algorithm which it should be quite easy to implement, and I would be most surprised if it weren't vastly superior in speed to your present O(n^2*L) algorithm (where L = 601.) There should be no need to individually consider the numerous discrete values of b you have defined when the problem is really one of computing b interval endpoints appropriately.

  If this idea appeals to you, I am sure that one of us cssm contributors can help you implement it in matlab code if you like.

Roger Stafford

Subject: How to avoid for loops in 3 vector function.

From: Roger Stafford

Date: 3 May, 2011 19:49:05

Message: 6 of 14

"Roger Stafford" wrote in message <ipph4t$8g9$1@fred.mathworks.com>...
> "Lan" wrote in message <ipmluu$l21$1@fred.mathworks.com>...
> > I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.
> >
> > V0, M0, Y are n x 1 vectors (n=2000)
> >
> > for i=1:length(V0)
> > v=V0(i);
> > b = (-3:0.01:3);
> > l = length(b);
> > vec0 = 2*Y - ones(n,1);% n x 1 vector
> > mV0=repmat(V0,1,l); % n x n matrix
> > t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> > vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> > vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> > sn0=vec0'*(vec1.*vec2);
> > sn=1/(n*h)*sn0; % 1 x l array
> > p = find(sn>=max(sn));
> > betaH(i)=b(p(1));
> > end
> >
> > This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?
> - - - - - - - - - - -
> You have defined vector b as the 601 discrete elements of the vector -3:0.01:3, which makes me think that in the ideal model you are working on, you would very likely prefer to consider the entire infinitude of all points within the interval -3<=b<=3 but have settled for the discrete case only because of practical computational considerations. If that guess is correct, it should be possible to revert back to this ideal continuum model and at the same time greatly increase the efficiency of your computation, by working with interval bounds in b rather than predefined discrete values.
> .......
- - - - - - - -
  I withdraw that suggestion I made earlier in this thread. In my haste I misinterpreted the vec2 vector as being entirely logical with only 0 and 1 values. Also I didn't interpret sn0 correctly. I am wondering if your 'h' is somehow related to the intervals between values in vector 'b'.

  I add my voice to Florin's and urge you to describe your problem to us in more descriptive terminology if we are to effectively help you. It is difficult to get a good grasp on the problem from the abstract quantities you have defined.

Roger Stafford

Subject: How to avoid for loops in 3 vector function.

From: Lan

Date: 4 May, 2011 11:10:05

Message: 7 of 14

"Roger Stafford" wrote in message <ippm7h$8au$1@fred.mathworks.com>...
> "Roger Stafford" wrote in message <ipph4t$8g9$1@fred.mathworks.com>...
> > "Lan" wrote in message <ipmluu$l21$1@fred.mathworks.com>...
> > > I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.
> > >
> > > V0, M0, Y are n x 1 vectors (n=2000)
> > >
> > > for i=1:length(V0)
> > > v=V0(i);
> > > b = (-3:0.01:3);
> > > l = length(b);
> > > vec0 = 2*Y - ones(n,1);% n x 1 vector
> > > mV0=repmat(V0,1,l); % n x n matrix
> > > t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> > > vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> > > vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> > > sn0=vec0'*(vec1.*vec2);
> > > sn=1/(n*h)*sn0; % 1 x l array
> > > p = find(sn>=max(sn));
> > > betaH(i)=b(p(1));
> > > end
> > >
> > > This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?
> > - - - - - - - - - - -
> > You have defined vector b as the 601 discrete elements of the vector -3:0.01:3, which makes me think that in the ideal model you are working on, you would very likely prefer to consider the entire infinitude of all points within the interval -3<=b<=3 but have settled for the discrete case only because of practical computational considerations. If that guess is correct, it should be possible to revert back to this ideal continuum model and at the same time greatly increase the efficiency of your computation, by working with interval bounds in b rather than predefined discrete values.
> > .......
> - - - - - - - -
> I withdraw that suggestion I made earlier in this thread. In my haste I misinterpreted the vec2 vector as being entirely logical with only 0 and 1 values. Also I didn't interpret sn0 correctly. I am wondering if your 'h' is somehow related to the intervals between values in vector 'b'.
>
> I add my voice to Florin's and urge you to describe your problem to us in more descriptive terminology if we are to effectively help you. It is difficult to get a good grasp on the problem from the abstract quantities you have defined.
>
> Roger Stafford


Thanks for all advice. I will try to describe the whole problem as clearly as possible.

This is the problem of simulating the binary response model. My binary response model is Y(j) = ((x0(j)+b0(j))>0). Remark: this is a scalar example , for the future, I would like to model Y(j)= (x(j)*b(j)>0) ,0 <j<=n, where x(j), and b(j) are vectors of dimension d. and * is a scalar product. But for now let's just focus on the concrete scalar example.

I am first generating multivariate random variables b0, z0, and x0.
x0 is endogenous, z0 is an instrument. The goal is to estimate b from the sample, by maximizing Sn function, defined later on, locally.

mu = [0 0 0];
SIGMA = [1 0.7 0; 0.7 1 0.6; 0 0.6 1];
n=2000;
M = mvnrnd(mu,SIGMA,n);
b0 = M(:,1);
x0 = M(:,2);
z0 = M(:,3);

We have a relation: x0 = M0(z) + V0(x, z). z is not correlated with b.

I compute M0

for n=1:length(z0)
    X=-100:2/100:100;
    Y=X.*normpdf(X,(0.6*z0(n)),0.64);
    M0(n) = trapz(X,Y);
end

V0 = x0-M0;

Y = (x0+b0 > 0);
betaH = zeros(n,1);

b = (-3:0.1:3);
l = length(b);
vec0 = 2*Y - ones(n,1);% n x 1 array
mV0=repmat(V0,1,l); % n x n matrix
t=zeros(n,l);
vec1=zeros(n,l);
vec2=zeros(n,l);
t_temp=repmat(M0, 1, l)+repmat(b,n,1);
my_ones=ones(n,l);


for i=1:length(V0)
    v=V0(i);
    t=t_temp+v*my_ones;
    % t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
    vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
    vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
    sn0=vec0'*(vec1.*vec2);
    Sn=1/(n*h)*sn0; % 1 x l array
    p = find(Sn>=max(sn));
    betaH(i)=b(p(1));
end


vec2 represents a matrix of integrated kernel function evaluated at every b.
vec1 is a kernel function.

Maximizing Sn(v,b) over b for a given v, for every v, will yield a relation betaH(v). averaging the calculated betaH will give an estimate of 'true' b. For Now, I would like to be able to plot betaH vs V0.

I hope the above made sense somehow. Let me know if something is confusing.

Lan

Subject: How to avoid for loops in 3 vector function.

From: Florin Neacsu

Date: 4 May, 2011 20:40:25

Message: 8 of 14

"Lan" wrote in message <iprc6c$67r$1@fred.mathworks.com>...
> "Roger Stafford" wrote in message <ippm7h$8au$1@fred.mathworks.com>...
> > "Roger Stafford" wrote in message <ipph4t$8g9$1@fred.mathworks.com>...
> > > "Lan" wrote in message <ipmluu$l21$1@fred.mathworks.com>...
> > > > I have a problem with a function Sn(M0, V0, v, b). M0, V0 are 2000x1 given vectors. I want to maximize Sn over b, for each different v belonging to V0.
> > > >
> > > > V0, M0, Y are n x 1 vectors (n=2000)
> > > >
> > > > for i=1:length(V0)
> > > > v=V0(i);
> > > > b = (-3:0.01:3);
> > > > l = length(b);
> > > > vec0 = 2*Y - ones(n,1);% n x 1 vector
> > > > mV0=repmat(V0,1,l); % n x n matrix
> > > > t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> > > > vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> > > > vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> > > > sn0=vec0'*(vec1.*vec2);
> > > > sn=1/(n*h)*sn0; % 1 x l array
> > > > p = find(sn>=max(sn));
> > > > betaH(i)=b(p(1));
> > > > end
> > > >
> > > > This runs very slowly. How do I vectorize the loop? Or is there any other way to make the function run faster?
> > > - - - - - - - - - - -
> > > You have defined vector b as the 601 discrete elements of the vector -3:0.01:3, which makes me think that in the ideal model you are working on, you would very likely prefer to consider the entire infinitude of all points within the interval -3<=b<=3 but have settled for the discrete case only because of practical computational considerations. If that guess is correct, it should be possible to revert back to this ideal continuum model and at the same time greatly increase the efficiency of your computation, by working with interval bounds in b rather than predefined discrete values.
> > > .......
> > - - - - - - - -
> > I withdraw that suggestion I made earlier in this thread. In my haste I misinterpreted the vec2 vector as being entirely logical with only 0 and 1 values. Also I didn't interpret sn0 correctly. I am wondering if your 'h' is somehow related to the intervals between values in vector 'b'.
> >
> > I add my voice to Florin's and urge you to describe your problem to us in more descriptive terminology if we are to effectively help you. It is difficult to get a good grasp on the problem from the abstract quantities you have defined.
> >
> > Roger Stafford
>
>
> Thanks for all advice. I will try to describe the whole problem as clearly as possible.
>
> This is the problem of simulating the binary response model. My binary response model is Y(j) = ((x0(j)+b0(j))>0). Remark: this is a scalar example , for the future, I would like to model Y(j)= (x(j)*b(j)>0) ,0 <j<=n, where x(j), and b(j) are vectors of dimension d. and * is a scalar product. But for now let's just focus on the concrete scalar example.
>
> I am first generating multivariate random variables b0, z0, and x0.
> x0 is endogenous, z0 is an instrument. The goal is to estimate b from the sample, by maximizing Sn function, defined later on, locally.
>
> mu = [0 0 0];
> SIGMA = [1 0.7 0; 0.7 1 0.6; 0 0.6 1];
> n=2000;
> M = mvnrnd(mu,SIGMA,n);
> b0 = M(:,1);
> x0 = M(:,2);
> z0 = M(:,3);
>
> We have a relation: x0 = M0(z) + V0(x, z). z is not correlated with b.
>
> I compute M0
>
> for n=1:length(z0)
> X=-100:2/100:100;
> Y=X.*normpdf(X,(0.6*z0(n)),0.64);
> M0(n) = trapz(X,Y);
> end
>
> V0 = x0-M0;
>
> Y = (x0+b0 > 0);
> betaH = zeros(n,1);
>
> b = (-3:0.1:3);
> l = length(b);
> vec0 = 2*Y - ones(n,1);% n x 1 array
> mV0=repmat(V0,1,l); % n x n matrix
> t=zeros(n,l);
> vec1=zeros(n,l);
> vec2=zeros(n,l);
> t_temp=repmat(M0, 1, l)+repmat(b,n,1);
> my_ones=ones(n,l);
>
>
> for i=1:length(V0)
> v=V0(i);
> t=t_temp+v*my_ones;
> % t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> sn0=vec0'*(vec1.*vec2);
> Sn=1/(n*h)*sn0; % 1 x l array
> p = find(Sn>=max(sn));
> betaH(i)=b(p(1));
> end
>
>
> vec2 represents a matrix of integrated kernel function evaluated at every b.
> vec1 is a kernel function.
>
> Maximizing Sn(v,b) over b for a given v, for every v, will yield a relation betaH(v). averaging the calculated betaH will give an estimate of 'true' b. For Now, I would like to be able to plot betaH vs V0.
>
> I hope the above made sense somehow. Let me know if something is confusing.
>
> Lan

Hi,

I do not have any suggestions on different approaches. Not sure I even understand what you are doing...(it's not because of your description)

Anyway, I was able to reduce the runtime by 20%. Try doing the next changes in your code :

before the loop (for n=1:length(z0) )

%edit
M0=zeros(1,length(z0));
X=-100:2/100:100;
%/edit
you don't need to create X on each iteration and pre-allocating is always a good idea, especially if you are considering working with even bigger matrices.

in the second for loop (for i=1:length(V0))

    %edit
    v_it=v*my_ones;
    t=t_temp+v_it;
    t_h=t./h; %this operation takes a rather long time, so just do it once
    tresh_inf_temp=(abs(t_h) <= 0.5);
    tresh_sup_temp=((t_h) > 0.5);
    t_shift=t_h+0.5*my_ones;

    vec1=t_shift.*tresh_inf_temp +tresh_sup_temp; % n x l matrix
    %/edit

HTH,
Florin

Subject: How to avoid for loops in 3 vector function.

From: Florin Neacsu

Date: 4 May, 2011 22:14:04

Message: 9 of 14

Hi,

You could win a couple more seconds by doing

tresh_inf_temp=double((abs(t_h) <= 0.5));
tresh_sup_temp=double(((t_h) > 0.5));

Regards,
Florin

Subject: How to avoid for loops in 3 vector function.

From: Roger Stafford

Date: 4 May, 2011 22:27:06

Message: 10 of 14

"Lan" wrote in message <iprc6c$67r$1@fred.mathworks.com>...
> Thanks for all advice. I will try to describe the whole problem as clearly as possible.
>
> This is the problem of simulating the binary response model. My binary response model is Y(j) = ((x0(j)+b0(j))>0). Remark: this is a scalar example , for the future, I would like to model Y(j)= (x(j)*b(j)>0) ,0 <j<=n, where x(j), and b(j) are vectors of dimension d. and * is a scalar product. But for now let's just focus on the concrete scalar example.
>
> I am first generating multivariate random variables b0, z0, and x0.
> x0 is endogenous, z0 is an instrument. The goal is to estimate b from the sample, by maximizing Sn function, defined later on, locally.
>
> mu = [0 0 0];
> SIGMA = [1 0.7 0; 0.7 1 0.6; 0 0.6 1];
> n=2000;
> M = mvnrnd(mu,SIGMA,n);
> b0 = M(:,1);
> x0 = M(:,2);
> z0 = M(:,3);
>
> We have a relation: x0 = M0(z) + V0(x, z). z is not correlated with b.
>
> I compute M0
>
> for n=1:length(z0)
> X=-100:2/100:100;
> Y=X.*normpdf(X,(0.6*z0(n)),0.64);
> M0(n) = trapz(X,Y);
> end
>
> V0 = x0-M0;
>
> Y = (x0+b0 > 0);
> betaH = zeros(n,1);
>
> b = (-3:0.1:3);
> l = length(b);
> vec0 = 2*Y - ones(n,1);% n x 1 array
> mV0=repmat(V0,1,l); % n x n matrix
> t=zeros(n,l);
> vec1=zeros(n,l);
> vec2=zeros(n,l);
> t_temp=repmat(M0, 1, l)+repmat(b,n,1);
> my_ones=ones(n,l);
>
>
> for i=1:length(V0)
> v=V0(i);
> t=t_temp+v*my_ones;
> % t=repmat(M0, 1, l)+v*ones(n, l)+repmat(b,n,1); % n x l matrix
> vec1=(t./h+0.5*ones(n,l)).*(abs(t./h) <= 0.5) +((t./h) > 0.5); % n x l matrix
> vec2=(abs((mV0 - v*ones(n,l))/h)<=0.5); % n x l matrix
> sn0=vec0'*(vec1.*vec2);
> Sn=1/(n*h)*sn0; % 1 x l array
> p = find(Sn>=max(sn));
> betaH(i)=b(p(1));
> end
>
>
> vec2 represents a matrix of integrated kernel function evaluated at every b.
> vec1 is a kernel function.
>
> Maximizing Sn(v,b) over b for a given v, for every v, will yield a relation betaH(v). averaging the calculated betaH will give an estimate of 'true' b. For Now, I would like to be able to plot betaH vs V0.
>
> I hope the above made sense somehow. Let me know if something is confusing.
>
> Lan
- - - - - - - - -
  Some observations about your code.

1. In computing M0, neither your for loop using 'trapz' nor the vectorized version I suggested in the other thread (307148) are needed here. It is evident that the limits -100 and +100 are equivalent to -inf and +inf to all practical purposes and all you are doing is demonstrating that the mean value of a normal function with specified mean .6*z0 is indeed .6*z0. You might as well just write M0 = 0.6*z0 and be done with it.

2. You haven't made any statements about the quantity 'h' (which I have assumed is a single small scalar.) If it is sufficiently small relative to 1, the condition

  abs((mV0 - v*ones(n,l))/h)<=0.5

applied in forming 'vec2' should usually be true for only a small fraction of the 2000 total. That fact should be utilized before you undertake the horrendous computation of sn0. You can do f = find(abs(V0-v)<=h/2) and then apply f to the rows of M0 and vec0 to get much smaller arrays for forming sn0. As it stands, 2000*61=122000 different times you are doing two additions and two multiplications each time you compute sn0, and you are computing it 2000 times! Most of these would give zero values. It is important to cut down that number of floating point operations.

3. Since all elements of vec0 are either +1 or -1, perhaps you can take advantage of that fact by performing the appropriate subtraction of two sums instead of the multiplication involved in sn0.

4. You don't need to divide sn0 by n*h if you only use it locate its maximum. Just do

 [~,p] = max(sn0);
 betaH(i) = b(p); % p will be the first max in case of duplicates

5. A more general comment: You are using a Monte Carlo method to simulate your "binary response model" assuming an underlying jointly normal distribution of three variables. Unfortunately 2000 is, in my opinion, a rather small number to use for this purpose. If you divide the three-dimensional space of these variables into a 3D mesh of little cubes, each cube would on average receive only twelve or so "hits". That is rather small to base an analysis on. Even so, that has led you to a very long computation process. I would think it would be better to carry out a direct analysis of your three-variable normal distributions combined with the convolution kernels you have defined and solve the statistics of the model directly. It might lead to some frightening multiple integrals, but that is the sort of thing that matlab should be very good at, particularly in the realm of multivariate
normal distribution functions. In my opinion Monte Carlo methods should be used only in case of desperation and when one has extremely large amounts of computer time available for getting valid simulations.

Roger Stafford

Subject: How to avoid for loops in 3 vector function.

From: Roger Stafford

Date: 5 May, 2011 05:46:04

Message: 11 of 14

"Roger Stafford" wrote in message <ipsjrq$ebg$1@fred.mathworks.com>...
> Some observations about your code.
>
> 1. In computing M0, neither your for loop using 'trapz' nor the vectorized version I suggested in the other thread (307148) are needed here. It is evident that the limits -100 and +100 are equivalent to -inf and +inf to all practical purposes and all you are doing is demonstrating that the mean value of a normal function with specified mean .6*z0 is indeed .6*z0. You might as well just write M0 = 0.6*z0 and be done with it.
>
> 2. You haven't made any statements about the quantity 'h' (which I have assumed is a single small scalar.) If it is sufficiently small relative to 1, the condition
>
> abs((mV0 - v*ones(n,l))/h)<=0.5
>
> applied in forming 'vec2' should usually be true for only a small fraction of the 2000 total. That fact should be utilized before you undertake the horrendous computation of sn0. You can do f = find(abs(V0-v)<=h/2) and then apply f to the rows of M0 and vec0 to get much smaller arrays for forming sn0. As it stands, 2000*61=122000 different times you are doing two additions and two multiplications each time you compute sn0, and you are computing it 2000 times! Most of these would give zero values. It is important to cut down that number of floating point operations.
>
> 3. Since all elements of vec0 are either +1 or -1, perhaps you can take advantage of that fact by performing the appropriate subtraction of two sums instead of the multiplication involved in sn0.
>
> 4. You don't need to divide sn0 by n*h if you only use it locate its maximum. Just do
>
> [~,p] = max(sn0);
> betaH(i) = b(p); % p will be the first max in case of duplicates
>
> 5. A more general comment: You are using a Monte Carlo method to simulate your "binary response model" assuming an underlying jointly normal distribution of three variables. Unfortunately 2000 is, in my opinion, a rather small number to use for this purpose. If you divide the three-dimensional space of these variables into a 3D mesh of little cubes, each cube would on average receive only twelve or so "hits". That is rather small to base an analysis on. Even so, that has led you to a very long computation process. I would think it would be better to carry out a direct analysis of your three-variable normal distributions combined with the convolution kernels you have defined and solve the statistics of the model directly. It might lead to some frightening multiple integrals, but that is the sort of thing that matlab should be very good at, particularly in the realm of multivariate

> normal distribution functions. In my opinion Monte Carlo methods should be used only in case of desperation and when one has extremely large amounts of computer time available for getting valid simulations.
>
> Roger Stafford
- - - - - - - - -
  Another thought about your code has occurred to me rather belatedly. Let it be called point #6. The only thing that changes in the 2000 passes through that second for loop is the value v obtained from V0(i). However, the result would be the same if the v used were equal to zero and the b values in b = -3:0.1:3 have V0(i) added to them. In other words a single pass through the loop with v = 0 will serve to establish a curve Sn as a discrete function of b. Using other values than v = 0 could be determined from this one curve by an opposite shift of that amount along the b scale followed by, say, interpolation, rather than going through the massive computation again. The only problem with this is that it would cause a shift off the -3 to +3 lower and upper limits, but this could be corrected by extending this range on this one pass sufficiently far to essentially encompass all such
likely shifts. Also you could afford finer steps in b, making the interpolation easier and more accurate, on such a single pass.

  The fact that this is true looks very fortunate from the point of view of cutting down your computation time. If valid it ought to make a huge difference in cpu time. However, it does raise the question of whether your algorithm as stated is actually computing what you intend. I have assumed that your aim is to simulate, via a Monte Carlo procedure, a triple integral of the given trivariate normal density function of b0, x0,and z0 multiplied by some kernel which is a function of these variables and of a parameter b, which parameter you would eventually optimize. I do not quite see how your method of setting the offset v = V0(i) to be added to b for each different i quite ties in with such a goal.

Roger Stafford

Subject: How to avoid for loops in 3 vector function.

From: Roger Stafford

Date: 5 May, 2011 06:00:22

Message: 12 of 14

"Roger Stafford" wrote in message <iptdis$4b3$1@fred.mathworks.com>...
> ..... Using other values than v = 0 could be determined from this one curve by an opposite shift of that amount along the b scale followed by, say, interpolation, rather than going through the massive computation again. ........
- - - - - - - -
  Where I said "an opposite shift of that amount" I meant to say "shifts of those amounts".

Roger Stafford

Subject: How to avoid for loops in 3 vector function.

From: Roger Stafford

Date: 5 May, 2011 10:22:04

Message: 13 of 14

"Roger Stafford" wrote in message <iptdis$4b3$1@fred.mathworks.com>...
> Another thought about your code has occurred to me rather belatedly. Let it be called point #6. The only thing that changes in the 2000 passes through that second for loop is the value v obtained from V0(i). ........
- - - - - - -
  Please forget about point #6. I was temporarily forgetting about the constraint imposed by vec2 (even though that was the subject of my point #2 which I believe is still valid.) That makes every b curve different as i varies, not just a shift of a single curve.

Roger Stafford

Subject: How to avoid for loops in 3 vector function.

From: Lan

Date: 8 May, 2011 22:14:05

Message: 14 of 14

"Roger Stafford" wrote in message <ipttoc$3aq$1@fred.mathworks.com>...
> "Roger Stafford" wrote in message <iptdis$4b3$1@fred.mathworks.com>...
> > Another thought about your code has occurred to me rather belatedly. Let it be called point #6. The only thing that changes in the 2000 passes through that second for loop is the value v obtained from V0(i). ........
> - - - - - - -
> Please forget about point #6. I was temporarily forgetting about the constraint imposed by vec2 (even though that was the subject of my point #2 which I believe is still valid.) That makes every b curve different as i varies, not just a shift of a single curve.
>
> Roger Stafford

Thank you so much for all advice. It greatly improved the code's efficiency. I have only one question. Does MATLAB has any free global optimization routines? I'm interested since, my b in other models I will simulate, b will be a vector, so it will involve multi-dimensional optimization...

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us