MATLAB Newsgroup

Last week, I endeavored to see how much of an improvement "Tony's trick" offered over the more brute force calculation.

For those in the know, Tony's trick amounts to replacing code such as

N=1e6;

h=4; % this is just a scalar

V=h*ones(N,1) ; % this fills the V vector

with the supposedly faster:

N=1e6;

h=4; % this is just a scalar

V=h(ones(N,1)) ; % this fills the V vector

The subtle difference being that V is built via indexing instead of performing the multiplication of h*1, N times. Surprisingly, I've noted a stark failure in tony's trick as the dimension gets higher. This examination was completed via R14. I'm looking for an explanation (an error in my code is a perfectly acceptable explanation) for why tony's trick is no longer working.

Further Drea Thomas (mathworks employee) posted the following:

So, why use Tony's trick? It is usually faster. The second method

performs unnecessary matrix multiplication. This technique is most

useful when you have large matrices and the operation is inside

your inner-most loop (where speed is critical). Let's test this

hypothesis.

a = rand(1,100);

I = ones(100,1);

tic;V=I*a;toc

elapsed_time = 0.0315

tic;V=a(I,:);toc

elapsed_time = 0.0115

There are two issues with her code. One is that she doesn't empty V before the second fill, so there's an obvious memory allocation bias. The second problem, which is extensively more damning, is that these results have not held for R14 on two different local machines, i.e. the tic-toc time is FASTER for the matrix multiply.

This is the code for both tests, one for the vector fill and one for Drea's matrix construction. This code compares the two, clears V after every fill, and outputs a histogram of the results.

The vector comparison code:

%This code compares the speed of replicating a vector via h*ones(m,1) versus

% h(ones(m,1)). The algorithm fills and clears a vector b, N times, and

% plots the histogram results.

close all

clear

N = 1000; % the number of trials

m = 1e7; % the vector length

h = 4; % assignment of a test number

time=zeros(N,2); %preallocation of memory for the resulting toc times

disp('-------h*ones(m,1) Analysis-------')

for ncnt=1:N

clear b %clear the b vector at the beginning of every loop

tic

b = h*ones(m,1); %populate b

time(ncnt,1)=toc; %store the toc time

end

disp('done.')

disp('-------h(ones(m,1)) Analysis-------')

for ncnt=1:N

clear b

tic

b=h(ones(m,1));

time(ncnt,2)=toc;

end

disp('done.')

%plotting normalized histograms

subplot(2,1,1)

[n,r]=hist(time(:,1),40); %histogram

bar(r, 1./(r(2)-r(1))*n/sum(n)); %normalized histogram

title('using h*ones(m,1)')

xlabel('tic-toc time')

subplot(2,1,2)

[n,r]=hist(time(:,2),40);

bar(r, 1./(r(2)-r(1))*n/sum(n));

title('using h(ones(m,1))')

xlabel('tic-toc time')

%end of file

And the matrix code-------------------------------------------------------------------------------------

%this code is directly from drea thomas at mathworks, I just repeat the

%calculation and create a histogram.

close all

clear

clc

N = 1000; % number of trials

m = 1e3; % size of vector

time=zeros(N,2); % preallocate memory

for p=1:N % this is the comparison loop

a = rand(1,m); % generate a random row vector a

I = ones(m,1); % generate a ones column vector I

clear V % clear the V variable to avoid a memory allocation bias

tic;V=I*a; time(p,1)=toc; %tic-toc the time to create V

% clear the V variable to avoid a memory allocation bias

tic;V=a(I,:); time(p,2)=toc; %tic-toc the time to create V

end

%%plotting normalized histograms

subplot(2,1,1)

[n,r]=hist(time(:,1),40); %histogram

bar(r, 1./(r(2)-r(1))*n/sum(n)); %normalized histogram

title('using V=I*a')

xlabel('tic-toc time')

xlim([0 mean(time(:,1))*10])

grid on

subplot(2,1,2)

[n,r]=hist(time(:,2),40);

bar(r, 1./(r(2)-r(1))*n/sum(n));

title('using V=a(I,:)')

xlabel('tic-toc time')

xlim([0 mean(time(:,1))*10])

grid on

%end of file

I see in the vector version a HUGE speed advantage of the matrix multiply. The matrix version shows tony's trick offering a "slight" advantage for very small dimensions (m<100), but for anything big, the matrix multiply starts to pull away.

I encourage readers to run the code themselves (10-15 minutes) and come to an explanation. Maybe an issue with the tic-toc time? I don't know, but this is certainly something that needs to be addressed.

Thanks all,

Bronson

On Sep 27, 5:28 am, "Bronson " <Bargyle13REMOVET...@gsb.columbia.edu>

wrote:

> Last week, I endeavored to see how much of an improvement "Tony's trick" offered over the more brute force calculation.

>

> For those in the know, Tony's trick amounts to replacing code such as

>

> N=1e6;

> h=4; % this is just a scalar

> V=h*ones(N,1) ; % this fills the V vector

>

> with the supposedly faster:

>

> N=1e6;

> h=4; % this is just a scalar

> V=h(ones(N,1)) ; % this fills the V vector

>

> The subtle difference being that V is built via indexing instead of performing the multiplication of h*1, N times. Surprisingly, I've noted a stark failure in tony's trick as the dimension gets higher. This examination was completed via R14. I'm looking for an explanation (an error in my code is a perfectly acceptable explanation) for why tony's trick is no longer working.

>

> Further Drea Thomas (mathworks employee) posted the following:

>

> So, why use Tony's trick? It is usually faster. The second method

> performs unnecessary matrix multiplication. This technique is most

> useful when you have large matrices and the operation is inside

> your inner-most loop (where speed is critical). Let's test this

> hypothesis.

> a = rand(1,100);

> I = ones(100,1);

> tic;V=I*a;toc

> elapsed_time = 0.0315

> tic;V=a(I,:);toc

> elapsed_time = 0.0115

>

> There are two issues with her code. One is that she doesn't empty V before the second fill, so there's an obvious memory allocation bias. The second problem, which is extensively more damning, is that these results have not held for R14 on two different local machines, i.e. the tic-toc time is FASTER for the matrix multiply.

>

> This is the code for both tests, one for the vector fill and one for Drea's matrix construction. This code compares the two, clears V after every fill, and outputs a histogram of the results.

>

> The vector comparison code:

> %This code compares the speed of replicating a vector via h*ones(m,1) versus

> % h(ones(m,1)). The algorithm fills and clears a vector b, N times, and

> % plots the histogram results.

>

> close all

> clear

>

> N = 1000; % the number of trials

> m = 1e7; % the vector length

> h = 4; % assignment of a test number

>

> time=zeros(N,2); %preallocation of memory for the resulting toc times

>

> disp('-------h*ones(m,1) Analysis-------')

> for ncnt=1:N

> clear b %clear the b vector at the beginning of every loop

> tic

> b = h*ones(m,1); %populate b

> time(ncnt,1)=toc; %store the toc time

> end

> disp('done.')

>

> disp('-------h(ones(m,1)) Analysis-------')

> for ncnt=1:N

> clear b

> tic

> b=h(ones(m,1));

> time(ncnt,2)=toc;

> end

> disp('done.')

>

> %plotting normalized histograms

> subplot(2,1,1)

> [n,r]=hist(time(:,1),40); %histogram

> bar(r, 1./(r(2)-r(1))*n/sum(n)); %normalized histogram

> title('using h*ones(m,1)')

> xlabel('tic-toc time')

>

> subplot(2,1,2)

> [n,r]=hist(time(:,2),40);

> bar(r, 1./(r(2)-r(1))*n/sum(n));

> title('using h(ones(m,1))')

> xlabel('tic-toc time')

> %end of file

>

> And the matrix code-------------------------------------------------------------------------------------

> %this code is directly from drea thomas at mathworks, I just repeat the

> %calculation and create a histogram.

> close all

> clear

> clc

>

> N = 1000; % number of trials

> m = 1e3; % size of vector

>

> time=zeros(N,2); % preallocate memory

>

> for p=1:N % this is the comparison loop

> a = rand(1,m); % generate a random row vector a

> I = ones(m,1); % generate a ones column vector I

> clear V % clear the V variable to avoid a memory allocation bias

> tic;V=I*a; time(p,1)=toc; %tic-toc the time to create V

> % clear the V variable to avoid a memory allocation bias

> tic;V=a(I,:); time(p,2)=toc; %tic-toc the time to create V

> end

>

> %%plotting normalized histograms

> subplot(2,1,1)

> [n,r]=hist(time(:,1),40); %histogram

> bar(r, 1./(r(2)-r(1))*n/sum(n)); %normalized histogram

> title('using V=I*a')

> xlabel('tic-toc time')

> xlim([0 mean(time(:,1))*10])

> grid on

>

> subplot(2,1,2)

> [n,r]=hist(time(:,2),40);

> bar(r, 1./(r(2)-r(1))*n/sum(n));

> title('using V=a(I,:)')

> xlabel('tic-toc time')

> xlim([0 mean(time(:,1))*10])

> grid on

> %end of file

>

> I see in the vector version a HUGE speed advantage of the matrix multiply. The matrix version shows tony's trick offering a "slight" advantage for very small dimensions (m<100), but for anything big, the matrix multiply starts to pull away.

>

> I encourage readers to run the code themselves (10-15 minutes) and come to an explanation. Maybe an issue with the tic-toc time? I don't know, but this is certainly something that needs to be addressed.

>

> Thanks all,

>

> Bronson

Please explain why you think this is so important.

And why you say "this is certainly something that needs to be

addressed".

I'm afraid I'm a rather prosaic sort of chap.

I tend to use the best/fastest method I can find and move on, without

worrying about the details of why it's the best. I've found this way

makes money, whereas trying to figure out how many angels can dance on

the head of a pin is time-wasting and unproductive.

Who is Tony?

Here is something you might play with and further include it in your analysis (call the below Bruno's trick if you like).

V = zeros(N,1,'double');

V(:) = h;

I must admit I don't have courage to go over your code. All I know is computer is more or less a deterministic machine. If we find something "strange" that not falls in our expectation, likely we must do a wrong assumption(s) that lead(s) to this expectation. As my colleague often remind me: The problem is likely somewhere between the keyboard and the chair.

Bruno

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <h9nfg0$418$1@fred.mathworks.com>...

> Who is Tony?

This has been around a long time. See these for a background:

http://blogs.mathworks.com/loren/2006/03/01/more-on-expansion-arrayfun/

http://blogs.mathworks.com/loren/2006/02/22/scalar-expansion-and-more-take-2/

I also have not run the OP's code, though I will probably do it later (with the method Bruno showed, as this was my first thought also). But, given that Tony's trick is very old (pre-MATLAB 5?) I am surprised that Bronson is surprised that this may not be the current fastest approach. When Tony's trick was discovered, For loops were unquestionably slower than equivalent vectorized code, for example. A lot has changed since then, and this may just be one of many things that no longer holds since the early days. Only a careful analysis will tell.

Bronson wrote:

> Last week, I endeavored to see how much of an improvement "Tony's

> trick" offered over the more brute force calculation.

>

> For those in the know, Tony's trick amounts to replacing code such

> as

>

> N=1e6;

> h=4; % this is just a scalar

> V=h*ones(N,1) ; % this fills the V vector

>

> with the supposedly faster:

>

> N=1e6;

> h=4; % this is just a scalar

> V=h(ones(N,1)) ; % this fills the V vector

>

>

> The subtle difference being that V is built via indexing instead of

> performing the multiplication of h*1, N times. Surprisingly, I've

> noted a stark failure in tony's trick as the dimension gets higher.

> This examination was completed via R14. I'm looking for an

> explanation (an error in my code is a perfectly acceptable

> explanation) for why tony's trick is no longer working.

>

> Further Drea Thomas (mathworks employee) posted the following:

>

> So, why use Tony's trick? It is usually faster. The second method

> performs unnecessary matrix multiplication. This technique is most

> useful when you have large matrices and the operation is inside your

> inner-most loop (where speed is critical). Let's test this

> hypothesis. a = rand(1,100); I = ones(100,1); tic;V=I*a;toc

> elapsed_time = 0.0315 tic;V=a(I,:);toc elapsed_time = 0.0115

>

> There are two issues with her code. One is that she doesn't empty V

> before the second fill, so there's an obvious memory allocation bias.

> The second problem, which is extensively more damning, is that these

> results have not held for R14 on two different local machines, i.e.

> the tic-toc time is FASTER for the matrix multiply.

>

> [rest of explanation and code samples snipped]

Hi Bronson,

Drea (a guy, by the way), hasn't worked at MathWorks for a very long

time - 13 years or so? I don't know how you came across this post, but

it's VERY old. MATLAB has undergone several generations of substantial

changes since then.

---

Steve Eddins

http://blogs.mathworks.com/steve/

Ok, so I finally got around to looking this over. Here is a link to the OP's reading matter, dating to 1995:

http://www.ee.columbia.edu/~marios/matlab/Vectorization.pdf

This is a good study in how things change. Drea gives 3 main tips for performance throughout the document, and it is interesting to look at how they work out today.

1. Drea discusses negation of certain elements of a matrix, subject to a certain criteria. He offers a double FOR loop and use of the FIND function. I tested on two of my machines, both running 2007b, and found the FOR loop to be faster with a 3000-by-3000 matrix. In addition, I wonder why Drea used FIND instead of logical indexing? Maybe it didn't work back then. Here is the code, with a typical rel_times = [1, 2.44, 1.25]

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [] = dreas_desk1()

% Indexing example.

S = 3000; % Size of matrix.

rand('twister',5340); % Compare the same matrices.

M = round(rand(S)*100); % Different results for, say *10: For loop is even better.

T = zeros(1,3); % for timings

tic

for ii=1:S,

for jj=1:S,

if (M(ii,jj) > 4)

M(ii,jj) = -M(ii,jj);

end

end

end

T(1) = toc;

rand('twister',5340);

M = round(rand(S)*100);

tic

ind = find(M > 4);

M(ind)=-M(ind);

T(2) = toc;

rand('twister',5340);

M = round(rand(S)*100);

tic

ind = M > 4;

M(ind) = -M(ind);

T(3) = toc;

rel_times = T./min(T) % display the relative times.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2. Drea then discusses the removal of whitespace from a string. He offers a WHILE loop, FINDSTR, and the use of the FILTER function. His conclusion is that FINDSTR is slower than FILTER. Not on my machine. Here is the code, with typical rel_times = [1663.6 3.75 1]

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [] = dreas_desk2()

% Remove extra whitespace

V = 'I really hate extra white space. ';

V = repmat(V,1,500);

T = zeros(1,3);

tic

len = length(V);

ii = 1;

while (ii<len),

if (V(ii) == ' ' && V(ii+1) == ' ')

for jj = ii:(len-1),

V(jj) = V(jj+1);

end

V(len)=0; % Or something that will shorten the vector

len=len-1;

else

ii=ii+1;

end

end

V = char(V); % The modern setstr

T(1) = toc;

V = 'I really hate extra white space.';

V = repmat(V,1,500);

tic

ind = find(filter([1 1],2,V==' ')==1);

V(ind)= [];

T(2) = toc;

V = 'I really hate extra white space.';

V = repmat(V,1,500);

tic

ind = findstr(V,' ');

V(ind)= [];

T(3) = toc;

rel_times = T./min(T)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

3. Finally, Drea discusses subtracting the column mean from each column in a matrix. This is of course tailor made for bsxfun, which I know didn't exist in 1995. Again he concludes that indexing is faster than a FOR loop. Not on my machine. Even bsxfun barely beats Drea's FOR loop solution. Here is the code with typical rel_times = [1.05 7.39 1]

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [] = dreas_desk3()

% Subtract the column mean from a matrix.

S = 2000;

M = rand(S);

V = mean(M);

T = zeros(1,3);

tic

for ii=1:S,

M(:,ii) = M(:,ii) - V(ii);

end

T(1) = toc;

M = rand(S);

V = mean(M);

tic

M = M - V(ones(S,1,'int8'),:);

T(2) = toc;

M = rand(S);

V = mean(M);

tic

M = bsxfun(@minus,M,V);

T(3) = toc;

rel_times = T./min(T)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Which at last brings me to Tony's trick. I found that Tony's trick is indeed slower for scalar expansion, such as:

h = 3;

V = h(ones(3))

But the story is more complex when expanding a vector. For instance, when expanding a column vector, Tony's trick is faster than the multiplication method shown by Drea. When expanding a row vector, as Bronson did, Tony's trick is not as fast as the multiplication method. So I tested if permuting the row vector coupled with Tony's trick on a colon indexed row vector could beat the multiplication method. The results are that this is faster than the multiplication method up to size about 1200, then the the multiplication method starts to pull away on my machine. Here is the code with typical

rel_times =

1.9636 1

1.1837 1

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [] = dreas_desk4()

% REPMAT a vector

S = 1000;

a = rand(S,1); % A column vector

T = zeros(2,2);

tic

V = a*ones(1,S);

T(1,1) = toc;

clear V

tic

V = a(:,ones(S,1,'int8'));

T(1,2) = toc;

a = rand(1,S); % A row vector.

tic

V = ones(S,1)*a;

T(2,1) = toc;

clear V

tic

V = a(:); % Make a column vector, since we know this is fast.

V = permute(V(:,ones(S,1,'int8')),[2 1]); % take the transpose.

T(2,2) = toc;

rel_times = bsxfun(@rdivide,T,min(T,[],2))

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

So if Drea were writing this document today, some things would definitely need to be changed. As for folks like Bronson, beware reading 14 year old MATLAB performance tips!

Dear all,

The point is rather simple, and it appears that most of the readers have missed it. 14 year old documentation or not, unless you've been living in a barn, Tony's trick is touted as ALWAYS being faster for this sort of situation. I'm looking for an explanation for why this apparently fails, not examples of it failing (though, I appreciate the solidarity of results with my own analysis). I really don't see the relevancy of a memory preallocation game. As for folks like Matt, I haven't said anything about fastest; it's an issue of faster. And though I do appreciate an annotated summary of Drea's previous work, responses that actually respond to my question would be much appreciated.

"Bronson " <Bargyle13REMOVETHIS@gsb.columbia.edu> wrote in message <hbqhhn$1vj$1@fred.mathworks.com>...

> Dear all,

> The point is rather simple, and it appears that most of the readers have missed it. 14 year old documentation or not, unless you've been living in a barn, Tony's trick is touted as ALWAYS being faster for this sort of situation. I'm looking for an explanation for why this apparently fails, not examples of it failing (though, I appreciate the solidarity of results with my own analysis).

==============

Well, I for one am finding that Tony's trick always fails both for N small (100) and N large (5e7) with straight multiplying being about twice as fast. So this could indeed be a matter of architecture or upgrades in MATLAB code over time.

To my mind, though, a smartly implemented matrix multiplication should be faster than Tony's trick because indexing operations always require that you do bounds checking and type checking to make sure that each index into h is legitimate.

Conversely, when faced with an expression h*ones(N,1) the code simply knows to loop sequentially through the vector ones(N,1). Furthermore, if the code is smart enough to pre-check whether one of the operands in the multiplication is 1 (the code 14 years ago may not have done so) it could bypass the usual labor required in a multiplication.

"Bronson " <Bargyle13REMOVETHIS@gsb.columbia.edu> wrote in message

news:hbqhhn$1vj$1@fred.mathworks.com...

> Dear all,

> The point is rather simple, and it appears that most of the readers

> have missed it. 14 year old documentation or not, unless you've been

> living in a barn, Tony's trick is touted as ALWAYS being faster for this

> sort of situation.

By whom? When I search the entire MathWorks website, I find 12 references

to "Tony's trick" -- 10 in postings on Loren's blog (which doesn't discuss

performance of Tony's trick), 1 in a technical support document (which also

doesn't mention performance) and this thread in the MATLAB Central

newsreader. If there's some reference that says that ANY routine is ALWAYS

faster than another, please let us know as it's probably not true and we

shouldn't be saying that (unless the former routine is a subroutine to the

latter.)

> I'm looking for an explanation for why this apparently fails, not

> examples of it failing (though, I appreciate the solidarity of results

> with my own analysis). I really don't see the relevancy of a memory

> preallocation game. As for folks like Matt, I haven't said anything about

> fastest; it's an issue of faster. And though I do appreciate an annotated

> summary of Drea's previous work, responses that actually respond to my

> question would be much appreciated.

As Steve Eddins said, MATLAB has gone through many changes since Drea wrote

that document to which you referred. Off the top of my head, some probably

didn't impact the performance of Tony's trick or if they did the impact was

probably minor:

* the introduction of the MATLAB Desktop

* the introduction of both the old object-oriented class system and the new

OO class system

* maybe even the introduction of Handle Graphics, if that happened after

1995

* etc.

Some of those changes, though, probably did affect how much of an impact

Tony's trick had. A few examples of those changes could be:

* the introduction of N-D and non-double arrays and the modifications to the

source code to support them (It predates my tenure at The MathWorks but I

believe they were introduced in MATLAB 5 which was released in 1997.)

* the switch to use LAPACK instead of LINPACK (MATLAB 6.0, release R12)

http://www.mathworks.com/company/newsletters/news_notes/clevescorner/winter2000.cleve.html

* the change (in MATLB 6.5, release R13) of logical from an attribute to a

class and sparse from a class to an attribute

* the introduction of the JIT accelerator (again, I believe, in R13)

* years and years of bug fixes and new features

* etc.

Any or all of those could have affected the relative performance of Tony's

trick, the matrix multiplication alternative, etc. and I think it would be

difficult or impossible to determine exactly which change had the most

impact.

--

Steve Lord

slord@mathworks.com

comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ

Thanks Bronson

This thread started by you was a very interesting read.

Abhinav

"Bronson " <Bargyle13REMOVETHIS@gsb.columbia.edu> wrote in message <h9lfek$5uf$1@fred.mathworks.com>...

> Last week, I endeavored to see how much of an improvement "Tony's trick" offered over the more brute force calculation.

>

> For those in the know, Tony's trick amounts to replacing code such as

>

> N=1e6;

> h=4; % this is just a scalar

> V=h*ones(N,1) ; % this fills the V vector

>

> with the supposedly faster:

>

> N=1e6;

> h=4; % this is just a scalar

> V=h(ones(N,1)) ; % this fills the V vector

>

>

> The subtle difference being that V is built via indexing instead of performing the multiplication of h*1, N times. Surprisingly, I've noted a stark failure in tony's trick as the dimension gets higher. This examination was completed via R14. I'm looking for an explanation (an error in my code is a perfectly acceptable explanation) for why tony's trick is no longer working.

>

> Further Drea Thomas (mathworks employee) posted the following:

>

> So, why use Tony's trick? It is usually faster. The second method

> performs unnecessary matrix multiplication. This technique is most

> useful when you have large matrices and the operation is inside

> your inner-most loop (where speed is critical). Let's test this

> hypothesis.

> a = rand(1,100);

> I = ones(100,1);

> tic;V=I*a;toc

> elapsed_time = 0.0315

> tic;V=a(I,:);toc

> elapsed_time = 0.0115

>

> There are two issues with her code. One is that she doesn't empty V before the second fill, so there's an obvious memory allocation bias. The second problem, which is extensively more damning, is that these results have not held for R14 on two different local machines, i.e. the tic-toc time is FASTER for the matrix multiply.

>

> This is the code for both tests, one for the vector fill and one for Drea's matrix construction. This code compares the two, clears V after every fill, and outputs a histogram of the results.

>

> The vector comparison code:

> %This code compares the speed of replicating a vector via h*ones(m,1) versus

> % h(ones(m,1)). The algorithm fills and clears a vector b, N times, and

> % plots the histogram results.

>

> close all

> clear

>

> N = 1000; % the number of trials

> m = 1e7; % the vector length

> h = 4; % assignment of a test number

>

> time=zeros(N,2); %preallocation of memory for the resulting toc times

>

> disp('-------h*ones(m,1) Analysis-------')

> for ncnt=1:N

> clear b %clear the b vector at the beginning of every loop

> tic

> b = h*ones(m,1); %populate b

> time(ncnt,1)=toc; %store the toc time

> end

> disp('done.')

>

> disp('-------h(ones(m,1)) Analysis-------')

> for ncnt=1:N

> clear b

> tic

> b=h(ones(m,1));

> time(ncnt,2)=toc;

> end

> disp('done.')

>

> %plotting normalized histograms

> subplot(2,1,1)

> [n,r]=hist(time(:,1),40); %histogram

> bar(r, 1./(r(2)-r(1))*n/sum(n)); %normalized histogram

> title('using h*ones(m,1)')

> xlabel('tic-toc time')

>

> subplot(2,1,2)

> [n,r]=hist(time(:,2),40);

> bar(r, 1./(r(2)-r(1))*n/sum(n));

> title('using h(ones(m,1))')

> xlabel('tic-toc time')

> %end of file

>

> And the matrix code-------------------------------------------------------------------------------------

> %this code is directly from drea thomas at mathworks, I just repeat the

> %calculation and create a histogram.

> close all

> clear

> clc

>

> N = 1000; % number of trials

> m = 1e3; % size of vector

>

> time=zeros(N,2); % preallocate memory

>

> for p=1:N % this is the comparison loop

> a = rand(1,m); % generate a random row vector a

> I = ones(m,1); % generate a ones column vector I

> clear V % clear the V variable to avoid a memory allocation bias

> tic;V=I*a; time(p,1)=toc; %tic-toc the time to create V

> % clear the V variable to avoid a memory allocation bias

> tic;V=a(I,:); time(p,2)=toc; %tic-toc the time to create V

> end

>

> %%plotting normalized histograms

> subplot(2,1,1)

> [n,r]=hist(time(:,1),40); %histogram

> bar(r, 1./(r(2)-r(1))*n/sum(n)); %normalized histogram

> title('using V=I*a')

> xlabel('tic-toc time')

> xlim([0 mean(time(:,1))*10])

> grid on

>

> subplot(2,1,2)

> [n,r]=hist(time(:,2),40);

> bar(r, 1./(r(2)-r(1))*n/sum(n));

> title('using V=a(I,:)')

> xlabel('tic-toc time')

> xlim([0 mean(time(:,1))*10])

> grid on

> %end of file

>

>

> I see in the vector version a HUGE speed advantage of the matrix multiply. The matrix version shows tony's trick offering a "slight" advantage for very small dimensions (m<100), but for anything big, the matrix multiply starts to pull away.

>

> I encourage readers to run the code themselves (10-15 minutes) and come to an explanation. Maybe an issue with the tic-toc time? I don't know, but this is certainly something that needs to be addressed.

>

> Thanks all,

>

> Bronson

You can think of your watch list as threads that you have bookmarked.

You can add tags, authors, threads, and even search results to your watch list. This way you can easily keep track of topics that you're interested in. To view your watch list, click on the "My Newsreader" link.

To add items to your watch list, click the "add to watch list" link at the bottom of any page.

To add search criteria to your watch list, search for the desired term in the search box. Click on the "Add this search to my watch list" link on the search results page.

You can also add a tag to your watch list by searching for the tag with the directive "tag:tag_name" where tag_name is the name of the tag you would like to watch.

To add an author to your watch list, go to the author's profile page and click on the "Add this author to my watch list" link at the top of the page. You can also add an author to your watch list by going to a thread that the author has posted to and clicking on the "Add this author to my watch list" link. You will be notified whenever the author makes a post.

To add a thread to your watch list, go to the thread page and click the "Add this thread to my watch list" link at the top of the page.

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Got questions?

Get answers.

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi test

Learn moreDiscover what MATLAB ^{®} can do for your career.

Opportunities for recent engineering grads.

Apply TodayThe newsgroups are a worldwide forum that is open to everyone. Newsgroups are used to discuss a huge range of topics, make announcements, and trade files.

Discussions are threaded, or grouped in a way that allows you to read a posted message and all of its replies in chronological order. This makes it easy to follow the thread of the conversation, and to see what’s already been said before you post your own reply or make a new posting.

Newsgroup content is distributed by servers hosted by various organizations on the Internet. Messages are exchanged and managed using open-standard protocols. No single entity “owns” the newsgroups.

There are thousands of newsgroups, each addressing a single topic or area of interest. The MATLAB Central Newsreader posts and displays messages in the comp.soft-sys.matlab newsgroup.

**MATLAB Central**

You can use the integrated newsreader at the MATLAB Central website to read and post messages in this newsgroup. MATLAB Central is hosted by MathWorks.

Messages posted through the MATLAB Central Newsreader are seen by everyone using the newsgroups, regardless of how they access the newsgroups. There are several advantages to using MATLAB Central.

**One Account**

Your MATLAB Central account is tied to your MathWorks Account for easy access.

**Use the Email Address of Your Choice**

The MATLAB Central Newsreader allows you to define an alternative email address as your posting address, avoiding clutter in your primary mailbox and reducing spam.

**Spam Control**

Most newsgroup spam is filtered out by the MATLAB Central Newsreader.

**Tagging**

Messages can be tagged with a relevant label by any signed-in user. Tags can be used as keywords to find particular files of interest, or as a way to categorize your bookmarked postings. You may choose to allow others to view your tags, and you can view or search others’ tags as well as those of the community at large. Tagging provides a way to see both the big trends and the smaller, more obscure ideas and applications.

**Watch lists**

Setting up watch lists allows you to be notified of updates made to postings selected by author, thread, or any search variable. Your watch list notifications can be sent by email (daily digest or immediate), displayed in My Newsreader, or sent via RSS feed.

- Use a newsreader through your school, employer, or internet service provider
- Pay for newsgroup access from a commercial provider
- Use Google Groups
- Mathforum.org provides a newsreader with access to the comp.soft sys.matlab newsgroup
- Run your own server. For typical instructions, see: http://www.slyck.com/ng.php?page=2