Ok, so I finally got around to looking this over. Here is a link to the OP's reading matter, dating to 1995:
http://www.ee.columbia.edu/~marios/matlab/Vectorization.pdf
This is a good study in how things change. Drea gives 3 main tips for performance throughout the document, and it is interesting to look at how they work out today.
1. Drea discusses negation of certain elements of a matrix, subject to a certain criteria. He offers a double FOR loop and use of the FIND function. I tested on two of my machines, both running 2007b, and found the FOR loop to be faster with a 3000by3000 matrix. In addition, I wonder why Drea used FIND instead of logical indexing? Maybe it didn't work back then. Here is the code, with a typical rel_times = [1, 2.44, 1.25]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [] = dreas_desk1()
% Indexing example.
S = 3000; % Size of matrix.
rand('twister',5340); % Compare the same matrices.
M = round(rand(S)*100); % Different results for, say *10: For loop is even better.
T = zeros(1,3); % for timings
tic
for ii=1:S,
for jj=1:S,
if (M(ii,jj) > 4)
M(ii,jj) = M(ii,jj);
end
end
end
T(1) = toc;
rand('twister',5340);
M = round(rand(S)*100);
tic
ind = find(M > 4);
M(ind)=M(ind);
T(2) = toc;
rand('twister',5340);
M = round(rand(S)*100);
tic
ind = M > 4;
M(ind) = M(ind);
T(3) = toc;
rel_times = T./min(T) % display the relative times.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2. Drea then discusses the removal of whitespace from a string. He offers a WHILE loop, FINDSTR, and the use of the FILTER function. His conclusion is that FINDSTR is slower than FILTER. Not on my machine. Here is the code, with typical rel_times = [1663.6 3.75 1]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [] = dreas_desk2()
% Remove extra whitespace
V = 'I really hate extra white space. ';
V = repmat(V,1,500);
T = zeros(1,3);
tic
len = length(V);
ii = 1;
while (ii<len),
if (V(ii) == ' ' && V(ii+1) == ' ')
for jj = ii:(len1),
V(jj) = V(jj+1);
end
V(len)=0; % Or something that will shorten the vector
len=len1;
else
ii=ii+1;
end
end
V = char(V); % The modern setstr
T(1) = toc;
V = 'I really hate extra white space.';
V = repmat(V,1,500);
tic
ind = find(filter([1 1],2,V==' ')==1);
V(ind)= [];
T(2) = toc;
V = 'I really hate extra white space.';
V = repmat(V,1,500);
tic
ind = findstr(V,' ');
V(ind)= [];
T(3) = toc;
rel_times = T./min(T)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3. Finally, Drea discusses subtracting the column mean from each column in a matrix. This is of course tailor made for bsxfun, which I know didn't exist in 1995. Again he concludes that indexing is faster than a FOR loop. Not on my machine. Even bsxfun barely beats Drea's FOR loop solution. Here is the code with typical rel_times = [1.05 7.39 1]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [] = dreas_desk3()
% Subtract the column mean from a matrix.
S = 2000;
M = rand(S);
V = mean(M);
T = zeros(1,3);
tic
for ii=1:S,
M(:,ii) = M(:,ii)  V(ii);
end
T(1) = toc;
M = rand(S);
V = mean(M);
tic
M = M  V(ones(S,1,'int8'),:);
T(2) = toc;
M = rand(S);
V = mean(M);
tic
M = bsxfun(@minus,M,V);
T(3) = toc;
rel_times = T./min(T)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Which at last brings me to Tony's trick. I found that Tony's trick is indeed slower for scalar expansion, such as:
h = 3;
V = h(ones(3))
But the story is more complex when expanding a vector. For instance, when expanding a column vector, Tony's trick is faster than the multiplication method shown by Drea. When expanding a row vector, as Bronson did, Tony's trick is not as fast as the multiplication method. So I tested if permuting the row vector coupled with Tony's trick on a colon indexed row vector could beat the multiplication method. The results are that this is faster than the multiplication method up to size about 1200, then the the multiplication method starts to pull away on my machine. Here is the code with typical
rel_times =
1.9636 1
1.1837 1
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [] = dreas_desk4()
% REPMAT a vector
S = 1000;
a = rand(S,1); % A column vector
T = zeros(2,2);
tic
V = a*ones(1,S);
T(1,1) = toc;
clear V
tic
V = a(:,ones(S,1,'int8'));
T(1,2) = toc;
a = rand(1,S); % A row vector.
tic
V = ones(S,1)*a;
T(2,1) = toc;
clear V
tic
V = a(:); % Make a column vector, since we know this is fast.
V = permute(V(:,ones(S,1,'int8')),[2 1]); % take the transpose.
T(2,2) = toc;
rel_times = bsxfun(@rdivide,T,min(T,[],2))
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
So if Drea were writing this document today, some things would definitely need to be changed. As for folks like Bronson, beware reading 14 year old MATLAB performance tips!
