Path: news.mathworks.com!newsfeed-00.mathworks.com!panix!bloom-beacon.mit.edu!llnews!53ab2750!not-for-mail
From: Peter Boettcher <boettcher@ll.mit.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: how to insert/delete rows in a matrix without copy the rest of matrix
References: <f87kg3$9ri$1@fred.mathworks.com>
Message-ID: <muymyka6olz.fsf@G99-Boettcher.llan.ll.mit.edu>
Organization: MIT Lincoln Laboratory
User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/23.0.0 (gnu/linux)
Cancel-Lock: sha1:TmRb4Bx3+QuEQjymFKP9spudqOM=
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Lines: 89
Date: Tue, 22 Jul 2008 09:21:44 -0400
NNTP-Posting-Host: 155.34.163.114
X-Complaints-To: news@ll.mit.edu
X-Trace: llnews 1216732161 155.34.163.114 (Tue, 22 Jul 2008 09:09:21 EDT)
NNTP-Posting-Date: Tue, 22 Jul 2008 09:09:21 EDT
Xref: news.mathworks.com comp.soft-sys.matlab:480941



"Jens Christiansen" <jenschristiansen@gmail.com> writes:

> "Chenyang " <john.doe.nospam@mathworks.com> wrote in message
> <f87kg3$9ri$1@fred.mathworks.com>...
>> Matlab is very slow add/delete a row in a matrix.
>> It always trying to copy the whole matrix
>> e.g., 
>> 
>> a = rand(5000);
>> tic
>> a(:,1) = [];
>> toc
>> tic
>> a = [a(:,1),a];
>> toc
>> Elapsed time is 0.447241 seconds.
>> Elapsed time is 0.556416 seconds.
>> 
>> I need to insert/delete a row into a large matrix.
>> How to make it fast?
>> 
>
>
> One fast way of doing this is to first fill the rows with
> zeros, and then delete them using the 'any' function as follows:
>
> tic
> x = rand(50000,10);
> for i = 50000:-100:1
>     x(i,:) = [];
> end
> toc
>
>>>Elapsed time is 4.813505 seconds.
>
> tic
> x = rand(50000,10);
> for i = 50000:-100:1
>     x(i,:) = 0;
> end
> x(~any(x,2),:) = [];
> toc
>
>>>Elapsed time is 0.058070 seconds.
>
> I cannot give a technical explanation for why the second
> method is so much faster, but I would like to hear one if
> anyone knows... I guess it has to do with the logical
> indexing. My own problem was to delete very many irregularly
> spaced rows from a matrix of app. the dimensions above, and
> this method does the job very quickly.

I believe it has to do with memory reallocation / copying.  Think of
deleting rows much like adding rows... if you don't preallocate, each
time you add elements, MATLAB is forced to reallocate memory to the new
larger size, and copy all the data over.

Your trick is like "preallocation" for deleting.  In the loop you only
compute and save (in some manner) the rows to be deleted.  Then the
actual delete happens at once, which means only a single reallocation.

My guess is that in this case, a "vectorized" delete would be even
faster.  You really shouldn't time "rand", as that isn't part of the
delete process.

x1 = rand(50000,10);
x2 = rand(50000,10);

tic
for i = 50000:-100:1
    x1(i,:) = 0;
end
x1(~any(x1,2),:) = [];
toc


tic
rows = 50000:-100:1;
x2(rows,:) = [];
toc


Elapsed time is 0.009575 seconds.
Elapsed time is 0.006921 seconds.




-Peter