Path: news.mathworks.com!not-for-mail
From: "Steven_Lord" <slord@mathworks.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: very large array
Date: Mon, 24 Jun 2013 10:19:40 -0400
Organization: MathWorks
Lines: 48
Message-ID: <kq9khs$kvi$1@newscl01ah.mathworks.com>
References: <kq9hdf$bsa$1@newscl01ah.mathworks.com>
NNTP-Posting-Host: ah-slord.dhcp.mathworks.com
Mime-Version: 1.0
Content-Type: text/plain;
	format=flowed;
	charset="utf-8";
	reply-type=response
Content-Transfer-Encoding: 7bit
X-Trace: newscl01ah.mathworks.com 1372083580 21490 172.28.9.169 (24 Jun 2013 14:19:40 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Mon, 24 Jun 2013 14:19:40 +0000 (UTC)
In-Reply-To: <kq9hdf$bsa$1@newscl01ah.mathworks.com>
X-Priority: 3
X-MSMail-Priority: Normal
Importance: Normal
X-Newsreader: Microsoft Windows Live Mail 14.0.8089.726
X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8089.726
Xref: news.mathworks.com comp.soft-sys.matlab:798100



"Lorenzo Quadri" <quadrilo_sub_r@gmail.com> wrote in message 
news:kq9hdf$bsa$1@newscl01ah.mathworks.com...
> Hi I'm a newbie in matlab, I have a very large array (circa 400 million 
> rows and 7 columns of uint8 type), I've to delete about 100 million 
> elements i tried this kind of operation but it's very very slow.
>
> for i=1:length(dati)
>    if (( int ( sum(dati(i,:) ))<355) & range(dati(i,:))>20)
>        dati(i,:) = [];
>    end
> end

Not only will this be slow, it will also error. If you have a 10 row array:

xv = (1:10).';
X = [xv, xv.^2]
size(X)

and you delete one row, you now have a 9 row array:

X(3, :) = []
size(X)

In your code, length(dati) is NOT evaluated each time the loop body executes 
but is fixed when the loop STARTS executing. Thus you'd walk off the end of 
the array if any of the rows are deleted.

So you want to eliminate rows whose sum is less than 355 and whose maximum 
and minimum elements are more than 20 apart? Use logical indexing on the 
whole array at once rather than row-by-row. Compute along the rows by 
specifying a dimension input argument to SUM and RANGE.

rowsums = sum(dati, 2, 'double');
rowranges = range(dati, 2);
dati(rowsums < 355 & rowranges > 20, :) = [];

While you could do this all on one line, I broke the two conditions out so 
you could experiment with a smaller dati to prove to yourself that it works 
and that you understand what the code is doing.

-- 
Steve Lord
slord@mathworks.com
To contact Technical Support use the Contact Us link on 
http://www.mathworks.com