Thread Subject: find function runs slowly -- optimize?

Subject: find function runs slowly -- optimize?

From: Chad

Date: 24 Nov, 2009 14:21:20

Message: 1 of 3

I'm working on calculating a 4-D histogram, and I'm trying to optimize my code to run as fast as possible.

First, I define xx, yy, zz, and maxis to be the histogram edge values, and preallocate specdata=zeros([numel(xx) numel(yy) numel(zz) numel(maxis) ],'single');

My actual code segment is:

fid = fopen( [pathname filename],'r','ieee-be');
%Waitbar
wait_handle=waitbar(0,'Gridding data');
flag=1;
ions=0;
num_loops=0;
while flag==1
    
    [buffer,count]=fread(fid,[4,read_max],'single'); %Read 1e5 ions
                                                        %read_max=1e5
    for inner_loop=1:round(count/4)
        i=find(buffer(1,inner_loop)>=xx, 1, 'last' ); %Grid into x
        j=find(buffer(2,inner_loop)>=yy, 1, 'last' );% into y
        k=find(buffer(3,inner_loop)>=zz, 1, 'last' );% z
        l=find(buffer(4,inner_loop)>=maxis, 1, 'last' );% m/q
        specdata([i,j,k,l])=specdata([i,j,k,l])+1;%Add to specdata
    end
    ions=ions+(count/4);
    num_loops=num_loops+1;
    waitbar(ions/num_ions,wait_handle);
    %Check for end-of-file
    if count < read_max*4
        flag=0; %Loop finished
    end
end

For 1e6 4-element datapoints in the datafile, this takes ~40 seconds. About half the time is spent in the specdata=specdata+1 line, and the rest spread across the four find() lines. (Using the profiler.)

I'm planning to move from this 1e6 unit play-dataset to real datasets of 1e9+ units, which will take this code ~10 hours (assuming linearity).

I've already optimized as well as I know, using the tricks in http://www.mathworks.com/company/newsletters/news_notes/june07/patterns.html

Has anyone thought about how to grid data into an N-D histogram efficiently? Any suggestions on how to speed this up? I'm a poor materials scientist learning computer science sink-or-swim style!

Thanks for the help.

Subject: find function runs slowly -- optimize?

From: Bruno Luong

Date: 24 Nov, 2009 19:34:20

Message: 2 of 3

"Chad " <parishcm@ornl.gov> wrote in message <hegq50$97p$1@fred.mathworks.com>...

>
> Has anyone thought about how to grid data into an N-D histogram efficiently?
>

Why reinvent the wheel?
http://www.mathworks.com/matlabcentral/fileexchange/23897-n-dimensional-histogram

Bruno

Subject: find function runs slowly -- optimize?

From: Chad

Date: 24 Nov, 2009 20:26:19

Message: 3 of 3

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message >
> Why reinvent the wheel?
> http://www.mathworks.com/matlabcentral/fileexchange/23897-n-dimensional-histogram
>
> Bruno

Thank you, I will look into this.

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
find Chad 24 Nov, 2009 09:24:07
optimization Chad 24 Nov, 2009 09:24:07
histogram Chad 24 Nov, 2009 09:24:07
nd histogram Chad 24 Nov, 2009 09:24:07
rssFeed for this Thread

Contact us at files@mathworks.com