Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Very large binary file with long processing time

Subject: Very large binary file with long processing time

From: Ashwin

Date: 26 May, 2011 16:21:04

Message: 1 of 4

Hi all,
I have a program that reads a large binary file (approx 700 MB). I am pasting a FOR loop that is consuming approximately 11 minutes. This loop should be able to process in a few seconds. Can someone advise on how to speed up this loop? Thanks.

for i = 1:npara
    if mod(i,100) == 0
        fprintf(1, 'i=%d (of %d)\n', i, npara);
        eval('java.lang.System.out.flush');
    end
    idx = (i-1)*blk;
    if isempty(coder.target)
        fseek(fid, 56, 'cof');
        bb = fread(fid, blk*8, 'uint8=>uint8');
    else
        coder.ceval(' fseek', fid, int32(56), coder.opaque('int', 'SEEK_CUR'));
        coder.ceval(' fread', coder.ref(bb), int32(1), int32(numel(bb)), fid);
    end
     for j = 1:blk
         a1 = bb(j*8-7);
         a2 = bb(j*8-6);
         a3 = bb(j*8-5);
         a4 = bb(j*8-4);
         b1 = bb(j*8-3);
         b2 = bb(j*8-2);
         b3 = bb(j*8-1);
         b4 = bb(j*8);
         x1 = typecast([a2 a1 a4 a3], 'int32');
         y1 = typecast([b2 b1 b4 b3], 'int32');
         amp(idx+j) = complex(double(x1)/ref,double(y1)/ref);
     end
 end

Subject: Very large binary file with long processing time

From: Florin Neacsu

Date: 26 May, 2011 18:24:04

Message: 2 of 4

"Ashwin" wrote in message <irlulg$24g$1@newscl01ah.mathworks.com>...
> Hi all,
> I have a program that reads a large binary file (approx 700 MB). I am pasting a FOR loop that is consuming approximately 11 minutes. This loop should be able to process in a few seconds. Can someone advise on how to speed up this loop? Thanks.
>
> for i = 1:npara
> if mod(i,100) == 0
> fprintf(1, 'i=%d (of %d)\n', i, npara);
> eval('java.lang.System.out.flush');
> end
> idx = (i-1)*blk;
> if isempty(coder.target)
> fseek(fid, 56, 'cof');
> bb = fread(fid, blk*8, 'uint8=>uint8');
> else
> coder.ceval(' fseek', fid, int32(56), coder.opaque('int', 'SEEK_CUR'));
> coder.ceval(' fread', coder.ref(bb), int32(1), int32(numel(bb)), fid);
> end
> for j = 1:blk
> a1 = bb(j*8-7);
> a2 = bb(j*8-6);
> a3 = bb(j*8-5);
> a4 = bb(j*8-4);
> b1 = bb(j*8-3);
> b2 = bb(j*8-2);
> b3 = bb(j*8-1);
> b4 = bb(j*8);
> x1 = typecast([a2 a1 a4 a3], 'int32');
> y1 = typecast([b2 b1 b4 b3], 'int32');
> amp(idx+j) = complex(double(x1)/ref,double(y1)/ref);
> end
> end

Hi,

You haven't given enough information in order to get an exact answer. I can suggest two things :

your vector amp should be preallocated (if is not already)
use profiler to spot bottlenecks.

Regards,
Florin

Subject: Very large binary file with long processing time

From: Ashwin

Date: 26 May, 2011 19:37:02

Message: 3 of 4

"Florin Neacsu" wrote in message <irm5s4$nkb$1@newscl01ah.mathworks.com>...
> "Ashwin" wrote in message <irlulg$24g$1@newscl01ah.mathworks.com>...
> > Hi all,
> > I have a program that reads a large binary file (approx 700 MB). I am pasting a FOR loop that is consuming approximately 11 minutes. This loop should be able to process in a few seconds. Can someone advise on how to speed up this loop? Thanks.
> >
> > for i = 1:npara
> > if mod(i,100) == 0
> > fprintf(1, 'i=%d (of %d)\n', i, npara);
> > eval('java.lang.System.out.flush');
> > end
> > idx = (i-1)*blk;
> > if isempty(coder.target)
> > fseek(fid, 56, 'cof');
> > bb = fread(fid, blk*8, 'uint8=>uint8');
> > else
> > coder.ceval(' fseek', fid, int32(56), coder.opaque('int', 'SEEK_CUR'));
> > coder.ceval(' fread', coder.ref(bb), int32(1), int32(numel(bb)), fid);
> > end
> > for j = 1:blk
> > a1 = bb(j*8-7);
> > a2 = bb(j*8-6);
> > a3 = bb(j*8-5);
> > a4 = bb(j*8-4);
> > b1 = bb(j*8-3);
> > b2 = bb(j*8-2);
> > b3 = bb(j*8-1);
> > b4 = bb(j*8);
> > x1 = typecast([a2 a1 a4 a3], 'int32');
> > y1 = typecast([b2 b1 b4 b3], 'int32');
> > amp(idx+j) = complex(double(x1)/ref,double(y1)/ref);
> > end
> > end
>
> Hi,
>
> You haven't given enough information in order to get an exact answer. I can suggest two things :
>
> your vector amp should be preallocated (if is not already)
> use profiler to spot bottlenecks.
>
> Regards,
> Florin
------------------------------------------------------------------------
Hi Florin,

I did preallocate the amp matrix. I just didn't paste it here. I preallocated it to a complex array. If I try to preallocate amp matrix, the program crashes right there. Size of amp matrix is 98013184. I preallocated it like this: amp = complex(zeros(1, ampSize));
If I don't preallocate the amp matrix, the program runs, but takes a very long time.

Some helpful values:

npara=23929
ampSize=98013184
blk=4096
size(bb)=1 x 32768
ref=1073709056
size(x1)=1 x 1
size(y1)=1 x 1

Regards,
Ashwin

Subject: Very large binary file with long processing time

From: Florin Neacsu

Date: 26 May, 2011 21:52:05

Message: 4 of 4

"Ashwin" wrote in message <irma4u$bj1$1@newscl01ah.mathworks.com>...
> "Florin Neacsu" wrote in message <irm5s4$nkb$1@newscl01ah.mathworks.com>...
> > "Ashwin" wrote in message <irlulg$24g$1@newscl01ah.mathworks.com>...
> > > Hi all,
> > > I have a program that reads a large binary file (approx 700 MB). I am pasting a FOR loop that is consuming approximately 11 minutes. This loop should be able to process in a few seconds. Can someone advise on how to speed up this loop? Thanks.
> > >
> > > for i = 1:npara
> > > if mod(i,100) == 0
> > > fprintf(1, 'i=%d (of %d)\n', i, npara);
> > > eval('java.lang.System.out.flush');
> > > end
> > > idx = (i-1)*blk;
> > > if isempty(coder.target)
> > > fseek(fid, 56, 'cof');
> > > bb = fread(fid, blk*8, 'uint8=>uint8');
> > > else
> > > coder.ceval(' fseek', fid, int32(56), coder.opaque('int', 'SEEK_CUR'));
> > > coder.ceval(' fread', coder.ref(bb), int32(1), int32(numel(bb)), fid);
> > > end
> > > for j = 1:blk
> > > a1 = bb(j*8-7);
> > > a2 = bb(j*8-6);
> > > a3 = bb(j*8-5);
> > > a4 = bb(j*8-4);
> > > b1 = bb(j*8-3);
> > > b2 = bb(j*8-2);
> > > b3 = bb(j*8-1);
> > > b4 = bb(j*8);
> > > x1 = typecast([a2 a1 a4 a3], 'int32');
> > > y1 = typecast([b2 b1 b4 b3], 'int32');
> > > amp(idx+j) = complex(double(x1)/ref,double(y1)/ref);
> > > end
> > > end
> >
> > Hi,
> >
> > You haven't given enough information in order to get an exact answer. I can suggest two things :
> >
> > your vector amp should be preallocated (if is not already)
> > use profiler to spot bottlenecks.
> >
> > Regards,
> > Florin
> ------------------------------------------------------------------------
> Hi Florin,
>
> I did preallocate the amp matrix. I just didn't paste it here. I preallocated it to a complex array. If I try to preallocate amp matrix, the program crashes right there. Size of amp matrix is 98013184. I preallocated it like this: amp = complex(zeros(1, ampSize));
> If I don't preallocate the amp matrix, the program runs, but takes a very long time.

Well, it is not surprising. You need more than 1 G of contiguous memory. I'm not an expert but depending on your OS you can probably find a way to make matlab use than amount (you should check this) if it's physically available.

Look for threads on the forum dealing with memory and large matrices. I am sure it's a common topic.

Florin

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us