Skip to Main Content Skip to Search
Login
File Exchange
MATLAB Newsgroup
Link Exchange
  Blogs  
 Contest 
MathWorks.com

Thread Subject: loading in large data...

Subject: loading in large data...

From: jay vaughan

Date: 06 May, 2008 01:42:03

Message: 1 of 4

Hi,

I am having some trouble optimizing the loading & handling
of large files (movies, in a kind of weird format). Any
comment on the following?

1) My code for loading the data (see below) was slow, and it
didn't scale linearly with the number of iterations of the
loop as I had thought it would. Any ideas on how to speed it up?

num_iterations = [10 30 100 300]; % # of frames loaded
time_measured = [0.098 0.37 2.18 14.3]; % time in seconds

2) Loading in the data as uint8. I think I want to work with
the data as uint8 not double since the data is 8 bit anyway
and uint8 is 8x more compact than double. Does this seem
like a good strategy?

Below is my code to load the data. The details of the file
format are after the code in case it is helpful.

[fid,msga]=fopen(filename,'r','ieee-le');

% first find xy dimensions
x_dim = fread(fid,1,'uint16');
y_dim = fread(fid,1,'uint16');

% loop through frames reading the data, reading first the
% frame number then the frame data
frame_num = fread(fid,1,'uint16');
mov = uint8( fread(fid,[x_dim, y_dim],'uint8') );
for k = 2:100;
    frame_num = fread(fid,1,'uint16');
    frame = fread(fid,[x_dim, y_dim],'uint8');
    mov = cat(3,mov,uint8(frame));
end

fclose(fid);


The movie is format is as follows. First, 4 bytes indicate
the x dimension size and y dimension size (2 bytes each).
Then two bytes indicate a frame number, followed by a single
byte for each of x_dim*y_dim pixels to form a frame. The
frame number, frame data structure is repeated until the end
of the movie.

Subject: Re: loading in large data...

From: John D'Errico

Date: 06 May, 2008 02:36:03

Message: 2 of 4

"jay vaughan" <jvaughan5.nospam@gmail.com> wrote in message
<fvod1b$di3$1@fred.mathworks.com>...
> Hi,
>
> I am having some trouble optimizing the loading & handling
> of large files (movies, in a kind of weird format). Any
> comment on the following?
>
> 1) My code for loading the data (see below) was slow, and it
> didn't scale linearly with the number of iterations of the
> loop as I had thought it would. Any ideas on how to speed it up?

Its funny. But I'd never have expected code
that does not pre-allocate an array, then
repeatedly concatenates data to the array
to behave in a linear fashion.

Your test does exactly what you should
NEVER do. When you append data to an
array, Matlab must dynamically re-allocate
that array with every iteration of the loop.
This is a trivial thing when you do it once,
or on a tiny array of numbers, just a few
times.

When you do that same operation on a
large array, and then do it several hundred
times, it takes much time. If the array is
large enough, this can potentially cause
seriously bad fragmentation of your
memory. It can force Matlab to begin
swapping large blocks of data in and out
of virtual memory. The point is, this is
NOT at all a process that is linear in the
time required.

John

Subject: Re: loading in large data...

From: Peter Boettcher

Date: 06 May, 2008 12:47:45

Message: 3 of 4

"jay vaughan" <jvaughan5.nospam@gmail.com> writes:

> Hi,
>
> I am having some trouble optimizing the loading & handling
> of large files (movies, in a kind of weird format). Any
> comment on the following?
>
> 1) My code for loading the data (see below) was slow, and it
> didn't scale linearly with the number of iterations of the
> loop as I had thought it would. Any ideas on how to speed it up?
>
> num_iterations = [10 30 100 300]; % # of frames loaded
> time_measured = [0.098 0.37 2.18 14.3]; % time in seconds
>
> 2) Loading in the data as uint8. I think I want to work with
> the data as uint8 not double since the data is 8 bit anyway
> and uint8 is 8x more compact than double. Does this seem
> like a good strategy?
>
> Below is my code to load the data. The details of the file
> format are after the code in case it is helpful.
>
> [fid,msga]=fopen(filename,'r','ieee-le');
>
> % first find xy dimensions
> x_dim = fread(fid,1,'uint16');
> y_dim = fread(fid,1,'uint16');
>
> % loop through frames reading the data, reading first the
> % frame number then the frame data
> frame_num = fread(fid,1,'uint16');
> mov = uint8( fread(fid,[x_dim, y_dim],'uint8') );

Instead of fread converting to double, then uint8() converting back to
uint8, specify the fread format as '*uint8', to keep the data in it's
original format.

> for k = 2:100;
> frame_num = fread(fid,1,'uint16');
> frame = fread(fid,[x_dim, y_dim],'uint8');
> mov = cat(3,mov,uint8(frame));

As John says, dynamically expanding the array like this requires a
memory allocation and memcpy for each frame.

> end
>
> fclose(fid);

Instead, preallocate your array then fill it in:

mov = zeros(x_dim, y_dim, 100, 'uint8');

for k=1:100
  frame_num = fread(fid, 1, 'uint16');
  mov(:,:,k) = fread(fid, [x_dim y_dim], '*uint8');
end



-Peter

Subject: Re: loading in large data...

From: jay vaughan

Date: 07 May, 2008 04:07:02

Message: 4 of 4

"jay vaughan" <jvaughan5.nospam@gmail.com> wrote in message
<fvod1b$di3$1@fred.mathworks.com>...
> Hi,
>
> I am having some trouble optimizing the loading & handling
> of large files (movies, in a kind of weird format). Any
> comment on the following?
>
> 1) My code for loading the data (see below) was slow, and it
> didn't scale linearly with the number of iterations of the
> loop as I had thought it would. Any ideas on how to speed
it up?
>
> num_iterations = [10 30 100 300]; % # of frames loaded
> time_measured = [0.098 0.37 2.18 14.3]; % time in seconds
>
> 2) Loading in the data as uint8. I think I want to work with
> the data as uint8 not double since the data is 8 bit anyway
> and uint8 is 8x more compact than double. Does this seem
> like a good strategy?
>
> Below is my code to load the data. The details of the file
> format are after the code in case it is helpful.
>
> [fid,msga]=fopen(filename,'r','ieee-le');
>
> % first find xy dimensions
> x_dim = fread(fid,1,'uint16');
> y_dim = fread(fid,1,'uint16');
>
> % loop through frames reading the data, reading first the
> % frame number then the frame data
> frame_num = fread(fid,1,'uint16');
> mov = uint8( fread(fid,[x_dim, y_dim],'uint8') );
> for k = 2:100;
> frame_num = fread(fid,1,'uint16');
> frame = fread(fid,[x_dim, y_dim],'uint8');
> mov = cat(3,mov,uint8(frame));
> end
>
> fclose(fid);
>
>
> The movie is format is as follows. First, 4 bytes indicate
> the x dimension size and y dimension size (2 bytes each).
> Then two bytes indicate a frame number, followed by a single
> byte for each of x_dim*y_dim pixels to form a frame. The
> frame number, frame data structure is repeated until the end
> of the movie.


Peter and John,

thanks for the replies. Preallocation helped, as did keeping
the frame in uint8.

Now I am working on how to display the movie frames quickly
in a GUI when controlled with arrow keys. We'll see how that
goes...


J

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
memory jay vaughan 05 May, 2008 21:45:10
load large files jay vaughan 05 May, 2008 21:45:10
rssFeed for this Thread

envelope graphic E-mail this page to a colleague

Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.
Related Topics