<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/168781</link>
    <title>MATLAB Central Newsreader - loading in large data...</title>
    <description>Feed for thread: loading in large data...</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2008 by The MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>The MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Tue, 06 May 2008 01:42:03 -0400</pubDate>
      <title>loading in large data...</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/168781#430469</link>
      <author>jay vaughan</author>
      <description>Hi,&lt;br&gt;
&lt;br&gt;
I am having some trouble optimizing the loading &amp; handling&lt;br&gt;
of large files (movies, in a kind of weird format). Any&lt;br&gt;
comment on the following?&lt;br&gt;
&lt;br&gt;
1) My code for loading the data (see below) was slow, and it&lt;br&gt;
didn't scale linearly with the number of iterations of the&lt;br&gt;
loop as I had thought it would. Any ideas on how to speed it up?&lt;br&gt;
&lt;br&gt;
num_iterations = [10 30 100 300]; % # of frames loaded&lt;br&gt;
time_measured = [0.098 0.37 2.18 14.3]; % time in seconds&lt;br&gt;
&lt;br&gt;
2) Loading in the data as uint8. I think I want to work with&lt;br&gt;
the data as uint8 not double since the data is 8 bit anyway&lt;br&gt;
and uint8 is 8x more compact than double. Does this seem&lt;br&gt;
like a good strategy?&lt;br&gt;
&lt;br&gt;
Below is my code to load the data. The details of the file&lt;br&gt;
format are after the code in case it is helpful.&lt;br&gt;
&lt;br&gt;
[fid,msga]=fopen(filename,'r','ieee-le');&lt;br&gt;
&lt;br&gt;
% first find xy dimensions&lt;br&gt;
x_dim = fread(fid,1,'uint16');&lt;br&gt;
y_dim = fread(fid,1,'uint16');&lt;br&gt;
&lt;br&gt;
% loop through frames reading the data, reading first the &lt;br&gt;
% frame number then the frame data&lt;br&gt;
frame_num = fread(fid,1,'uint16');&lt;br&gt;
mov = uint8( fread(fid,[x_dim, y_dim],'uint8') );&lt;br&gt;
for k = 2:100;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;frame_num = fread(fid,1,'uint16');&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;frame = fread(fid,[x_dim, y_dim],'uint8');&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;mov = cat(3,mov,uint8(frame));&lt;br&gt;
end&lt;br&gt;
&lt;br&gt;
fclose(fid);&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
The movie is format is as follows. First, 4 bytes indicate&lt;br&gt;
the x dimension size and y dimension size (2 bytes each).&lt;br&gt;
Then two bytes indicate a frame number, followed by a single&lt;br&gt;
byte for each of x_dim*y_dim pixels to form a frame. The&lt;br&gt;
frame number, frame data structure is repeated until the end&lt;br&gt;
of the movie. &lt;br&gt;
</description>
    </item>
    <item>
      <pubDate>Tue, 06 May 2008 02:36:03 -0400</pubDate>
      <title>Re: loading in large data...</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/168781#430471</link>
      <author>John D'Errico</author>
      <description>"jay vaughan" &amp;lt;jvaughan5.nospam@gmail.com&amp;gt; wrote in message &lt;br&gt;
&amp;lt;fvod1b$di3$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Hi,&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I am having some trouble optimizing the loading &amp; handling&lt;br&gt;
&amp;gt; of large files (movies, in a kind of weird format). Any&lt;br&gt;
&amp;gt; comment on the following?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 1) My code for loading the data (see below) was slow, and it&lt;br&gt;
&amp;gt; didn't scale linearly with the number of iterations of the&lt;br&gt;
&amp;gt; loop as I had thought it would. Any ideas on how to speed it up?&lt;br&gt;
&lt;br&gt;
Its funny. But I'd never have expected code&lt;br&gt;
that does not pre-allocate an array, then&lt;br&gt;
repeatedly concatenates data to the array&lt;br&gt;
to behave in a linear fashion.&lt;br&gt;
&lt;br&gt;
Your test does exactly what you should&lt;br&gt;
NEVER do. When you append data to an&lt;br&gt;
array, Matlab must dynamically re-allocate&lt;br&gt;
that array with every iteration of the loop.&lt;br&gt;
This is a trivial thing when you do it once,&lt;br&gt;
or on a tiny array of numbers, just a few&lt;br&gt;
times.&lt;br&gt;
&lt;br&gt;
When you do that same operation on a&lt;br&gt;
large array, and then do it several hundred&lt;br&gt;
times, it takes much time. If the array is&lt;br&gt;
large enough, this can potentially cause&lt;br&gt;
seriously bad fragmentation of your&lt;br&gt;
memory. It can force Matlab to begin&lt;br&gt;
swapping large blocks of data in and out&lt;br&gt;
of virtual memory. The point is, this is&lt;br&gt;
NOT at all a process that is linear in the&lt;br&gt;
time required.&lt;br&gt;
&lt;br&gt;
John&lt;br&gt;
</description>
    </item>
    <item>
      <pubDate>Tue, 06 May 2008 12:47:45 -0400</pubDate>
      <title>Re: loading in large data...</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/168781#430544</link>
      <author>Peter Boettcher</author>
      <description>"jay vaughan" &amp;lt;jvaughan5.nospam@gmail.com&amp;gt; writes:&lt;br&gt;
&lt;br&gt;
&amp;gt; Hi,&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; I am having some trouble optimizing the loading &amp; handling&lt;br&gt;
&amp;gt; of large files (movies, in a kind of weird format). Any&lt;br&gt;
&amp;gt; comment on the following?&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; 1) My code for loading the data (see below) was slow, and it&lt;br&gt;
&amp;gt; didn't scale linearly with the number of iterations of the&lt;br&gt;
&amp;gt; loop as I had thought it would. Any ideas on how to speed it up?&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; num_iterations = [10 30 100 300]; % # of frames loaded&lt;br&gt;
&amp;gt; time_measured = [0.098 0.37 2.18 14.3]; % time in seconds&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; 2) Loading in the data as uint8. I think I want to work with&lt;br&gt;
&amp;gt; the data as uint8 not double since the data is 8 bit anyway&lt;br&gt;
&amp;gt; and uint8 is 8x more compact than double. Does this seem&lt;br&gt;
&amp;gt; like a good strategy?&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; Below is my code to load the data. The details of the file&lt;br&gt;
&amp;gt; format are after the code in case it is helpful.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; [fid,msga]=fopen(filename,'r','ieee-le');&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; % first find xy dimensions&lt;br&gt;
&amp;gt; x_dim = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt; y_dim = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; % loop through frames reading the data, reading first the &lt;br&gt;
&amp;gt; % frame number then the frame data&lt;br&gt;
&amp;gt; frame_num = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt; mov = uint8( fread(fid,[x_dim, y_dim],'uint8') );&lt;br&gt;
&lt;br&gt;
Instead of fread converting to double, then uint8() converting back to&lt;br&gt;
uint8, specify the fread format as '*uint8', to keep the data in it's&lt;br&gt;
original format.&lt;br&gt;
&lt;br&gt;
&amp;gt; for k = 2:100;&lt;br&gt;
&amp;gt;     frame_num = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt;     frame = fread(fid,[x_dim, y_dim],'uint8');&lt;br&gt;
&amp;gt;     mov = cat(3,mov,uint8(frame));&lt;br&gt;
&lt;br&gt;
As John says, dynamically expanding the array like this requires a&lt;br&gt;
memory allocation and memcpy for each frame.  &lt;br&gt;
&lt;br&gt;
&amp;gt; end&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; fclose(fid);&lt;br&gt;
&lt;br&gt;
Instead, preallocate your array then fill it in:&lt;br&gt;
&lt;br&gt;
mov = zeros(x_dim, y_dim, 100, 'uint8');&lt;br&gt;
&lt;br&gt;
for k=1:100&lt;br&gt;
&amp;nbsp;&amp;nbsp;frame_num = fread(fid, 1, 'uint16');&lt;br&gt;
&amp;nbsp;&amp;nbsp;mov(:,:,k) = fread(fid, [x_dim y_dim], '*uint8');&lt;br&gt;
end&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
-Peter&lt;br&gt;
</description>
    </item>
    <item>
      <pubDate>Wed, 07 May 2008 04:07:02 -0400</pubDate>
      <title>Re: loading in large data...</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/168781#430720</link>
      <author>jay vaughan</author>
      <description>"jay vaughan" &amp;lt;jvaughan5.nospam@gmail.com&amp;gt; wrote in message&lt;br&gt;
&amp;lt;fvod1b$di3$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Hi,&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I am having some trouble optimizing the loading &amp; handling&lt;br&gt;
&amp;gt; of large files (movies, in a kind of weird format). Any&lt;br&gt;
&amp;gt; comment on the following?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 1) My code for loading the data (see below) was slow, and it&lt;br&gt;
&amp;gt; didn't scale linearly with the number of iterations of the&lt;br&gt;
&amp;gt; loop as I had thought it would. Any ideas on how to speed&lt;br&gt;
it up?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; num_iterations = [10 30 100 300]; % # of frames loaded&lt;br&gt;
&amp;gt; time_measured = [0.098 0.37 2.18 14.3]; % time in seconds&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 2) Loading in the data as uint8. I think I want to work with&lt;br&gt;
&amp;gt; the data as uint8 not double since the data is 8 bit anyway&lt;br&gt;
&amp;gt; and uint8 is 8x more compact than double. Does this seem&lt;br&gt;
&amp;gt; like a good strategy?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Below is my code to load the data. The details of the file&lt;br&gt;
&amp;gt; format are after the code in case it is helpful.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; [fid,msga]=fopen(filename,'r','ieee-le');&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; % first find xy dimensions&lt;br&gt;
&amp;gt; x_dim = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt; y_dim = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; % loop through frames reading the data, reading first the &lt;br&gt;
&amp;gt; % frame number then the frame data&lt;br&gt;
&amp;gt; frame_num = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt; mov = uint8( fread(fid,[x_dim, y_dim],'uint8') );&lt;br&gt;
&amp;gt; for k = 2:100;&lt;br&gt;
&amp;gt;     frame_num = fread(fid,1,'uint16');&lt;br&gt;
&amp;gt;     frame = fread(fid,[x_dim, y_dim],'uint8');&lt;br&gt;
&amp;gt;     mov = cat(3,mov,uint8(frame));&lt;br&gt;
&amp;gt; end&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; fclose(fid);&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The movie is format is as follows. First, 4 bytes indicate&lt;br&gt;
&amp;gt; the x dimension size and y dimension size (2 bytes each).&lt;br&gt;
&amp;gt; Then two bytes indicate a frame number, followed by a single&lt;br&gt;
&amp;gt; byte for each of x_dim*y_dim pixels to form a frame. The&lt;br&gt;
&amp;gt; frame number, frame data structure is repeated until the end&lt;br&gt;
&amp;gt; of the movie. &lt;br&gt;
&lt;br&gt;
&lt;br&gt;
Peter and John,&lt;br&gt;
&lt;br&gt;
thanks for the replies. Preallocation helped, as did keeping&lt;br&gt;
the frame in uint8.&lt;br&gt;
&lt;br&gt;
Now I am working on how to display the movie frames quickly&lt;br&gt;
in a GUI when controlled with arrow keys. We'll see how that&lt;br&gt;
goes...&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
J&lt;br&gt;
</description>
    </item>
  </channel>
</rss>
