Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
textscan uses VAST amounts of memory with some larger text files

Subject: textscan uses VAST amounts of memory with some larger text files

From: Thomas

Date: 27 Jan, 2014 20:27:07

Message: 1 of 2

Hi
I am currently using textscan to import non-rectangular text files into Matlab. The data has the basic format (I have displayed with lower precision to aid readability:

NVM_V3 %header

3 % number of cameras followed by camera list (filename/extrinsics/intrinsics)
DSC05814.JPG 8774.7363 0.982 -0.099 -0.0984 -0.128 -0.174 0.008 -0.361 -0.312 0
DSC05826.JPG 8719.6439 0.970 -0.039 -0.162 -0.170 -0.811 -0.668 -0.872 -0.289 0
DSC05825.JPG 8718.2906 0.977 -0.059 -0.108 -0.176 -0.956 -0.083 -0.976 -0.286 0

10 % number of points followed by a list of row vectors (x/y/z/R/G/B/views/measurements)
3.706 0.009 5.521 147 116 87 2 1 4695 -10.072 829.875 2 4129 138.551 20.650
4.118 0.115 5.901 98 71 54 1 1 5698 1308.469 704.791
3.680 -0.351 5.285 171 137 102 2 1 6613 -586.595 -81.978 3 4142 -489.869 -1032.766
3.479 0.0586 5.469 49 30 21 1 2 6752 -574.997 1148.147 26
3.417 -0.086 5.224 105 68 38 2 2 7826 -1111.384 885.410 3 4167 -979.546 -2.273
3.964 0.059 5.749 120 88 65 2 1 7107 815.728 710.646 3 4171 959.160 -61.294
4.032 0.139 5.837 51 33 22 2 2 5371 1090.961 839.350 3 4174 1242.225 89.978
3.732 -0.132 5.410 195 165 141 2 1 5167 -153.592 457.226 3 4175 -16.533 -431.148
3.68557024172 -0.126260038974 5.39401729277 109 76 51 1 1 5307 -282.079 513.668
3.683 -0.094 5.410 90 58 42 2 2 5537 -247.375 598.106 3 4183 -106.569 -271.090

where the first list are cameras with intrinsic / extrinsic parameters and the second list is point xyz-rgb followed by a list of measurements . This second list can be have different numbers of measurements between different points (i.e. it is non rectangular) and is several orders of magnitude longer than the camera list.
I want to read this entire file into Matlab with each row put into a separate cell as character array (I leave the camera info as string data but convert the point list into numeric data to do other operations on it. I can get this result using the following code:

% open model target nvm file
[filename, pathname] = uigetfile('*.nvm', 'Multiselect', 'off');
fullpath = strcat(pathname,filename);
fid = fopen(fullpath,'r');
    c = textscan(fid,'%s','delimiter', '','whitespace','', 'HeaderLines', 1,'BufSize', 6500);
    fclose(fid);
    
%Extract data in cells
C = c{1};
However, whilst this works for text files that are a few mb to a few 10s of mb in size, most of my data is 500mb+. Using the above code for files of this size results in memory being eaten up at an alarming rate: I tried it with a 500mb file on a 64gb workstation today and the entire physical memory was consumed in a couple of minutes!!!
I'm not sure what the best approach is here? Would it be best to just bring in the camera data as strings and then import the larger point list as numeric data? I'm not really sure how this could best be achieved given that importdata() expects rectangular data as input.
Any advice / solutions would be greatly appreciated
Thanks
Thomas

Subject: textscan uses VAST amounts of memory with some larger text files

From: Thomas

Date: 28 Jan, 2014 11:07:12

Message: 2 of 2

Just in case any one is interested, I solved this problem by splitting the file into blocks and assigning each incoming block into a structure:

block_size = 10000;
data = struct;
format = '%s';
count = 0;
file_id = fopen(fullpath);
while ~feof(file_id)
   count = count+1;
   segarray = textscan(file_id, format, block_size, 'delimiter', '','whitespace','');
   data(count).blocks = segarray;
   disp(['processed block' ' ' num2str(count)]);
  
end

This solves the problem for me. Perhaps others may find it useful.
Thomas

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us