MATLAB Answers

writetable takes forever - what is faster?

76 views (last 30 days)
cmo
cmo on 13 Aug 2015
I have a table with ~500,000 lines and ~20 columns. The table is mixed text and numbers. It is about 90 Mb as a text file.
It takes matlab FOREVER to write the table to a text file via the "writetable" function. I'm talking ~30 minutes.
Clearly, this is totally unacceptable.
How can I speed this up?
Please note - there are many (> 20) columns, and the number of them may change according to my data. So please do not suggest any manual solutions like "fprintf('%s\t%f\t%f')" etc.

  2 Comments

Titus Edelhofer
Titus Edelhofer on 13 Aug 2015
30 minutes sounds indeed way too long for a large but not giant table like this. Just to be sure, you are not trying to write it to some (slow) network drive but it's a local hard drive? Just to make sure that's MATLAB "to blame" and not the OS/network ... :)
Titus
Álvaro López de Quadros
Álvaro López de Quadros on 23 May 2018
Hi cmo
It's been almost three years since the post of your question and yet there have been 51 views in the last 30 days. A good answer for this highly technical issue could be golden and I want to give it a try; if you still have that code around, please attach the .m file.
Thank you,
AFLQ

Sign in to comment.

Answers (2)

per isakson
per isakson on 14 Aug 2015
Edited: per isakson on 14 Aug 2015
I've made a simple test with R2013b, 64bit, Win7, local SSD, and a spinning HD.
Some results
  • elapse time for writing increases linearly with size of the table variable
  • writing speed is approx. 0.15 MB/sec. EDIT: "speed" refers to the size of the table variable.
  • writing speed is practically the same with the HD
  • elapse time for 90MB would be approx. 10 minutes. EDIT: "90MB" refers to the size of the table variable.
>> [et,mb] = cssm(1e2)
et =
0.1496
mb =
0.0325
>> [et,mb] = cssm(1e3)
et =
1.4222
mb =
0.2287
>> [et,mb] = cssm(1e4)
et =
14.2710
mb =
2.1907
where
function [et,mb] = cssm( N )
str( N, 9 ) = 'z';
for jj = 1 : N
str(jj,:) = sprintf( 'Row%06d', jj );
end
sas.Name = str;
%
for jj = 1 : 20
sas.(sprintf('F%02d',jj)) = rand(N,1);
sas.(sprintf('S%02d',jj)) = char( randi( double('AZ'), [N,1] ) );
end
T = struct2table( sas );
sz = whos('T');
mb = sz.bytes/1e6;
tic, writetable( T, 'c:\m\cssm\T1.txt' ), et(1)=toc;
end
AFAIK: There is no faster, still user-friendly, alternative to writetable.

  2 Comments

Titus Edelhofer
Titus Edelhofer on 14 Aug 2015
Nice! Allthough the mb is the size of the variable, not the file, the file is roughly double the size (and therefore it should take about 5 minutes instead of 10). I get more or less the same timing, btw.
Titus
per isakson
per isakson on 14 Aug 2015
Thanks! Yes, my fault. I edited my answer.

Sign in to comment.


Jan
Jan on 14 Aug 2015
You want me not to suggest fprintf('%s\t%f\t%f'), but of course this is the most direct and fastest solution. You can create the format string automatically based on the type of the data. So why do you hesitate to call fprintf?

  2 Comments

cmo
cmo on 17 Aug 2015
The format is subject to change, as the table is liable to have varying number of columns (depending on input data).
Walter Roberson
Walter Roberson on 17 Aug 2015
fmt = ['%s', repmat('\t%f', 1, NumNumericColumns)];
fprintf(fid, fmt, TheString, TheNumericVector);

Sign in to comment.

Sign in to answer this question.

Tags