Speeding Up Text File Reading

33 views (last 30 days)
Nick G
Nick G on 13 Jul 2015
Answered: Walter Roberson on 15 Jul 2015
I'm attempting to read large .txt files (e.g., 500 MB). The trick is that I'm trying to downsample them at the same time (to make the data more manageable). I notice that the textscan(); really slows my overall code. Is there a faster alternative?
In C, we'd use something like sscanf(); which is lightning fast.
% Read and Downsample %
in = fopen('sampreport.txt', 'r');
fprintf('\nReading sample report at %i hz downsample...', desiredHz);
tline = fgetl(in); %eat header
fileline = 0;
currsamp = 0;
i = 0;
tline = fgetl(in);
while ischar(tline);
if isempty(strfind(tline, '. .'))
temp = textscan(tline, '%*f %*f %*s %*f %*f %*f %*s %*f %*s %*s %*f %*s %*f %*f %*f %*s %*f %*f %*f %f %f');
currtime = temp{1}-temp{2};
if mod(currtime,2) == 1
currtime = currtime + 1;
end
if currtime <= maxtime && ~mod(currtime,(1000/desiredHz))
i = i + 1;
temp = textscan(tline,'%f %f %*s %f %f %f %*s %f %*s %*s %f %s %*f %*f %*f %*s %f %f %*f %*f %*f');
sample.RIGHT_GAZE_X(i) = temp{1};
sample.RIGHT_GAZE_Y(i) = temp{2};
sample.block(i) = temp{3};
sample.trialNum(i) = temp{4};
sample.subjNum(i) = temp{5};
sample.targLoc(i) = temp{6};
sample.singLoc(i) = temp{7};
sample.singPres(i) = temp{8};
sample.ACC(i) = temp{9};
sample.RT(i) = temp{10};
sample.sampTime(i) = currtime;
sample.currentSamp(i) = currtime*(desiredHz/1000);
end
end
fileline = fileline + 1; % countline
if mod(fileline,200000)<1
fprintf('.');
end
tline = fgetl(in);
end
fclose(in);
  1 Comment
Stephen23
Stephen23 on 14 Jul 2015
Edited: Stephen23 on 14 Jul 2015
This code is very inefficient. MATLAB is not C (or any other language), and there are different concepts that should be applied to use it efficiently.
In particular reading this file line-by-line is a complete waste of textscan's ability to read the whole file (or parts of it) at once. Also expanding the output cell arrays on every iteration is going to be slow, without some form of array preallocation. These are not difficult problems to solve...
Unfortunately you do not provide any sample data for us to work with, so advising you on how to improve your code is difficult. If you actually want help we need to have something to work with, otherwise how can we test and check if our suggestion are an improvement? Please upload a sample file (yes, it can be redacted to make it smaller) using the paperclip button, and then pressing both the Choose file and Attach file buttons.

Sign in to comment.

Answers (1)

Walter Roberson
Walter Roberson on 15 Jul 2015
MATLAB has sscanf()

Categories

Find more on Data Import and Export in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!