Cannot get textscan() while loop to work

5 views (last 30 days)
David malins
David malins on 13 Jun 2018
Answered: David malins on 13 Jun 2018
Hi, I have a .csv I am trying to read with textscan in a while loop. Below is a sample, the actual file has 1,700,00 rows and 5 columns/fields. I can manually read each successive line and concatenate to give an iterative TSOutput but in .m file it just returns a 1x5 cell array with nothing in.
Device;Date;50212 Most restrictive Status 10M ();50187 Average Power 10M (kW);49411 Average Speed 10M (rpm)
A01;01/02/2018 00:00:00.000;100;2500.0;13.0
A01;01/02/2018 00:10:00.000;100;2499.9;13.0
A01;01/02/2018 00:20:00.000;100;2500.0;13.0
A01;01/02/2018 00:30:00.000;100;2500.0;13.0
A01;01/02/2018 00:40:00.000;100;2500.0;13.0
A01;01/02/2018 00:50:00.000;100;2500.1;13.0
A01;01/02/2018 01:00:00.000;100;2500.0;13.0
A01;01/02/2018 01:10:00.000;100;2499.9;13.0
FileID = fopen('X-MinutalXX.csv');
TSOutput = textscan(FileID, '%s%s%d%d%d/n','Delimiter',';','HeaderLines',1);
while not(feof(FileID))
TempData = textscan(FileID, '%s%s%d%d%d/n','Delimiter',';'); % read next row from the file
if feof(FileID)
break;
end
TSOutput = [TSOutput; TempData];
end
fclose(FileID);
Hope someone can help me?

Answers (3)

dpb
dpb on 13 Jun 2018
Format string is in error; the last numeric value on each record is floating point.
fmt=['%s%s' repmat('%f',1,3)]; % textscan handles EOR automagically unless doing something funky
With that correction, no problem reading the file but will note that dynamic reallocation of the output will be exceedingly slow bottleneck when the count of records get larger...do NOT do this!!!!
What is the end result needed? The better solution will be to read sizable chunks of the file into memory at a time and process those or use some of the other features in ML for large files; what, specifically might be most beneficial depends on need.

OCDER
OCDER on 13 Jun 2018
FID = fopen('X-MinutalXX.csv');
Data = textscan(FID, '%s%{MM/dd/yyy HH:mm:ss.SSS}D%f%f%f', 'Delimiter', ';', 'Headerlines', 1);
fclose(FID)
textscan will scan through all file until no pattern is reached or feof is true, so that means you don't have to have that while loop. The use of textscan is different from fgetl.
  2 Comments
dpb
dpb on 13 Jun 2018
I'm guessing OP ran into memory issues with the 1.7E6 record file that way...there's a possibility might be able to keep in memory but a very quick test indicated that the file would be 850 MB as cell array...depending on what else is in memory and how much memory has, who knows???
OCDER
OCDER on 13 Jun 2018
Good point - missed that gigantic data issue. textscan can then be used to read the next N number of lines of data as such:
FID = fopen('X-MinutalXX.csv');
FMT = '%s%{MM/dd/yyy HH:mm:ss.SSS}D%f%f%f';
N = 2; %number of lines to read in one chunk
while ~feof(FID)
Data = textscan(FID, FMT, N, 'Delimiter', ';', 'Headerlines', 1);
end % ^ ADD THIS HERE
fclose(FID)

Sign in to comment.


David malins
David malins on 13 Jun 2018
many thanks for correcting my format mistake. I was initially planning to count rows to allocate memory, but as i wish to filter for the asset number (first field) to generate a matrix to save as .mat file, I will read as asset chunk at a time with
TempData = textscan(FileID, '%s%s%f%f%f','Delimiter',';');
if strcmp(TempData{1},'A01')
TSOutput = [TSOutput; TempData];
end
OCDER, The reason I had the feof break was, I thought it would read n+1 lines before the while statement realised feof = 1. Is this not correct?
thanks again

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!