Read one column of unknown length from CSV file
10 views (last 30 days)
Show older comments
Jacee Johnson
on 17 May 2015
Commented: Nicholas Paul Patzke
on 30 Mar 2018
Let's say I want to read column C from a CSV file which has an unknown number of rows. I have written the following code:
csvread('Filepath',0,2,[0 2 Inf 2])
Which gives me the error:
Error using dlmread (line 157)
Internal size mismatch
Error in csvread (line 50)
m=dlmread(filename, ',', r, c, rng);
Have also tried:
csvread('Filepath',0,2,[0 2 : 2])
Which gives the error:
Attempted to access range(3); index out of bounds because numel(range)=2.
Error in dlmread (line 105)
if r > range(3) || c > range(4), result= []; return, end
Error in csvread (line 50)
m=dlmread(filename, ',', r, c, rng);
So any suggestions on how to accomplish this? Thanks
0 Comments
Accepted Answer
per isakson
on 17 May 2015
Edited: per isakson
on 17 May 2015
I failed. Since csvread is based on textscan, I propose you use textscan. It's at least better documented - IMO.
These two scripts read the third column of csv_test.txt.
fid = fopen( 'csv_test.txt', 'r' );
cac = textscan( fid, '%*d%*d%d%*d%*d%*d', 'Delimiter',',', 'Headerlines',1 );
fclose( fid );
fid = fopen( 'csv_test.txt', 'r' );
cac = textscan( fid, '%d%*[^\n]', 'Delimiter',',' ...
, 'Headerlines',1, 'Headercolumns',2 );
fclose( fid );
( 'Headercolumns',2   is an undocumented option I stumbled upon when I tried to find out how to use csvread.)
and where csv_test.txt contains
A,B,C,D,E,F
11,12,13,14,15,16
21,22,23,24,25,26
31,32,33,34,35,36
More Answers (1)
Walter Roberson
on 17 May 2015
csvread() invokes dlmread()
dlmread() invokes textscan() with the format set to '' (the empty string), which invokes an undocumented mode of textscan that allows it to try to determine the number of columns that are present. It also passes in the undocumented 'headercolumns' parameter, which is 0 to skip no columns and is otherwise the starting column to read from. It also uses the undocumented feature of passing -1 as the number of rows to textscan() instead of using the documented inf to mean infinite number.
The memory inconsistency error is disabled in dlmread if the number of rows is negative. And here is a hack: the number of rows will be -1 (telling textscan to read the entire file) if the third element of the range you specify is precisely 2 less than the starting row number that you give. So... instead of using
csvread('Filepath',0,2,[0 2 Inf 2])
invoke
csvread('Filepath',0,2,[0 2 -2 2])
-2 minus 0 gives -2, 1 is added to that in the internal calculation, -1 would be the result, that gets passed to textscan() as the number of rows so it reads everything, then the fact that the number of rows is negative disables the mismatch error.
By the way, when you specify a column range to csvread() it passes it to dlmread(). When dlmread() is given a column range, it uses the undocumented headercolumns parameter to tell textscan() the starting column. textscan() reads all of the columns from there to the end of the line, and returns all those columns. And then dlmread() throws away any extra columns.
So all in all, you might as well invoke textscan() directly and use an '%*' specifier to tell textscan to not even return data for the columns you don't want, the way that per isakson showed.
5 Comments
See Also
Categories
Find more on Text Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!