Read one column of unknown length from CSV file

10 views (last 30 days)
Let's say I want to read column C from a CSV file which has an unknown number of rows. I have written the following code:
csvread('Filepath',0,2,[0 2 Inf 2])
Which gives me the error:
Error using dlmread (line 157)
Internal size mismatch
Error in csvread (line 50)
m=dlmread(filename, ',', r, c, rng);
Have also tried:
csvread('Filepath',0,2,[0 2 : 2])
Which gives the error:
Attempted to access range(3); index out of bounds because numel(range)=2.
Error in dlmread (line 105)
if r > range(3) || c > range(4), result= []; return, end
Error in csvread (line 50)
m=dlmread(filename, ',', r, c, rng);
So any suggestions on how to accomplish this? Thanks

Accepted Answer

per isakson
per isakson on 17 May 2015
Edited: per isakson on 17 May 2015
I failed. Since csvread is based on textscan, I propose you use textscan. It's at least better documented - IMO.
These two scripts read the third column of csv_test.txt.
fid = fopen( 'csv_test.txt', 'r' );
cac = textscan( fid, '%*d%*d%d%*d%*d%*d', 'Delimiter',',', 'Headerlines',1 );
fclose( fid );
fid = fopen( 'csv_test.txt', 'r' );
cac = textscan( fid, '%d%*[^\n]', 'Delimiter',',' ...
, 'Headerlines',1, 'Headercolumns',2 );
fclose( fid );
( 'Headercolumns',2 &nbsp is an undocumented option I stumbled upon when I tried to find out how to use csvread.)
and where csv_test.txt contains
A,B,C,D,E,F
11,12,13,14,15,16
21,22,23,24,25,26
31,32,33,34,35,36
  1 Comment
Jacee Johnson
Jacee Johnson on 18 May 2015
Thanks to the both of you for your help. This seems like the best approach but now I have trouble using fopen for multiple files. I will create another thread for that topic.

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 17 May 2015
csvread() invokes dlmread()
dlmread() invokes textscan() with the format set to '' (the empty string), which invokes an undocumented mode of textscan that allows it to try to determine the number of columns that are present. It also passes in the undocumented 'headercolumns' parameter, which is 0 to skip no columns and is otherwise the starting column to read from. It also uses the undocumented feature of passing -1 as the number of rows to textscan() instead of using the documented inf to mean infinite number.
The memory inconsistency error is disabled in dlmread if the number of rows is negative. And here is a hack: the number of rows will be -1 (telling textscan to read the entire file) if the third element of the range you specify is precisely 2 less than the starting row number that you give. So... instead of using
csvread('Filepath',0,2,[0 2 Inf 2])
invoke
csvread('Filepath',0,2,[0 2 -2 2])
-2 minus 0 gives -2, 1 is added to that in the internal calculation, -1 would be the result, that gets passed to textscan() as the number of rows so it reads everything, then the fact that the number of rows is negative disables the mismatch error.
By the way, when you specify a column range to csvread() it passes it to dlmread(). When dlmread() is given a column range, it uses the undocumented headercolumns parameter to tell textscan() the starting column. textscan() reads all of the columns from there to the end of the line, and returns all those columns. And then dlmread() throws away any extra columns.
So all in all, you might as well invoke textscan() directly and use an '%*' specifier to tell textscan to not even return data for the columns you don't want, the way that per isakson showed.
  5 Comments
Jacee Johnson
Jacee Johnson on 19 May 2015
Per, thanks once again, this command made my code much simpler and easier to manage. Thank you very much for all of your help.
Nicholas Paul Patzke
Nicholas Paul Patzke on 30 Mar 2018
passing along the -1 to text scan made it work for me!!!!

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!