How can I read csv file data correctly? I tried multiple ways

3 views (last 30 days)
Hi
I have a csv file and I am trying to import it to matlab. an example for the file content is the following (this is one row)
689d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,62
I tried using
readtable()
but the file is not separated as it is by the commas.
Then, I tried
csvread()
and I get the error:
Error using dlmread (line 147)
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 1, field number 1) ==>
Also, I tried
textscan()
and it did not do anything(no error but no content extracted either)
Last thing, I tried to manually import the data using the interface but the same problem in the readtable() occurred.
How can I read the data correctly? and put it in a matrix or table
Thank you

Accepted Answer

Stephen23
Stephen23 on 2 Nov 2020
Edited: Stephen23 on 2 Nov 2020
textscan has no problems importing the file data simply and efficiently (sample file is attached):
opt = {'Delimiter',',','CollectOutput',true};
fmt = '%s%s%q%q%q%q%f%f%f%f';
[fid,msg] = fopen('temp0.txt','rt');
assert(fid>=3,msg)
out = textscan(fid,fmt,opt{:});
fclose(fid);
Giving:
>> out{1} % character data
ans =
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2'
[1x27 char] 'true' 'timestamp' ' 3' ' 9' ' 1'
[1x27 char] 'true' 'timestamp' ' 4' ' 10' ' 0'
[1x27 char] 'true' 'timestamp' ' 5' ' 11' ' -1'
>> out{2} % numeric data
ans =
25.2100 17.5366 39.0000 62.0000
25.2300 17.5366 40.0000 63.0000
25.2400 17.5366 41.0000 64.0000
25.2500 17.5366 42.0000 65.0000
>>
Because readtable also supports the format specifier I see no reason why it shouldn't work as well. I might try later.
  2 Comments
Nora Khaled
Nora Khaled on 2 Nov 2020
Thank you for your help !
This code works with my problem very well.
But I would like to ask, why check assert(fid>=3,msg)? I thought fid contains the data from the csv file
Stephen23
Stephen23 on 3 Nov 2020
Edited: Stephen23 on 3 Nov 2020
"But I would like to ask, why check assert(fid>=3,msg)?"
To print an informative error message if the file could not be opened.
"I thought fid contains the data from the csv file"
No, it does not.
The command fopen opens a file and returns a kind of handle to the open file, that handle is known as a "file identifier" (this is explained in the fopen documentation). Then any functions and operators which need to operate on that file (e.g. reading data, writing data, moving the current position in the file, etc.) are given that file ID so that they can perform their operations on the open file.
In this case textscan takes the file ID of an open file and imports the file data using the options that we defined.

Sign in to comment.

More Answers (1)

Mathieu NOE
Mathieu NOE on 2 Nov 2020
hello
seems matlab has an issue with the format of your data (especially with )
I could not make it work whatever the options with readtable.
I ended doing a small work around function with basic operations.
Seems to work, at least on my matlab
input data : 4 lines - slightly different - saved as csv file
689d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,62
679d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,63
669d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,64
659d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,65
function code as follows :
function output_matrix = retrieve_csv(Filename)
fid = fopen(Filename);
tline = fgetl(fid);
k = 0;
while ischar(tline)
k = k+1; % loop over line index
sep = findstr(tline,',');
ind = [0;sep(:);length(tline)+1];
for ci = 1:length(ind)-1
tline_extract = tline(ind(ci)+1:ind(ci+1)-1);
% remove undesired characters (")
ind_rem = findstr(tline_extract,'"');
tline_extract(ind_rem) = '';
output_matrix{k,ci} = tline_extract;
end
tline = fgetl(fid);
end
fclose(fid);
output :
output_matrix =
Columns 1 through 7
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
Columns 8 through 10
'17.536593101926' '39' '62'
'17.536593101926' '39' '63'
'17.536593101926' '39' '64'
'17.536593101926' '39' '65'
  2 Comments
Nora Khaled
Nora Khaled on 2 Nov 2020
Thank you very much!
I was wondering how am gonna read the file if no reading function worked, so, this is really helpful.
you can also see Stephen answer, he used the function textscan() with a format specifier.
Mathieu NOE
Mathieu NOE on 3 Nov 2020
yes - I myself learn from Stephen answer ! good for me too !

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!