Read data from complex *csv file

Hello everyone! I've been trying to read data from a .csv file that looks like:
[[0.714792626301598, -0.697224229221414, 0.05431275698645074], 0, 0, 'pos_0'],[[-0.5884201907614969, 0.7739447885416152, -0.23403235544146034], 1, 0, 'pos_1'],[[0.1944985746440766, -0.9746176263199874, 0.11086382154614778], 50, (0, 2), 'pos_4']....
The idea is to get separately for each column:
[0.714792626301598, -0.697224229221414, 0.05431275698645074]
0
0 %this could be in a (0, 2) format as displayed in the third column
'pos_0'
The file is around 200 columns, I've tried the following code:
S = textscan(fid,'%*c %[^] %*c %f %*c %f %s]','Delimiter',',');
resulting in a badly ordered cell array. Any help will be much appreciated!!
Thanks!

3 Comments

Stephen23
Stephen23 on 2 Nov 2018
Edited: Stephen23 on 2 Nov 2018
@Pablo: do not give us screenshots of data: we cannot import screenshots of data, we cannot test code on screenshots of data, we cannot search screenshots of data, we cannot edit screenshots of data, we cannot use screenshots of data in any meaningful way.
Please upload the actual text data file (or a file that represents all of its salient features).
Here is the data file Thanks for your comment
Stephen23
Stephen23 on 2 Nov 2018
Edited: Stephen23 on 2 Nov 2018
Ugh, what a badly formatted "CSV" file: it is a liberal mess of double quotes, square brackets, and superfluous single quotes... there are random parentheses around two fields, double quotes surround the entirety of each "line", and there is only one newline at the very end of the file! It should be named "Incoherence_disorder.txt".
The best solution to your problem is fix the thing that created that awful file.

Sign in to comment.

 Accepted Answer

Stephen23
Stephen23 on 2 Nov 2018
Edited: Stephen23 on 2 Nov 2018
This code imports that very messed up "CSV" file (attached):
opt = {'Delimiter',',', 'CollectOutput',true};
S = fileread('Coherence_order.csv');
S = regexprep(S,{'^"\[\[','\]"\s*$'},'');
S = strrep(S,']","[[',char(13));
S = strrep(S,'],',',');
S = strrep(S,'(','"');
S = strrep(S,')','"');
C = textscan(S,'%f%f%f%f%q%s', opt{:});
A summary of the data:
>> size(C{1})
ans =
165 4
>> size(C{2})
ans =
165 2
>> C{1}(1:8,:) % first eight "lines" of numeric data:
ans =
0.71479 -0.69722 0.05431 0.00000
-0.58842 0.77394 -0.23403 1.00000
-0.39404 -0.91007 0.12852 2.00000
0.73498 0.67216 -0.08951 3.00000
0.19450 -0.97462 0.11086 50.00000
0.47600 -0.87518 0.08647 25.00000
-0.10445 -0.98660 0.12531 75.00000
0.09842 0.97114 -0.21727 50.00000
>> C{2}(1:8,:) % first eight "lines" of char data:
ans =
'0' ''pos_0''
'0' ''pos_1''
'0' ''pos_2''
'0' ''pos_3''
'0, 2' ''pos_4''
'0, 2' ''pos_5''
'0, 2' ''pos_6''
'1, 3' ''pos_7''

1 Comment

Thank you very much!!! this will do it perfectly. :)

Sign in to comment.

More Answers (1)

madhan ravi
madhan ravi on 2 Nov 2018
Edited: madhan ravi on 2 Nov 2018
fid=fopen('coherence_order.csv','r')
f = textscan(fid,'%s','delimiter',{']'})
fclose(fid)

7 Comments

f = textscan(fid,'%s','delimiter','\n')
This takes data as it is in the file. But the user wants all three columns separated.
madhan ravi
madhan ravi on 2 Nov 2018
Edited: madhan ravi on 2 Nov 2018
This takes data as it is in the file
No it doesn’t, instead it reads all of them as separate cells
But the user wants all three columns separated.
True
Many thanks for your time, It gave me an empty cell as result. Here is the data file if you want to try it
{'[[0.714792626301598, -0.697224229221414, 0.05431275698645074], 0, 0, 'pos_0'],' }
{'[[-0.5884201907614969, 0.7739447885416152, -0.23403235544146034], 1, 0, 'pos_1'],' }
{'[[0.1944985746440766, -0.9746176263199874, 0.11086382154614778], 50, (0, 2), 'pos_4'],'}
The above is the result of:
f = textscan(fid,'%s','delimiter','\n')
It text scans the file as it is present in the file.
did you try this?
f = textscan(fid,'%s','delimiter',{']'})
{'[[0.714792626301598, -0.697224229221414, 0.05431275698645074' }
{', 0, 0, 'pos_0'' }
{',' }
{'[[-0.5884201907614969, 0.7739447885416152, -0.23403235544146034'}
{', 1, 0, 'pos_1'' }
{',' }
{'[[0.1944985746440766, -0.9746176263199874, 0.11086382154614778' }
{', 50, (0, 2), 'pos_4'' }
{','
madhan ravi
madhan ravi on 2 Nov 2018
Edited: madhan ravi on 2 Nov 2018
It is not the expected solution however because of the file arrangement

Sign in to comment.

Products

Release

R2017b

Asked:

on 2 Nov 2018

Commented:

on 5 Nov 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!