Using regexp to create dataset
Show older comments
I have imported a large database using textscan(). Now I have data with 12 variables. Each observation looks like this:
5,573346285,746540138,NA,1341119065,NA,7,0,2,1341111281,"-1,-1,-1,0,-1",-0.8
These are cell data and I would like to convert them in dataset type, but my problem is that the 11th variable is a string that may contain several numbers separated by commas. I cannot use something like this regexp(datacell{1,1}{6,1}, ',\s*', 'split') because it will split the 11th variable in many different parts. Can you please suggest me a code that can make it? Thank you.
Accepted Answer
More Answers (2)
Walter Roberson
on 21 May 2016
0 votes
If you are using one of the more recent versions of textscan then you can use the %q format to read the double-quoted string as a single item.
Azzi Abdelmalek
on 21 May 2016
Edited: Azzi Abdelmalek
on 21 May 2016
a='5,573346285,746540138, NA ,1341119065,NA,7,0,2,1341111281,"-1,-1,-1,0,-1",-0.8'
b=regexp(a,'\<".+\>"\,','match');
c=strrep(a,b,'');
data1=regexp(c,'[\s\,]+','split');
data2=regexp(b{1}(2:end-2),'[\s\,]+','split');
data=[data1{:} data2{:}]
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!