|
"Derik " <d.nospam.schupbach@lombardodier.please.com> wrote in message <hc2gsv$375$1@fred.mathworks.com>...
> Dear Sunday readers,
> I am trying to read the below file format. I tried textscan but I must be missing things... I either errors or empty cell (I run version7.5.0 2007b)
> I have several difficulties as a beginner:
> * all these doublequotes seem not to be well understood
> * Unfortunately the comma delimiter is also the thousand delimiter
> * I would like to have the first line transformed as the variable names of the columns
> * I would like to change the date string "MM/DD/YYYY" to matlab dates
> * the file is around 7000 lines and 70 variables
>
> extract of the file:
> "Fund_ID","Fund","Firm","Structure","Minimum_Investment","Additional_Investment","Inception","Reporting"
> "10003","Enterprise Fund Ltd. (Class E) - Emerging Markets","Advantage Management Limited","Corporation","10,000","","06/01/2003","Monthly"
>
> Thank you very much in advance
> derik
Another approach using regexp:
fid = fopen(filename,'rt');
val=textscan(fid,'%s','delimiter','','headerlines', 0);
fclose(fid);
Header=regexp(val{:}{1},'(\w+)','match'); % Remove all numeric
as=regexprep(val{:}{2},'\d*,\d{3}','${strrep($&,'','','''')}'); % Replace 10,000 with 10000
as=regexprep(as,'\d{2}/\d{2}/\d{4}','${num2str(datenum($&, ''mm/dd/yyyy'')'')}'); %Convert Gregorian tu Julian
as=regexprep(as,'"',''); % Remove double quotes
Data=regexp(as, ',', 'split'); % Split data
Data{5}=str2num(Data{5}); % Convert string to numeric
Data{7}=str2num(Data{7}); % Convert string to numeric
DATA = cell2struct(Data,Header,2);
Branko
|