How do I extract sections of data from a csv file?

9 views (last 30 days)
I am having difficulty extracting the data I require from a csv file. I have been provided with a csv file which has the outputs of a number of simulations. However I have hit a dead-end on how to extract the data without doing it individually, i.e. opening in the import window and selecting the range of cells I wish to import by altering the range selected and doing this for each simulation. The format of the csv is attached, the output comes in the form of a table where one simulation is produced with the given headings along the top and the simulated results in the columns below. The next simulation is then produced below in the same form. Ideally I would want to input all the data directly into vectors, e.g. the year data into an 11x10 vector, with each column holding the relevant data for each simulation, i.e the first column of the vector holding A3:A13, the second column holding A17:A27 and so on. Any advice on how to extract the data would be greatly appreciated. Thanks in advance.
  3 Comments
Andrew Hair
Andrew Hair on 18 Jun 2015
Sorry, I've tried to update it too be more clear. I just attached one of the files I had and didn't realise that the format you viewed it would be different from the excel format I was viewing it with. The 10x10 was a mistake. I hope this helps and thank you for taking the time to look at it.
dpb
dpb on 18 Jun 2015
See Answer below--there's really nothing to worry about regarding the format/interpretation as far as I can see--looks like a machine-generated csv file with a blank line after the last set of values for each year and the explicit commas for each field irrespective of data in the field or not for the title line.
There is something a little funky as is often the case in me experience using textscan--I had to insert an extra fgetl to get the file pointer to the next line after the initial section read; trying the loop w/o got off somehow...

Sign in to comment.

Accepted Answer

dpb
dpb on 18 Jun 2015
Edited: dpb on 18 Jun 2015
d=[]; % initialize an array for the data
fmt=repmat('%f',1,14); % format string to match file
fid=fopen('filename');
for i=1:10
d=[d;cell2mat(textscan(fid,fmt,12,'headerlines',2, ...
'collectoutput',1 ...
'delimiter',','))]; % read each section;concatenate
fgetl(fid) % had to do this to get synchronized again...find it often
end
fid=fclose(fid);
The above returns the data in one array; if you instead want each simulation separately, instead of the concatenation above use a cell array to store each read section--
ERRATUM
fmt=[repmat('%f',1,14) '\n']; % format string to match line of file data
fid=fopen('filename');
for i=1:10 % repeat for all sections in file
d(i)=textscan(fid,fmt,12,'headerlines',2, ...
'collectoutput',1 ...
'delimiter',',')]; % read each section into cell array
end
fid=fclose(fid);
NB: The \n (newline) in the format string solves the position in the file it seems. I guess it's one of those cases where it's probably technically wrong without it but the scanning routines skip over it transparently in the vectorized portion but not when the scan is started over. Again, "why" is indecipherable as far as I can tell, it "just is".
  8 Comments
Andrew Hair
Andrew Hair on 19 Jun 2015
Thank you very much for the help guys, I've got it working this morning. It has been a great help. Definitely learnt some new commands for MatLab.
dpb
dpb on 19 Jun 2015
Edited: dpb on 19 Jun 2015
No problem....if I had to guess formatted input for particular file structure is quite possibly the number one question...there are so many apparent possibilities and such a number of nuances to almost every one of them as to make it almost overwhelming to the initiate...

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!