Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!news2.google.com!npeer01.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!newsfe12.iad.POSTED!7564ea0f!not-for-mail
From: Doug Schwarz <see@sig.for.address.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: reading an annoying ascii text file
References: <hc2gsv$375$1@fred.mathworks.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
User-Agent: MT-NewsWatcher/3.5.2 (Intel Mac OS X)
Message-ID: <see-839DE8.23435925102009@news.frontiernet.net>
Lines: 61
X-Complaints-To: abuse-news@frontiernet.net
NNTP-Posting-Date: Mon, 26 Oct 2009 03:43:59 UTC
Organization: Frontier
Date: Sun, 25 Oct 2009 23:43:59 -0400
Xref: news.mathworks.com comp.soft-sys.matlab:579968


In article <hc2gsv$375$1@fred.mathworks.com>,
 "Derik " <d.nospam.schupbach@lombardodier.please.com> wrote:

> Dear Sunday readers,
> I am trying to read the below file format. I tried textscan but I must be 
> missing things... I either errors or empty cell (I run version7.5.0 2007b)
> I have several difficulties as a beginner:
> * all these doublequotes seem not to be well understood 

Use the %q format with textscan.


> * Unfortunately the comma delimiter is also the thousand delimiter
> * I would like to have the first line transformed as the variable names of 
> the columns

Don't do this, it's more trouble than it's worth.  Instead use the 
column headers as field names for a structure array.


> * I would like to change the date string "MM/DD/YYYY" to matlab dates
> * the file is around 7000 lines and 70 variables
> 
> extract of the file:
> "Fund_ID","Fund","Firm","Structure","Minimum_Investment","Additional_Investmen
> t","Inception","Reporting"
> "10003","Enterprise Fund Ltd. (Class E) - Emerging Markets","Advantage 
> Management Limited","Corporation","10,000","","06/01/2003","Monthly"
> 
> Thank you very much in advance
> derik

Here's what I would do (assume your data is in a file called derik.dat):

% Read in entire file.
fid = fopen('derik.dat');
header = textscan(fid,'%q%q%q%q%q%q%q%q',1,'Delimiter',',');
raw = textscan(fid,'%q%q%q%q%q%q%q%q','Delimiter',',');
fclose(fid);
 
% Store data in a structure array, data.
fields = [header{:}];
raw_array = [raw{:}];
data = cell2struct(raw_array,fields,2);
 
% Convert column 5 (Minimum_investment) from string to numeric.
min_invest_str = {data.(fields{5})};
min_invest = str2double(min_invest_str);
min_invest_cell = num2cell(min_invest);
[data.(fields{5})] = min_invest_cell{:};
 
% Convert column 7 (Inception) into date numbers.
date_str = {data.(fields{7})};
date_num = datenum(date_str,'mm/dd/yyyy');
date_num_cell = num2cell(date_num);
[data.(fields{7})] = date_num_cell{:};

-- 
Doug Schwarz
dmschwarz&ieee,org
Make obvious changes to get real email address.