Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!w3g2000hsg.googlegroups.com!not-for-mail
From:  "G.A.M." <x0Zero@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Need File I/O help for reading and rewriting string data
Date: Sun, 16 Sep 2007 02:42:18 -0000
Organization: http://groups.google.com
Lines: 46
Message-ID: <1189910538.669127.261880@w3g2000hsg.googlegroups.com>
NNTP-Posting-Host: 24.129.101.179
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1189910538 22282 127.0.0.1 (16 Sep 2007 02:42:18 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sun, 16 Sep 2007 02:42:18 +0000 (UTC)
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6,gzip(gfe),gzip(gfe)
Complaints-To: groups-abuse@google.com
Injection-Info: w3g2000hsg.googlegroups.com; posting-host=24.129.101.179;
Xref: news.mathworks.com comp.soft-sys.matlab:428699



I need to read from a CSV file of mixed data types (mostly strings). I
need to process each record (such as adding a few more values to it).
Then I need to write (or append) the data to another CSV file.

I can't seem to find a decent way to do this. The code below is super
slow on large files (and this code is a drastic simplification of what
I really need to do). I actually have to look for and remove duplicate
records before writing out the modified data.

I would appreciate any suggestions for improving the code. (For now
I'm stuck using ASCII files, so I can't go to a database or binary
files.)

filedata = textread(myFilename, '%s','delimiter','\n');
charRec = '';
outputData = cell(length(filedata), 1);%preallocate
for k = 1 : length(filedata)
	record = strread(char(filedata(k)),'%s','delimiter',',');
	for m = 1 : length (record)
		temp = strcat(record{m}, ',');
		charRec = strcat(charRec, temp);
	end%for
	charRec = strcat(charRec, 'newData');
	outputData(k) = {charRec};
	charRec = '';
end

testFile = fopen (testFilename,'wt');
if testFile ~= -1
    for k = 1 : length(outputData)
    	fprintf(testFile,'%s\n', char(outputData(k)));
    end
    fclose(testFile);
end


3 lines of sample data:
lastname1,firstname,DOB,Male,Caucasian,,,sp = 0 ti = 26,22,ox(1),03-
Aug-2007 10:36:48,13.15,5.85,189.058,18.9,5.8
lastname2,firstname,DOB,Male,Caucasian,,,sp = 0 ti = 33,22,ox(2),03-
Aug-2007 10:37:20,16.54,6.35,213.073,21.3,7.3
lastname3,firstname,DOB,Male,Caucasian,,,sp = 0 ti = 27,22,ox(3),03-
Aug-2007 10:53:16,15.86,7.68,192.082,19.2,8.2

I hope all this code and data comes out readable.