Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!22g2000hsm.googlegroups.com!not-for-mail
From:  "G.A.M." <x0Zero@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: need help fixing embarrassingly bad File I/O code
Date: Sun, 16 Sep 2007 04:29:26 -0000
Organization: http://groups.google.com
Lines: 81
Message-ID: <1189916966.986612.323400@22g2000hsm.googlegroups.com>
NNTP-Posting-Host: 24.129.101.179
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1189916967 31670 127.0.0.1 (16 Sep 2007 04:29:27 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sun, 16 Sep 2007 04:29:27 +0000 (UTC)
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6,gzip(gfe),gzip(gfe)
Complaints-To: groups-abuse@google.com
Injection-Info: 22g2000hsm.googlegroups.com; posting-host=24.129.101.179;
Xref: news.mathworks.com comp.soft-sys.matlab:428710



I have worked on this for a while and it seems to be close to working
correctly. However, I only arrived at some of the various transposes
after a bunch of trial and error. This code looks unmaintainable to
me. I can't believe there isn't a more elegant way to write this in
ML. I hope someone will point me in the right direction for
accomplishing my goal with good code.

My goal is to read from an ASCII file, modify the records, remove
duplicate records, and write the resulting data to another ASCII file.

The code below should run in ML r2007a. It is close to working
correctly (except that the output has extra line breaks). However, I
can't live with such poor quality code - that's why I'm asking for
advice. Thanks.

function fileStuff()
%example

	%4 lines of sample data - you may need to fix line wraps
	file(1) = {'lastname1,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
26,22,ox(1),03-Aug-2007 10:36:48,13.15,5.85,189.058,18.9,5.8'};
	file(2) = {'lastname2,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
33,22,ox(2),03-Aug-2007 10:37:20,16.54,6.35,213.073,21.3,7.3'};
	file(3) = {'lastname3,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
27,22,ox(3),03-Aug-2007 10:53:16,15.86,7.68,192.082,19.2,8.2'};
	file(4) = {'lastname1,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
26,22,ox(1),03-Aug-2007 10:36:48,13.15,5.85,189.058,18.9,5.8'};
	file = file.';%transpose so format is the same as if read from file.
	myFileData = file;

	myFileName = strcat(actualFileName, '.test.txt');
%using above to simulate reading from a file, so these lines commented
out:
% 	fid = fopen(myFileName, 'r');
% 	file = textscan(fid, '%s','delimiter','\n');
%	myFileData = file{1}; %unbundle first level of cell
	myRecordCount = size(myFileData, 1); %number of records
	uniqueRecords = {}; %I'd like to preallocate, but it breaks the code
below.
	%outputRecords I'd like to preallocate this too

	n = 1;
	%work from last record forward
	for r = myRecordCount: -1: 1
		cellData = textscan(myFileData{r},'%s','delimiter',',');
		currentRecord = cellData{1};%textscan requires unbundling cells
		unique = 'true';
		for k = 1 : size(uniqueRecords, 2)
			if (strcmpi(uniqueRecords{k}(10), currentRecord{10}))
				%non-unique
				if (~strcmpi(uniqueRecords{k}(11), currentRecord{11}))
					error('this shouldn''t happen');
				end
				unique = 'false';
				break;
			end
		end%for
		if (strcmpi(unique, 'true'))
			uniqueRecords{n} = currentRecord;
			currentRecord=currentRecord.';
			r = repmat('%s,',1,size(currentRecord,2));
			s = [r,'\n'];
			txt=sprintf(s, currentRecord{:});
			outputRecords{n} = txt;
			n = n + 1;
		end
	end

	outputRecords = outputRecords.';
	outputFile = fopen (myFileName,'wt');
	if outputFile ~= -1
		for k = 1 : size(outputRecords, 1)
			fprintf(outputFile,'%s', char(outputRecords{k})');
		end
		fclose(outputFile);
	end

	disp (char(outputRecords));
	type (myFileName);
end