|
I have worked on this for a while and it seems to be close to working
correctly. However, I only arrived at some of the various transposes
after a bunch of trial and error. This code looks unmaintainable to
me. I can't believe there isn't a more elegant way to write this in
ML. I hope someone will point me in the right direction for
accomplishing my goal with good code.
My goal is to read from an ASCII file, modify the records, remove
duplicate records, and write the resulting data to another ASCII file.
The code below should run in ML r2007a. It is close to working
correctly (except that the output has extra line breaks). However, I
can't live with such poor quality code - that's why I'm asking for
advice. Thanks.
function fileStuff()
%example
%4 lines of sample data - you may need to fix line wraps
file(1) = {'lastname1,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
26,22,ox(1),03-Aug-2007 10:36:48,13.15,5.85,189.058,18.9,5.8'};
file(2) = {'lastname2,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
33,22,ox(2),03-Aug-2007 10:37:20,16.54,6.35,213.073,21.3,7.3'};
file(3) = {'lastname3,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
27,22,ox(3),03-Aug-2007 10:53:16,15.86,7.68,192.082,19.2,8.2'};
file(4) = {'lastname1,firstname,DOB,Male,Caucasian,,,sp = 0 ti =
26,22,ox(1),03-Aug-2007 10:36:48,13.15,5.85,189.058,18.9,5.8'};
file = file.';%transpose so format is the same as if read from file.
myFileData = file;
myFileName = strcat(actualFileName, '.test.txt');
%using above to simulate reading from a file, so these lines commented
out:
% fid = fopen(myFileName, 'r');
% file = textscan(fid, '%s','delimiter','\n');
% myFileData = file{1}; %unbundle first level of cell
myRecordCount = size(myFileData, 1); %number of records
uniqueRecords = {}; %I'd like to preallocate, but it breaks the code
below.
%outputRecords I'd like to preallocate this too
n = 1;
%work from last record forward
for r = myRecordCount: -1: 1
cellData = textscan(myFileData{r},'%s','delimiter',',');
currentRecord = cellData{1};%textscan requires unbundling cells
unique = 'true';
for k = 1 : size(uniqueRecords, 2)
if (strcmpi(uniqueRecords{k}(10), currentRecord{10}))
%non-unique
if (~strcmpi(uniqueRecords{k}(11), currentRecord{11}))
error('this shouldn''t happen');
end
unique = 'false';
break;
end
end%for
if (strcmpi(unique, 'true'))
uniqueRecords{n} = currentRecord;
currentRecord=currentRecord.';
r = repmat('%s,',1,size(currentRecord,2));
s = [r,'\n'];
txt=sprintf(s, currentRecord{:});
outputRecords{n} = txt;
n = n + 1;
end
end
outputRecords = outputRecords.';
outputFile = fopen (myFileName,'wt');
if outputFile ~= -1
for k = 1 : size(outputRecords, 1)
fprintf(outputFile,'%s', char(outputRecords{k})');
end
fclose(outputFile);
end
disp (char(outputRecords));
type (myFileName);
end
|