Modify then write data in the given format

6 views (last 30 days)
Hi, I will start with a brief overview of what I am trying to achieve in this code.
I have a sample piece of data I want manipulate below. It stores the xyz coordinates and velocities of atoms in solution.
1SOL OW 1 4.309 5.254 4.135 -0.2790 0.3440 0.2064
1SOL HW1 2 4.314 5.169 4.082 -1.5406 0.3918 -0.0293
1SOL HW2 3 4.388 5.312 4.114 -1.3375 0.9272 -2.6151
2SOL OW 4 1.743 1.687 2.366 0.2136 0.2777 0.3181
2SOL HW1 5 1.818 1.750 2.387 0.3115 0.1542 0.3431
4502OCTA H13545 2.108 5.326 1.045 -1.2169 0.4890 -2.6144
4502OCTA H13546 2.068 5.492 1.036 0.7609 0.6650 0.8612
4502OCTA H13547 2.285 5.388 1.207 3.0144 2.5562 1.0920
4502OCTA H13548 2.121 5.425 1.265 -1.2460 -1.3635 1.4829
4502OCTA Oc13549 2.131 5.677 1.238 -0.0183 -0.0221 -1.0402
4502OCTA Oh13550 2.353 5.635 1.208 -0.6036 0.2241 -0.8140
4502OCTA H13551 2.383 5.198 0.399 0.4893 0.7154 -0.9915
4502OCTA Ho13552 2.413 5.565 1.189 -0.4685 -0.0421 -2.1107
What I need to do is add a specific value, for instance add 1 to the numbers in the columns 4 and 5 and rows 3 to 8. Basically, I want to translate the positions of certain atoms. It is important that I keep the original file's format.
There were two challenging aspects to this code: The first being that the second and the third column merge when the numbers in the third column (atom IDs) go into the 5 digits. I've worked around that, albeit not elegantly.
The second issue which I haven't been able to solve is how to write the new coordinates into a file. Matlab ignores the empty spaces before each row begins, and ignores the spaces in between the columns, and ignores the spaces after the rows end. I've tried using horzcat, mat2str, strcat, and maybe some others without success. I will leave my code below for you to examine.
function output = Gro_editor(filename, x1, y1, y2,addval)
%Initialize Values%
filend = 0;
fin = 0;
data = cell(15,22);
Xpos = 1; %X-coordinate
Ypos = 1; %Y-coordinate
%Open and get permission to write target file%
fid = fopen(filename,'r');
if fid == -1
disp('Could not open file');
else
disp('File Open...');
%Index the entire file with meaningful partitions.
%First 15 are indexed individually, following seven are read as blocks.
while filend == 0
if Xpos < 16
data(Ypos,Xpos) = {cellstr(fscanf(fid,'%c',1))};
elseif Xpos > 15
data(Ypos,Xpos) = {fscanf(fid,'%f',1)};
end
if Xpos == 22
Xpos = 1;
Ypos = Ypos + 1;
fscanf(fid,'%c',2);
elseif Xpos < 22
Xpos = Xpos + 1;
end
if Ypos == 20
filend = 1;
end
end
closeresult = fclose(fid);
if closeresult == 0
disp('File closed')
else
disp('File close unsuccessful')
end
%Writing new cell array
while fin == 0
data{y1,x1} = data{y1,x1}+addval
if y1 < y2
y1 = y1+1;
elseif y1 == y2
fin = 1;
end
end
%So far so good. I don't know what I am doing after here.%
----------------------------------------------------------
output=data;
fid = fopen('output.txt','w');
[nrows,ncols] = size(data);
formatSpec = '%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c\n';
for row = 1:nrows
for col = 1:ncols
line = horzcat(data(nrows,1:15))
fprintf(fid,formatSpec,line(col));
end
end
fclose(fid);
type output.txt
end
  1 Comment
dpb
dpb on 13 Mar 2015
Edited: dpb on 15 Mar 2015
It is a pita to read fixed-width nondelimited data with Matlab owing to its C i/o formatting--the idea of having such was apparently overlooked and it is, simply put, impossible w/o counting columns if the fields are ever actually full.
That said, not sure where, precisely, your output formatting issues are arising but to write fixed column fields use a counted field of the proper width for both the string and the numeric fields and you can control that however is desired.
Provide a precise definition of the first two string columns' content (and spacing within the column if that is actually also significant) and I'm sure we can write an appropriate format string.

Sign in to comment.

Accepted Answer

per isakson
per isakson on 15 Mar 2015
Edited: per isakson on 15 Mar 2015
The documentation of textscan doesn't cover fixed-width very well. However, textscan has "undocumented"/hidden capabilities.
Approach
  • Read the first three columns to one string, since they shall only be copied to the output file.
  • Read the following six columns to a double array.
  • Add 1 to the prescribed elements of the array
  • Use the same format string to write the data (don't forget new-line)
Run example code (I use R2013b)
fixed_width_format(13)
where
function fixed_width_format( N )
fid = fopen( 'fixed_width_format.txt' );
format_spec = '%20s%8.3f%8.3f%8.3f%8.4f%8.4f%8.4f';
cac = textscan( fid, format_spec, N ...
, 'Whitespace' , '' ...
, 'Delimiter' , '' ...
, 'CollectOutput' , true );
fclose( fid );
RowHead = cac{1};
Data = cac{2};
% add 1 to the numbers in the columns 4 and 5 and rows 3 to 8.
Data( 3:8, 4:5 ) = Data( 3:8, 4:5 ) + 1;
fid = fopen( 'fixed_width_format_out.txt', 'w' );
for rr = 1 : N
fprintf( fid, [format_spec,'\n'], RowHead{rr}, Data(rr,:) );
end
fclose( fid );
end
and where fixed_width_format.txt contains
1SOL OW 1 4.309 5.254 4.135 -0.2790 0.3440 0.2064
1SOL HW1 2 4.314 5.169 4.082 -1.5406 0.3918 -0.0293
1SOL HW2 3 4.388 5.312 4.114 -1.3375 0.9272 -2.6151
2SOL OW 4 1.743 1.687 2.366 0.2136 0.2777 0.3181
2SOL HW1 5 1.818 1.750 2.387 0.3115 0.1542 0.3431
4502OCTA H13545 2.108 5.326 1.045 -1.2169 0.4890 -2.6144
4502OCTA H13546 2.068 5.492 1.036 0.7609 0.6650 0.8612
4502OCTA H13547 2.285 5.388 1.207 3.0144 2.5562 1.0920
4502OCTA H13548 2.121 5.425 1.265 -1.2460 -1.3635 1.4829
4502OCTA Oc13549 2.131 5.677 1.238 -0.0183 -0.0221 -1.0402
4502OCTA Oh13550 2.353 5.635 1.208 -0.6036 0.2241 -0.8140
4502OCTA H13551 2.383 5.198 0.399 0.4893 0.7154 -0.9915
4502OCTA Ho13552 2.413 5.565 1.189 -0.4685 -0.0421 -2.1107
--- 0---|--- 10---|--- 20---|--- 30---|--- 40---|--- 50---|--- 60---
123456789|123456789|123456789|123456789|123456789|123456789|123456789
'%20s%8.3f%8.3f%8.3f%8.4f%8.4f%8.4f%8.4f'
and where fixed_width_format_out.txt contains
1SOL OW 1 4.309 5.254 4.135 -0.2790 0.3440 0.2064
1SOL HW1 2 4.314 5.169 4.082 -1.5406 0.3918 -0.0293
1SOL HW2 3 4.388 5.312 4.114 -0.3375 1.9272 -2.6151
2SOL OW 4 1.743 1.687 2.366 1.2136 1.2777 0.3181
2SOL HW1 5 1.818 1.750 2.387 1.3115 1.1542 0.3431
4502OCTA H13545 2.108 5.326 1.045 -0.2169 1.4890 -2.6144
4502OCTA H13546 2.068 5.492 1.036 1.7609 1.6650 0.8612
4502OCTA H13547 2.285 5.388 1.207 4.0144 3.5562 1.0920
4502OCTA H13548 2.121 5.425 1.265 -1.2460 -1.3635 1.4829
4502OCTA Oc13549 2.131 5.677 1.238 -0.0183 -0.0221 -1.0402
4502OCTA Oh13550 2.353 5.635 1.208 -0.6036 0.2241 -0.8140
4502OCTA H13551 2.383 5.198 0.399 0.4893 0.7154 -0.9915
4502OCTA Ho13552 2.413 5.565 1.189 -0.4685 -0.0421 -2.1107
&nbsp
Finally, does this example rely on undocumented features of textscan?
&nbsp
Addendum triggered by comment
>> data = fixed_width_format(7);
>> data(1:5,:)
ans =
4.3090 5.2540 4.1350 -0.2790 0.3440 0.2064
4.3140 5.1690 4.0820 -1.5406 0.3918 -0.0293
4.3880 5.3120 4.1140 -0.3375 1.9272 -2.6151
1.7430 1.6870 2.3660 1.2136 1.2777 0.3181
1.8180 1.7500 2.3870 1.3115 1.1542 0.3431
>> data(6:7,:)
ans =
0 111 111111 1222222 2222333 3333333
123456 789012 345678 9012345 6789012 3456789
>>
where
function data = fixed_width_format(N)
fid = fopen( 'fixed_width_format_dpb.txt' );
format_spec = '%6f%6f%6f%7f%7f%7f';
cac = textscan( fid, format_spec, N ...
, 'Whitespace' , '' ...
, 'Delimiter' , '' ...
, 'CollectOutput' , true );
fclose( fid );
data = cac{1};
end
and where fixed_width_format_dpb.txt contains
4.309 5.254 4.135-0.2790 0.3440 0.2064
4.314 5.169 4.082-1.5406 0.3918-0.0293
4.388 5.312 4.114-0.3375 1.9272-2.6151
1.743 1.687 2.366 1.2136 1.2777 0.3181
1.818 1.750 2.387 1.3115 1.1542 0.3431
000000000111111111122222222223333333333
123456789012345678901234567890123456789
  27 Comments
per isakson
per isakson on 10 May 2015
Edited: per isakson on 10 May 2015
"the display result forces a cast to the lower precision" &nbsp Thanks! I was too occupied with textscan to think of that.
Over the years there is a remarkable number of new and updated Matlab functions to read flat text files. And many toolboxes have their own variants. Obviously, TMW wants to enhance the "user experience". With my comment I just wanted to say that they should try even harder; "constant dropping wears the stone".
Thank you for the detailed explanation. However, IMO, there are way too many subtle details for a "high level language" like Matlab.
I'll be back with a new thread.
dpb
dpb on 10 May 2015
Edited: dpb on 11 May 2015
"...there are way too many subtle details for a "high level language" like Matlab."
I'd concur although I think it's inevitable given the choice of the underlying implementation; it's just inherent with the way the C library operates for these kinds of cases and there is so much generality that one must be able to handle to make a truly universal tool.
IMO it would, help however if the documentation were written as a definitive normative description that did have sufficient detail that one could infer the result simply from the TMW-supplied help files. But, then they would be so complex that
  1. Nobody would read them, and
  2. It would take a "language lawyer" to parse the result in the exotic cases if did.
The latter above is a discussion often at comp.lang.fortran wherein one of the regulars is a former editor of the Standard and there are regular discussions and disagreements as to whether a given construct is or is not "standard".
ADDENDUM
BTW, it is the combination of #1 and #3 above that is perhaps the most critical difference between Fortran and C on the interpretation of fixed-width fields input. That W characters are read irrespective of a presumed interpretation as "white space" (1) and that a field is "zero-filled" as necessary (3) so that a blank field is thus NOT presumed empty. (Hmmmm....interesting thought--would that mean that for your function you could use the "read character array to memory" idea and do a global substitution of zeros for blanks and then the field width count from existing textscan would work? Not sure if it would be totally general or not otomh but it's an intriguing thought, methinks...)

Sign in to comment.

More Answers (0)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!