Thread Subject: read nonuniform text

Subject: read nonuniform text

From: wldf wal

Date: 28 Oct, 2008 09:57:01

Message: 1 of 4

hi,

i have to read nonuniform text. i tried some way to read them but without succes. my text is looking like this:
#cP2008 1 1 0 0 0.00000000 96 ORBIT IGS05 HLM IGS
## 1460 172800.00000000 900.00000000 54466 0.0000000000000
+ 32 G01G02G03G04G05G06G07G08G09G10G11G12G13G14G15G16G17
+ G18G19G20G21G22G23G24G25G26G27G28G29G30G31G32 0 0
+ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
++ 2 3 2 2 3 2 3 3 2 3 3 3 3 3 3 3 2
++ 2 2 3 3 2 3 2 3 2 3 3 0 2 2 4 0 0
++ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
++ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
++ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
%c G cc GPS ccc cccc cccc cccc cccc ccccc ccccc ccccc ccccc
%c cc cc ccc ccc cccc cccc cccc cccc ccccc ccccc ccccc ccccc
%f 1.2500000 1.025000000 0.00000000000 0.000000000000000
%f 0.0000000 0.000000000 0.00000000000 0.000000000000000
%i 0 0 0 0 0 0 0 0 0
%i 0 0 0 0 0 0 0 0 0
/* FINAL ORBIT COMBINATION FROM WEIGHTED AVERAGE OF:
/* cod emr esa gfz jpl mit ngs sio
/* REFERENCED TO IGS TIME (IGST) AND TO WEIGHTED MEAN POLE:
/* PCV:IGS05_1461 OL/AL:FES2004 NONE Y ORB:CMB CLK:CMB
* 2008 1 1 0 0 0.00000000
PG01 -14698.277128 -2535.799307 -21763.079031 176.099007 10 9 8 169
PG02 22442.389224 -10272.174156 -9560.136872 162.720385 11 9 10 193
PG03 -13620.463422 8919.821941 20695.070658 166.322185 7 6 8 150
PG04 19958.435008 150.295195 -17787.161153 -44.873145 7 11 8 144
PG05 -2823.115526 -16462.410098 -20939.207089 625.222106 14 9 10 185
PG06 -7143.196104 -25514.299802 1830.117866 170.117599 8 7 8 123
PG07 -10237.510207 -23940.114838 5512.438860 18.039115 11 10 12 127
PG08 14017.015904 7231.477558 21355.061525 -135.440094 12 11 11 183
PG09 12181.805529 -21204.777222 -10751.462189 103.555164 10 8 8 168
PG10 20955.493066 -3772.425757 15961.204995 -191.937552 14 12 13 177
PG11 -7911.583661 23011.271914 -10406.461307 26.858878 11 8 11 164
PG12 3597.925369 -14744.679693 -21709.078627 -353.004481 11 6 7 171
PG13 6302.303623 25466.378831 4141.700761 231.438520 9 9 8 165
PG14 -13901.965954 -14297.974424 -17487.878896 -308.992361 12 11 8 190
PG15 8043.076227 -19262.657551 16459.331387 -70.724287 8 6 7 127
PG16 -23557.523541 435.160643 12502.752577 130.642791 11 12 9 161
PG17 14425.037573 14961.864887 -16604.253188 45.370751 7 10 6 158
PG18 -14740.869594 -17038.887817 14502.939124 -215.440947 6 7 9 155
PG19 -9283.543487 18906.294600 16165.872654 14.687729 9 8 8 157
PG20 -502.268859 15950.765314 -21343.725180 123.445198 10 11 8 192
PG21 -6088.360637 -15011.938405 21125.289387 74.387588 9 8 7 145
PG22 -23711.369538 -11741.131155 2703.641648 206.205087 11 10 10 180
PG23 -1529.915775 24934.711906 -8730.648105 322.616742 11 9 11 162
PG24 7406.306436 -20068.249051 16104.791945 58.635678 10 7 7 164
PG25 167.773732 18905.588920 19085.370779 496.447720 11 7 10 169
PG26 9723.166517 -15936.361423 18196.454530 142.441239 8 6 10 149
PG27 6211.243170 14847.570806 21768.371652 157.876417 12 9 10 169
PG28 22657.886048 13092.931016 5887.875225 -12.977759 10 9 10 184
PG29 3816.647710 -23882.297183 -10941.975317 999999.999999
PG30 -9359.334917 -16911.316266 -18591.680456 58.197189 10 6 7 148
PG31 -20533.999421 2614.616320 -16481.402576 -3.607840 9 7 8 188
PG32 -24787.761236 -6801.202466 -5876.358848 999999.999999
* 2008 1 1 0 15 0.00000000
PG01 -15058.326297 -5000.778117 -21092.964767 176.099823 9 8 8 156
PG02 21733.552608 -9237.464754 -12017.289171 162.723274 10 9 9 182
PG03 -14441.820840 6605.755634 21031.270746 166.327278 7 6 8 147
PG04 18483.506644 1643.505769 -19220.844162 -44.878291 7 10 8 141
PG05 -492.171174 -15995.283710 -21473.031033 625.238575 14 9 9 189
PG06 -6825.928215 -25628.429721 -970.178911 170.104991 8 7 8 122
PG07 -10049.336278 -24451.170524 2761.218462 18.070673 11 10 12 68


first 22 lines is head which i don't need. but lines begining with star char (*) and PGXX i need.

i can't use dlmread (which is great for reading ASCII text) because there are lines with start char (*). and textscan is impractical (for reason a long text - around 3000-4000 lines and i never know number of raws).

there is possibility using non-interactive editor "sed" by i need indempendet on OS.

can somebody help me?

thank you

Subject: read nonuniform text

From: Andres

Date: 28 Oct, 2008 14:58:02

Message: 2 of 4

A suggestion with txt2mat from the file exchange:
.
A = txt2mat('file.txt','ReplaceExpr',{{'*','0'},{'PG','1 '}});
.
The values in the first column of A, 0 or 1, will indicate whether the numbers were taken from a line starting with '*' or 'PG', respectively. Rows will be padded with NaNs.
Hth
Andres

Subject: read nonuniform text

From: wldf wal

Date: 28 Oct, 2008 21:52:01

Message: 3 of 4

"Andres" <rantore@werb.deNoRs> wrote in message <ge799q$ss$1@fred.mathworks.com>...
> A suggestion with txt2mat from the file exchange:
> .
> A = txt2mat('file.txt','ReplaceExpr',{{'*','0'},{'PG','1 '}});
> .
> The values in the first column of A, 0 or 1, will indicate whether the numbers were taken from a line starting with '*' or 'PG', respectively. Rows will be padded with NaNs.
> Hth
> Andres

thank you for you answer :]. I will try your advice (not today but tomorrow :]).

For now I have this code which use function "fgetl()" and then I split line into strings where I convert number in strings to double. I need PGXX number which are converted into number without 'PG' via regular expression.

There is the most important part of my code:

fid = fopen('file.txt');

while ~feof(fid)
  % Read line from file
  newLine = fgetl(fid);
  .
  .
  .
  % Split one string to strings via regular expression
  workLine = regexp(newLine, '([^ \t]*)', 'match');
  .
  .
  .
  % Some lines begining with star (*), so I filter every line and if I catch star (*) I substitude for zero
  if strcmp(newLine(ii),'*')
    SP3data(lineIndex-22,ii) = 0;
  end
  .
  .
  .
  % This part convert "PGXX" number into double number via regular expression
  a = char(workLine(ii));
  aa = regexp(char(a),'([0-9]*)', 'match');
  aaa = char(aa);
  SP3data(lineIndex,ii) = str2double(aaa);
  .
  .
  .
  % This part convert number in string to double
  SP3data(lineIndex-22,ii) = str2double(workLine(ii));
  .
  .
end

Subject: read nonuniform text

From: wldf wal

Date: 28 Oct, 2008 22:04:02

Message: 4 of 4

I forget tell that I have to (!!!) control number of strings/columns in every line. I know the longest line (in my case 9) so I have to add zeros behind numbers (e.g. number number number 0 0 0 0 0 0 ;]).


Or you can create big matrix with 9 columns (contain only zeros) and rewrite every position. What you choose depend on you ;].

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
textscan wldf wal 28 Oct, 2008 18:05:06
textscan code wldf wal 28 Oct, 2008 17:55:07
evergreen wldf wal 28 Oct, 2008 06:00:07
textread wldf wal 28 Oct, 2008 06:00:07
code wldf wal 28 Oct, 2008 06:00:07
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com