X-Received: by 10.180.8.4 with SMTP id n4mr1603089wia.0.1360446613063;
        Sat, 09 Feb 2013 13:50:13 -0800 (PST)
Path: news.mathworks.com!newsfeed-00.mathworks.com!news.tele.dk!feed118.news.tele.dk!yu2no3377719wib.0!news-out.google.com!g1ni1472wig.0!nntp.google.com!feeder1.cambriumusenet.nl!feed.tweaknews.nl!194.134.4.91.MISMATCH!news2.euro.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!news.stack.nl!aioe.org!.POSTED!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Pull out specific numbers from unstructured text file
Date: Sat, 09 Feb 2013 15:50:08 -0600
Organization: Aioe.org NNTP Server
Lines: 121
Message-ID: <kf6gae$jv4$1@speranza.aioe.org>
References: <kf1bg4$d0u$1@newscl01ah.mathworks.com> <kf1dlk$cg0$1@speranza.aioe.org> <kf1pc7$684$1@speranza.aioe.org> <kf672a$8l5$1@newscl01ah.mathworks.com>
NNTP-Posting-Host: zUqTRxxdEXegJGUnZjNBiQ.user.speranza.aioe.org
Mime-Version: 1.0
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20
X-Notice: Filtered by postfilter v. 0.8.2
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Xref: news.mathworks.com comp.soft-sys.matlab:788693

On 2/9/2013 1:12 PM, Stan wrote:
> ^^^^^Okay I think I don't understand Lines 5,7,8 in your shortcut code:
>
> Line 1: > fid=fopen(....,'rt');
> Line 2: > l=' ';
> Line 3: > while 1
> Line 4: > l=fgetl(fid);
> Line 5: > if strfind(l,'Nmoves')>0,break,end
> Line 6: > end
> Line 7: > Nmoves=sscanf(l,'Nmoves=%d');
> Line 8: > Nrequired=fscanf(fid,'Nrequired=%d');
> Line 9: > fid=fclose(fid);
>
> My explanation is:
>
> while 1
> .
> .
> .
> end
>
> This is for lines 4-6 and this reads the file. If fgetl encounters the
> end-of-file indicator, it returns -1. So, as long as it returns 1 (i.e.
> anywhere before the end of the file), this statement is saying the while
> loop should perform the actions inside the if statement.

Not quite--the '1' in the WHILE construct is a constant and never 
changes--only finding the string 'Nmoves=' somewhere in the file will 
break the loop.

The condition in the WHILE would have to be something on the variable l 
after returned by fgetl() if it were to have any effect.  I chose to not 
do that 'cuz I presumed you'd only use this on an appropriate file and 
it would take reading the first line outside the loop or to otherwise 
initialize the loop at the beginning.  An alternate that would be a 
little cleaner in case the string weren't to be in the file would be to 
use while ~feof(fid) which would at least die gracefully on the EOF 
(eventually).

> My explanation for line 5:
> If 'Nmoves' is found in the string l (where l is the contents of the
> file that have been read up to that point) then stop reading at that line.

Essentially--it breaks the loop having found the desired string and 
therefore the first line to parse (on the assumption the string pattern 
only exists for the line desired or at least it is the first 
occurrence).  At that point 'l' holds the content of the line read--the 
strfind() simply scans the content for a match and returns.

>
> My explanations for lines 7 and 8:
> 7: Scan l for 'Nmoves=%d'.
> 8. Scan fid for 'Nrequired=%d'.

Well, depends on what you mean by "scan" -- they both do input 
conversion matching the formatting string according to the rules 
therefore.  The rule for a literal string is to match that string in the 
input and essentially ignore those matching characters.  %d is to 
convert a field as decimal number.  sscanf() works from a string 
variable ('l' in this case which we filled w/ the desired line from the 
file previously so now we're getting the desired value to a variable) 
while fscanf takes input from the file which has been connected via 
fopen() and associated w/ a valid file handle (fid is just a convenient 
variable name for that).

> Questions:
> In line 8, why did you change from l to fid?

Because we need to scan another line and it's done w/ one source code 
line directly from the file via fscanf() whereas we had used fgetl() to 
suck up a record in its entirety before while search for the target 
first line.  By your file, the next line was the location for the next 
value wanted so didn't need any more searching to find another randomly 
place record--it was given to be the next.

> What is the connection between line 5 and lines 7,8?
> How does it know, after line 5 (i.e. after reaching the end of the line
> containing Nmoves), that it needs to search for the next two lines?

You described the file format and said the next line after the one 
containing "Nmoves" was the next desired field to be parsed.

You still don't seem to grasp that the fgetl() reads a record including 
the \n (newline) and returned that in the character variable 'l' and the 
first sscanf() is parsing that string--nothing else has happened in the 
file at that point (after the sscanf() that is).  _THEN_, we went back 
to the file and got as much of the next record as required to get the 
next variable by the use of fscanf().

fscanf(), however, unlike fgetl() does _NOT_ automagically read the 
entire record _UNLESS_ and _IFF_ the format string provided tells it to 
do that.  Your initial description didn't say anything about reading 
anything except these two values so I did just that--read records until 
found the first one desired, then read just what was needed to get the 
variable value requested from the following record.  Period. End of 
story.  That's why later when you came back and said "Oh, that's not the 
end of what's needed" I said what I gave you was a shortcut specifically 
for the first problem outlined.

Now, the problem is that to read the rest of the desired records you've 
got to either write specific formatting strings to handle them (a pita 
since they're not symmetric in much of any useful way) to continue on w/ 
fscanf() (and including the fact that the file position marker is in the 
middle of the Nrequired record as above).

So, as noted in my previous response, given you want to do the other 
stuff I'd suggest it's simpler to revert to fgetl/sscanf pairs.

Again, take the sample code and your example file and just type the 
while loop in at the command line and look at what the contents of 'l' 
are and then what happens if you follow the fscanf() call w/ a fgetl() 
to understand the difference...

Also read

doc fscanf
doc fgetl

and friends carefully...

--