Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!j44g2000hsj.googlegroups.com!not-for-mail
From: Predictor <predictr@bellatlantic.net>
Newsgroups: comp.soft-sys.matlab
Subject: Reading Web Data
Date: Sun, 25 Nov 2007 06:59:39 -0800 (PST)
Organization: http://groups.google.com
Lines: 9
Message-ID: <e96768fe-79a6-411b-b231-78bc1ca0b169@j44g2000hsj.googlegroups.com>
NNTP-Posting-Host: 151.197.213.177
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Trace: posting.google.com 1196002779 10296 127.0.0.1 (25 Nov 2007 14:59:39 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sun, 25 Nov 2007 14:59:39 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: j44g2000hsj.googlegroups.com; posting-host=151.197.213.177; 
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) 
Content-Disposition: inline
Xref: news.mathworks.com comp.soft-sys.matlab:439247



Is there a way to read Web data exactly as it appears in "View... Page
Source" within a browser?  My experiments with urlread() seem to show
that HTML tags and other items are ignored, but I'd like to be able to
read the exact contents of Web pages, and interpret or filter out tags
and so forth in my own code.  Any ideas?


Thanks,
Will