webread is not getting all text on a website

12 views (last 30 days)
Will Kinsman
Will Kinsman on 28 Feb 2016
Answered: Walter Roberson on 28 Feb 2016
Hi all,
I am trying to build a program to get the plain text from a website. The issue I am encountering is that webread does not seem to be collecting all of the text on the site (specifically, the table; see website below). I see my options are twofold:
  1. query a third-party html-to-plain text website that can do a better job
  2. determine if there is a workaround that catches more text than the webread method I am using now
here is my code:
html = webread('https://finance.yahoo.com/q/bs?s=MXWL');
txt = regexprep(html,'<script.*?/script>','');
saveTXT(txt,'htmlplaintext');
help is greatly appreciated; I love you guys!
Will

Answers (1)

Walter Roberson
Walter Roberson on 28 Feb 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!