4.5

4.5 | 4 ratings Rate this file 31 Downloads (last 30 days) File Size: 4.91 KB File ID: #29642
image thumbnail

Get HTML-table data into MATLAB via urlread and without builtin browser

by Sven Körner

 

07 Dec 2010

Based on getTableFromWeb with a little more functionality for bring in table data from web to MATLAB

| Watch this File

File Information
Description

Function getTableFromWeb_mod is based on the very very good "pick of the week" from August 20th, 2010 (http://www.mathworks.com/matlabcentral/fileexchange/22465-get-html-table-data-into-matlab) by Jeremy Barry.
It is inspired by the restrictions of the original function and should users help, who had problems with the loading time of the requested webpage. So the workaround doesn't use the internal webbrowser by Matlab but takes the urlread function to import and analyze the table-webdata.

To get table data, it is necessary to know from which url you want to read in the data and from which
table. If you have an url but no idea which table with the specified tablenummer has the data use the originalfuntion getTableFromWeb
(http://www.mathworks.com/matlabcentral/fileexchange/22465-get-html-table-data-into-matlab) to check which tablenumber with content you are interested in.

The first example(at the end of description) gets actual departure information by german railways for the railwaystation Frankfurt Hbf (coded by ibnr, international railway station number).

The second example belongs to the orinigal example by Jeremy Barry and gets financial information.

There are two input arguments:
url_string -- is the string of the requested webpage
nr_table -- number of table to get and to put in out_table

Ouput argument:
out_table -- is a cell array of requested data

Example:

% German Railways-travelling information example
 ibnr = 8098105; % IBNR railway station: Frankfurt-Hbf (for more ibnr see: http://www.ibnr.de.vu/)
 url_string = [ 'http://reiseauskunft.bahn.de/bin/bhftafel.exe/dn?rt=1&ld=10000&evaId=', num2str(ibnr) ,'&boardType=dep&time=actual&productsDefault=1111000000&start=yes']; % question string fo calling actual departure information for Frankfurt HBF
 nr_table = 2; % Table with the travelinformation data
 out_table = getTableFromWeb_mod(url_string, nr_table)

% Finance example
% run getTableDataScript to see, which table is number 7 (Valuation Measures)
url_string = ('http://finance.yahoo.com/q/ks?s=GOOG');
nr_table = 7;
out_table = getTableFromWeb_mod(url_string, nr_table)

Acknowledgements

The author wishes to acknowledge the following in the creation of this submission:
Get HTML Table Data into MATLAB

MATLAB release MATLAB 7.10 (2010a)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (5)
28 Mar 2011 Brian

Great tool. One issue I discovered is that in some HTML sites, the table identifier is capitalized (ex. </TABLE> instead of </table>, etc.). In these cases, the function fails because the string comparison commands are case sensitive. Modifying to use regexprep(...,'preservecase') and regexpi() where appropriate, allows tables to be extracted from websites where the original function failed.

14 Sep 2011 Borislav Oreshkin

Excellent.

09 Jan 2012 Raj Sodhi

Fantastic!

15 Mar 2012 David Jessop

That's great! Do you know how I remove the ' ' from around the numbers found so I can use them?

15 Mar 2012 David Jessop

Sorry, stupid question above have now sorted!

Please login to add a comment or rating.
Tag Activity for this File
Tag Applied By Date/Time
urlread Sven Körner 09 Dec 2010 08:43:02
cell array Sven Körner 09 Dec 2010 08:43:02
data Sven Körner 09 Dec 2010 08:43:02
table Sven Körner 09 Dec 2010 08:43:02
html Sven Körner 09 Dec 2010 08:43:02
html Fabrice 09 Mar 2011 11:16:52
table Jay 01 Oct 2011 14:25:48

Contact us at files@mathworks.com