4.5

4.5 | 6 ratings Rate this file 94 Downloads (last 30 days) File Size: 4.91 KB File ID: #29642
image thumbnail

Get HTML-table data into MATLAB via urlread and without builtin browser

by

 

Based on getTableFromWeb with a little more functionality for bring in table data from web to MATLAB

| Watch this File

File Information
Description

Function getTableFromWeb_mod is based on the very very good "pick of the week" from August 20th, 2010 (http://www.mathworks.com/matlabcentral/fileexchange/22465-get-html-table-data-into-matlab) by Jeremy Barry.
It is inspired by the restrictions of the original function and should users help, who had problems with the loading time of the requested webpage. So the workaround doesn't use the internal webbrowser by Matlab but takes the urlread function to import and analyze the table-webdata.

To get table data, it is necessary to know from which url you want to read in the data and from which
table. If you have an url but no idea which table with the specified tablenummer has the data use the originalfuntion getTableFromWeb
(http://www.mathworks.com/matlabcentral/fileexchange/22465-get-html-table-data-into-matlab) to check which tablenumber with content you are interested in.

The first example(at the end of description) gets actual departure information by german railways for the railwaystation Frankfurt Hbf (coded by ibnr, international railway station number).

The second example belongs to the orinigal example by Jeremy Barry and gets financial information.

There are two input arguments:
url_string -- is the string of the requested webpage
nr_table -- number of table to get and to put in out_table

Ouput argument:
out_table -- is a cell array of requested data

Example:

% German Railways-travelling information example
 ibnr = 8098105; % IBNR railway station: Frankfurt-Hbf (for more ibnr see: http://www.ibnr.de.vu/)
 url_string = [ 'http://reiseauskunft.bahn.de/bin/bhftafel.exe/dn?rt=1&ld=10000&evaId=', num2str(ibnr) ,'&boardType=dep&time=actual&productsDefault=1111000000&start=yes']; % question string fo calling actual departure information for Frankfurt HBF
 nr_table = 2; % Table with the travelinformation data
 out_table = getTableFromWeb_mod(url_string, nr_table)

% Finance example
% run getTableDataScript to see, which table is number 7 (Valuation Measures)
url_string = ('http://finance.yahoo.com/q/ks?s=GOOG');
nr_table = 7;
out_table = getTableFromWeb_mod(url_string, nr_table)

Acknowledgements

Get Html Table Data Into Matlab inspired this file.

MATLAB release MATLAB 7.10 (R2010a)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (8)
21 Sep 2014 Gareth Thomas

Worked like a charm, thanks:)

16 Jun 2014 Jorge

Great function! However I've seem to have found a problem with a webpage with only one table, where I got "No Table detected". I changed:

if i>1 % if there are tables to read

to

if i>=1 % if there are tables to read

To correct for the case where only one table is detected, and worked perfectly.

23 Jan 2014 Douglas

Hi, I am pretty new to this. But what if I need to extract more than one table each time i run the script? and also some of the text could not be read and it return a [] cell array. Thanks.

15 Mar 2012 David Jessop

Sorry, stupid question above have now sorted!

15 Mar 2012 David Jessop

That's great! Do you know how I remove the ' ' from around the numbers found so I can use them?

09 Jan 2012 Raj Sodhi

Fantastic!

14 Sep 2011 Borislav Oreshkin

Excellent.

28 Mar 2011 Brian

Great tool. One issue I discovered is that in some HTML sites, the table identifier is capitalized (ex. </TABLE> instead of </table>, etc.). In these cases, the function fails because the string comparison commands are case sensitive. Modifying to use regexprep(...,'preservecase') and regexpi() where appropriate, allows tables to be extracted from websites where the original function failed.

Contact us