webbot
WEBBOT Java-based browser with download and PERL regular expressions. The function will extract all links from a web-page, and display them. The resulting documents can be downloaded.
WEBBOT(URL)
URL is a string indicating the base page address; the url must link to an html file. The function lists all links in the file. URL can also be a cell vector of url-strings.
WEBBOT(URL, WHAT)
displays only specific links. WHAT is a string:
'all_links': displays all links (default).
'page_links': displays all links to an html web page*.
'local_links': displays all local links on the server*.
'external_links': displays all links to external websites.
'image_links': displays all links to an image file**.
'image_tags': displays all image tags <img src="xxx">.
'.xxx.yyyy.zz': displays all links to each specific .xxx files; the case is ignored ('zip' will find 'ZiP'); e.g. '.zip.gz.gzip.tar.Z'.
WEBBOT(URL, WHAT, ACT)
performs an action on found links. ACT is a string:
'noaction': just display links (default)
'download': downloads all links found locally.
'cartoons': downloads all image tags found on linked pages. This is usefull for cartoons websites where each cartoon (e.g. "01.gif") is on its own html page (e.g. "c01.html").
<li>'follow.x': follows links to html pages and recursively performs the same action on the resulting page. 'x' is an integer indicating the ecursivity depth (0 is equivalent to 'noaction').
lks = WEBBOT(URL, ...)
returns an cell-array with links of URL{end}.
Notes: * Links explicitely pointing to a .htm or .html url.
** Image links are recognized by the following file types:
.jpg .jpeg .gif .pict .bmp .tif .tiff .ras .png (.giff)
Try it with:
webbot('http://www.unitedmedia.com/comics/dilbert/archive/', ...
'local_links', 'cartoons');
Written by L.Cavin, 28.09.2003, (c) CSE
This code is free to use and modify for non-commercial purposes.
Web address: http://ltcmail.ethz.ch/cavin/CSEDBLib.html#WEBBOT
Cite As
Laurent Cavin (2024). webbot (https://www.mathworks.com/matlabcentral/fileexchange/4023-webbot), MATLAB Central File Exchange. Retrieved .
MATLAB Release Compatibility
Platform Compatibility
Windows macOS LinuxCategories
- MATLAB > External Language Interfaces > Web Services with MATLAB > Call Web Services from MATLAB Using HTTP >
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.
Version | Published | Release Notes | |
---|---|---|---|
1.0.0.0 | Major update:
|