Is there an efficient way to test the validity of a large set of http URLs in Matlab?

3 views (last 30 days)
If use urlread/webread, it downloads all the contents, which is time-consuming and unnecessary.
If a http URL doesn't exist (204/404 no content/error), the function is stuck for some seconds.
Is there other faster way?

Answers (1)

Walter Roberson
Walter Roberson on 25 Aug 2015
If you do not find a way with urlread() you can use urlread2() passing 'HEAD' as the method.
  4 Comments
Ray Lee
Ray Lee on 25 Aug 2015
Same to you, non-existing url
[a,b] = urlread2('http://www.qwert.yui','HEAD')
produces errors
Response stream is undefined
below is a Java Error dump (truncated):
Error using urlread2 (line 217)
Java exception occurred:
java.io.IOException: Server returned HTTP response code: 503 for URL:
http://www.qwert.yui
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
But URLs return 204 or 404 no content
url = 'http://ws.resif.fr/fdsnws/dataselect/1/query?network=SX&station=FBE&starttime=2000-01-01&endtime=2000-01-02&channel=BHZ'
[a,b] = urlread2(url,'HEAD')
produce
a =
Empty string: 1-by-0
b =
allHeaders: [1x1 struct]
firstHeaders: [1x1 struct]
status: [1x1 struct]
url: 'http://ws.resif.fr/fdsnws/dataselect/1/query?netw...'
isGood: 1
and b.firstHeaders
Response: 'HTTP/1.1 200 OK'
Date: 'Tue, 25 Aug 2015 17:17:41 GMT'
Server: 'Apache-Coyote/1.1'
Content_Type: 'text/plain; charset=UTF-8'
X_Cache: 'MISS from www-cache-3'
Via: '1.1 www-cache-3 (squid/3.4.12)'
Connection: 'keep-alive'

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!