File Exchange

image thumbnail

urlread2

version 1.2 (9.5 KB) by

Generalizes HTTP requests, providing more control and access to input and output

4.85714
7 Ratings

174 Downloads

Updated

View License

Version 1.1

Replacement for urlread. Example functions in the urlread2 file show how to make equivalent calls as are made in urlread. Helper functions can be written (some provided) to provide additional functionality without always needing to modify the urlread2 code.

Specific improvements:
- improved unicode support
- improved binary retrieval support
- request and response header access
- response status access

A technical description of the implementation can be found at:
http://undocumentedmatlab.com/blog/expanding-urlreads-capabilities/

Comments and Ratings (32)

Arek Majka

I have been using this API successfully for months. It stopped working today... I am getting error:

Error using Quandl.api (line 36)
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->
<!--[if IE 7]> <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->
<!--[if IE 8]> <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en-US"> <!--<![endif]-->
<head>
<title>Access denied | www.quandl.com used CloudFlare to restrict access</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
<meta name="robots" content="noindex, nofollow" />
<meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1" />
<link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/cf.errors.css" type="text/css" media="screen,projection" />
<!--[if lt IE 9]><link rel="stylesheet" id='cf_styles-ie-css' href="/cdn-cgi/styles/cf.errors.ie.css" type="text/css" media="screen,projection" /><![endif]-->
<style type="text/css">body{margin:0;padding:0}</style>
<!--[if lte IE 9]><script type="text/javascript" src="/cdn-cgi/scripts/jquery.min.js"></script><![endif]-->
<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/zepto.min.js"></script><!--<![endif]-->
<script type="text/javascript" src="/cdn-cgi/scripts/cf.common.js"></script>

</head>
<body>
<div id="cf-wrapper">
<div class="cf-alert cf-alert-error cf-cookie-error" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.</div>
<div id="cf-error-details" class="cf-error-details-wrapper">
<div class="cf-wrapper cf-header cf-error-overview">
<h1>
<span class="cf-error-type" data-translate="error">Error</span>
<span class="cf-error-code">1010</span>
<small class="heading-ray-id">Ray ID: 2f0bf4c4ae045e82 &bull; 2016-10-12 16:33:53 UTC</small>
</h1>
<h2 class="cf-subheadline" data-translate="error_desc">Access denied</h2>
</div><!-- /.header -->

<section></section><!-- spacer -->

<div class="cf-section cf-wrapper">
<div class="cf-columns two">
<div class="cf-column">
<h2 data-translate="what_happened">What happened?</h2>
<p>The owner of this website (www.quandl.com) has banned your access based on your browser's signature (2f0bf4c4ae045e82-ua21).</p>
</div>

</div>
</div><!-- /.section -->

<div class="cf-error-footer cf-wrapper">
<p>
<span class="cf-footer-item">CloudFlare Ray ID: 2f0bf4c4ae045e82</span>
<span class="cf-footer-separator">&bull;</span>
<span class="cf-footer-item"><span data-translate="your_ip">Your IP</span>: 12.234.165.254</span>
<span class="cf-footer-separator">&bull;</span>
<span class="cf-footer-item"><span data-translate="performance_security_by">Performance &amp; security by</span> <a data-orig-proto="https"
data-orig-ref="www.cloudflare.com/5xx-error-landing?utm_source=error_footer" id="brand_link" target="_blank">CloudFlare</a></span>

</p>
</div><!-- /.error-footer -->

</div><!-- /#cf-error-details -->
</div><!-- /#cf-wrapper -->

<script type="text/javascript">
window._cf_translation = {};

</script>

</body>
</html>

Error in Quandl.get (line 124)
csv = Quandl.api(path, 'params', params);

What happened??

how can i attach the session cookie to every request? I tried to add the cookie in the headresIn, but every time server returns a new session cookie.

thanks for the great function

Eli

Eli (view profile)

Dan

Dan (view profile)

A greatsubmission. However how would you go about passing a cookie. Several websites I need to access require a cookie to be passed along with th elogin.

Jim Hokanson

Jim Hokanson (view profile)

Hi José,

Sorry for the delay. Two things jump out at me.

1) Your Python example doesn't include the nonce, where as the Matlab version does
2) The headers input should be a structure array, so you want to use [] instead of {} when concatenating the header entries

header = [ struct('name','Key','value','KKKK'),...
struct('name','Sign','value','SSSS'),...
struct('name','nonce','value','NNNN')
];

Let me know if you have any more questions and best of luck.

Jim

Hello,

First of all, thank you Jim for this complete function.

I have an inquiry regarding 'POST' method and the usage of a header with several fields in your function urllib2.

What I want to do it is easy to be implemented in python, but in MATLAB I am not able to do it. So I shall provide the original code in python, and then my try in MATLAB with your function. I hope I provide enough info to see where the problem is.

PYTHON:
post_data = https://poloniex.com/tradingApi?command=returnBalances&nonce=NNNN

sign = hmac.new(self.Secret, post_data, hashlib.sha512).hexdigest()

headers = {
'Sign': sign,
'Key': self.APIKey
}

ret = urllib2.urlopen(urllib2.Request('https://poloniex.com/tradingApi', post_data, headers))

My try in MATLAB:
body = https://poloniex.com/tradingApi?command=returnBalances&nonce=1440435809;

header = { struct('name','Key','value','KKKK'),...
struct('name','Sign','value','SSSS'),...
struct('name','nonce','value','NNNN')
};

urlbase = 'https://poloniex.com/tradingApi'

json = urlread2(urlbase,'POST',body,header);

Thank you in advance and kind regards

José

Maxwell Agnew

I'm trying to use urlread2 to make a post request to the Etrade API. Their documentation says:

"Since this is a POST request, the parameters are included in the request as XML or JSON"

Can urlread2 handle a 'POST' request with the xml? An example would be much appreciated.
Thanks!

Francisco

Thanks

Jim Hokanson

Jim Hokanson (view profile)

@Francisco,

My apologies on PATCH not working. This is a problem with the underlying Java classes. I'll try to look into alternative Java classes.

Jim

Francisco

Hello.

http DELETE method works OK:

url ='https://api-fxpractice.oanda.com/v1/accounts';
header = http_createHeader('Authorization','Bearer XXXXXXXX-YYYYYYYYY');
urlread2(url,'DELETE','',header)

returns:

ans =

{
"id" : 619104742,
"instrument" : "EUR_GBP",
"units" : 1,
"side" : "buy",
"price" : 0.79643,
"time" : "2014-07-22T03:25:49.000000Z",
"type" : "BuyEntry"
}

Cheers

Francisco

Thanks for your answer.

I get this error message when I use the PATCH method here:
header = http_createHeader('Authorization','XXXXXXXX-YYYYYYYYY');
params = {'units' '2'};
uparams = http_paramsToString(params);
url ='https://api-fxpractice.oanda.com/v1/accounts/1125870/orders/619104742';
urlread2(url,'PATCH',uparams,header)
Error using urlread2 (line 157)
Java exception occurred:
java.net.ProtocolException: Invalid HTTP method: PATCH

at java.net.HttpURLConnection.setRequestMethod(Unknown Source)

at sun.net.www.protocol.https.HttpsURLConnectionImpl.setRequestMethod(Unknown Source)

What do you think is causing it?
Cheers

Francisco

Jim Hokanson

Jim Hokanson (view profile)

@Francisco,

Yes, both are. I've exposed the entirety of the HTTP request and response so you can do anything you want with it.

Jim

Francisco

Hello. Thanks for your Help. Are http PATCH and DELETE methods supported? Thanks

Jim Hokanson

Jim Hokanson (view profile)

@Francisco,

My apologies on the confusing documentation. When providing headers you also need to provide the body so that the order of the inputs is maintained. In this case providing an empty body is fine.

urlread2 (url,'GET','',header)

One day I'd like to rewrite this code base so that this is unnecessary and clearer ...

Best of luck and let me know if you have any other questions.

Jim

Francisco

Hello,

I'm a newbie in MATLAB, I want to use MATLAB send http GET, POST, PATCH and DELETE commands to a REST API:

https://api-fxpractice.oanda.com/v1/accounts

for identification a header has to be sent in this format:
Parameter:Authorization
Value:Bearer XXXXXXXX-YYYYYYYYY

So I type:
header = http_createHeader('Authorization','Bearer XXXXXXXX-YYYYYYYYY')

url = 'https://api-fxpractice.oanda.com/v1/accounts'

urlread2 (url,'GET',header)

and I get this message:
Error using urlread2 (line 180)
Function input: body, should be of class char, uint8, or int8, detected: struct

Is this a problem with the header?

Thanks

Francisco

Jim Hokanson

Jim Hokanson (view profile)

@Dan,

Usually problems arise due to the site using javascript. To get around this I'll usually use a program called "Fiddler" (In Windows). I'll go to the site in my web browser and then look at Fiddler to see what requests are being made to the server. Look for a request and subsequent response from the site that contains the information that you want. Then look more closely at the request to see how you would make the same request.

Best of luck.

Jim

Dan

Dan (view profile)

Thought that URLREAD2 would alleviate the issues I have using URLREAD to retrieve the contents of a URL page.
y = urlread( 'http://www.realtor.com/international/listing-detail/Costa-Guimar%C3%A3es_DISTRITO-DE-BRAGA_PO_666733');

Would like to extract information from the page (e.g. price). While I can see all the info in the source page (in the browser), there is no info in the return value from URLREAD2. Is the server somehow trying to protect the info? Anything that could be done?

Thanks a lot.

none poncho

Perfect solution for one of matlab's problems, but i'm experiencing difficulties in using it. I'm trying to upload binary contents. Can you please provide an example?

Cary Belas

Perfectly solved my quesionts

Clemens

This function solved my long waiting time of the java call findproxyforurl. Thank you!

Alex

Alex (view profile)

Very helpful, thanks! This gave me exactly what I needed.

Julian

Julian (view profile)

Thanks, I like it. I only wanted to see what my web-server was sending in a Header, and did the trick nicely. I am more comfortable with MATLAB than web scripting, so this was the easiest solution for me. With regard to my rating though, please bear in mind that I did not take it for a thorough test drive..

GMark

GMark (view profile)

Thanks for your availability. I'll wait for your example, maybe I can figure out better what you are suggesting.

Jim Hokanson

Jim Hokanson (view profile)

Ah, that clarifies things a bit. That brings up a tricky issue. Your use case isn't necessarily envisioned by their server and you might be blocked temporarily. To minimize this possibility you can space out your queries in time, perhaps every 10 - 20 seconds. I've run into this case with Google Scholar where eventually they poppped up a Captcha to try and prevent me from making automatic queries. It is a bit frustrating that they don't provide a code interface which would minimize their server load while still allowing access to users.

It looks like you are making a GET request, which means that all of the parameters are attached to the URL during the request. I've described this briefly in:
http://undocumentedmatlab.com/blog/expanding-urlreads-capabilities/
Essentially the way to do what you are asking is to make an example request in your browser where you change the date ranges, see how those dates are placed into the request url. Then you learn to modify the request url appropriately. I have code which would help with this but it is not included in this package. I can work on including it. Basically the code first splits on '&' characters to get [property]=[value] pairs. Then you split on '=' to separate the property from the value. Then you want to take each property and value and decode them using the function urldecode. Once this is done it should be pretty obvious where your dates are in the request. You then just modify those dates, repackage all of the parameters, and add them onto the original url after the '?' symbol.

I'll try and post an example of this with new code by the end of the week.

GMark

GMark (view profile)

Sorry, I forgot to tell you the most important thing. I meant, the search on Google News. Now it should be clear.

Jim Hokanson

Jim Hokanson (view profile)

When you mention different time intervals I assume you mean running the function multiple times over the course of many hours/days/weeks etc. If that is the case these links might be helpful:
http://www.mathworks.com/matlabcentral/answers/30481-how-to-automatically-run-a-matlab-function-at-a-particular-time-every-day
http://www.mathworks.com/support/solutions/en/data/1-361S45/index.html?product=ML&solution=1-361S45

GMark

GMark (view profile)

I used this function and I realized that in the output there's what I need, that is, the number of results of a search. Is it possibile to collect it automatically for different time intervals?

Jim Hokanson

Jim Hokanson (view profile)

Doubtful, that error is coming because some router or Google server is blocking your ip address (like if you are using a computer in China). You would need to use a proxy to get around that. The fix I just uploaded may provide a more explicit error message than what it was previously giving you, but it should be just the same as if you entered that url in a browser. Note: with the fix the error message will be the output (instead of throwing an error), and the extras structure will have a status indicating that the error occurred and that the output is an error message.

Shane Lin

Shane Lin (view profile)

Do you think if it will work if I add "urlConnection.setFollowRedirects(1);" to urlread? Thanks

Jim Hokanson

Jim Hokanson (view profile)

Well that stinks. It works for me. There are two issues going on here.
1) The example should work, I'll find a better one!
2) I made some last minute changes to try and directly handle errors instead of letting them cause code errors further down in the code which were harder to debug. What I didn't realize was that the handler recognizes HTTP status codes and processes them as errors as well. In other words, I thought the 403 error code you are seeing would go through just fine and that you would need to check the status and the output to see that something was wrong.
I'll address these changes and upload a new version ASAP.

Thanks!
Jim

Shane Lin

Shane Lin (view profile)

Just tested the get example:

??? Error using ==> urlread2 at 203
Java exception occurred:
java.io.IOException: Server returned HTTP response code: 403 for URL:
http://www.google.com/search?hl=en&query=Pittsburgh+weather

Updates

1.2

Changed description

1.1

Fixed GET example and code now returns HTTP status codes that are errors along with the error text instead of just throwing an error.

MATLAB Release
MATLAB 7.13 (R2011b)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video