Code covered by the BSD License  

Highlights from
urlread2

5.0

5.0 | 4 ratings Rate this file 125 Downloads (last 30 days) File Size: 9.5 KB File ID: #35693

urlread2

by

 

17 Mar 2012 (Updated )

Generalizes HTTP requests, providing more control and access to input and output

| Watch this File

File Information
Description

Version 1.1

Replacement for urlread. Example functions in the urlread2 file show how to make equivalent calls as are made in urlread. Helper functions can be written (some provided) to provide additional functionality without always needing to modify the urlread2 code.

Specific improvements:
- improved unicode support
- improved binary retrieval support
- request and response header access
- response status access

A technical description of the implementation can be found at:
http://undocumentedmatlab.com/blog/expanding-urlreads-capabilities/

Acknowledgements

Rewrites Of Urlread And Urlwrite and New Useful Urlread Urlreadv inspired this file.

Required Products MATLAB
MATLAB release MATLAB 7.13 (R2011b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (14)
25 Jun 2014 none poncho

Perfect solution for one of matlab's problems, but i'm experiencing difficulties in using it. I'm trying to upload binary contents. Can you please provide an example?

05 Mar 2014 Cary Belas

Perfectly solved my quesionts

28 Nov 2013 Clemens

This function solved my long waiting time of the java call findproxyforurl. Thank you!

21 May 2013 Alex

Very helpful, thanks! This gave me exactly what I needed.

23 Aug 2012 Julian

Thanks, I like it. I only wanted to see what my web-server was sending in a Header, and did the trick nicely. I am more comfortable with MATLAB than web scripting, so this was the easiest solution for me. With regard to my rating though, please bear in mind that I did not take it for a thorough test drive..

04 Apr 2012 Gerardo Manzo

Thanks for your availability. I'll wait for your example, maybe I can figure out better what you are suggesting.

03 Apr 2012 Jim Hokanson

Ah, that clarifies things a bit. That brings up a tricky issue. Your use case isn't necessarily envisioned by their server and you might be blocked temporarily. To minimize this possibility you can space out your queries in time, perhaps every 10 - 20 seconds. I've run into this case with Google Scholar where eventually they poppped up a Captcha to try and prevent me from making automatic queries. It is a bit frustrating that they don't provide a code interface which would minimize their server load while still allowing access to users.

It looks like you are making a GET request, which means that all of the parameters are attached to the URL during the request. I've described this briefly in:
http://undocumentedmatlab.com/blog/expanding-urlreads-capabilities/
Essentially the way to do what you are asking is to make an example request in your browser where you change the date ranges, see how those dates are placed into the request url. Then you learn to modify the request url appropriately. I have code which would help with this but it is not included in this package. I can work on including it. Basically the code first splits on '&' characters to get [property]=[value] pairs. Then you split on '=' to separate the property from the value. Then you want to take each property and value and decode them using the function urldecode. Once this is done it should be pretty obvious where your dates are in the request. You then just modify those dates, repackage all of the parameters, and add them onto the original url after the '?' symbol.

I'll try and post an example of this with new code by the end of the week.

02 Apr 2012 Gerardo Manzo

Sorry, I forgot to tell you the most important thing. I meant, the search on Google News. Now it should be clear.

02 Apr 2012 Jim Hokanson

When you mention different time intervals I assume you mean running the function multiple times over the course of many hours/days/weeks etc. If that is the case these links might be helpful:
http://www.mathworks.com/matlabcentral/answers/30481-how-to-automatically-run-a-matlab-function-at-a-particular-time-every-day
http://www.mathworks.com/support/solutions/en/data/1-361S45/index.html?product=ML&solution=1-361S45

01 Apr 2012 Gerardo Manzo

I used this function and I realized that in the output there's what I need, that is, the number of results of a search. Is it possibile to collect it automatically for different time intervals?

26 Mar 2012 Jim Hokanson

Doubtful, that error is coming because some router or Google server is blocking your ip address (like if you are using a computer in China). You would need to use a proxy to get around that. The fix I just uploaded may provide a more explicit error message than what it was previously giving you, but it should be just the same as if you entered that url in a browser. Note: with the fix the error message will be the output (instead of throwing an error), and the extras structure will have a status indicating that the error occurred and that the output is an error message.

26 Mar 2012 Shane Lin

Do you think if it will work if I add "urlConnection.setFollowRedirects(1);" to urlread? Thanks

25 Mar 2012 Jim Hokanson

Well that stinks. It works for me. There are two issues going on here.
1) The example should work, I'll find a better one!
2) I made some last minute changes to try and directly handle errors instead of letting them cause code errors further down in the code which were harder to debug. What I didn't realize was that the handler recognizes HTTP status codes and processes them as errors as well. In other words, I thought the 403 error code you are seeing would go through just fine and that you would need to check the status and the output to see that something was wrong.
I'll address these changes and upload a new version ASAP.

Thanks!
Jim

23 Mar 2012 Shane Lin

Just tested the get example:

??? Error using ==> urlread2 at 203
Java exception occurred:
java.io.IOException: Server returned HTTP response code: 403 for URL:
http://www.google.com/search?hl=en&query=Pittsburgh+weather

Updates
25 Mar 2012

Fixed GET example and code now returns HTTP status codes that are errors along with the error text instead of just throwing an error.

26 Mar 2012

Changed description

Contact us