You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Download a file from a website?
18 views (last 30 days)
Show older comments
Accepted Answer
Walter Roberson
on 6 Aug 2022
Edited: Walter Roberson
on 6 Aug 2022
22 Comments
Ara
on 7 Aug 2022
Hi Walter, Thank you. Here is the code, though I cannot provide my user name and password. Do you know where the problem is?
-----------------------------
Clear all;clc;
url = 'http://cdaac-www.cosmic.ucar.edu/cdaac/cgi_bin/fileFormats.cgi?type=scnLv1';
buffer = urlread (url);
% pattern = '
%(2)
% - Add the code for reading data, update the function so it outputs them
username = 'xxx' ;
password = 'xxx' ;
% - Define path to wget.
% wgetExec = '"C:\Program Files\GnuWin32\bin\wget"' ;
wgetExec = '"C:\Program Files (x86)\GnuWin32\bin\wget"' ;
% % - Define base path for accessing data (may have to update it).
dataBase = 'http://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.';
% - Disable warning when overwriting folders.
warning( 'off', 'MATLAB:MKDIR:DirectoryExists' ) ;
for dayId = 01:227
% - Create directory and cd.
dirName = sprintf( '2013_0%02d', dayId ) ;
mkdir( dirName ) ;
cd( dirName ) ;
% - Build wget call string and make the call.
command = sprintf( '%s -nd -np -l 1 -r -w 2 --http-user=%s --http-passwd=%s %s%03d', wgetExec, username, password, dataBase, dayId ) ;
status = system( command, '-echo' ) ; % You may want to remove the echo when it works.
if status
warning( 'There was an issue with WGET on day %d.', dayId ) ;
end
% - Return to top directory.
cd( '..' ) ;
end
warning( 'on', 'MATLAB:MKDIR:DirectoryExists' ) ;
Walter Roberson
on 7 Aug 2022
Is there a reason you are calling an external wget instead of using webread()?
Walter Roberson
on 8 Aug 2022
You would use weboptions to construct the username and password information. You would construct a url from your dataBase and dayID . You would set the options to request binary data.
The output you get back would be a column vector of uint8. You would fopen() a file ending with .tar in the name, and fwrite() the data to the file. You would then use untar to extract the files.
You should probably check to be sure the stream of bytes was not empty... I suspect that you are either getting an error message about access or else there just isn't any data there.
Ara
on 9 Aug 2022
Edited: Ara
on 9 Aug 2022
Thank you, Walter.
I do not know how to remove wget and how to write the program. Would it be possible for you to tell me how to modify the code?
I used
options = weboptions('Username','xxx','Password','xxx');
Walter Roberson
on 9 Aug 2022
Edited: Walter Roberson
on 9 Aug 2022
url = 'http://cdaac-www.cosmic.ucar.edu/cdaac/cgi_bin/fileFormats.cgi?type=scnLv1';
username = 'xxx' ;
password = 'xxx' ;
options = weboptions('Username', username, 'Password', password, ...
'type', 'raw');
% % - Define base path for accessing data (may have to update it).
dataBase = 'http://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.';
origdir = pwd();
for dayId = 1:227
% - Create directory and cd.
dirName = sprintf( '2013_%03d', dayId ) ;
cd( origdir );
mkdir( dirName );
cd( dirName );
rawdata = webread( [dataBase dirName], options );
if isempty(rawdata)
fprintf('no data reading %s\n', dirName);
else
try
fid = fopen('rawdata.tar', 'w');
fwrite(fid, rawdata, 'uint8');
fclose(fid)
untar('rawdata.tar');
fprintf('success %s\n', dirName);
catch ME
printf('some failure on %s\n', dirName);
lasterror
end
end
cd( origdir );
end
Ara
on 9 Aug 2022
Thank you very much.
I got this error:
Error using weboptions
'type' is not a recognized parameter. For a list of valid name-value pair arguments,
see the documentation for weboptions.
Error in weboptions>parseInputs (line 638)
p.parse(args{:});
Error in weboptions (line 375)
inputs = parseInputs(options, varargin);
Error in main_dowloadFromWebsite (line 4)
options = weboptions('Username', username, 'Password', password, ...
Walter Roberson
on 9 Aug 2022
url = 'http://cdaac-www.cosmic.ucar.edu/cdaac/cgi_bin/fileFormats.cgi?type=scnLv1';
username = 'xxx' ;
password = 'xxx' ;
options = weboptions('Username', username, 'Password', password, ...
'ContentType', 'raw');
% % - Define base path for accessing data (may have to update it).
dataBase = 'http://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.';
origdir = pwd();
for dayId = 1:227
% - Create directory and cd.
dirName = sprintf( '2013_%03d', dayId ) ;
cd( origdir );
mkdir( dirName );
cd( dirName );
rawdata = webread( [dataBase dirName], options );
if isempty(rawdata)
fprintf('no data reading %s\n', dirName);
else
try
fid = fopen('rawdata.tar', 'w');
fwrite(fid, rawdata, 'uint8');
fclose(fid)
untar('rawdata.tar');
fprintf('success %s\n', dirName);
catch ME
printf('some failure on %s\n', dirName);
lasterror
end
end
cd( origdir );
end
Ara
on 9 Aug 2022
Thank you, Walter. I got an errr. Please see bellow:
Error using matlab.internal.webservices.HTTPConnector/copyContentToByteArray (line
396)
The server returned the status 301 with message "Moved Permanently" in response to
the request to URL
http://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.2013_001.
Error in readContentFromWebService (line 46)
byteArray = copyContentToByteArray(connection);
Error in webread (line 125)
[varargout{1:nargout}] = readContentFromWebService(connection, options);
Error in main_dowloadFromWebsite (line 16)
rawdata = webread( [dataBase dirName], options );
Walter Roberson
on 10 Aug 2022
You would have to log the HTTP session; the details would show the new URL to use.
There are also ways to do it using telnet, but it is a nuisance to get right, especially with the authentication step.
I do not have an account with them so I cannot trace it myself.
Ara
on 11 Aug 2022
Edited: Walter Roberson
on 11 Aug 2022
Dear Walter,
Here is the website after the user name and password. When I run it using this URL( https://cdaac-www.cosmic.ucar.edu/cdaac/tar/rest.html ) it shows an error. Please see below.
I can share my username and password only with you (I can send it using your contact that privided in your profile. But please keep it with yourself. Please let me know then I will send it to you.
Error using mkdir
Access is denied.
Error in main_dowloadFromWebsite (line 15)
mkdir( dirName );
Walter Roberson
on 11 Aug 2022
The following puts in error checks. It does not try to write into your current directory, as your earlier error messages show that you do not have write access to your current directory. Instead, it writes into a temporary directory that you should have write access to unless your system is misconfigured.
If you get "error reading from url" then the URL in dataBase is wrong, or you have an authentication error.
If you get "no error but no data available from" then reading from the URL did not error but no data was delivered.
If you get "got data but could not untar" then you received some data but it was not a valid tar file. Either the site sent an in-line message or else we did not properly figure out how to tell it how to download a file.
Any other warning message represents a problem on your side.
%url = 'http://cdaac-www.cosmic.ucar.edu/cdaac/cgi_bin/fileFormats.cgi?type=scnLv1';
username = 'xxx' ;
password = 'xxx' ;
options = weboptions('Username', username, 'Password', password, ...
'ContentType', 'raw');
% % - Define base path for accessing data (may have to update it).
dataBase = 'http://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.';
origdir = pwd();
td = tempdir;
outputdir = fullfile(td, 'cosmic2013');
if ~isdir(outputdir)
try
mkdir(outputdir)
catch ME
error('could not create output directory "%s"', outpudir);
end
end
fprintf('Extracting data into directory "%s"\n', outputdir);
for dayId = 1:227
% - Create directory and cd.
dirName = fullfile(outputdir, sprintf( '2013_%03d', dayId ) );
if ~isdir(dirName)
try
mkdir( dirName );
catch ME
warning('skipping day %d, could not create directory at "%s"', dayID, dirName);
continue;
end
end
cd( dirName );
thisurl = [dataBase dirName];
try
rawdata = webread( thisurl, options );
catch ME
warning('error reading from url "%s"', thisurl);
continue;
end
if isempty(rawdata)
warning('no error but no data available from "%s"', url);
continue
else
tarname = fullfile(dirName, 'rawdata.tar');
fid = fopen(tarname, 'w');
if fid < 0
warning('skipping day %d, got data but could not create tar at "%s"', dayId, tarname);
continue;
end
fwrite(fid, rawdata, 'uint8');
fclose(fid)
try
untar(tarname);
catch ME
warning('got data but could not untar, examine "%s"', tarname);
continue
end
fprintf('success for day %d\n', dayID);
end
end
cd(origdir)
Ara
on 11 Aug 2022
Dear Walter,
Thank you very much. It works but the folder is empty. Error in the URL.
The url is like this"https://cdaac-www.cosmic.ucar.edu/cdaac/tar/rest.html" after inserting username and password, it should select cosmic 2013then select file from the calendar and the folder is eaxctly came that is full of netcdf file to extract S4, time, etc. I would appriciate if you resolve the error. Which url do I need to use?
Ara
on 11 Aug 2022
Here is the warning error and does not stop Matlab.
Warning: error reading from url
"http://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.C:\Users\Aramesh\AppData\Local\Temp\cosmic2013\2013_006"
Walter Roberson
on 11 Aug 2022
%url = 'http://cdaac-www.cosmic.ucar.edu/cdaac/cgi_bin/fileFormats.cgi?type=scnLv1';
username = 'xxx' ;
password = 'xxx' ;
options = weboptions('Username', username, 'Password', password, ...
'ContentType', 'raw');
% % - Define base path for accessing data (may have to update it).
dataBase = 'http://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.';
origdir = pwd();
td = tempdir;
outputdir = fullfile(td, 'cosmic2013');
if ~isdir(outputdir)
try
mkdir(outputdir)
catch ME
error('could not create output directory "%s"', outpudir);
end
end
fprintf('Extracting data into directory "%s"\n', outputdir);
for dayId = 1:227
% - Create directory and cd.
dirName = sprintf('2013_%03d', dayId);
dayoutputdir = fullfile(outputdir, dirName);
if ~isdir(dayoutputdir)
try
mkdir( dayoutputdir );
catch ME
warning('skipping day %d, could not create directory at "%s"', dayID, dayoutputdir);
continue;
end
end
cd( dayoutputdir );
thisurl = [dataBase dirName];
try
rawdata = webread( thisurl, options );
catch ME
warning('error reading from url "%s"', thisurl);
continue;
end
if isempty(rawdata)
warning('no error but no data available from "%s"', url);
continue
else
tarname = fullfile(dayoutputdir, 'rawdata.tar');
fid = fopen(tarname, 'w');
if fid < 0
warning('skipping day %d, got data but could not create tar at "%s"', dayId, tarname);
continue;
end
fwrite(fid, rawdata, 'uint8');
fclose(fid)
try
untar(tarname);
catch ME
warning('got data but could not untar, examine "%s"', tarname);
continue
end
fprintf('success for day %d\n', dayID);
end
end
cd(origdir)
Walter Roberson
on 11 Aug 2022
The redirect error turned out to be because you were making an http request but the site wants https requests.
The below code is tested.
Note: the below code leaves the .tar file in place; you might want to delete those.
Note: if you want to interrupt, control C repeatedly. I built in a test that it gives up after 5 errors, and that turns out to include errors caused by interrupting the download.
%url = 'http://cdaac-www.cosmic.ucar.edu/cdaac/cgi_bin/fileFormats.cgi?type=scnLv1';
username = 'xxx' ;
password = 'xxx' ;
options = weboptions('Username', username, 'Password', password, ...
'ContentType', 'raw');
% % - Define base path for accessing data (may have to update it).
dataBase = 'https://cdaac-www.cosmic.ucar.edu/cdaac/rest/tarservice/data/cosmic2013/scnLv1/2013.';
origdir = pwd();
td = tempdir;
outputdir = fullfile(td, 'cosmic2013');
if ~isfolder(outputdir)
try
mkdir(outputdir)
catch ME
error('could not create output directory "%s"', outpudir);
end
end
fprintf('Extracting data into directory "%s"\n', outputdir);
errorcount = 0;
maxerror = 5;
for dayId = 1:227
% - Create directory and cd.
dirName = sprintf('%03d', dayId);
dayoutputdir = fullfile(outputdir, dirName);
if ~isfolder(dayoutputdir)
try
mkdir( dayoutputdir );
catch ME
warning('skipping day %d, could not create directory at "%s"', dayID, dayoutputdir);
errorcount = errorcount + 1; if errorcount >= maxerror; error('too many errors, giving up'); end
continue;
end
end
thisurl = [dataBase dirName];
try
rawdata = webread( thisurl, options );
catch ME
warning('error reading from url "%s"', thisurl);
errorcount = errorcount + 1; if errorcount >= maxerror; error('too many errors, giving up'); end
continue;
end
if isempty(rawdata)
warning('no error but no data available from "%s"', url);
errorcount = errorcount + 1; if errorcount >= maxerror; error('too many errors, giving up'); end
continue
else
tarname = fullfile(dayoutputdir, 'rawdata.tar');
fid = fopen(tarname, 'w');
if fid < 0
warning('skipping day %d, got data but could not create tar at "%s"', dayId, tarname);
errorcount = errorcount + 1; if errorcount >= maxerror; error('too many errors, giving up'); end
continue;
end
fwrite(fid, rawdata, 'uint8');
fclose(fid);
try
cd( dayoutputdir );
untar(tarname);
cd( origdir);
catch ME
cd( origdir );
warning('got data but could not untar, examine "%s"', tarname);
errorcount = errorcount + 1; if errorcount >= maxerror; error('too many errors, giving up'); end
continue
end
fprintf('success for day %d\n', dayId);
end
end
cd(origdir)
fprintf('Files extracted to "%s"\n', outputdir);
Example output file:
/private/var/folders/jq/wx1hzy713dj_408tpm5fck040000gn/T/cosmic2013/001/cosmic2013/scnLv1/2013.001/scnLv1_C001.2013.001.11.58.0005.G04.03_2013.3520_nc
This code is creating the Cosmic2013/001" level and the untar is creating the cosmic2013/scnLv1/2013.001 level under that. The /private/var/folders/jq/wx1hzy713dj_408tpm5fck040000gn/T here is the temporary directory that resulted from tempdir()
I download into a directory relative to tempdir() because you do not seem to have write access to your current directory.
Walter Roberson
on 11 Aug 2022
I tested the code on my system (Mac). For example,
>> ls /private/var/folders/jq/wx1hzy713dj_408tpm5fck040000gn/T/cosmic2013/001/cosmic2013/scnLv1/2013.001/
scnLv1_C001.2013.001.00.00.0004.G08.03_2013.3520_nc scnLv1_C002.2013.001.02.02.0033.G14.01_2013.3520_nc scnLv1_C002.2013.001.18.53.0027.G16.01_2013.3520_nc scnLv1_C005.2013.001.08.12.0008.G10.01_2013.3520_nc
scnLv1_C001.2013.001.00.01.0001.G22.02_2013.3520_nc scnLv1_C002.2013.001.02.03.0001.G20.02_2013.3520_nc scnLv1_C002.2013.001.18.54.0001.G05.02_2013.3520_nc scnLv1_C005.2013.001.08.18.0001.G04.02_2013.3520_nc
scnLv1_C001.2013.001.00.02.0001.G31.02_2013.3520_nc scnLv1_C002.2013.001.02.04.0019.G02.01_2013.3520_nc scnLv1_C002.2013.001.18.54.0023.G19.01_2013.3520_nc scnLv1_C005.2013.001.08.18.0001.G28.03_2013.3520_nc
(and more)
I did not test it on Windows (I am not sure I have a functioning Windows MATLAB installed at the moment.)
More Answers (1)
Ara
on 11 Aug 2022
Dear Walter,
It works. For folder 1 completely downloaded all data. Thank you very much. Only problem is very slow and gets busy for ~25min for one file. Is there any way to improve it?
3 Comments
Ara
on 11 Aug 2022
Dear Walter,
Now, I have to read NETCDF files to extract S4, time, Longtitude, latitute. Do you know how I can do this? Do I need to open another question?
You did a great work for me and I would greatly appriciate your help.
Walter Roberson
on 12 Aug 2022
You are mostly being limited by the speed of your internet connection.
If you change the assignment
td = tempdir();
you could change the download directory to an SSD if you have one. That could potentially make the untar step faster.
Ara
on 12 Aug 2022
Edited: Ara
on 12 Aug 2022
Oh, I see! Yes the internet connection is not very good.
SSD means external memory? Would it be possible to download it all files in it instead of C? I do not know how to change it. Wouldyou please let me know how to change the path to the current folder or specifically in the external memory?
See Also
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)