How to websave correctly?

7 views (last 30 days)
Haron Shaker
Haron Shaker on 18 Mar 2021
Edited: Haron Shaker on 23 Mar 2021
Dear all,
I want to websave some .edf files with the function created below. My problem is that I just get the .html's instead of downloading it. This isn't the case when downloading some .txt files. That is, it works for .txt and not for .edf.
I will attach a list of .edf's ('apri_edf.xls') and a list of .txt ('founded_medicine_folder_ohne_.xls'), so you can try it yourself.
cellArrayData=readcell('apri_edf.xls')
folderName='apri_edfs'
downloadFolderContentToHardDrive(cellArrayData, folderName)
% downlaods files to the hard drive from a list of urls in cellArrayData
% cellArrayData CellArray with expected data
% 1) file/folder
% 2) last modified time/date
% 3) file type
% folderName Name of the folder in which the data will be saved
% hint: run listAllDirectories first if excel file is not yet available
% recursion
function downloadFolderContentToHardDrive(cellArrayData, folderName)
import matlab.net.http.*
%check if folder already exists, otherwise create one
folderName = string(folderName);
if ~exist(folderName, 'dir')
mkdir(folderName);
end
for i = 1:size(cellArrayData,1)
%at first does the string represent a folder
if strcmp(cellArrayData{i,3}, 'folder')
dirName = getFileNameFromURLstring(cellArrayData{i,1});
tempFolderName = folderName;
%ensure that all subfolders are build correctly
if endsWith(dirName, '/')
folderName = strcat(folderName, dirName);
else
folderName = strcat(folderName, dirName, '/');
end
%get all files in the folder
len = length(cellArrayData{i,1}); %length of string in that cell array entry
subArrayData = cellArrayData(strncmp(cellArrayData{i,1}, cellArrayData(:,1), len), :);
%subtract from original data array to not download data twice
len = size(subArrayData, 1) - 1;
cellArrayData(i+1 : i+len, :) = [];
%delete first row, otherwise will have too many folders around
subArrayData(1,:) = [];
%recursion for the win
downloadFolderContentToHardDrive(subArrayData, folderName);
%reset folderName
folderName = tempFolderName;
%check if I'm done with the the cell array
%otherwise get an error with out of bound since I'm deleting
%entries for recursion
if i == size(cellArrayData, 1)
break;
else
continue
end
%skip loop iterration when data is not empty or first line or missing
elseif isempty(cellArrayData{i,1})
continue
elseif ismissing(cellArrayData{i,1})
continue
elseif strcmp(cellArrayData{i,1}, 'file type')
continue
end
%get filename
if endsWith(folderName, '/')
filename = strcat(folderName, getFileNameFromURLstring(cellArrayData{i,1}));
else
filename = strcat(folderName, '/', getFileNameFromURLstring(cellArrayData{i,1}));
end
%finally save the data
options = weboptions('Username', '...', 'Password', '...', 'Timeout', 60);
url = cellArrayData{i,1};
try
websave(filename, url, options);
catch
pause(10);
websave(filename, url, options);
disp('saving failed: ' + filename)
end
end
end
%returns the file name from an url string
function fileName = getFileNameFromURLstring(url)
x = strsplit(url, '/');
if strcmp(x(end), "")
%folder
fileName = x(end -1);
else
%not a folder
fileName = x(end);
end
end
  6 Comments
Rik
Rik on 20 Mar 2021
If you simply rename them to .edf instead of .edf.html, do you end up with the correct files?
If not, you could try setting the ContentType in weboptions to 'raw'.
Haron Shaker
Haron Shaker on 20 Mar 2021
Edited: Haron Shaker on 23 Mar 2021
Thank you very much, Rik!

Sign in to comment.

Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!