Loop over subfolders in directory instead of using 'cd' and perform calcuations and give output for every subfolder

Hello all,
Thank you for your help in advance. i tried different ways to get it but i could not.
I have a main folder which has three subfolder in it. Each subfolder has excel files and i do some operations of all these files in a folder. like merging all excel files, taking only relevant columns and creating a matrix at the end.
eg:
code 1: cd 'F:\ '
files= dir('*.csv');
output = readmatrix(files(1).name);
for x=2:numel(files)
new=readmatrix(files(x).name);
output = vertcat(output,new);
end
output = rmmissing (output);
t=[output(:,1) output(:,3)];
t(:,3) = sqrt (out(:,4).^2+out(:,5).^2+out(:,6).^2);
code 2: cd 'F:\'
files= dir('*.csv');
output = readmatrix(files(1).name);
for x=2:numel(files)
new=readmatrix(files(x).name);
output = vertcat(output,new);
end
output = rmmissing (output);
t=[output(:,1) output(:,3)];
So the code should work like this.
i give the path to main folder at the starting line of code. the next process should done all by the code. the code should only selct the subfolders not the files present in the main folder.
it should check how many subfolders the main folder has. first it should selct the first folder and do the code 1 or code 2 depending on the name of the subfolder and give the outputs t1. and after that it should go to second folder and do the same. and then the final folder.
everytime i have to do this operation on each and every folder and it is taking lot of time when i use 'cd' as it changes directory all the time.
Eg: loopover every subfolder and depending on the folder name if pump1 or pump2 using code 1 on these subfolders, if filter1 or filter2 use code 2 on these subfolders. every subfolder should get t1,t2,t3 as respective outputs saved to the workspace.
I tried using dir, fullfile and difierent ways but its pretty confusing. It would be really helpful if someone gives me an idea of how to do it.

Answers (2)

hello
this is my suggestion - you may have to adapt it to your specific needs
clc
clearvars
%% define path
yourpath = pwd; % or your specific path
list=dir(yourpath); %get info of files/folders in current directory
isfile=~[list.isdir]; %determine index of files vs folders
dirnames={list([list.isdir]).name}; % directories names (including . and ..)
dirnames=dirnames(~(strcmp('.',dirnames)|strcmp('..',dirnames))); % remove . and .. directories names from list
%% demo for excel files
sheet = 1; % specify which sheet to be processed (my demo) - if needed
%% Loop on each folder
for ci = 1:length(dirnames) %
fileDir = char(dirnames(ci)); % current directory name
S = dir(fullfile(fileDir,'Sheeta*.xlsx')); % get list of data files in directory according to name structure 'Sheeta*.xlsx'
S = natsortfiles(S); % sort file names into natural order (what matlab does not) , see FEX :
%(https://fr.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort)
%% Loop inside folder
for k = 1:length(S) % read data in specified sheet
data = xlsread(fullfile(fileDir, S(k).name),sheet); % or use a structure (S(k).data ) to store the full data structure
% your own code here for data processing. this is just for my demo
% for now :
title_str = [fileDir ' / ' S(k).name ' / sheet : ' num2str(sheet)];
figure,plot(data),title(title_str);
end
end

3 Comments

Thank you for your fast response.
I tried this but its still not working. The names of excel sheets differs so i do not want to use any name, so it should directly implements the code from the 1st sheet. I managed to change few things but its not working. please look into it.
from where 'S' is defined the code is not working actually.
yourpath='F:\';
list=dir(yourpath);
isfile=~[list.isdir];
dirnames={list([list.isdir]).name};
dirnames=dirnames(~(strcmp('.',dirnames)|strcmp('..',dirnames)));
for ci=1:length(dirnames)
fileDir = char (dirnames(ci));
S=dir(fullfile(fileDir, '*.csv'));
for k=1:length(S)
data = readmatrix(S(k).name);
%% using vertcat or something i want to merge all the excel files in subfolder into one file matrix
end
%% do some calculations on the final merged matrix of the one subfolder
%% this matrix data should be saved, so that in the next loop the new merged matrix will be saved with a new name.
end
also when i use this code
" S = dir(fullfile(fileDir,'Sheeta*.xlsx'));"
its not giving list of files.
in the workspace its showing like this
S =
name
folder
date
bytes
isdir
datenum
If i am able to get list of files for 'S', i can do the next part easily i guess. this is where i am stuck.
hello
my code was a general example - you can do some modifications :
to have a list of all xls files in the directory and not files with names containing specific characters
replace
S = dir(fullfile(fileDir,'Sheeta*.xlsx')); % get list of data files in directory according to name structure 'Sheeta*.xlsx'
with
S = dir(fullfile(fileDir)); % get list of data files in directory
also 'sheet" is here an optionnal parameter to pass to xlsread , this is also not mandatory. By default the first sheet is loaded. If xlsread is not appropriate you have other options (readtable)
replace
data = xlsread(fullfile(fileDir, S(k).name),sheet);
with
data = xlsread(fullfile(fileDir, S(k).name));
=> demo code modified according to explanations above
clc
clearvars
%% define path
yourpath = pwd; % or your specific path
list=dir(yourpath); %get info of files/folders in current directory
isfile=~[list.isdir]; %determine index of files vs folders
dirnames={list([list.isdir]).name}; % directories names (including . and ..)
dirnames=dirnames(~(strcmp('.',dirnames)|strcmp('..',dirnames))); % remove . and .. directories names from list
%% Loop on each folder
for ci = 1:length(dirnames) %
fileDir = char(dirnames(ci)); % current directory name
S = dir(fullfile(fileDir,'Sheeta*.xlsx')); % get list of data files in directory according to name structure 'Sheeta*.xlsx'
S = natsortfiles(S); % sort file names into natural order (what matlab does not) , see FEX :
%(https://fr.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort)
%% Loop inside folder
for k = 1:length(S) % read data in specified sheet
data = xlsread(fullfile(fileDir, S(k).name)); % or use a structure (S(k).data ) to store the full data structure
% your own code here for data processing. this is just for my demo
% for now :
title_str = [fileDir ' / ' S(k).name];
figure,plot(data),title(title_str);
end
end

Sign in to comment.

Why not just set up a fileDatastore where you get all the CSV files below some top level folder, like
% Set up a fileDatastore
topLevelFolder = pwd; % Wherever you want, like 'C:\CSV files' or wherever.
filePattern = fullfile(topLevelFolder, '*.csv');
ds = fileDatastore(filePattern, 'ReadFcn', @readmatrix)
% Extract all the individual file names. They'll have the folder as well as the base file name plus extension.
allFileNames = {ds.Files{:}}'
numCSVFiles = numel(allFileNames);
fprintf('A total of %d files.\n', numCSVFiles)
% Loop through them all reading them into a variable.
for k = 1 : numCSVFiles
thisFullFileName = allFileNames{k};
fprintf('Now reading %s\n', thisFullFileName)
data = readmatrix(thisFullFileName);
% Now process data however you want.
end
fprintf('Done reading all %d CSV files.\n', numCSVFiles)

4 Comments

Hello @Image Analyst thank you for your fast response.
I cannot combine all csv files. every subfolder has to individually selected and perform some calculations and give the ouput matrix for every subfolder. because every subfolder is different from other even though all have csv files.
You can ask the user to specify a file name this way:
% Have user browse for a file, from a specified "starting folder."
% For convenience in browsing, set a starting folder from which to browse.
startingFolder = pwd; % or 'C:\wherever';
if ~isfolder(startingFolder)
% If that folder doesn't exist, just start in the current folder.
startingFolder = pwd;
end
% Get the name of the file that the user wants to use.
defaultFileName = fullfile(startingFolder, '*.csv');
[baseFileName, folder] = uigetfile(defaultFileName, 'Select a file');
if baseFileName == 0
% User clicked the Cancel button.
return;
end
fullFileName = fullfile(folder, baseFileName)
x matrix output is just showing the data of the last iteration (last subfolder data). is it possible to create x1, x2, x3 as outputs for every loop of subfolder.
I read that variable names should not be changed dynamically like x1,x2,x3..is there any other possible ways to create the output matrix 'X' with respective subfolder name ?
if i have subfolder pump,fan.. the x output matrix should use the subfolder name and gives pump, fan as outut matrices.
is it ok the way i used the X matrix to add up the data ?
No, that would be a really bad idea, as you've already learned. Why? See the FAQ:
I don't see why you can't just use data in your computations. Why do you need to save them all? If you still do, you can put data into a cell array:
data{k} = readmatrix(thisFullFileName);

Sign in to comment.

Categories

Products

Release

R2020a

Asked:

on 30 Mar 2022

Commented:

on 1 Apr 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!