How do I import and read multiple text files with the same name from different folders

Hi,
I'm a beginner with matlab and struggling with opening and importing a large set of data. I have a folder DATA with subfolders P01, P02, P03 etc, in each subfolder 10 text files and i would like to open and read 1 txtfile from each subfolder named RESULTS_TRIALS. The data is 19 by 7 and i have 150 participants (P01 - P150), so would like to make a data array of 19x7x150.
I want to make a for loop to direct to all those different folders and importing 1 txt file per subfolder, however I have been struggling for a couple of days now and still dont have any clue how to make this happen.
Thanks in advance

4 Comments

Can you provide a sample of a RESULTS_TRIALS file.
Thanks for your quick answer!
I have added a txt file
Oh! My answer will need modification for this type of text file as I assumed from the fact that you want a 3D array that all your data was numerical.
This sort of data should be stored in a table, which can't be 3D. In any case, you would be better off vertically concatenating all the results into just one table.Does the "ID" match the "Pxx" folder name? This would make the whole concatenation trivial, otherwise, I'd just add one variable to the table which is the source folder (assuming that you care about that information).

Sign in to comment.

 Accepted Answer

datafolder = 'C:\somewhere\somefolder'; %your root data folder
dataname = 'RESULTS_TRIALS.txt'; %replace with appropriate extension since you haven't specified
filelist = dir(fullfile(datafolder, 'P*', 'dataname')); %get list of dataname files in all P* subfolders of datafolder
%optional: sort filelist by the P folder number:
[~, order] = sort(str2double(regexp({filelist.folder}, '\d+$', 'match', 'once')))
filelist = filelist(order);
%loop over files and import into alldata
alldata = [];
datasize = [];
for fidx = 1:numel(filelist);
filedata = readmatrix(fullfile(filelist(fidx).folder, filelist(fidx).name));
if fidx == 1 %first file read. Preallocate destination matrix
datasize = size(filedata);
alldata = zeros([datasize, numel(filelist)]); %preallocate 3D array
else %2nd to last file, make sure the size match previous files
assert(size(filedata) == datasize, 'Inconsistent matrix size in folder %s', filelist(fidx).folder);
end
alldata(:, :, fidx) = filedata; %copy file data in destination matrix
end

5 Comments

The code I wrote would go into all the P* folders of datafolder, and if there's a 'RESULTS_TRIALS.txt' (dataname) file in that P* folder, load into alldata. However, from your screenshot it doesn't look like that's what you want exactly.
Is the name of the file to load the same in each P* folder? If not how do you know which file to load? And is it just one file per folder?
As mentioned, the code needs adapting to the fact that it's not just numeric data in your files. That's an easy change to make.
Yes, so every Pxx folder contains exactly the same amount and names of .txt files. Also, every file with the same name will have exactly the same dimensions.
So for now i was only trying to import RESULTS_EHMI_TRIALS.txt files from every Pxx folder, and trying to import it in such a way that i will still be able to know which results contain to which participant.
But i think it might be handy to also import the other txt files already, since for example 'demographics' also need to be linked to participants later on...
The txt files i am gonna use are only those four: (dimensions are different for those files)
RESULTS_EHMI_DIGITS
RESULTS_EHMI_QUESTIONS
RESULTS_EHMI_TRIALS
RESULTS_QUESTIONS
Ok, this will create as many tables as there are types of files to import. They're all stored in a cell array. For each type you get just one table which is the vertical concetanation of the files in all the P* directories. It is assumed that the file on its own has the required information (eg. the ID field) to know where it came from:
datafolder = 'C:\somewhere\somefolder'; %your root data folder
filenames = {'RESULTS_EHMI_DIGITS.txt', 'RESULTS_EHMI_QUESTIONS.txt', 'RESULTS_EHMI_TRIALS.txt', 'RESULTS_QUESTIONS.txt'};
importedtables = cell(size(filenames));
for filetype = 1:numel(filenames)
filelist = dir(fullfile(datafolder, 'P*', filenames{filetype})); %get list of files for the filetype in all P* folders
tbl = arrayfun(@(f) readtable(fullfile(f.folder, f.name)), filelist, 'UniformOutput', false);
importedtables{filetype} = vertcat(tbl{:});
end
Once the 4 tables are imported, you can join, innerjoin or outerjoin the tables as required.
Thanks it worked!:))!!
although, got an error in this line, but i changed f.file to f.name
tbl = arrayfun(@(f) readtable(fullfile(f.folder, f.file)), filelist, 'UniformOutput', false);
I have 1 more question, is it possible to remove in RESULTS_EHMI_TRIALS, the first row from every file?
Yep, sorry it was meant to be f.name indeed. Fixed now.
Assuming that by first row, you mean the Trial index 0:
importedtables{3}(importedtables{3}.Trial_Index_ == 0, :) = []; %delete all rows whose trial index is 0

Sign in to comment.

More Answers (0)

Categories

Asked:

on 27 Nov 2019

Commented:

on 27 Nov 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!