Opening Sequential .txt Files and Populating Array with Data

6 views (last 30 days)
Hello All,
I would like to write a scrip that will open multiple tab delimited, headerless .txt files from a folder (ex file name "Test1" "Test2" etc) and have the second column copied into an array of zeroes. So far I have tried to follow a similar case I found here:
%This scrip is designed to open multiple .txt files containing %data and write the second column of data into an array.
%1)Set-up File Directory
%Copy folder path here:
MyFolder = Users/Chris/Desktop/Test;
TextFiles = dir(Users/Chris/Desktop/Test .txt);
%Create and array of zeros to rewrite data over
Absorbance = zeroes (2048, 63); %2048 x values and ## spectra files
%3)Import Absorbance Data
for i = 1:63 %Number of spectra files in directory
TextFileNames = ['Test_Absorbance_' num2str(k) '.txt'];
%Open Text File, 'rt' = read text (only works for txt files)
fid = fopen(TextFileNames, 'rt');
%Read file into 2 separate variables
Data = textscan (fid, '%d%d', 2);
%4)Wrtie Abs. data to array
%Insert absorbance data into array
Absorbance(:,k+1) = Data {2}; %put in 2nd row of data
fclose(fid);
end
the code-error alert icon in the upper right is green, however when I run the scrip I get an "undefined function or variable 'Users' error. I tried switching the path to ~Desktop\Test and get the same error. I would appreciate if someone could explain what the proper command should be.
Thank you in advance,

Accepted Answer

dpb
dpb on 29 Apr 2014
Again, once you've got the results from dir for the files in question, then use the names as returned therein...that is, instead of
DataFiles = dir('Test_Absorbance_*.txt');
fid = fopen('Test_Absorbance_00001.txt');
which is a hardcoded filename, and there's no sense in having therefore, done the call to dir, use
fid=fopen(DataFiles(i).name,'r');
inside your loop.
Absorbance = zeros(2048, length(DataFiles)); % preallocate the two that you want to keep
Wavelength=Absorbance;
for k = 1:length(DataFiles); % this is right...begin the loop
fid=fopen(DataFiles(k).name,'r'); % open the ith file for reading
TextData = textscan(fid, '%n%n', 2048, 'delimiter', '\t');
Wavelength(:,k) = TextData{1}; % fill k-th column of preallocated
Absorbance(:,k) = TextData{2}; % arrays w/ the stuff from the file
fid=fclose(fid);
end
  3 Comments
dpb
dpb on 30 Apr 2014
Edited: dpb on 30 Apr 2014
dir returns a structure array of which the filename is the field name in the structure of the structure named as given by the return value identifier for it. To retrieve the file names you reference the ith array location in the structure array and the desired field using "dot" notation. Try
d=dir('*.txt');
d
at the command line to see what dir returns in d and then
d.name
doc dir % for details
and also wouldn't hurt to read up on Matlab data types and the syntax for them under the section on them in the Data Types chapter you can get to by
doc struct
for details and broadening your Matlab horizons if this is new/unfamiliar territory for you.
ADDENDUM:
BTW, if you require the files be in numerical order, since dir returns them in the order as the OS provides which may or may not be sequential (odds are good if they're created that way, but if there's been any renaming or other moving around that's fairly likely to change and even if not it's not guaranteed). In that case, you may for robustness want to sort the array before using it. To do that takes one "trick" of concatenating the filenames correctly by row since they're character variable of possibly differing lengths --
d=dir('*.dat'); % sample directory
d.name=sortrows(strvcat(d.name));
The "trick" is strvcat to automagically vertically concatenate the list of names, padding them as needed to then be able to use sortrows.
The limitation in the above is that if the filenames are not created using the zero-filled numerals as in you first example of
'Test_Absorbance_00001.txt'
then the ASCII sort won't be the same as a numeric sort on just the numeric field. In that case the expedient of building the name manually is probably the easier route.
SA
SA on 25 Mar 2021
with the above code it can call single file each time then process the data by matlab code & with the help of 'for...loop' it can read/scan all the files in the directory. My question is there any way to read a group of files (30 files or 40 files each time and process with matlab code) from the folder and in this way it can complete reading all the files (say 1000 .txt files in the directory) in the directory. Thanks

Sign in to comment.

More Answers (2)

dpb
dpb on 28 Apr 2014
Edited: dpb on 29 Apr 2014
MyFolder = Users/Chris/Desktop/Test;
...the code-error alert icon in the upper right is green, however when I run the scrip I get an "undefined function or variable 'Users' error
Because you wrote the path w/o enclosing the string in quotes. Matlab thinks you're trying to divide four separate variables and store the result in the LHS variable MyFolder until run time it discovers there is no variable Users
You meant to write
MyFolder = 'Users/Chris/Desktop/Test';
BTW, I'd approach it slightly differently...use dir to return the files and iterate over the collection and avoid the grief of building file names manually...plus if one isn't there, you won't have to try to skip it--
Instead of
for i = 1:63 %Number of spectra files in directory
TextFileNames = ['Test_Absorbance_' num2str(k) '.txt'];
fid = fopen(TextFileNames, 'rt');
...
I'd suggest
d=dir('Test_Absorbance_*.txt'); % find the extant absorbance files
for i=1:length(d)
fid = fopen(d(i).name, 'rt');
...
Also NB: that if these files are in the directory as given above by MyFolder both of the above are going to fail unless that is your working directory. If that is the case, use fullfile to build the filename for dir and the fully qualified name in the fopen call
  1 Comment
dpb
dpb on 28 Apr 2014
Edited: dpb on 28 Apr 2014
ADDENDUM
BTW, here's the kind of place where old-fashioned textread is a real improvement actually over the newer textscan -- it's unfortunate imo that TMW has so strongly deprecated its use.
d=dir('Test_Absorbance_*.txt'); % find the extant absorbance files
for i=1:length(d)
data = textread(d(i).name);
...
You get a way to address the file directly w/o dealing w/ the file handle and a friendly data array directly instead of the encapsulating cell that there's no need for.

Sign in to comment.


Chris
Chris on 29 Apr 2014
DPB,
Thank you for your help. I've been able to get the following scrip that calls on the 1st file (Test_Absorbance_00001.txt) to work (it will open the file, read it, and take the cell elements and create the Wavelength and Absorbance elements). However I am having a good bit of difficulty understanding how to incorporate the for loop and have it call up _00002.txt and _00003.txt and so forth. Here is what I have:
%The following line of code creates a structure containing all of the names
%of the files in the directory that are .txt files
DataFiles = dir ('Test_Absorbance_*.txt');
Absorbance = zeros (2048, length (DataFiles));
%This command opens a for loop that will read each file in the directory
for k = 1:length(DataFiles);
%The following two lines of code tell MatLab to open a .txt file from the
%Ocean Optics spectrometer. They will produce a 1x2 cell called TextData
%where {1,1} is the wavelength values and {1,2} are the absorbance values.
fid = fopen ('Test_Absorbance_00001.txt');
TextData = textscan (fid, '%n%n', 2048, 'delimiter', '\t');
%The following two lines create vectors named Wavelength and Absorbance
%from TextData {1,1} and TextData {1,2} respectivly.
Wavelength = TextData {1,1};
Absorbance = TextData {1,2};
%Re-write k-th column of Absorbance with this spectrums
%TextData {1,2} data
%Absorbance (:,k) = TextData {1,2};
%Close the open file in fid
fclose(fid);
end
%The plot command will graph Absorbance vs Wavelength.
plot (Wavelength, Absorbance)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!