# populating a tall array in a for loop

21 views (last 30 days)
Still Learning Matlab on 6 Jun 2018
Edited: KAE on 22 Apr 2021
*I acknowledge that my approach is flawed, but am curious whether this solution exists.
Can I populate a tall array in a for loop?
I am running a large number of calculations and wanting to store the results in a vector. Lets say the results resemble a 1 by 1*10^12 vector of doubles. Clearly this exceeds the memory of my laptop.
The way I have it coded now is to keep track of how many calculations have been performed. Once a particular number is exceeded, I save of the workspace variable and then clear the variable from memory.
%count = 1
%for i = 1:1*10^12
% if count > 784000000
% if exist('A') == 1
% save('TestsaveforProject.mat','A','-v7.3')
% clear A
% end
% B = zeros(1,784000000);
% B(count-78399999) = calculation
% if count > 2*784000000
% if exist('B') == 1
% save('TestsaveforProject2.mat','B','-v7.3')
% clear B
% C = zeros(1,784000000);
% end
% C(count-2*78399999) = calculation%
% end
%else
% A(count) = Calculation%
%end
%count = count+1;
%end
Can I convert the series of 'if' statements to a few lines to populate a tall table? For 1*10^12 cases I would need to include more than 100 if statements like this...plus the save function is pretty clunky. Open to any other suggestions on data storage.
Thanks
dpb on 6 Jun 2018
Do you need the large vector of values to do the calculation or is it the result of the calculations (I think I gather)?
If it is the latter, look at the example of using matfile at Growing an array. Also read on the various ML tools for large data Large-files-and-big-data to get an overview of facilities that exist and see what seems to fit most naturally to your problem.

Edric Ellis on 7 Jun 2018
What I think you should do is something like the following:
% Choose a directory to store the files
outDir = '/tmp/tall_eg';
% Counter indicating which file we'll save to next
fileIdx = 1;
% How many rows of data to save in each file
rowsPerFile = 100;
% How many rows have been written so far
rowsWritten = 0;
% How many rows to write in total
totalRows = 10010;
while rowsWritten < totalRows
% Choose how many rows to write to this file
rowsThisTime = min(rowsPerFile, totalRows - rowsWritten);
% Build the rows
data = rand(rowsThisTime, 1);
% Choose a file name - ensure these progress in order
fname = fullfile(outDir, sprintf('data_%05d.mat', fileIdx));
% Save the data and increment counters
save(fname, 'data');
fileIdx = 1 + fileIdx;
rowsWritten = rowsThisTime + rowsWritten;
end
% Read the data back in as a tall array. First create a datastore ...
ds = fileDatastore(fullfile(outDir, '*.mat'), ...
% ... and then a tall array
tdata = tall(ds)
Note the 'ReadFcn' argument to the fileDatastore is a little tricky - it loads the file and then simply extracts the 'data' field and returns that. 'UniformRead' is required to ensure that we get a tall numeric vector rather than a tall cell array.
##### 2 CommentsShowHide 1 older comment
KAE on 22 Apr 2021
This should be added as an example in the Matlab documentation.

R2018a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!