How to preallocate memory for storing data in same mat file?

Question

0 votes

Hi, I wrote the below code and I would like to preallocate memory so that the code will run faster. Once I preallocate I know that I cannot use append but need to index to store output. Can you suggest how to get output for code below?

Here the value of f is a 1*5449 double. Final output is 5449*5449 double.

clc;
n=1; %system order 
m=1; %number of inputs
p=6;%number of outputs
Final = []; 
for i = 1:7783
  for j = 1:50
      if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
          load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
          A1 = A{1}; 
          A1 = A1 / max(abs(eig(A1)));
          B1 = B{1}; 
          C1 = C{1};
          index = 1;
          for k = 1:7783
              for l = 1:50
                      if exist(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat'],'file')
                          load(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat']);
                          A2 = A{1}; 
                          A2 = A2 / max(abs(eig(A2)));
                          B2 = B{1};  
                          C2 = C{1};
                          f(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
                          index = index + 1;
                      end
              end
          end 
          Final = [Final;f];
      end
  end
end
save('Distance','Final');

5 Comments
Show 3 older comments Hide 3 older comments

Sunny on 21 Oct 2018

Edited: Sunny on 21 Oct 2018

Open in MATLAB Online

Thanks. I changed the program to this. I think this is faster. A is 10*10 double, B is 1*10 and C is 6*10. Now the structs f, o and g are 1*5449.

clc;
n=10; %system order 
m=1; %number of inputs
p=6;%number of outputs
Final = [];
k = 1;
for i = 1:7783
  for j = 1:50
      if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
           load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
           f{k} = A{1};
           o{k} = B{1}; 
           g{k} = C{1};
           k = k+1;
       end
   end
end
 save('Rescaled_A_Values_All_States','f');
 save('Rescaled_B_Values_All_States','o');
 save('Rescaled_C_Values_All_States','g'); 
for c = 1:5449          
          A1 = f{c}; 
          A1 = A1 / max(abs(eig(A1)));
          B1 = o{c}; 
          C1 = g{c};
          index = 1;
          for d = 1:5449
                          A2 = f{d}; 
                          A2 = A2 / max(abs(eig(A2)));
                          B2 = o{d};  
                          C2 = g{d};
                          q(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
                          index = index + 1;
            end 
         Final = [Final;q];
  end

Guillaume on 21 Oct 2018

Well, yes it's going to be much faster. You're reading each file only once. You're still doing N^2 unnecessary eigs and related calculations. And nearly 99% of the files you test for existence don't exist, so it'd be faster to do a dir so the OS just tells you which files are there.

Finally, depending on what distance1_matlab does, it may well be that your 2nd loop is not needed.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Guillaume on 21 Oct 2018

Open in MATLAB Online

0 votes

Depending on what distance1_matlab does, this code could be significantly improved.

I'm also assuming that all files that match the pattern ID_*_file_*_Variables.mat' need to be loaded.

filelist = dir('ID_*_file_*_Variables.mat');      %get list of files that exist
fileids = regexp({filelist.name}, 'ID_(\d+)_file_(\d+)_', 'tokens', 'once')  %extract numeric ids as text
fileids = str2double(vertcat(fileids{:}));   %and convert to numeric
%you may want to sort fileids and filelist to match the order of your original loops
%it's trivial to do. For now I assume it does not matter.
filedata = struct('A', cell(numel(filelist), 1), 'B', [], 'C', []);  %preallocate structure to receive file content and final result
%note that A, B and C are very poor field names.
for fileiter = 1:numel(filelist)
   filecontent = load(filelist(fileiter).name));
   filedata(fileiter).A = filecontent.A{1} / max(abs(eig(A{1})));
   filedata(fileiter).B = filecontent.B{1};
   filedata(fileiter).C = filecontent.C{1};
end
[cartprod1, cartprod2] = ndgrid(filedata);  %cartesian product of all files with themselves
distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2);  %assumes that the result of distance1_matlab is scalar

Note that that last line assumes distance1_matlab returns a scalar. If not, change it to:

distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2, 'UniformOutput', false);

If you want the result in the same form as your original Final, then:

distance = distance(:);  %if scalar result out of 
distance = vertcat(distance{:});   %otherwise

2 Comments
Show None Hide None

Sunny on 26 Oct 2018

@Guillaume

Can I use parfor instead of for to speed up execution with parallel processing? Does the loops synchronize?

Guillaume on 26 Oct 2018

I doubt that using parfor for the loading loop would help much. The slow part of that is not the processor but the disk access. If anything, it's possible that parfor will slow things down as parallel threads compete for disk access. You'll only know if you try.

I don't know if the parallel toolbox can parallelise arrayfun (I don't have the toolbox). arrayfun is a for loop in disguise. Parallelising that code could certainly result in a speed-up

However, as I've said (twice now) depending on what distance_matlab does, it's likely that this 2nd loop/arrayfun is not needed at all and that the function can be vectorised. This would probably be the most efficient way to improve your code. Hence why I asked for the details of this function.

Sign in to comment.

How to preallocate memory for storing data in same mat file?

5 Comments
Show 3 older comments Hide 3 older comments

Accepted Answer

2 Comments
Show None Hide None

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

How to preallocate memory for storing data in same mat file?

5 Comments Show 3 older comments Hide 3 older comments

Accepted Answer

2 Comments Show None Hide None

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

5 Comments
Show 3 older comments Hide 3 older comments

2 Comments
Show None Hide None