# Running Matlab in Parallel on local machine or another suggestion

6 views (last 30 days)

Show older comments

##### 4 Comments

### Answers (2)

Voss
on 30 Jan 2022

You've got 2 nested for loops there, the outer one looping over all the F.* files and the inner one looping over all the Nareplicate.* files. So with 5 files of each type, the inner loop will run 25 times, performing the same set of operations 5 times on each pair of files. With 1000 files your inner loop would execute 1 million times (doing the same thing on each pair of files 1000 times), so fixing that should significantly speed things up, I would expect.

The current two-loop implementation with those 5 pairs of files takes ~1s:

clc

clear all

tic

S = dir('F.*');

for k = 1:numel(S)

S(k).name;

T = dir('Nareplicate.*');

for k = 1:numel(T)

T(k).name;

Nadump = dlmread(T(k).name, ' ', 9, 0);

Fdump = dlmread(S(k).name, ' ', 9, 0);

L1 = length(Nadump);

L2 = length(Fdump);

for i=1:L1

for j=1:L2

X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);

X(i) = X(i)/10;

Y(j,i) = (X(i));

%X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);

%X(i) = X(i)/10;

%Y(j,i) = (X(i));

end

end

%S = zeros(L2, L1);

%for j = 1:L2

%S(j,:) = sort(Y(j,:));

%end

%S1= S(:,1);

Y1 = Y';

Y2 = sort(Y1);

S1= sort ((Y2(1,:))');

%Find indices to elements in first column of A that satisfy the equalit

ind1 = S1(:,1) < .28;

ind2 = S1(:,1) < .55;

ind3 = S1(:,1) < .78;

%ind4 = S(:,1) > .79;

%Use the logical indices to index into A to return required sub-matrices

A1 = S1(ind1,:);

A2 = S1(ind2,:);

A3 = S1(ind3,:);

%A4 = S(ind4,:);

Q1(k,:) = [length(A1), (length(A2)-length(A1)), length(A3)-length(A2), 125-length(A3)];

end

end

W= sum(Q1)/(length(T));

W1 = W/125;

toc

bar(diag(W1),'stacked', 'BarWidth', 1)

dlmwrite('Ion-Pair-Stat.txt',Q1,'delimiter','\t','precision',3)

Using one loop takes ~0.3s:

clc

clear all

tic

S = dir('F.*');

T = dir('Nareplicate.*');

for k = 1:numel(S)

S(k).name;

T(k).name;

Nadump = dlmread(T(k).name, ' ', 9, 0);

Fdump = dlmread(S(k).name, ' ', 9, 0);

L1 = length(Nadump);

L2 = length(Fdump);

for i=1:L1

for j=1:L2

X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);

X(i) = X(i)/10;

Y(j,i) = (X(i));

%X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);

%X(i) = X(i)/10;

%Y(j,i) = (X(i));

end

end

%S = zeros(L2, L1);

%for j = 1:L2

%S(j,:) = sort(Y(j,:));

%end

%S1= S(:,1);

Y1 = Y';

Y2 = sort(Y1);

S1= sort ((Y2(1,:))');

%Find indices to elements in first column of A that satisfy the equalit

ind1 = S1(:,1) < .28;

ind2 = S1(:,1) < .55;

ind3 = S1(:,1) < .78;

%ind4 = S(:,1) > .79;

%Use the logical indices to index into A to return required sub-matrices

A1 = S1(ind1,:);

A2 = S1(ind2,:);

A3 = S1(ind3,:);

%A4 = S(ind4,:);

Q1(k,:) = [length(A1), (length(A2)-length(A1)), length(A3)-length(A2), 125-length(A3)];

end

W= sum(Q1)/(length(T));

W1 = W/125;

toc

bar(diag(W1),'stacked', 'BarWidth', 1)

dlmwrite('Ion-Pair-Stat.txt',Q1,'delimiter','\t','precision',3)

And simplifying the computation cuts the time down again by almost half (if anything is not clear about what I did here, you can put a break point and inspect the variables and convince yourself that it's doing the same thing it used to do, and/or come back here and post a comment and I'll explain it):

clc

clear all

tic

S = dir('F.*');

T = dir('Nareplicate.*');

N = numel(S);

Q1 = zeros(N,4);

for k = 1:N

Nadump = dlmread(T(k).name, ' ', 9, 0).';

Fdump = dlmread(S(k).name, ' ', 9, 0);

L1 = size(Nadump,2);

L2 = size(Fdump,1);

Y = zeros(L2,L1,3);

for m = [3 4 5]

Y(:,:,m-2) = Fdump(:,m)-Nadump(m,:);

end

S1 = min(sqrt(sum(Y.^2,3))/10,[],2);

N1 = nnz(S1 < 0.28);

N2 = nnz(S1 < 0.55);

N3 = nnz(S1 < 0.78);

Q1(k,:) = [N1 N2-N1 N3-N2 L2-N3];

end

W1 = sum(Q1,1)/N/125;

toc

bar(diag(W1),'stacked', 'BarWidth', 1)

dlmwrite('Ion-Pair-Stat.txt',Q1,'delimiter','\t','precision',3)

If you take any or all of those changes, I bet you will see significant improvement in the speed of your code when you run it on the real (1000s of files) case.

Walter Roberson
on 28 Jan 2022

∫Are the files already stored on an SSD ?

If not then are they split between two (or more) hard drives? Preferably on different controllers?

Generally speaking, the peak performance for hard drives is typically two reading processes per drive, one (sometimes two) drives per controller.

I have been testing some Samsung BAR+ USB 3.1 Flash Drives (makre sure you get 128 Gb or later version, the smaller ones are slower.) On a very new M1 MacBook, I am reading about 304 megabits/second from them; on my 2013 iMac and an external USB3 hub, I am reading about 386 megabits/second on them. Write speed is only on the order of 60 megabits/second but the read speed is very nice.

A little over a year ago, I connected an external thunderbolt external drive bay to my 2013 iMac; with it and WD Red drives or HG Star drives, I am about to write about 225 megabits/second and read about 285 megabits/second . Reading speed is not as good as those new flash drives... on the other hand I am running it through a Thunderbolt 2 <-> Thunderbolt 3 convertor, and would likely get a significant performance improvement if I were to switch it to my newer iMac .

My Samsung EVO (SSD) drive in the same external enclosure is giving me write speeds about 490 megabits/second and read speeds about 530 megabits/second

... The point being that paying attention to what kind of drives you have and how they are connected can help gain a significant performance improvement.

If you are using a USB 2 drive, then if you have a USB 3 controller, pick up a quality SSD drive, or a quality flash drive. The Samsung BAR+ 128 Gb drives cost me only $C30 each.

##### 0 Comments

### See Also

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!