Efficient way to assign indices to variables in a matrix

4 views (last 30 days)
I have a 3e6*4 matrix1 and a 7e6*4 matrix2. For each element in matrix1 I need to replace it with the row index of the element in the 4th column of matrix2 that is equal to the value of that element in matrix1. I have written a nested for loop which does this but it takes 4 hours. How can I do it more efficiently?
for i = 1:size(matrix1,1)
for j = 1:size(matrix1,2)
matrix1(i,j) = find(matrix2(:,4)==matrix1(i,j))
end
end
  4 Comments
Adam Fitchett
Adam Fitchett on 29 Sep 2022
Pre-allocation isn't necessary because the result matrix already exists
matrix1 already exists, i am simply assiging a value to each element iteratively

Sign in to comment.

Answers (1)

Jan
Jan on 29 Sep 2022
Edited: Jan on 1 Oct 2022
[~, Result] = ismember(A, B(:, 4));
A look up table is even faster: Instead of searching the element A(i,j) in B(:, 4), create a vector, which contains the index of the elements of B at the corresponding vector. A limitation is, that look up tables work for positive integer values only and the maximum value must match into the available RAM. If this is not the case, use ismember, or the ismembc approach.
Some timings using input data which are a factor of 100 smaller: (Win10, R2018b, 4 Core i7)
index = randperm(12e4, 7e4).';
B = [zeros(7e4, 3), index];
A0 = index(randi([1, numel(index)], 3e4, 4));
% ISMEMBER:
tic;
[~, Result] = ismember(A0, B(:, 4));
toc
% Look up table:
tic;
n = max(A0(:));
LUT = zeros(n, 1);
LUT(B(:, 4)) = 1:size(B, 1);
A = LUT(A0);
toc;
assert(isequal(Result, A), 'wrong result');
% 2 loops:
A = A0;
tic
for i = 1:size(A,1)
for j = 1:size(A,2)
A(i,j) = find(B(:,4) == A(i,j));
end
end
toc
assert(isequal(Result, A), 'wrong result');
% 1 FOR loop
A = A0;
tic
V = B(:, 4);
for k = 1:numel(A)
A(k) = find(V == A(k));
end
toc
assert(isequal(Result, A), 'wrong result');
% 1 PARFOR loop:
A = A0;
gcp; % Open a parallel pool
tic
V = B(:, 4);
parfor k = 1:numel(A)
A(k) = find(V == A(k));
end
toc
assert(isequal(Result, A), 'wrong result');
% ISMEMBC2 (undocumented):
A = A0;
tic
[Vs, idx] = sort(B(:, 4));
A = idx(ismembc2(A, Vs)); % [EDITED, without a loop over indices of A]
toc
assert(isequal(Result, A), 'wrong result');
% Elapsed time is 0.008757 seconds. ismember
% Elapsed time is 0.002307 seconds. look up table
% Elapsed time is 8.971745 seconds. 2 loops
% Elapsed time is 9.019042 seconds. 1 loop
% Elapsed time is 3.829381 seconds. 1 PARFOR loop
% Elapsed time is 0.006160 seconds. ismembc2
For the input data of the original size:
index = randperm(12e6, 7e6).';
B = [zeros(7e6, 3), index];
A0 = index(randi([1, numel(index)], 3e6, 4));
I get 1.8 seconds for ismember and 0.53 seconds for the look up table. A speed up by a factor 27'000 compared with your 4 hours. Nice.
  4 Comments
Jan
Jan on 4 Oct 2022
Do you mean this one:
Result = zeros(size(A)); % Pre-allocation!!!
B4 = B(:, 4);
for k = 1:numel(A)
Result(k) = find(A(k) == B4, 1);
end
But this ismember, ismembc2 and look-up-table methods are some tenthousands times faster.
Use at least the binary search with:
Result = zeros(size(A)); % Pre-allocation!!!
[Vs, idx] = sort(B(:, 4));
for k = 1:numel(A)
A(k) = idx(ismembc2(A(k), Vs));
end
Avoiding the loop by
A = idx(ismembc2(A, Vs));
speeds this up another time.

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Tags

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!