# Speeding up this function involving 3D matrix multiplications.

6 views (last 30 days)
Ankur Kamboj on 12 Feb 2024
Commented: Ankur Kamboj on 13 Feb 2024
I have this function that is called several times over in my project. Running the code analyer shows that this is the bottleneck. There are a lot of 3D matrix multplicaions in here. I was looping to evaluate it first, but then someone suggested vectorizing it. However, the only function I could find that helped me is pagemtimes(). Is there a way I can speed this up further? My aim is to deploy this on real time experiment, that is why speed is of paramount importance!
Function:
function [fun] = evalP(input, xd, d, pars, Wo, Ph)
% input - 67x1000 double
% xd - 2x1 double
% d - scalar
% pars - 7x1 double
% Wo - 2x4 double
% Ph - 4x19 double
dim = size(Ph, 2);
sz = size(input, 2);
x = zeros(2,1,sz); w1 = zeros(4,1,sz); w2 = w1; dq = zeros(dim, 1, sz); u = dq; q = dq; fun = zeros(67, sz);
x(:, 1, :) = input(1:2, :);
w1(:, 1, :) = input(3:6, :); w2(:, 1, :) = input(7:10, :);
dq(:, 1, :) = input(11:29, :); u(:, 1, :) = input(30:48, :); q(:, 1, :) = input(49:67, :);
wh = cat(2, w1, w2);
f1 = x + d * pagemtimes(Wo * Ph, u);
common_term = pagemtimes(Ph, pagemtimes(pagemtimes(dq, permute(dq, [2,1,3])), Ph'));
f2 = w1 + d * -pars(1) * permute( pagemtimes(permute(w1 - Wo(1, :)', [2,1,3]), common_term), [2,1,3]);
f3 = w2 + d * -pars(1) * permute( pagemtimes(permute(w2 - Wo(2, :)', [2,1,3]), common_term), [2,1,3]);
f4 = dq + d * (-pars(7)*dq + u) + sqrt(d) * pars(6) * randn(dim, 1, sz);
f5 = u + d * -pars(2) * ( pagemtimes( pagemtimes(pagemtimes(Ph', pagemtimes(wh, permute(wh, [2,1,3]))), Ph) + pars(3)*eye(dim), u) - pars(4) * pagemtimes(pagemtimes(Ph', wh), (xd - x)) ) + sqrt(d) * pars(5) * randn(dim, 1, sz);
f6 = q + d * u;
fun(:,:) = cat(1, f1, f2, f3, f4, f5, f6);
end
Edit: This MATLAB function is trying to compute a function value for 1000 data points (input). W is a 2x4x1000 double. input here is [x; w1; w2; dq; u; q] where W is factored as w1 and w2 to vectorize it. p_i are the pars(i)
James Tursa on 12 Feb 2024
Can you add some explanatory text stating what the calculation is, maybe with a general equation etc.? I would note that all of your permute( ) and indexing and cat( ) stuff results in deep data copies (performance drag), so my first suggestion would be to rearrange your data up front to maybe avoid some or all of this, but you could help us by showing at a higher level what the calculations are doing.
Ankur Kamboj on 12 Feb 2024
Not sure if it helps, but the function is basically evaluating a mathematical function for some 1000 67x1 inputs (Edited into the OP).
I'll try to see if avoiding permute() and cat() could result in any performanc gains. Thanks!

Matt J on 12 Feb 2024
Edited: Matt J on 12 Feb 2024
Don't use permute(___,[2,1,3]) to transpose the pages in conjunction with pagemtimes. Use the transposition flags,
Matt J on 12 Feb 2024
I have similar concerns about f5, but to know how best to handle it, we need to know what you plan to use it for
But one thing that's for sure is you will not build the matrix if you are merely going to be multiplying it with u. You will instead re-express the product as,
Ankur Kamboj on 13 Feb 2024
Just implemented what you suggested for f2 and f3, and it really is more than 2x the speedup. Thank you!! I am still trying to wrap my head around how the elementwise multiplication and the sum operation equate to the f2 and f3's analytical expressions.
But I get the gist, vector-matrix multiplication are much faster than any matrix-matrix ones. I'll try to chalk up a similar expression for f5 as well. Going through the profiler, f5 is the only one left that's eating up the majority of computation time.
Again, much thanks! This saved me a lot of time, trouble, and research.

### Categories

Find more on Mathematics in Help Center and File Exchange

R2023b

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!