# Efficient Row x Collumn multiplication

Asked by Martijn on 20 Mar 2012
Latest activity Commented on by Daniel on 20 Mar 2012

Hi all,

Consider the vectors: A of size; s x 4 B of size; 4 x s

Now, I am only interested in the product A(i,:)*B(:,i) of size s x 1

for i=1,..,s. I.e. only the row times the collumn with the same index.

The solution I found myself is: diag(A*B);

But I think there must be a faster solution, since I calculate many useless matrix elements if s>>4.

Do you guys have a suggestion?

## Products

Answer by Honglei Chen on 20 Mar 2012
```sum(A'.*B)
```

Martijn on 20 Mar 2012

Strangely enough this method is much faster than dot()

Martijn on 20 Mar 2012

Sum is better if s>50

Martijn on 20 Mar 2012

better than diag()

Answer by Titus Edelhofer on 20 Mar 2012

Hi,

don't know if it is really more efficient, give it a try:

```x = dot(A',B)'
```

Titus

## 1 Comment

Martijn on 20 Mar 2012

t=linspace(0,1,100);
tf=[t'.^3 t'.^2 t' ones(size(t'))];
M=[-1 3 -3 1; 3 -6 3 0 ; -3 0 3 0; 1 4 1 0]; %Standard form for matrix uniform cubic spline evaluation
P=randn(4,length(t));

tic; x = 1/6*diag(tf*M*P); toc
Elapsed time is 0.000065 seconds.
tic; y = 1/6*dot((tf*M)',P)'; toc
Elapsed time is 0.002589 seconds.

Hence, too slow.

Answer by Daniel on 20 Mar 2012

Maybe I am missing something here, but it seems like you already have a solution ...

```x = zeros(s, 1)
for i = 1:s
x(i) = A(i, :)*B(i, :);
end
```

## 1 Comment

Martijn on 20 Mar 2012

This is too slow

Answer by Daniel on 20 Mar 2012

I am posting this as a separate answer. First when using timing, MATLAB has a hard time timing things that take only 0.000065 seconds, so you should put things in a loop. Second, The size of the matrices need to be established before discounting an answer as too slow.

```t=linspace(0,1,1e4);
tf=[t'.^3 t'.^2 t' ones(size(t'))];
M=[-1 3 -3 1; 3 -6 3 0 ; -3 0 3 0; 1 4 1 0]; %Standard form for matrix uniform cubic spline evaluation
P=randn(4,length(t));
```
```tic;
for ii = 1:10
x = diag(tf*M*P);
end;
toc
```
```tic;
for ii = 1:10
x = sum((tf*M)'.*P)';
end;
toc
```
```tic;
for ii = 1:10
x = dot((tf*M)', P)';
end;
toc
```
```tic;
s = length(t);
x = zeros(s, 1);
A = (tf*M);
B = P;
for ii = 1:10
for i = 1:s
x(i) = A(i, :)*B(:, i);
end
end
toc
```

On my machine I get:

1. Elapsed time is 7.584815 seconds.
2. Elapsed time is 0.004872 seconds.
3. Elapsed time is 0.004445 seconds.
4. Elapsed time is 0.149892 seconds.

with the "too slow" loop being an order of magnitude faster than diag. With a large s, there is essentially no difference between the dot and sum method. For your example matrix sizes, the error checking in dot takes a significant portion of the time.

To really optimize your code you want to think about the order in which operations are occurring and memory access.

http://undocumentedmatlab.com/blog/matrix-processing-performance/

Martijn on 20 Mar 2012

Thanks for the input! I, however, do not get how you would get a +7 second calculation time for the diag() function.

Anyway, for s>>4 diag() is indeed very slow and sum or dot are much much faster.

Daniel on 20 Mar 2012

Because I chose s to be very large (1e4) and I looped 10 times. If s ~4, then the diag method doesn't have that many useless calculations. It is only as s >> 4, that the number of useless calculations dominates.