# Multiple results for iteratively multiplying a matrix with a vector quickly

4 views (last 30 days)
James McIntosh on 17 Jun 2018
Commented: James McIntosh on 17 Jun 2018
Essentially I want a fast way of doing: c = (A^k)*b0. But I want the result for multiple values of k (I don't need it for all values of k, just some).
At the moment, I am just doing this in a normal for loop (b1 = A*b, b2 = A*b1, b3 = A*b2, etc.) for all k. But I am wondering if there is a faster way (maybe using GPUs).
Doing loops in GPUs doesn't seem like the way forward. I was thinking I could just request c = (A^k)*b0 (which is very fast on the GPU) for only the k that I want, but if I want many (for example for k = [1:5:1000]) this still ends up being slower than just doing it on a loop on the CPU.
Any suggestions? Thanks -
clear; rng(1);
N = 301;
k = 1000; A = randn(N)/17; b = rand(N,1);
f = @() r(A,b,k);
t = timeit(f);disp(t)
function b = r(A,b,k)
for ix = 1:k
% currently not pulling out b for the k of interest (but is doable here)
b = A*b;
end
end
##### 2 CommentsShowHide 1 older comment
James McIntosh on 17 Jun 2018
Thanks - there is no pattern to the k (for all intents and purposes, the k that I want to query is randomly distributed between 1 and 5000, and there might be for example, 500 samples).

James Tursa on 17 Jun 2018
Edited: James Tursa on 17 Jun 2018
Note that as the power value gets higher, the numerical stability of successive matrix multiplies will degrade and you will not get as accurate an answer as calling the MATLAB mpower function directly (which uses the matrix exponential function). Also, I am not sure all of those successive matrix multiplies will be faster than calling mpower directly anyway. E.g., what kind of timing do you get with this compared to your looping?
k = 1:5:1000;
tic
result = arrayfun(@(p)X^p*b,k,'uni',false);
toc
James McIntosh on 17 Jun 2018
Thanks for looking into the problem. I know that the numerical error accumulation is manageable for my purposes. I am really just looking for a faster way than the loop method on a CPU - but maybe that's as good as it's going to get.

Walter Roberson on 17 Jun 2018
Since you are calculating many A^k, you should probably do an svd to get the U, S, V, and then you can easily raise S to multiple powers since it is a vector. You would probably still need a loop for the reconstruction -- though perhaps you would be able to us pagefun() with gpuarrays
James McIntosh on 17 Jun 2018
Thanks - I think I do not understand the SVD approach though. Maybe if A was symmetric (so that U*(S^k)*V' = X^k) - but it is not in this case.