File Exchange

## randomized Singular Value Decomposition

version 1.0 (778 Bytes) by

Extremely fast computation of truncated SVD

4.66667
8 Ratings

Updated

This functions implements a fast truncated SVD.
We often want to compute singular value decompositions. But most of the time, we actually don't need all the singular vectors/values as in Principal Components Analysis.

This is also justified by the fact that many matrices occuring in practice do exhibit some kind of structure that leads to only a few singular values actually being non-negligible.

Recent research has shown that when we want a truncated SVD, randomized algorithms can yield an incredible amount of acceleration.

usage :

input:
* A : matrix whose SVD we want
* K : number of components to keep

output:
* U,S,V : classical output as the builtin svd matlab function

Here is a small example for a 2000x2000 matrix of rank 100 on my small laptop computer:

>> A = randn(2000,100)*randn(100,2000);
>> tic; [U1,S1,V1] = svd(A); toc
Elapsed time is 6.509186 seconds.

>> tic; [U2,S2,V2] = rsvd(A,150); toc
Elapsed time is 0.238175 seconds.

>> norm(U1*S1*V1'-U2*S2*V2')

ans =

2.3591e-11

So in that case a near 30x speed improvement. It just gets crazy when your matrix gets big...

You'll find more information about these breakthrough algorithms in a nice paper by Halko et al. : "finding structure through randomness"

enjoy !

Hengfa Lu

Volker Rath

### Volker Rath (view profile)

On second thought, it seems plausible that there is no chance to approximate fully random matrix with a small k. You can see that the code works fine when k in near N/2. After all the paper of Halko et al. says "finding structure through randomness"...

Volker Rath

### Volker Rath (view profile)

Tested this code with a lot of different matrices, comparing to the standard svd(A, 'econ') or svds results, and also other independent randomized code (found at https://stats.stackexchange.com/questions/2806/best-pca-algorithm-for-huge-number-of-features-10k/11934), but I did not observe this kind of differences. However I could reproduce the results in this example also for the other randomized code.

Just some observations while playing with Dimitris' example:
- the difference between the two methods slightly decreases with number of k
- replacing the mat in Dimitris' code by mat = randn(2000,100)*randn(100,2000) produced excellent results. So this must be matrix dependent
- replacing mat by randn(dims, dims-900) is excellent, but by randn(dims, dims-800) produces the large deviations

Possibly some of the assumptions are violated with a square purely random matrix?

Dimitris Stamos

### Dimitris Stamos (view profile)

There is a big difference between the values recovered my svds for a limited number of singular values and rsvd?

Example:

dims = 100;
mat = randn(dims, dims);
numberOfSVs = 6;

tic
[uMatlab, sMatlab, vMatlab] = svds(mat, numberOfSVs);
matlabToc = toc;
fprintf('matlabs toc: %7.5f\n', matlabToc)

tic
[uRSVD, sRSVD, vRSVD] = rsvd(mat, numberOfSVs);
rsvdToc = toc;
fprintf('rsvd toc: %7.5f\n', rsvdToc)
[diag(sMatlab), diag(sRSVD)]

ZGMLS

### ZGMLS (view profile)

Dear,What is the results difference between the original algorithm and the original algorithm?

Rahul Anand Sharma

### Rahul Anand Sharma (view profile)

Great Code . Reduced our time consumption from few hours to few seconds

Lifang Yu

### Lifang Yu (view profile)

I'm working on spliting an image into many small matrix, so very fast svd on small size matrix is what I need. This svd implementaion is lower than Matlab's svd when processing small size matrix. I don't try it on matrix with large size.

It is 500 faster than eig function for a 192000 x 192000 matrix, thanks a lot!

mike

Antoine Liutkus

### Antoine Liutkus (view profile)

Concerning the theoretical analysis, I just recommend the excellent overview paper by Halko et al "finding structure through randomness". This is just a plain implementation of one of the algorithms proposed there. Cheers

I just updated my rating,... this contribution is just great. I have used it on huge matrices (up to 1.5 GB) to compute up to 1500 singular values/vectors.