Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

To resolve issues starting MATLAB on Mac OS X 10.10 (Yosemite) visit: http://www.mathworks.com/matlabcentral/answers/159016

How can I do a simple matrix multiplication withour running in any out of memory error

Asked by Mehrnaz Shok on 18 May 2013

I want to multiply a big matrix (possibly 6000 * 6000) with its transpose. However I get out of memory error. I know that I don't have enough RAM to make this works...but is there any way I can get this to work?? I am running Matlab R2012b on 32bit windows vista. Can someone show me how to do this block wise, or some methods. The only thing that is important for me is that I need to use the result of the matrix multiplication for another analysis.

This is a sample of what I trying to do: A = ones(6000); B = A'*A; (I need to use B to get another code running).

Please advise, Thanks

2 Comments

per isakson on 18 May 2013

R2012a, 64bit, Windows 7, old vanilla desktop, 8GB RAM.

    >> tic, A = ones(6000); B = A'*A; toc
    Elapsed time is 6.767988 seconds.
    >> tic, A = rand(6000); B = A'*A; toc
    Elapsed time is 7.429135 seconds.

I guess from looking at the Task Manager window that 4GB would have been enough so far.

    >> tic, A = rand(9000); B = A'*A; toc
    Elapsed time is 22.454364 seconds.

.

Try

    doc memory
Iain on 22 May 2013

You can reduce the memory you need by working with singles.

Mehrnaz Shok

Products

No products are associated with this question.

2 Answers

Answer by Jan Simon on 18 May 2013

Perhaps there are any magic tricks, which allow to compute A'*A. When you call the BLAS function through a MEX interface, A' is not calculated explicitly. Perhaps http://www.mathworks.com/matlabcentral/fileexchange/13604-calllapack-matlab-interface-of-lapack-and-blas-functions helps. But nothing can compete with real RAM: Install a 64 bit version and a bunch of GigaBytes. This will be cheaper than about 1 to 2 hours of work with a usual salary for an engineer.

3 Comments

James Tursa on 19 May 2013

FYI, MATLAB already calls symmetric BLAS routines when doing A'*A (without doing the transpose explicitly), so a mex interface will not improve things ... you will be doing exactly what MATLAB is doing already.

Jan Simon on 22 May 2013

@James: I assume this symmetric BLAS routine is triggered by the undocumented JIT acceleartion. Unfortunately I do not know its documentation, such that this could not matter my Matlab version.

Another point could be, that in A = rand(6000); B = A'*A two large matrices must exist at the same time. When A is not required anymore, an inplace construction would be useful. The original BLAS-DGEMM does not exploit, that the result of A'*A is symmetric, as far as I know. Perhaps the MKL and ATLAS implementations do this.

James Tursa on 22 May 2013

@Jan: I am unaware of any specific documentation covering the rules for when symmetric BLAS routines are called. The doc used to list the routines that were used (without giving rules for when they were used), but I don't even see that anymore. E.g., is the symmetric routine called for shared data copy cases such as this:

A = rand(6000);
C = A;
B = C'*A;

Given that the rules are not documented, I don't know if one can always depend on the symmetric behavior.

If you want to know what MATLAB is doing on your particular machine for a particular case, you can always compare the timing to MTIMESX in SPEED mode for a large case. MTIMESX will always key off of the real data pointer value to determine symmetric cases, so it will catch the shared data copy inputs. The timing difference for large symmetric multiplies is an obvious discriminator (something like 30% or more on my machine if I remember correctly). And the symmetric result won't in general match the DGEMM result.

Jan Simon
Answer by Iain on 22 May 2013

Do it the old fashioned way using loops. - It means you do NOT need to have both A, its transpose and the result in memory, but it is slower.

0 Comments

Iain

Contact us