MATLAB Newsgroup

Can anyone explain the following behavior on a 16GB machine. Notice that the memory error occurs in two different places : `plus' and 'mtimes'.

>> clear all;

>> n = 2.5*10^4;

% --- GBytes = 8*n^2/10^9;

% --- Memory needed for each nxn matrix = 5GB;

>> C = rand(n,n); D = rand(n,n);

% --- WTM records total memory in use 12.6GB. 2.6 (OS + ?) +(5+5) matrices: checks

>> whos

Name Size Bytes Class Attributes

C 25000x25000 5000000000 double

D 25000x25000 5000000000 double

n 1x1 8 double

>> memory

Maximum possible array: 3498 MB (3.667e+009 bytes) *

Memory available for all arrays: 3498 MB (3.667e+009 bytes) *

Memory used by MATLAB: 9929 MB (1.041e+010 bytes)

Physical Memory (RAM): 16381 MB (1.718e+010 bytes)

>> C = 2*C;

??? Error using ==> mtimes

Out of memory. Type HELP MEMORY for your options

>> C = C+C;

??? Error using ==> plus

Out of memory. Type HELP MEMORY for your options.

for i = 1:n, for j = 1:n, C(i,j) = 2*C(i,j);end;end; % --- works ok ---

>> % --- Interrupted at keyboard after 5mins ---

>>for i = 1:n, for j = 1:n, C(i,j) = C(i,j)+C(i,j);end;end; % --- works ok ---

>> % --- Interrupted at keyboard after 5mins ---

>> ver

-------------------------------------------------------------------------------------

MATLAB Version 7.6.0.324 (R2008a)

MATLAB License Number:

Operating System: Microsoft Windows Vista Version 6.0 (Build 6001: Service Pack 1)

Java VM Version: Java 1.6.0 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode

-------------------------------------------------------------------------------------

Derek O'Connor

"Derek O'Connor" <derekroconnor@eircom.net> wrote in message <gr5ega$t7f$1@fred.mathworks.com>...

> Can anyone explain the following behavior on a 16GB machine. Notice that the memory error occurs in two different places : `plus' and 'mtimes'.

>

> >> clear all;

>

> >> n = 2.5*10^4;

>

> % --- GBytes = 8*n^2/10^9;

>

> % --- Memory needed for each nxn matrix = 5GB;

>

> >> C = rand(n,n); D = rand(n,n);

>

> % --- WTM records total memory in use 12.6GB. 2.6 (OS + ?) +(5+5) matrices: checks

>

> >> whos

> Name Size Bytes Class Attributes

>

> C 25000x25000 5000000000 double

> D 25000x25000 5000000000 double

> n 1x1 8 double

>

> >> memory

> Maximum possible array: 3498 MB (3.667e+009 bytes) *

> Memory available for all arrays: 3498 MB (3.667e+009 bytes) *

> Memory used by MATLAB: 9929 MB (1.041e+010 bytes)

> Physical Memory (RAM): 16381 MB (1.718e+010 bytes)

>

>

> >> C = 2*C;

> ??? Error using ==> mtimes

> Out of memory. Type HELP MEMORY for your options

>

> >> C = C+C;

> ??? Error using ==> plus

> Out of memory. Type HELP MEMORY for your options.

>

> for i = 1:n, for j = 1:n, C(i,j) = 2*C(i,j);end;end; % --- works ok ---

> >> % --- Interrupted at keyboard after 5mins ---

>

> >>for i = 1:n, for j = 1:n, C(i,j) = C(i,j)+C(i,j);end;end; % --- works ok ---

> >> % --- Interrupted at keyboard after 5mins ---

>

> >> ver

> -------------------------------------------------------------------------------------

> MATLAB Version 7.6.0.324 (R2008a)

> MATLAB License Number:

> Operating System: Microsoft Windows Vista Version 6.0 (Build 6001: Service Pack 1)

> Java VM Version: Java 1.6.0 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode

> -------------------------------------------------------------------------------------

>

> Derek O'Connor

Well, even if you would not compute those operations you are near the limit of your phisical ram just with matrix storage ... and that's not the best for ram performance. Second, there are large possibilities that you don't need all that amount of data, because those that really need don't have these problems =)))

Whatever, in the C= operations(C) istruction you need a copy of C, so your extimated memory usage would be 15 gb at least, and crumbs waste memory too...

Third, if you really want to work with all those data try to load only some data in memory and keep the whole matrix in a file,

and last, you really need a double random number? Try to recast to something less expensive.

"Romeo " <romeo_aristogatto@yahoo.com> wrote in message <gr5k49$akq$1@fred.mathworks.com>...

> "Derek O'Connor" <derekroconnor@eircom.net> wrote in message <gr5ega$t7f$1@fred.mathworks.com>...

> > Can anyone explain the following behavior on a 16GB machine. Notice that the memory error occurs in two different places : `plus' and 'mtimes'.

> >

> > >> clear all;

> >

> > >> n = 2.5*10^4;

> >

> > % --- GBytes = 8*n^2/10^9;

> >

> > % --- Memory needed for each nxn matrix = 5GB;

> >

> > >> C = rand(n,n); D = rand(n,n);

> >

> > % --- WTM records total memory in use 12.6GB. 2.6 (OS + ?) +(5+5) matrices: checks

> >

> > >> whos

> > Name Size Bytes Class Attributes

> >

> > C 25000x25000 5000000000 double

> > D 25000x25000 5000000000 double

> > n 1x1 8 double

> >

> > >> memory

> > Maximum possible array: 3498 MB (3.667e+009 bytes) *

> > Memory available for all arrays: 3498 MB (3.667e+009 bytes) *

> > Memory used by MATLAB: 9929 MB (1.041e+010 bytes)

> > Physical Memory (RAM): 16381 MB (1.718e+010 bytes)

> >

> >

> > >> C = 2*C;

> > ??? Error using ==> mtimes

> > Out of memory. Type HELP MEMORY for your options

> >

> > >> C = C+C;

> > ??? Error using ==> plus

> > Out of memory. Type HELP MEMORY for your options.

> >

> > for i = 1:n, for j = 1:n, C(i,j) = 2*C(i,j);end;end; % --- works ok ---

> > >> % --- Interrupted at keyboard after 5mins ---

> >

> > >>for i = 1:n, for j = 1:n, C(i,j) = C(i,j)+C(i,j);end;end; % --- works ok ---

> > >> % --- Interrupted at keyboard after 5mins ---

> >

> > >> ver

> > -------------------------------------------------------------------------------------

> > MATLAB Version 7.6.0.324 (R2008a)

> > MATLAB License Number:

> > Operating System: Microsoft Windows Vista Version 6.0 (Build 6001: Service Pack 1)

> > Java VM Version: Java 1.6.0 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode

> > -------------------------------------------------------------------------------------

> >

> > Derek O'Connor

>

> Well, even if you would not compute those operations you are near the limit of your phisical ram just with matrix storage ... and that's not the best for ram performance. Second, there are large possibilities that you don't need all that amount of data, because those that really need don't have these problems =)))

> Whatever, in the C= operations(C) istruction you need a copy of C, so your extimated memory usage would be 15 gb at least, and crumbs waste memory too...

> Third, if you really want to work with all those data try to load only some data in memory and keep the whole matrix in a file,

> and last, you really need a double random number? Try to recast to something less expensive.

Dear Romero,

Are you familiar with Shakespeare's phrase in Henry IV, Part 2, "Don't shoot the messenger"? --- Anyway, on to your 5 points :

1.

"Well, even if you would not compute those operations you are near the

limit of your phisical ram just with matrix storage ... and that's not the best for ram performance."

The whole point of these simple tests was to see how much of the 16GB Matlab could use. Why pay for a machine with 16GB if the software cannot use it? Perhaps that should be: Why pay for the software if it cannot use your machine? Scilab 5.1 64-bit is even worse than Matlab -- it gives a memory allocation error at 5.6 GB.

I do not understand what you mean by "... not the best for ram performance."

2.

"Second, there are large possibilities that you don't need all that amount of data, because those that really need don't have these problems =)))"

Who says I don't need all that data? See these sites:

Tim Davis's site which has huge (sparse) matrices http://www.cise.ufl.edu/research/sparse/matrices/

One of the test problems for the 9th Dimacs Challenge on the implementation of Shortest Path algorithms is the full USA map which has 23,947,347 nodes and 58,333,344 arcs. That is a 24x10^6 X 24*10^6 adjacency matrix, but is sparse with about 2 to 3 non-zeros per row. This is typical of road networks. http://www.dis.uniroma1.it/~challenge9/download.shtml

I realise that this may sidetrack the discussion into sparse vs dense but the question is "What is the largest dense (or sparse) matrix that can be loaded into 16GB under Matlab", with room left over for a couple of n-vectors. Remember: most iterative linear equation solvers are of the form xnew = C*xold + d, which requires the storage of one matrix and three vectors and the only significant operation is an O(n^2) Blas 2 matrix-vector multiplication.

3. You say "in the C = operations(C) instruction you need a copy of C". Are you sure of that? Can anyone verify that statement? Can you explain these lines of code?

>> C = 2*C; % --- out-of-memory failure ---

>> C = C+C; % --- out-of-memory failure ---

>>for i = 1:n, for j = 1:n, C(i,j) = 2*C(i,j);end;end; % --- works ok ---

>>for i = 1:n, for j = 1:n, C(i,j) = C(i,j)+C(i,j);end;end; % --- works ok ---

4. "Third, if you really want to work with all those data try to load only some data in memory and keep the whole matrix in a file,"

Yes indeed. This is really what I was trying to do. You see the actual matrix I'm working on is 500GB and sits on a 1TB Maxtor. I was trying to load into memory a small 10GB (10/500 = 2%) part of it.

5. "... and last, you really need a double random number? Try to recast to something less expensive."

Oh dear! I've been here before. See James Tursa's and my posts on this subject here:

http://www.mathworks.com/matlabcentral/newsreader/view_thread/239186#637344

Of course, this whole question could be answered in a trice by a competent Mathworkser.

Regards,

Derek O'Connor.

"Derek O'Connor" <derekroconnor@eircom.net> wrote in message <gr6cci$k50$1@fred.mathworks.com>...

>

> 3. You say "in the C = operations(C) instruction you need a copy of C". Are you sure of that? Can anyone verify that statement? Can you explain these lines of code?

>

> >> C = 2*C; % --- out-of-memory failure ---

> >> C = C+C; % --- out-of-memory failure ---

> >>for i = 1:n, for j = 1:n, C(i,j) = 2*C(i,j);end;end; % --- works ok ---

> >>for i = 1:n, for j = 1:n, C(i,j) = C(i,j)+C(i,j);end;end; % --- works ok ---

>

MATLAB first creates the result of 2*C. This requires a matrix the same size of C, which is what the phrase "copy of C" actually meant. So that doubles the amount of storage needed, at least temporarily. Then, had it worked, the original C memory is freed and C is set to this new result. That is why your double loops do not run out of memory ... they do the operation in place. Same comments for the C+C equation and loop. The fact that your 2*C operation runs out of memory whereas your loops do not should be proof enough that a temp matrix the same size as C is used.

If you really need to do these simple operations on a very large matrix that is not sharing memory with another variable, you can do them inplace in a mex routine. e.g.

#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])

{

mwSize i, numel;

double *pr;

if( nrhs != 1 || nlhs != 0 ) {

mexErrMsgTxt("Need one input and no outputs.");

}

if( !mxIsDouble(prhs[0]) ) {

mexErrMsgTxt("Input must be double.");

}

if( mxIsSparse(prhs[0]) ) {

numel = *(mxGetJc(prhs[0])+ mxGetN(prhs[0]));

} else {

numel = mxGetNumberOfElements(prhs[0]);

}

pr = mxGetPr(prhs[0]);

for( i=0; i<numel; i++ ) {

pr[i] *= 2.0;

}

}

To mex it just call the file mult2.c and then do this:

>> mex mult2.c

Here is a timing test of the three methods:

>> a=rand(5000);

>> tic;2*a;toc

Elapsed time is 0.458821 seconds.

>> tic;for k=1:5000;for m=1:5000; a(k,m) = 2*a(k,m);end;end;toc

Elapsed time is 43.492720 seconds.

>> tic;mult2(a);toc

Elapsed time is 0.134097 seconds.

> 5. "... and last, you really need a double random number? Try to recast to something less expensive."

>

> Oh dear! I've been here before. See James Tursa's and my posts on this subject here:

> http://www.mathworks.com/matlabcentral/newsreader/view_thread/239186#637344

>

Based on that thread, I have been experimenting with a sparse int8 class that I have created. So far it just does simple things like plus, minus, and times. I am slowly working on adding more capability for potential upload to the FEX someday. Would this be of interest to you? If so, what operations would you need/want for a sparse int8 class?

James Tursa

"James Tursa" <aclassyguywithaknotac@hotmail.com> wrote in message <gr76gq$5ap$1@fred.mathworks.com>...

> "Derek O'Connor" <derekroconnor@eircom.net> wrote in message <gr6cci$k50$1@fred.mathworks.com>...

> >

> > 3. You say "in the C = operations(C) instruction you need a copy of C". Are you sure of that? Can anyone verify that statement? Can you explain these lines of code?

> >

> > >> C = 2*C; % --- out-of-memory failure ---

> > >> C = C+C; % --- out-of-memory failure ---

> > >>for i = 1:n, for j = 1:n, C(i,j) = 2*C(i,j);end;end; % --- works ok ---

> > >>for i = 1:n, for j = 1:n, C(i,j) = C(i,j)+C(i,j);end;end; % --- works ok ---

> >

>

> MATLAB first creates the result of 2*C. This requires a matrix the same size of C, which is what the phrase "copy of C" actually meant. So that doubles the amount of storage needed, at least temporarily. Then, had it worked, the original C memory is freed and C is set to this new result. That is why your double loops do not run out of memory ... they do the operation in place. Same comments for the C+C equation and loop. The fact that your 2*C operation runs out of memory whereas your loops do not should be proof enough that a temp matrix the same size as C is used.

>

> If you really need to do these simple operations on a very large matrix that is not sharing memory with another variable, you can do them inplace in a mex routine. e.g.

>

> #include "mex.h"

> void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])

> {

> mwSize i, numel;

> double *pr;

> if( nrhs != 1 || nlhs != 0 ) {

> mexErrMsgTxt("Need one input and no outputs.");

> }

> if( !mxIsDouble(prhs[0]) ) {

> mexErrMsgTxt("Input must be double.");

> }

> if( mxIsSparse(prhs[0]) ) {

> numel = *(mxGetJc(prhs[0])+ mxGetN(prhs[0]));

> } else {

> numel = mxGetNumberOfElements(prhs[0]);

> }

> pr = mxGetPr(prhs[0]);

> for( i=0; i<numel; i++ ) {

> pr[i] *= 2.0;

> }

> }

>

> To mex it just call the file mult2.c and then do this:

>

> >> mex mult2.c

>

> Here is a timing test of the three methods:

>

> >> a=rand(5000);

> >> tic;2*a;toc

> Elapsed time is 0.458821 seconds.

> >> tic;for k=1:5000;for m=1:5000; a(k,m) = 2*a(k,m);end;end;toc

> Elapsed time is 43.492720 seconds.

> >> tic;mult2(a);toc

> Elapsed time is 0.134097 seconds.

>

> > 5. "... and last, you really need a double random number? Try to recast to something less expensive."

> >

> > Oh dear! I've been here before. See James Tursa's and my posts on this subject here:

> > http://www.mathworks.com/matlabcentral/newsreader/view_thread/239186#637344

> >

>

> Based on that thread, I have been experimenting with a sparse int8 class that I have created. So far it just does simple things like plus, minus, and times. I am slowly working on adding more capability for potential upload to the FEX someday. Would this be of interest to you? If so, what operations would you need/want for a sparse int8 class?

>

> James Tursa

============================================

Dear James,

I'll look at your mex functions later but respond to the first part of

your post here.

You say :

"MATLAB first creates the result of 2*C. This requires a matrix the

same size of C, which is what the phrase "copy of C" actually meant.

So that doubles the amount of storage needed, at least temporarily."

Your statement seems to be verified by this result :

>> clear all; n = 3.0*10^4; C = 2*rand(n,n);

??? Out of memory. Type HELP MEMORY for your options.

But it seems to be contradicted by this result :

>> clear all; n = 3.0*10^4; C = rand(n,n); C = C+C;

>> clear all; n = 3.0*10^4; C = rand(n,n); C = 2*C;

>> clear all; n = 4.2*10^4; C = rand(n,n); C = 2*C;

>> memory

Maximum possible array: 214 MB (2.249e+008 bytes) *

Memory available for all arrays: 214 MB (2.249e+008 bytes) *

Memory used by MATLAB: 13837 MB (1.451e+010 bytes)

Physical Memory (RAM): 16381 MB (1.718e+010 bytes)

WTM (Windows Task Manager) reports 15.5 GB (96%) physical memory in use.

It is very important when problem sizes are close to the memory limit that

no unecessary temporaries are generated. I wrote the small function below

to show that you can solve problems that are close to the memory limit and

that use very little extra space above that for the problem.

You can see that it solved a 42,000 x 42,000 dense linear equations problem

in core using 96% of the physical memory. Notice that it uses the two-step

trick shown above : C = rand(n,n); C = C/n; That is bad.

Programmers should not use tricks.

Regards,

Derek O'Connor

>-------------------------------------------------------------------------------

function [C,d,xsol,xnew,res] = TestFxPtIter(n,tol,maxits);

%

% Generate nxn matrix: C, nx1 vectors: d,xsol, to test

% Iteration x(k+1) = Cx(k)+d : converges if ||C|| < 1;

% Set tol = 0 for full prec

% USE : [C,d,xsol,xnew,res] = TestFxPtIter(10^4,0,100);

%

% Derek O'Connor, 4 April 2009. derekroconnor@eircom.net

% --------- Generate the problem data C and d -----------------

C = rand(n,n); % Matlab gives out-of-mem error

C = C/n; % if C = rand(n,n)/n is used.

xsol = ones(n,1); % any given solution will do

d = xsol-C*xsol; %

xold = zeros(n,1); % In theory any starting value will do

%

% --------- Start of iterations -------------------------------

tic;

converged = false; k = 0;

while ~converged && k <= maxits

k = k+1;

xnew = C*xold+d; % O(n^2) work per iteration

nreldelx = norm(xnew-xold,inf)/norm(xold,inf);

converged = nreldelx <= tol+eps;

xold = xnew;

end;

res = [n k nrdelx toc];

%-------------------- end of TestFxPtIter ---------------------

% clear all;

% [C,d,xsol,xnew,res] = TestFxPtIter(4.2*10^4,10^(-7),100);

% res =

% n = 42000 iters = 24 tol = 6.0267e-008 time = 118.36 secs

% 2 x Quad Xeon 5345s @ 2.33GHz, 16GB ram. Matlab 7.6, Vista 64 bit.

% WTN : 15.5 GB, 96% Phys Mem used in 87 processes

yes Derek I'm sorry, my last 5 minutes of attention I give you. I promise

You can think of your watch list as threads that you have bookmarked.

You can add tags, authors, threads, and even search results to your watch list. This way you can easily keep track of topics that you're interested in. To view your watch list, click on the "My Newsreader" link.

To add items to your watch list, click the "add to watch list" link at the bottom of any page.

To add search criteria to your watch list, search for the desired term in the search box. Click on the "Add this search to my watch list" link on the search results page.

You can also add a tag to your watch list by searching for the tag with the directive "tag:tag_name" where tag_name is the name of the tag you would like to watch.

To add an author to your watch list, go to the author's profile page and click on the "Add this author to my watch list" link at the top of the page. You can also add an author to your watch list by going to a thread that the author has posted to and clicking on the "Add this author to my watch list" link. You will be notified whenever the author makes a post.

To add a thread to your watch list, go to the thread page and click the "Add this thread to my watch list" link at the top of the page.

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

The newsgroups are a worldwide forum that is open to everyone. Newsgroups are used to discuss a huge range of topics, make announcements, and trade files.

Discussions are threaded, or grouped in a way that allows you to read a posted message and all of its replies in chronological order. This makes it easy to follow the thread of the conversation, and to see what’s already been said before you post your own reply or make a new posting.

Newsgroup content is distributed by servers hosted by various organizations on the Internet. Messages are exchanged and managed using open-standard protocols. No single entity “owns” the newsgroups.

There are thousands of newsgroups, each addressing a single topic or area of interest. The MATLAB Central Newsreader posts and displays messages in the comp.soft-sys.matlab newsgroup.

**MATLAB Central**

You can use the integrated newsreader at the MATLAB Central website to read and post messages in this newsgroup. MATLAB Central is hosted by MathWorks.

Messages posted through the MATLAB Central Newsreader are seen by everyone using the newsgroups, regardless of how they access the newsgroups. There are several advantages to using MATLAB Central.

**One Account**

Your MATLAB Central account is tied to your MathWorks Account for easy access.

**Use the Email Address of Your Choice**

The MATLAB Central Newsreader allows you to define an alternative email address as your posting address, avoiding clutter in your primary mailbox and reducing spam.

**Spam Control**

Most newsgroup spam is filtered out by the MATLAB Central Newsreader.

**Tagging**

Messages can be tagged with a relevant label by any signed-in user. Tags can be used as keywords to find particular files of interest, or as a way to categorize your bookmarked postings. You may choose to allow others to view your tags, and you can view or search others’ tags as well as those of the community at large. Tagging provides a way to see both the big trends and the smaller, more obscure ideas and applications.

**Watch lists**

Setting up watch lists allows you to be notified of updates made to postings selected by author, thread, or any search variable. Your watch list notifications can be sent by email (daily digest or immediate), displayed in My Newsreader, or sent via RSS feed.

- Use a newsreader through your school, employer, or internet service provider
- Pay for newsgroup access from a commercial provider
- Use Google Groups
- Mathforum.org provides a newsreader with access to the comp.soft sys.matlab newsgroup
- Run your own server. For typical instructions, see: http://www.slyck.com/ng.php?page=2