Error in loading a very large matlab file

56 views (last 30 days)
Christina
Christina on 23 Jul 2012
I need to load a 14.6 GB .mat file. However, the maximum possible array that Matlab can produce is 13.8 GB.
Also, the total memory used by Matlab is 480 MB and the physical memory (RAM) is 8 GB. I cannot resuce the size of my matrix as well.
When I am trying to load this file, after an hour, I get the error: 'Error in load 'filename.mat'.
Any ideas??

Answers (4)

Titus Edelhofer
Titus Edelhofer on 23 Jul 2012
Hi Christina,
does this .mat file contains several variables or one large matrix? In the latter case, I guess, you will have no chance other then using a bigger machine. In the first case: did you try to load the entire file or only some variables? You might try to load e.g. a single variable A by
aStruct = load('yourFBigFile.mat', 'A');
Titus
  7 Comments
per isakson
per isakson on 23 Jul 2012
Edited: per isakson on 27 Jul 2012
It looks like matfile (R2012a) is cheeting; the entire variable is loaded
function output = inefficientPartialLoad(obj, indexingStruct, varName)
warning(message('MATLAB:MatFile:OlderFormat', obj.Properties.Source, varName));
output = loadEntireVariable(obj, varName);
output = builtin('subsref', output, indexingStruct(2:end));
end
The name of the function says that TMW plans to improve the function
I stand corrected regarding partial load of mat-file version 7.3
Titus Edelhofer
Titus Edelhofer on 24 Jul 2012
Hi Per, the function is only "cheeting" on "older" mat files: from the doc
matfile only supports partial loading and saving for MAT-files in Version 7.3 format (described in MAT-File Versions). If you index into a variable in a Version 7 (the current default) or earlier MAT-file, MATLAB warns and temporarily loads the entire contents of the variable.
Since Christina's file is 7.3 (otherwise it could not be that large) partial read should indeed do what it promises to do, namely, partially reading.

Sign in to comment.


Jan
Jan on 23 Jul 2012
Edited: Jan on 23 Jul 2012
Loading a 14GB file into 8GB RAM will be slow, because the disk cache must be used excessively. Out of memory problems are very likely. Therefore a clean and stable advice must include the installation of more RAM. 16GB would be a start, 32GB would be better. If this seems expensive, calculate the number of hours you will wait for your machine to crash. Double this number and remove this from your live time.

per isakson
per isakson on 23 Jul 2012
Edited: per isakson on 27 Jul 2012
Version 7.3 MAT-files uses an HDF5 based format. I think one can read them piece-wise with the HDF5 functions of Matlab. I made a tiny experiment with a few [double 1x1]. The functions h5info and h5disp provide a start.
--- Answer to [Answer by Christina on 24 Jul 2012 at 11:41] ---
Now, I know a little more and I'll make a new try to answer your question.
  • With R2012a 64bit there are no practical limits regarding the address space.
  • R2012a can save to several different mat-file formats, e.g. v7 and v7.3. Only v7.3 is based on HDF5.
  • AFAIK: By default R2012a saves to v7, not v7.3. There is "no" upper limit to the size of v7-mat-files. The limits mentioned above applies to data items, e.g. double arrays.
  • With matfile it is possible to make partial reads from mat-files v7.3. However, the indexing functionality is limited; "complex" indexing is not supported. Thus, read the indexing part of the documentation on matfile carefully.
  • With the HDF5-support it is possible to read any piece of data from mat-files v7.3, e.g. a column of a data array that is the value of a field of a nested structure. However, it might be a bit complicated.
Thus, make sure you use
  1. mat-file v7.3
  2. 64bit of R2012a and
  3. that matfile supports the indexing of the variables you need

Christina
Christina on 24 Jul 2012
Thanks everyone for the replies.
I've got now matlab R2012a, I tried to work with the 'matfile' command, but it is not helpful in my case, as the matrix contains multiple variables.
The problem with my dataset that is stored in my large matrix is the actual structure of it. It is stored as d, which contains d{1,i}, which contains d{1,i}.variable1 , where i is the number of my samples (my large matrix contains i=10000, now I am testing it with i=5, for time purposes).
When I try for example: load 'matrixname.mat', d{1,2}. variable1 , I get an empty matrix as a result.
Any ideas about that?

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!