File Exchange

image thumbnail

Better memory-mapped files in Matlab

version 1.31.0.0 (33.6 KB) by Dylan Muir
A better, transparent memmapfile, with complex number support.

6 Downloads

Updated 18 May 2020

GitHub view license on GitHub

See also http://dylan-muir.com/articles/mapped_tensor/
If this function is useful to your academic work, please cite the publication in lieu of thanks:
Muir and Kampa, 2015. "FocusStack and StimServer: A new open source MATLAB toolchain for visual stimulation and analysis of two-photon calcium neuronal imaging data". Frontiers in Neuroinformatics.

This class transparently maps large tensors of arbitrary dimensions to temporary files on disk. Referencing is identical to a standard matlab tensor, so a MappedTensor can be passed into functions without requiring that the function be written specifically to use MappedTensors. This is opposed to memmapfile objects, which cannot be used in such a way. Being able to used MappedTensors as arguments requires that the tensor is indexed inside the function (as opposed to using the object with no indices). This implies that a function using a MappedTensor must not be fully vectorised, but must operate on the mapped tensor in segments inside a for loop.

MappedTensor also offers support for basic operations such as permute and sum, without requiring space for the tensor to be allocated in memory. memmapfile sometimes runs out of virtual addressing space, even if the data is stored only on disk. MappedTensor does not suffer from this problem.

Functions that work on every element of a tensor, with an output the same size as the input tensor, can be applied to a MappedTensor without requiring the entire tensor to be allocated in memory. This is done with the convenience function "SliceFunction".
An existing binary file can also be mapped, similarly to memmapfile. However, memmapfile offers more flexibility in terms of file format. MappedTensors transparently support complex numbers, which is an advantage over memmapfile.

Example:
mtVar = MappedTensor(500, 500, 1000, 'Class', 'single');
% A new tensor is created, 500x500x1000 of class 'single'.
% A temporary file is generated on disk to contain the data for this tensor.

for (i = 1:1000)
mtVar(:, :, i) = rand(500, 500);
mtVar(:, :, i) = abs(fft(mtVar(:, :, i)));
end

mtVar = mtVar';

mtVar(3874)

mtVar(:, 1, 1)

mfSum = sum(mtVar, 3);
% The sum is performed without allocating space for mtVar in
% memory.

mtVar2 = SliceFunction(mtVar, @(m)(fft2(m), 3);
% 'fft2' will be applied to each Z-slice of mtVar
% in turn, with the result returned in the newly-created
% MappedTensor mtVar2.

clear mtVar mtVar2
% The temporary files are removed

mtVar = MappedTensor('DataDump.bin', 500, 500, 1000);
% The file 'DataDump.bin' is mapped to mtVar.

SliceFunction(mtVar, @()(randn(500, 500)), 3);
% "Slice assignment" is supported, by using "generator" functions that accept no arguments. The assignment occurs while only allocating space for a single tensor slice in memory.

mtVar = -mtVar;
mtVar = 5 + mtVar;
mtVar = 5 - mtVar;
mtVar = 12 .* mtVar;
mtVar = mtVar / 5;
% Unary and binary mathematical operations are supported, as long as they are performed with a scalar. Multiplication, division and negation take O(1) time; addition and subtraction take O(N) time.

Cite As

Dylan Muir (2020). Better memory-mapped files in Matlab (https://github.com/DylanMuir/MappedTensor), GitHub. Retrieved .

Comments and Ratings (11)

Dylan,
Sorry for my previous rude comment. Actually I could do some tricks and it really works.
So I still have some questions, I wrote an Issue on GitHub, here is a link:
https://github.com/DylanMuir/MappedTensor/issues/16

Dylan Muir

Kerim.
Apologies for the `subasgn` error; an explicit output was not required in previous versions of Matlab.
As I'm sure you're aware, cross-platform C development is non-trivial, especially if you don't have access to every platform that someone might use. The mex files compile without warning on OS X, and others have compiled with no problems on windows. If you create an issue on github, I can attempt to help you with your compilation problem.
You will of course have seen the extensive examples immediately above, as well as included in the help text. If you have suggestions for additional examples, or questions that were not answered, then please create an issue on github.
Finally, if in future you would like the author of free software to assist you, I suggest that you write more politely.

Sorry, but as for me, this is one of the most useful functions but in my r2016b Matlab the file "MappedTensor.m' opened in editor has three equal errors that are highlighted with red color: Line 1662 (1668,268): output SUBSASGN must be assigned to a variable
Two of three mex files were successfully compiled with MinGw except "mapped_tensor_shim.c". I think there is simple error.
And the last think is that the descritpion how to use it very very bad. You could give some examples of calling such class.
It is very annoying puzzling out your code during two days. Is that really work on other machines?

Peter Cook

fvff

Dylan Muir

Hi Avinash,

MappedTensor can't handle matlab format (.mat) data files, which can be compressed. MappedTensor can only handle uncompressed, unpacked binary data formats. If you write your tensor out to disk in a binary format using fprintf then MappedTensor should work just fine.

Let me know if you have any other questions.
Dylan.

I've run into a problem with the use of MappedTensor that I can't quite figure out. I have a large tensor matrix in matlab of size M x N x T. I saved this to the path fPath as
save(fPath,'IM.mat');
Then, I try to acess this using MappedTensor as follows
MT = MappedTensor(fPath,[M N T]);
However, when I look at any frame in MT as
imagesc(MT(:,:,10)), axis image
The image appears circularly shifted along the 1st dimension. I am not sure why this is happening. Anyone have any ideas?
Thanks!

Wouter

Just what I needed in order to process huge (18GB) files in a memory map fashion without the excessive virtual memory usage from matlab's builtin memmapfile functionality.

Dylan Muir

Hi Holger,
Thanks a lot for your bug report.
I converted the single-line // comments to /*-style comments. I also fixed the htobe16 issue. However, I can't replicate the "fopen" error, and can't see what the problem would be with the line. Could you please try the updated code and let me know?
Dylan

nightrome

I found various bugs in the code:
- Line 1590 in mapped_tensor_shim_nomex: One or more output arguments not assigned during call to "fopen".

- In the mex files: The //-style comments are not allowed in ANSI C-code, which is why I had to specify that as a compiler flag (would have been good to know).

- mapped_tensor_shim doesn't compile at all, since htobe16 passed 2 arguments, but takes just 1.

Is it possible that you rework the current for Matlab 2014?

Updates

1.31.0.0

Updated description

1.31.0.0

Updated description

1.31.0.0

Updated description

1.31.0.0

Updated description

1.31.0.0

Updated description

1.31.0.0

Updated description

1.31.0.0

Updated formatting

1.30.0.0

Added "neuroscience" tag

1.30.0.0

Moved to github hosting

1.29.0.0

Improved referencing for edge cases.

1.28.0.0

Improved referencing to make it more similar to matlab tensors. FIxed a referencing bug involving a confusion between "58" and ":".

1.27.0.0

Updated usage notes.

1.26.0.0

Added paper reference.

1.25.0.0

Indexing improvement

1.24.0.0

Accelerated reading of data, especially when accessing chunks of data in sequential order.

1.23.0.0

Updated description

1.22.0.0

Fixed bug where call to fopen failed

1.21.0.0

Removed compiled MEX files from archive

1.20.0.0

Fixed several mex compilation bugs.

1.19.0.0

Updated summary

1.18.0.0

Updated description

1.17.0.0

Updated description

1.16.0.0

Fixed a regression, such that SliceFunction no longer worked. Thanks to Stanislas Rapacchi for the bug report.

1.15.0.0

Minor bug fixes

1.13.0.0

MappedTensor now uses mex-accellerated internal functions, if possible. MappedTensor is now much faster.

1.11.0.0

Fixed a referencing bug, where repeated indices and multi-dimensional indices were not referenced correctly on reads.

1.10.0.0

Accelerated SliceFunction; SliceFunction now provides a slice index argument; better error reporting when too many dimensions were used for indexing; SliceFunction now provides feedback during operation

1.9.0.0

MappedTensor now does not rely internally on memmapfile, but performs optimised direct binary file reads. It is now much faster than memmapfile, for some tasks. You can now specify a header offset to skip, when mapping an existing file.

1.8.0.0

Updated description

1.7.0.0

Updated image

1.5.0.0

Added support for unary uplus, uminus; binary plus, minus, times, mtimes, m/l/r/divide (all with a scalar).

1.4.0.0

Fixed a bug in linear indexing of a permuted tensor; added support for slice assignment; added support for complex values.

1.3.0.0

Added support for "sum"; added SliceFunction.

1.2.0.0

Added a brief example, more details of restrictions.

MATLAB Release Compatibility
Created with R2009a
Compatible with any release
Platform Compatibility
Windows macOS Linux