MATLAB Digest - March 2004
Using Memory Mapped Files for Fast Data Transferby Tom Gaudette |
|
Sometimes it is necessary to move large amounts of data into MATLAB for further processing and then move it back into another application. The MATLAB engine interface enables you to do this exchange of data and it works well in most cases, but can also become the bottleneck for large data transfers.
This article describes an alternative method for moving large data sets between MATLAB and other applications. This method uses a memory mapped file instead of the engine interface to gain some speed.
An example of an application that requires fast data exchange is the XDEV capability in the LeCroy Wavemaster oscilloscope. The Windows based oscilloscopes allow users to define a MATH channel on their scopes and apply MATLAB functions to the streaming data. The oscilloscope captures the data and transfers it into the MATLAB environment. The user then can apply any operations they wish on the data using the MATLAB syntax. The output of their analysis is then transferred back into the oscilloscope for display.
Transferring large data sets such as this can be challenging using the engine interface. The engine interface on the Windows platform uses the COM standard, which requires users to wrap the data from their application into a form that COM can marshal to MATLAB. MATLAB then unwraps the data from the COM data type and converts it into the MATLAB data type. This marshaling and transferring of data is necessary for isolating the two applications, but becomes a bottleneck when working with large data sets.
Memory mapped files offer significant speed improvements for transferring large data sets without the need for marshaling the data as with the COM interface.
Memory Mapped Files
A memory mapped file is a file that the operating system knows how to interact with. To the user, these files are easier to work with because the operating system handles all of the interactions with them. For example, they can be accessed by multiple programs simultaneously and the operating system does everything you need to share the data between the applications.
However, unlike the fscanf and fprintf functions
you need to use on a file in MATLAB, a memory mapped file behaves like a
data array in memory. This allows you to use memory copy commands on it or
pointer manipulation for accessing different sections of data and simplifies
how you use the data while in your program. You can point to the data and
act on it as if it were just another variable in your program. Also, being
in memory gives the speed that a normal file sharing program would not provide
using the COM interface.
Components of Using Memory Mapped Files
Figure 1 shows a pictorial representation of the different components needed for using a memory mapped file.

Figure 1. A diagram of the different components in our example.
Each of these components will be discussed further:
- The simple MEX file called from MATLAB
- A helper object used to transfer data to and from the memory mapped file
- User applications that use the helper object
- MATLAB calls into a simple MEX file
Simple MEX File
A MEX file is a DLL that exports a standard entry point that MATLAB knows how to call. (Please refer to the external API guide for a full explanation.)
The code below shows this standard entry point labeled mexFunction. This
entry point is the function call that MATLAB will execute when a user in
MATLAB invokes the MEX function.
- Number of left hand side arguments,
nlhs - Pointer to the list of variables on the left hand side,
plhs[] - Number of right hand side arguments,
nrhs - Pointer to the list of variables on the right hand side,
prhs[]
The right hand side and left hand side are before and after the equal sign
as in a math equation. For
example, if you type a=sum([1:10]); in MATLAB
this translates to one left hand side variable called a and
one right hand side variable [0:10].
// ******************************************************************
// MEX Interface.
// ******************************************************************
// MATLAB MAIN entry point.
extern "C"
void mexFunction( int nlhs, mxArray* plhs[],
int nrhs, const mxArray* prhs[])
{
/*************************************************************
// Register the MEX exit function first time thru mex file.
// Setup a persistent mxArray that we will us for passing data
//*************************************************************
if (initFlag)
{
mexAtExit(mexExitFunction);
persistent_array_ptr = mxCreateDoubleMatrix(1,1,mxREAL);
mexMakeArrayPersistent(persistent_array_ptr);
memory_holder = mxGetPr(persistent_array_ptr);
initFlag = false ;
}
// Error Checking removed for clarity of code Section
1: MEX file: The main entry point and house keepingThe MEX code shown is broken into sections. Section 1 does some house keeping and gets executed once the first time the MEX file is called. The next section does error checking on the passed parameters. It is removed for clarity of the article but is in the project code described at the end of the article.
// ***************************************************************
// Actual work of the MEX file.
// ***************************************************************
if (nrhs==1)
{ // User wants to put data into memory map file
hObj->putData(mxGetPr(prhs[0]),
mxGetNumberOfElements(prhs[0]));
}
else // Or user wants to get data from memory mapped file.
{
plhs[0]=persistent_array_ptr;
// Notice that we pass back a pointer to the mem-map
// and we do not copy the data out of the point
mxSetPr(plhs[0],hObj->getPointer());
// Tell the mxArray the size of the data in the mem-map
mxSetN(plhs[0],hObj->getLength());
}
} Section 2: MEX file: The actual work done by
the MEX file Section 2 is where we either put the data into the memory mapped file or
retrieve the data from the memory mapped file. We will assume that if the
MEX file is called with a parameter passed in, this parameter passed in is
to be stored and we will send our helper object a pointer to the real array
of data and the length of the data array. If we are not passed in any parameter
then we will assume that we should return all the data stored in the memory
mapped file. So we swap out the pointer to the memory mapped array and the
length of the array and fill in the appropriate fields in the mxArray being
returned to MATLAB. One key step to notice is that we are not doing a memory
copy, but only moving pointers. This works because we have told MATLAB that
this variable is persistent, which forces MATLAB not to move it or change
it in any way.
Helper Object Code
In order to have a central place for the code that does the setting and getting of the data, we produced an object. This object’s only purpose is to deal with the memory map. Its code will be used in the MEX file shown above and in the user application code for communicating with the memory mapped file.
MMFile::MMFile(void)
{
// Create memory mapped file
hMMFile = CreateFileMapping (INVALID_HANDLE_VALUE,
NULL, // default security
PAGE_READWRITE, // read/write permission
0, // max object size
0x01400000, // size of hFile
szMapFileName); // name of mapping Object
// Map a view of this file for us to user in this process.
PointerLength = MapViewOfFile (hMMFile, //Handle to MAPPED Obj
FILE_MAP_ALL_ACCESS,
0,
0,
0);
// Save pointer to data beyond beginning of data array
PointerData = (LPVOID) ((long long)PointerLength+sizeof(long));
} Section 3: Helper object: create method In creating the object we first create the memory mapped file with the CreateFileMapping command. The
key for two applications to share data is to have both applications open
the same file. The parameter szMapFileName seen
in the code above is the name of the file. Because we defined this name
in the file that both applications use, all our projects will be using the
same memory mapped file.
Once we have a pointer to the memory mapped file we need to map this file
into our current processes memory space with the MapViewOfFile function. We
create a new helper object each time a process is attached to the DLL, so
this is called for each loading of the DLL. Now we will get the memory mapped
file mapped into our current applications process memory space. Or, if we
use the object’s code directly in our project, each time we create the object
we will get a pointer to the memory mapped file and map it into the current
process memory map.
The last step of the creation process is to get a pointer to the data array. Because
a memory map looks like just an array of data, we will use the size of a long to
hold the length of the valid data array and start the data after this. Now,
when we put data into the array, we will know how many data points are valid.
// Destructor
MMFile::~MMFile()
{
UnmapViewOfFile (PointerLength);
CloseHandle (hMMFile);
} Section 4: Helper object: destructor methodDuring the destruction of the object we simply need to unmap our
view on the data and tell the memory mapped file that we no longer need it. This
will allow the memory map to go away if no one else is using it and free
up memory for the system to use.
// Put data into the file
void MMFile::putData(double * newData, long len)
{
long dataLength = 0x01400000-sizeof(long);
// Length of memory map
if (len<=dataLength)
{
memcpy(PointerData,newData,len*sizeof(double));
setLength(len);
}
}
// Get data from the file
void MMFile::getData(long len, double *retData)
{
long dataLength = getLength();
if (len<=dataLength)
{
memcpy(retData,PointerData,len*sizeof(double));
}
else
{ // Should through an error here.
memcpy(retData,PointerData,dataLength*sizeof(double));
}
}
// Get pointer to data in file
double* MMFile::getPointer(void)
{
return (double*)PointerData;
} Section 5: Helper object: sending and receiving
data form the memory map file The most important part in the helper object is the actual sending and receiving
of the data. These functions are outlined below. In the putData function
we do a memory copy from the data passed to the memory mapped file. In the getData we
also do a memory copy. But if you remember in the MEX file we do not use
the getData but use the getPointer function
instead to move the data into MATLAB. This allows us to not need a memory
copy command for getting from the memory mapped file into MATLAB. This greatly
increases our speed for moving the data in.
How to Link from User Application Code
If you want to create a program that moves data into the memory mapped file
so that MATLAB can take advantage of it, you only need to include the helper
object in your project and call the appropriate function. The following
section of code is a sample console application that puts ten numbers into
the memory mapped file. If you have already called mmfile,
the memory map will have been created already and this function will replace
the data in the memory map.
#include "../MMFileDemo/MMFile.h"
int _tmain(int argc, _TCHAR* argv[])
{
double data[] = {0,1,2,3,4,5,6,7,8,9};
MMFile *Obj = new MMFile();
// Put the number 0-9 into the array.
Obj->putData(data,10);
// Once we leave we will not have a connection to
// the memory mapped file
return 0; Section 6: User application code: putting
data into the memory mapped file The following section of code will get ten numbers for the memory mapped file and display them.
#include "../MMFileDemo/MMFile.h"
int _tmain(int argc, _TCHAR* argv[])
{
MMFile *Obj = new MMFile();
double data[10];
// Get the first 10 elements in the MemoryMaped file
Obj->getData(10,&data[0]);
// Display the numbers
printf("{");
for (int idx=0;idx<9;idx++)
printf("%g, ",data[idx]);
printf("%g}\n",data[9]);
// We will delete the Obj when we leave therefore remove
// any connection to the Memory Mapped file.
return 0;
} Section 7: User application code: getting data
form the memory mapped file Using MATLAB with the Memory Mapped File
The MEX file described in this article is available for download. To use
it, call mmfile with the appropriate variable
either on the input if you want to pass the data in or a variable on the
output if you want to get data from the memory mapped file.
indata=mmfile; |
Getting data from the memory mapped file and
putting it into the variable |
mmfile(outdata); |
Putting data from the MATLAB variable |
This lets you easily transfer data to and from the memory mapped file using standard MATLAB syntax.
Once the mmfile function has been called once,
it will cause the MEX file to stay in memory and therefore the memory mapped
file will exist for other applications to call. Once you call clear
mex the MEX file will be unloaded from memory and the deletion of
the helper object will be called. If there are no other applications currently
connected to the memory map file the file will be deleted.
Subscribe Now
The MATLAB Digest is the MathWorks electronic news bulletin for the MATLAB and Simulink community. To subscribe, become a Mathworks Account member.