Search Comments and Ratings

go

Comments and Ratings

   
Date File Comment by Comment Rating
19 Nov 2009 Compression Routines Compress Matlab variables in the workspace. (supports cells, structs, matrices, strings, objects) Author: Jesse Hopkins Sebastiaan

Returning on the issue for compressing large matrices, I made the following patch. It chunks byteData into blocks of 5MiB, and write the output to a structure, which has some information about chunksize and the size of the uncompressed byte array.

The amount of heap space available for compression is rather unpredictable. Sometimes 10MiB blocks were too large. The command 'java.lang.Runtime.getRuntime.freeMemory' does not return a usable value either.

The use of a structure produces an extra overhead of 736 bytes compared to your current version. This can be significantly reduced to 24 bytes if the blocksize/uncompressed size information is stored in the byte array as well. However, I found this method more clear, and the overhead is minimal for larger data, which cannot be compressed currently.

The CompressLib.test shows that all compressions were succesful.

The only thing which cannot be compressed now are sparse matrices. (I have got something working, but I have no idea how to get it compiled on windows, so I do not want to submit it on the FX now.)

Many thanks for your work! It helped me solving a lot of memory issues.

Sebastiaan

Patch:
diff old/CompressionLib/CompressLib.m new/CompressionLib/CompressLib.m
43c43
< function out = decompress(byteArray)
---
> function out = decompress(compressedData)
46c46
< % out = CompressLib.decompress(byteArray)
---
> % out = CompressLib.decompress(compressedData)
48,49c48,49
< % Function will decompress "byteArray" (created by CompressLib.compress).
< % "byteArray" must be a 1-D array of bytes (uint8).
---
> % Function will decompress "compressedData" (created by CompressLib.compress).
> % "compressedData" must be a compression structure.
59,71c59,77
< if ~strcmpi(class(byteArray),'uint8') || ndims(byteArray) > 2 || min(size(byteArray) ~= 1)
< error('Input must be a 1-D array of uint8');
< end
<
< %------Decompress byte-array "byteArray" to "byteData" using java methods------
< a=java.io.ByteArrayInputStream(byteArray);
< b=java.util.zip.GZIPInputStream(a);
< isc = InterruptibleStreamCopier.getInterruptibleStreamCopier;
< c = java.io.ByteArrayOutputStream;
< isc.copyStream(b,c);
< byteData = typecast(c.toByteArray,'uint8');
< %----------------------------------------------------------------------
<
---
> if isstruct(compressedData) && ~isfield(compressedData, 'compressed') && ~isequal(compressedData.compressed, 'GZIP')
> error('Input must be a compression structure.');
> end
>
> % Reserve memory
> byteData = zeros(compressedData.UncompressedSize, 1, 'uint8');
>
> % Decompress data in chunks
> for Iter=1:length(compressedData.Data)
> %------Decompress byte-array "byteArray" to "byteData" using java methods------
> a=java.io.ByteArrayInputStream(compressedData.Data{Iter});
> b=java.util.zip.GZIPInputStream(a);
> isc = InterruptibleStreamCopier.getInterruptibleStreamCopier;
> c = java.io.ByteArrayOutputStream;
> isc.copyStream(b,c);
> byteData((Iter-1)*compressedData.BlockSize+1:min(Iter*compressedData.BlockSize, length(byteData))) = typecast(c.toByteArray,'uint8');
> %----------------------------------------------------------------------
> end
>
74d79
< end
76c81,83
< function byteArray = compress(in)
---
> end
>
> function compressedData = compress(in)
79c86
< % byteArray = CompressLib.compress(in)
---
> % compressedData = CompressLib.compress(in)
88c95
< % Outputs an array of type uint8. Use CompressLib.decomress to decompress
---
> % Outputs a compression structure. Use CompressLib.decomress to decompress
98,106c105,120
< %-------compress the array of bytes using java GZIP--------------------
< f=java.io.ByteArrayOutputStream();
< g=java.util.zip.GZIPOutputStream(f);
< g.write(byteData);
< g.close;
< byteArray=typecast(f.toByteArray,'uint8');
< f.close;
< %----------------------------------------------------------------------
< end
---
> % Compress data in chunks
> compressedData.compressed = 'GZIP';
> compressedData.BlockSize = 5*1024^2; % Make 5 MiB chunks
> compressedData.UncompressedSize = length(byteData);
> compressedData.Data = cell(ceil(compressedData.UncompressedSize/compressedData.BlockSize), 1);
> for Iter = 1:length(compressedData.Data)
> %-------compress the array of bytes using java GZIP--------------------
> f=java.io.ByteArrayOutputStream();
> g=java.util.zip.GZIPOutputStream(f);
> g.write(byteData((Iter-1)*compressedData.BlockSize+1:min(Iter*compressedData.BlockSize, compressedData.UncompressedSize)));
> g.close;
> compressedData.Data{Iter}=typecast(f.toByteArray,'uint8');
> f.close;
> %----------------------------------------------------------------------
> end
> end

06 Nov 2009 Compression Routines Compress Matlab variables in the workspace. (supports cells, structs, matrices, strings, objects) Author: Jesse Hopkins Sebastiaan

Well, I started to think about that yesterday, to chop my matrix into smaller blocks by octree indexing or maybe just a simple block approach since I know that the data is cluttered together and large blocks are 0 (and then compress these smaller blocks).

I wonder which function is used to compress variables before writing them to a MAT file. If the contents could be written directly to a variable in stead of a file, this would get rid of the size limit of the java function.

05 Nov 2009 Compression Routines Compress Matlab variables in the workspace. (supports cells, structs, matrices, strings, objects) Author: Jesse Hopkins Hopkins, Jesse

Wow I never did try it with any single matrix that large. You could probably still save much memory by splitting up that large matrix, perhaps compress the 514x435 2-D matrices, so that you have 217 compressed variables.

CompressLib could probably get some smarts to compress the input in "chunks", but I probably won't be able to get around to that for a while.

05 Nov 2009 Compression Routines Compress Matlab variables in the workspace. (supports cells, structs, matrices, strings, objects) Author: Jesse Hopkins Sebastiaan

Nice utility, but it fails for large matrices. I have a single matrix of 514x435x217 consuming 190MB. Trying to compress it gives a heap space error:

??? Java exception occurred:
java.lang.OutOfMemoryError: Java heap space

Error in ==> CompressLib>CompressLib.compress at 101
g.write(byteData);
 

28 Apr 2009 Explore Convenient way to open windows explorer. Author: Jesse Hopkins Hopkins, Jesse

Eric, the point of this script is so that you don't have to type a full path-name to the explorer command. For example, say you have a m-file called my_func.m in some location on your matlab path. If you want to open an explorer window, you can simply type 'explore my_func'. Otherwise you would have to type !explorer c:\path\to\my_func.

16 Apr 2009 Explore Convenient way to open windows explorer. Author: Jesse Hopkins Eric

Or just use the built in explorer command in DOS along with the ! functionality. Try from the command line:

!explorer (Will open explorer window at top level of windows directory)

!explorer . (Will open explorer window in PWD)

!explorer c:/temp (Will open explorer window at specific address)

16 Apr 2009 Explore Convenient way to open windows explorer. Author: Jesse Hopkins Johansen, Steen

Does not seem to work in 2007R:

>> explore
??? No appropriate method or public field builtinGetActiveDocument for class com.mathworks.mlservices.MLEditorServices.

Error in ==> explore at 45
string = char(com.mathworks.mlservices.MLEditorServices.builtinGetActiveDocument);

15 Apr 2009 Explore Convenient way to open windows explorer. Author: Jesse Hopkins Kiger, Stead

See also "Explorer Toolbar Shortcut"

http://www.mathworks.com/matlabcentral/fileexchange/14140

25 Nov 2008 genpath_exclude Executes like genpath, but can ignore specified directories. Author: Jesse Hopkins Hopkins, Jesse

Thanks for the comment Thierry. I did look at modifying genpath, and for some reason on my first attempt, the excluded path would still be added, but not the tree beneath it. That is when I went off looking into the regexp business to remove the offending paths.

Taking a second look, I realize that it is possible to modify genpath directly accomplish this goal, which should operate faster since it wouldn't recurse on the ignored directories.

25 Nov 2008 genpath_exclude Executes like genpath, but can ignore specified directories. Author: Jesse Hopkins Dalon, Thierry

I think regexp with option 'split' is quite new.=>This may not work with R14 or previous.
For optimization why don't you modify directly genpath to not go into excluded directories?

 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com