From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: How do I compress an array of floating numbers in Matlab?
Date: Sun, 4 Apr 2010 19:20:05 +0000 (UTC)
Organization: Magma Geosciences Inc.
Lines: 48
Message-ID: <hpaop5$qse$>
References: <> <>
Reply-To: <HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: 1270408805 27534 (4 Apr 2010 19:20:05 GMT)
NNTP-Posting-Date: Sun, 4 Apr 2010 19:20:05 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 872154
Xref: comp.soft-sys.matlab:623390

Luna Moon <> wrote in message <>...
> On Apr 3, 8:29 pm, "Mark Shore" <> wrote:
> > Luna Moon <> wrote in message <>...
> > > Hi all,
> >
> > > I have a vector of real numbers in Matlab. How do I compress them?  Of
> > > course this has to be lossless, since I need to be able to recover
> > > them.
> >
> > > The goal is to study the Shannon rate and entropy of these real
> > > numbers, so I decide to compress them and see how much compression
> > > ratio I can have.
> >
> > > I don't need to write the result into compressed files, so those
> > > headers, etc. are just overhead for me which affect me calculating the
> > > Entropy... so I just need a bare version of the compress ratio...
> >
> > > Any pointers?
> >
> > > Thanks a lot!
> >
> > An exceeding simple test involving little or no effort on your part would be to take representative binary files and compress them with off-the-shelf utilities such as WinZip or 7-Zip.
> >
> > This would certainly give you some idea of what level of lossless compression you can expect from reasonably well-tested and mature algorithms before you try to adapt your own.
> Thanks a lot folks.
> Please remember the goal is not to compress the floating numbers per
> se. It's actually to measure the entropy of the data.
> I don't really care how much compression it can maximally achieve.
> Using WinZip is a great idea, however, I am looking for
> (1) a command inside Matlab;
> (2) a bare-bone compression, without the header info, etc. in Winzip,
> because those are overheads in terms of measuring entropy...
> Any more thoughts?
> Thank you!

I'm not aware off the top of my head what built-in commands or third-party tools might be available in MATLAB. You did make your overall goal clear in your first posts, so I was suggesting file compression utilities as an indirect measure of the entropy of a given data set.

This can work if the data set is large enough. For example, as a test I just compressed a binary 1591200x15 matrix of double-precision values representing a time series of 24-bit measurements from an array of magnetometers. WinZip compresses the original 190,944,400 byte file to 34,162,706 bytes using its maximum compression setting. An equal size binary array filled with pseudorandom numbers compressed to 179,929,479 bytes using the same setting. This difference seems reasonable given the higher entropy of the random set.

If you are dealing with very small files, then agreed, any file compression/decompression header overhead would likely make this less useful.