Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
arithenco Help

Subject: arithenco Help

From: Garry Higgins

Date: 21 Apr, 2009 13:48:01

Message: 1 of 10

Hi,

I'm trying to use the Arithmetic Encoder function in MATLAB (arithenco) and am looking for some help.

I have a 1x7500 array of values that range from 0 to 0.9998. Of these values 7492 of them are unique. I'm having trouble figuring out how to calculate the "counts" vector that needs to be passed to the arithenco function. Here's what I've tried:

counts = histc(A, unique(A));

But this gives me a 1x7492 double ranging from 1 to 9 which I know isn't right. I thought histc would be the correct function to use from reading it's help page and I saw a similar example using it elsewhere. Is there something else I should be using instead or am I doing something else wrong that I'm not seeing?

Hope you guys can help. I don't have a huge amount of MATLAB experience.

Subject: arithenco Help

From: Roger Stafford

Date: 21 Apr, 2009 16:26:03

Message: 2 of 10

"Garry Higgins" <mathworks.20.seker@xoxy.net> wrote in message <gskiqh$hie$1@fred.mathworks.com>...
> Hi,
>
> I'm trying to use the Arithmetic Encoder function in MATLAB (arithenco) and am looking for some help.
>
> I have a 1x7500 array of values that range from 0 to 0.9998. Of these values 7492 of them are unique. I'm having trouble figuring out how to calculate the "counts" vector that needs to be passed to the arithenco function. Here's what I've tried:
>
> counts = histc(A, unique(A));
>
> But this gives me a 1x7492 double ranging from 1 to 9 which I know isn't right. I thought histc would be the correct function to use from reading it's help page and I saw a similar example using it elsewhere. Is there something else I should be using instead or am I doing something else wrong that I'm not seeing?
>
> Hope you guys can help. I don't have a huge amount of MATLAB experience.

  What makes you think the 'counts' values are not right? The 'histc' function is telling you that of the 7492 unique values in A, all but one of them occur only once each and that remaining value occurs nine times. What is not right about that?

  Of course with statistics like that, you will not be able to gain much compression with 'arithenco', but that is not the fault of 'histc', it is a problem with your particular A. What else could it do with an alphabet of length 7492? I think you had better review what it is you are trying to accomplish. It sounds as if the A array is inappropriate.

Roger Stafford

Subject: arithenco Help

From: Garry Higgins

Date: 22 Apr, 2009 13:08:02

Message: 3 of 10


> What makes you think the 'counts' values are not right? The 'histc' function is telling you that of the 7492 unique values in A, all but one of them occur only once each and that remaining value occurs nine times. What is not right about that?
>
> Of course with statistics like that, you will not be able to gain much compression with 'arithenco', but that is not the fault of 'histc', it is a problem with your particular A. What else could it do with an alphabet of length 7492? I think you had better review what it is you are trying to accomplish. It sounds as if the A array is inappropriate.
>
> Roger Staffod

Thanks for the reply.

Sorry. I should have specified why I felt 'counts' was not right. If I procede with that value and try and pass it to 'arithenco' it fails with an error saying: "The symbol counts parameter must be a vector or positive finite integers." and doesn't return anything.

At the moment the data in my A array is random. This is going to be used as part of a bigger system that I don't have the data sets for yet. Shouldn't 'arithenco' still work but without any compression gains?

Subject: arithenco Help

From: Roger Stafford

Date: 22 Apr, 2009 19:50:16

Message: 4 of 10

"Garry Higgins" <mathworks.20.seker@xoxy.net> wrote in message <gsn4ri$ebi$1@fred.mathworks.com>...
> Sorry. I should have specified why I felt 'counts' was not right. If I procede with that value and try and pass it to 'arithenco' it fails with an error saying: "The symbol counts parameter must be a vector or positive finite integers." and doesn't return anything.
>
> At the moment the data in my A array is random. This is going to be used as part of a bigger system that I don't have the data sets for yet. Shouldn't 'arithenco' still work but without any compression gains?

  You pose a mystery here, Garry, but you have the means of checking to see if arithenco's error message is telling you the truth. You can check directly on 'counts' with something like

 all(counts==round(counts) & counts>0)

to see if all elements of 'counts' are positive integers and

 any(size(counts)==1)

to see if it is actually a vector and not a matrix.

  If both checks indicate that 'counts' is truly a vector of positive integers, then something is possibly haywire with 'arithenco'. If not, then something mysterious is going on in 'histc' or 'unique'. 'histc' is supposed to give out only non-negative integer counts, but the way you are using it with 'unique', all counts should in fact be positive. There should be no zero bin counts in this case.

  In other words, Garry, you need to conduct some detective work to narrow down the source of difficulty here. Your description of A as having 7500 values between 0 and .9998 suggests that perhaps at least two of them differ only by a single least bit in their significands. If so, it is conceivable that 'unique' regards them as different and 'histc' considers them to be equal. If so, you could receive a zero value in one or more of the bin count values. Another possibility is that 'arithenco' is not really set up to deal with sequences of symbols that are closely-spaced fractional values. The only example I see in its documentation is with sequences of +1 and -1 values. Perhaps its error message actually refers to the 'seq' argument, not the 'counts' argument. It is a common failing of error messages to give the wrong reason for a detected error.

  If you manage to boil your problem down to something very simple and reproducible, it would be time to consult with members of Mathworks' support team, but you need to have some very specific examples that they can easily repeat.

Roger Stafford

Subject: arithenco Help

From: Roger Stafford

Date: 22 Apr, 2009 23:33:13

Message: 5 of 10

"Garry Higgins" <mathworks.20.seker@xoxy.net> wrote in message <gsn4ri$ebi$1@fred.mathworks.com>...
> At the moment the data in my A array is random. This is going to be used as part of a bigger system that I don't have the data sets for yet. Shouldn't 'arithenco' still work but without any compression gains?

  Something you said has rung a bell, Garry. You said, "At the moment the data in my A array is random." By any chance did you use the 'rand' function to generate those 7500 random numbers in A between 0 and 0.9998? If so, the odds are very heavily against your getting any matches at all with such a method, and yet you claimed to have obtained one case with nine numbers all equal. How is that possible? The odds against this should be astronomically large. What is this random number that reoccurred nine times? It makes me think that you haven't told us the full story of your proceedings. I think you should give many more details of what you were doing when this error of yours occurred and do so in such a manner that others can reproduce it if we are to seriously consider this problem.

Roger Stafford

Subject: arithenco Help

From: Garry Higgins

Date: 23 Apr, 2009 15:40:17

Message: 6 of 10

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gsnsdo$8ml$1@fred.mathworks.com>...
> "Garry Higgins" <mathworks.20.seker@xoxy.net> wrote in message <gsn4ri$ebi$1@fred.mathworks.com>...
> > Sorry. I should have specified why I felt 'counts' was not right. If I procede with that value and try and pass it to 'arithenco' it fails with an error saying: "The symbol counts parameter must be a vector or positive finite integers." and doesn't return anything.
> >
> > At the moment the data in my A array is random. This is going to be used as part of a bigger system that I don't have the data sets for yet. Shouldn't 'arithenco' still work but without any compression gains?
>
> You pose a mystery here, Garry, but you have the means of checking to see if arithenco's error message is telling you the truth. You can check directly on 'counts' with something like
>
> all(counts==round(counts) & counts>0)
>
> to see if all elements of 'counts' are positive integers and
>
> any(size(counts)==1)
>
> to see if it is actually a vector and not a matrix.
>
> If both checks indicate that 'counts' is truly a vector of positive integers, then something is possibly haywire with 'arithenco'. If not, then something mysterious is going on in 'histc' or 'unique'. 'histc' is supposed to give out only non-negative integer counts, but the way you are using it with 'unique', all counts should in fact be positive. There should be no zero bin counts in this case.
>
> In other words, Garry, you need to conduct some detective work to narrow down the source of difficulty here. Your description of A as having 7500 values between 0 and .9998 suggests that perhaps at least two of them differ only by a single least bit in their significands. If so, it is conceivable that 'unique' regards them as different and 'histc' considers them to be equal. If so, you could receive a zero value in one or more of the bin count values. Another possibility is that 'arithenco' is not really set up to deal with sequences of symbols that are closely-spaced fractional values. The only example I see in its documentation is with sequences of +1 and -1 values. Perhaps its error message actually refers to the 'seq' argument, not the 'counts' argument. It is a common failing of error messages to give the wrong reason for a detected error.
>
> If you manage to boil your problem down to something very simple and reproducible, it would be time to consult with members of Mathworks' support team, but you need to have some very specific examples that they can easily repeat.
>
> Roger Stafford

I tried the two tests about and both came back true (1).

You might be right about it having to do with the number of significant numbers in the array. Strangely, I accidentally discovered that if I quantize A and then pass the quantized A with the original 'counts' to arithenco it works perfectly.

Regarding the array having repeated numbers even though it's made up of random numbers, this might be because I made this array out of a much larger one. I was running into memory problems with the bigger array so just copied some of it to a new one. I thought getting a simplified version working first would be easier :) If it helps I can upload the .mat file somewhere.

I've since decided on A = rand(32, 256*10); would be an accurate example for the final product. This might complicate things so we can probably ignore it for the minute.

If it is a limitation in histc, does anyone know of another way I could populate the 'counts' vector? Here's a desciption of it from the help file :
"The vector counts represents the source's statistics by listing the number of times each symbol of the source's alphabet occurs in a test data set."

Thanks again for taking the time to help Roger. I really appreciate it.

Subject: arithenco Help

From: Garry Higgins

Date: 24 Apr, 2009 14:25:05

Message: 7 of 10

I'm just posting to say that I seem to have identified to source of the problem (with the help of a friend).

The problem is with the 'histc' function. It works fine if all integer values in the range are present but if one isn't, it causes a problem. As a simple example take an array with numbers from 1-10. If the array contains at least 1 of each value (e.g. A=[1,2,3,4,5,6,7,8,9,10]) then histc will work fine for calcultating counts and passing to arithenco. If, however, one of the numbers is not represented (e.g. A=[1,2,3,4,4,6,7,8,9,10]), hisct will omit the zero count number (5 in this case) and the 'counts' vector it creates is not of the form required by arithenco.

Hope this helps somebody else if they find themselves in a similar situation. Thanks again for all your help Roger. You steered me in the right direction.

I don't suppose anyone knows of an alternative function to do what I need in MATLAB? Save me having to try and write one :)

Subject: arithenco Help

From: Steven Lord

Date: 24 Apr, 2009 14:52:10

Message: 8 of 10


"Garry Higgins" <mathworks.20.seker@xoxy.net> wrote in message
news:gssi41$igq$1@fred.mathworks.com...
> I'm just posting to say that I seem to have identified to source of the
> problem (with the help of a friend).
>
> The problem is with the 'histc' function. It works fine if all integer
> values in the range are present but if one isn't, it causes a problem. As
> a simple example take an array with numbers from 1-10. If the array
> contains at least 1 of each value (e.g. A=[1,2,3,4,5,6,7,8,9,10]) then
> histc will work fine for calcultating counts and passing to arithenco. If,
> however, one of the numbers is not represented (e.g.
> A=[1,2,3,4,4,6,7,8,9,10]), hisct will omit the zero count number (5 in
> this case) and the 'counts' vector it creates is not of the form required
> by arithenco.

I suspect the fault is not in HISTC, but in the edges vector you pass into
HISTC. If you're passing in something like unique(A) as the edges vector,
you will receive the behavior you described above. If you instead pass in a
different edges vector, you'll receive a different set of bin counts. Try:

>> A = [1:4 6:10];
>> bin1 = unique(A);
>> n = histc(A, bin1)

n =

     1 1 1 1 1 1 1 1 1

>> n2 = histc(A, min(A):1:max(A))

n2 =

     1 1 1 1 0 1 1 1 1 1

Compare the set of bins used to generate n and the set of bins used to
generate n2 and you'll see they're slightly different -- I haven't used
ARITHENCO but from your description I think you want the n2 bins.

--
Steve Lord
slord@mathworks.com

Subject: arithenco Help

From: Garry Higgins

Date: 27 Apr, 2009 12:08:02

Message: 9 of 10

"Steven Lord" <slord@mathworks.com> wrote in message <gssjmp$7pg$1@fred.mathworks.com>...
>
> "Garry Higgins" <mathworks.20.seker@xoxy.net> wrote in message
> news:gssi41$igq$1@fred.mathworks.com...
> > I'm just posting to say that I seem to have identified to source of the
> > problem (with the help of a friend).
> >
> > The problem is with the 'histc' function. It works fine if all integer
> > values in the range are present but if one isn't, it causes a problem. As
> > a simple example take an array with numbers from 1-10. If the array
> > contains at least 1 of each value (e.g. A=[1,2,3,4,5,6,7,8,9,10]) then
> > histc will work fine for calcultating counts and passing to arithenco. If,
> > however, one of the numbers is not represented (e.g.
> > A=[1,2,3,4,4,6,7,8,9,10]), hisct will omit the zero count number (5 in
> > this case) and the 'counts' vector it creates is not of the form required
> > by arithenco.
>
> I suspect the fault is not in HISTC, but in the edges vector you pass into
> HISTC. If you're passing in something like unique(A) as the edges vector,
> you will receive the behavior you described above. If you instead pass in a
> different edges vector, you'll receive a different set of bin counts. Try:
>
> >> A = [1:4 6:10];
> >> bin1 = unique(A);
> >> n = histc(A, bin1)
>
> n =
>
> 1 1 1 1 1 1 1 1 1
>
> >> n2 = histc(A, min(A):1:max(A))
>
> n2 =
>
> 1 1 1 1 0 1 1 1 1 1
>
> Compare the set of bins used to generate n and the set of bins used to
> generate n2 and you'll see they're slightly different -- I haven't used
> ARITHENCO but from your description I think you want the n2 bins.
>
> --
> Steve Lord
> slord@mathworks.com
>

Thanks for the reply Steve. That does indeed solve the problem of zero-count values. Unfortunately it brings me back to another problem I forgot about where arithenco requires all values in counts to be greater than zero. Looks like I'm going round in circles here.

Thanks again for the reply though. At least you saved me a couple of days of writing my own code to do the same thing and then rediscovering this problem. When/if I find a solution I'll post it here. Might have to bite the bullet and start hacking my own version of arithenco together.

In the mean time, if anyone else has any advice I'd really appreciate it :)

Subject: arithenco Help

From: Garry Higgins

Date: 28 Apr, 2009 13:03:03

Message: 10 of 10

Ok. I found a solution to my problem. The first thing I did was edit out the the error checking part of arithenco and arithdeco (copy the files to your working directory for safety. MATLAB will use these copies before the ones in the toolbox). The next step was to convert the signal to a form where it starts at 1 and contains only integers. To do this I divided by min(A) and rounded:

>>B = round(A/min(A));

From there I could procede as normal.

>> counts = histc(B, min(B):1:max(B));
>> code = arithenco(B,counts);

And decoding gives me back the B signal:

>> dseq = arithdeco(code,counts,length(B));
>> isequal(dseq,B)

ans =

     1

Obviously, the manipulation of the A signal to give B can result in data loss so be aware of that when doing this. Also, I so far haven't come across any problems with removing the error checking but that's not to say I won't.

I hope this helps anyone having similar problems with using arithenco. Just be aware of the above caveats. Thanks to Steven and Roger for their help. I wouldn't have figured it out without yer help.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us