Generate Huffman code dictionary for source with known probability model

`[dict,avglen] = huffmandict(symbols,p)`

[dict,avglen] = huffmandict(symbols,p,N)

[dict,avglen] = huffmandict(symbols,p,N,* variance*)

The `huffmandict`

function generates a Huffman
code dictionary corresponding to a source with a known probability
model. The required inputs are

`symbols`

, which lists the distinct signal values that the source produces. It can have the form of a numeric vector, numeric cell array, or alphanumeric cell array. If it is a cell array, it must be either a row or a column.`p`

, a probability vector whose kth element is the probability with which the source produces the kth element of`symbols`

. The length of`p`

must equal the length of`symbols`

.

The outputs of `huffmandict`

are

`dict`

, a two-column cell array in which the first column lists the distinct signal values from`symbols`

and the second column lists the corresponding Huffman codewords. In the second column, each Huffman codeword is represented as a numeric row vector.`avglen`

, the average length among all codewords in the dictionary, weighted according to the probabilities in the vector`p`

.

`[dict,avglen] = huffmandict(symbols,p)`

generates
a binary Huffman code dictionary using the maximum variance algorithm.

`[dict,avglen] = huffmandict(symbols,p,N)`

generates
an `N`

-ary Huffman code dictionary using the maximum
variance algorithm. `N`

is an integer between 2 and
10 that must not exceed the number of source symbols whose probabilities
appear in the vector `p`

.

`[dict,avglen] = huffmandict(symbols,p,N,`

generates
an N-ary Huffman code dictionary with the minimum variance if * variance*)

`variance`

`'min'`

and
the maximum variance if `variance`

`'max'`

. `N`

is
an integer between 2 and 10 that must not exceed the length of the
vector `p`

.```
symbols = [1:5];
p = [.3 .3 .2 .1 .1];
[dict,avglen] = huffmandict(symbols,p)
samplecode = dict{5,2} % Codeword for fifth signal value
```

The output is below, where the first column of `dict`

lists
the values in `symbols`

and the second column lists
the corresponding codewords.

dict = [1] [1x2 double] [2] [1x2 double] [3] [1x2 double] [4] [1x3 double] [5] [1x3 double] avglen = 2.2000 samplecode = 1 1 0

[1] Sayood, Khalid, *Introduction
to Data Compression,* San Francisco, Morgan Kaufmann, 2000.

Was this topic helpful?