Personal tools
  •  

Data Compression

Document Actions
  • Content View
  • Bookmarks
  • CourseFeed

Introduction   ::   Kraft   ::   Optimal Codes   ::   Bounds   ::   Huffman   ::   Coding

Huffman codes

Huffman codes are the optimal prefix codes for a given distribution. What's more, if we know the distribution, Huffman codes are easy to find. The code operates from the premise of assigning longer codewords to less-likely symbols, and doing it in a tree-structured way so that the codes obtained are prefix-free.
\begin{example}
Consider the distribution $X$\ taking values in the set $\Xc =
...
... assign codewords on the tree. The average codelength is 2.3 bits.
\end{example}

Codes with more than D =2 symbols can also be built, as described in the book.

Proving the optimality of the Huffman code begins with the following simple lemma:
\begin{lemma}
For any distribution, there exists an optimal instantaneous code
...
...it, and
correspond to the two least-likely symbols.
\end{enumerate}\end{lemma}

\begin{proof}(Sketch)
\begin{enumerate}
\item Simply swap lengths.
\item If t...
...st code, which
contradicts the optimality property.
\end{enumerate}\end{proof}

For a code on m symbols, assume (w.o.l.o.g.) that the probabilities are ordered $p_1> p_2 > \cdots > p_m$ . Define the merged code on m -1 symbols by merging the two least probable symbols p m , p m -1 . The codeword on this merged symbol is the common prefix on the two least-probable (longest) codewords, which, by the lemma, exists. The expected length of the code C m is

\begin{displaymath}\begin{aligned}
L(C_m) &= \sum_{i=1}^m p_i l_i \\
&= \sum_{i...
...+ p_m(l_{m}+1) \\
&= L(C_{m-1}) + p_m(l_{m}'+1).
\end{aligned}\end{displaymath}

The optimization problem on m symbols has been reduced to an optimization problem on m -1 symbols. Proceeding inductively, we get down to two symbols, for which the optimal code is obvious: 0 or 1.

Copyright 2008, by the Contributing Authors. Cite/attribute Resource . admin. (2006, May 17). Data Compression. Retrieved January 07, 2011, from Free Online Course Materials — USU OpenCourseWare Web site: http://ocw.usu.edu/Electrical_and_Computer_Engineering/Information_Theory/lecture6_6.htm. This work is licensed under a Creative Commons License Creative Commons License