Arithmetic Coding
Introduction :: Probability Models :: Applications
Introduction
Arithmetic coding overcomes some of the problems of Huffman coding, in particular the potential 1 bit surplus problem. It operates as a human might, using information already observed to predict what might be coming, and coding based on the prediction. In addition, the technique explicitly separates the prediction portion from the encoding portion. In AC, a bit sequence is interpreted as an interval on the real line from 0 to 1. For example 01 is interpreted as 0.01...., which corresponds (not knowing what the following digits are) to the interval [0.01, 0.10) (in binary) which is [0.25,0.5) (bse ten). (Make sure understanding on brackets.) A longer string 01101 corresponds to the interval [0.01101, 0.01110). The longer the string, the shorter the interval represented on the real line. Assume we are dealing with an alphabet
, where aI is a special symbol meaning "end of transmission.'' The source produces the sequence
, and not necessarily i.i.d. We further assume (or model) that there is a predictor which computes, or estimates
which is available at both encoder and decoder. We divide the segment [0,1) into I intervals whose lengths are equal to the probabilities
. The first interval is
The second interval is
and so forth. More generally, to provide for the possibility of considering other symbols than just x1, we define the lower and upper cumulative probabilities:
Then, for example, a2 corresponds to the interval [Q1(a2),R1(a2)). Now we represent the probabilities for the next symbol. Take, for example, the interval for a1, and subdivide it into intervals
Then we note that the sum of the lengths of these subintervals will be
which sure enough is the correct length. More generally, we subdivide each of the intervals for ai aj similarly to have length of
Then, we continue subdividing each subinterval for strings of length N. The following algorithm (Mackay, p. 151) shows how to compute the interval [u,v) for the string
. (Note: this is for demonstration purposes, since it requires infinite precision arithmetic. In practice, the algorithm is arranged so that infinite precision is not required.)
In encoding, the interval is subdivided for each new symbol. To encode the string
, we send the binary string whose interval lies within the interval determined by the sequence.
One of the benefits of arithmetic coding is that the worst case redundancy for an entire bit string (which may, for example, consist of an entire file) is at most two bits, assuming the probabilistic model is correct. Given a probabilistic model
, the ideal message length for a sequence
is
. Suppose that
is just barely between two binary intervals. Then the next smaller binary intervals contained in
are smaller by a factor of 4. This factor of 4 corresponds to
bits overhead worst case.


















