# Channel Capacity

Definitions :: Symmetric Channels :: Closer Look :: Typical Sequences :: Theorem

## A closer look at capacity

The rationale for the coding theorem: "for large block lengths, every channel looks like the noisy typewriter.
**
The channel has a subset of inputs that produce essentially disjoint sequences at the output
**
.''

For each typical input
*
n
*
-sequence, there are approximately 2
^{
nH
(
Y
|X)
}
possible
*
Y
*
sequences, each of them more or less equally likely by the AEP. In order to reliably detect them, we want to ensure that no two
*
X
*
sequences produce the same
*
Y
*
sequence. The total number of possible (typical)
*
Y
*
sequences is
This has to be divided into sets of size 2
^{
nH
(
Y
|X)
}
, corresponding to the different
*
X
*
sequences. The total number of disjoint sets is therefore approximately 2
^{
n
(
H
(
Y
) -
H
(
Y
|X))
}
= 2
^{
nI
(
X
;
Y
)
}
. Hence we can send at most
distinguishable sequences of length
*
n
*
.

When we talk about data transmission through a channel, the issue of coding arises. We define a code as follows:

In other words, the code takes symbols from
*
W
*
, and encodes them to produce a sequence of
*
n
*
symbols in
.

The probability of error is defined as

In other words, if the message symbol is

*i*, but the output symbol is not

*i*, then we have an error. This can be written using the indicator function as

In our development, it may be convenient to deal with the maximal probability of error. If it can be shown that the maximal probability of errors goes to zero, then clearly the other probabilities of error do also. The maximal probability of error is

The average probability of error is

This is a different definition than the "information'' channel capacity of a DMC, presented above. What we will show (this is Shannon's theorem) is that the two definitions are equivalent.

The implication of this definition of capacity is that for an achievable rate, the probability of error tends to zero as the block length gets large. Since the capacity is the largest achievable rates, then
**
for rates less than the capacity, the probability of error goes to zero with the block length.
**