Linear Minimum Mean-Square Error Filtering
Background :: Filtering
Background
Recall that for random variable
and
with finite variance, the
MSE
is minimized by
. That is, the
best estimate of
using a measured value of
is to find the
conditional average of
. One aspect of this estimate is that:
The error is orthogonal to the data.More precisely, the error
for all measurable functions
<
\infty$" align="middle" border="0" height="33" width="102" />.
We want to show that
minimizes
if and only if
(orthogonality), for all measurable
such
that
<
\infty$" align="middle" border="0" height="33" width="102" />.
Conversely, suppose for some
where
Then
< E[(X-h(Y))^2].
\end{displaymath}" border="0" height="45" width="553" />
Suppose now we are given two random processes
and
that are statistically related (that is, not independent). Suppose,
to begin, that
. Suppose we observe
over the interval
, and based on the information gained we want to estimate
for some fixed
as a function of
. That
is, we form
for some functional
If
< b$" align="middle" border="0" height="29" width="37" />: We say that the operation of the function is smoothing.
If
: We way that the operation of the function is filtering.
If
b$" align="middle" border="0" height="29" width="37" />: We way that the operation of the function is prediction.
The error in the estimate is
. The mean-squared error
is
.
Fact (built on our previous intuition): The MSE
is minimized by the conditional expectation
Furthermore, the orthogonality principle applies:
While we know the theoretical result, it is difficult in general to compute the desired conditional expectation.
Note that
may include infinite sequences, so we assume
mean-square limits. The set
contains mean-square derivatives,
mean-square integrals, and other linear transformations of
. (The set
is the Hilbert space generated by the
linear span of
.)
Let's now solve
![]() |
() |
- If
< \infty$" align="middle" border="0" height="33" width="83" /> then
solves (*) if
and only if
for all
. That
is, the error is orthogonal to all elements of
.
-
for all
if and only if
and
for all
.
This is a restatement of orthogonality, but for a restricted space.
That is,
so that
and (2):
for
This gives us two equations in the unknowns
or
or
The optimal
Since we are dealing with covariances, the means have been
eliminated. It is frequently assumed that
and
have zero
means. In this case, the covariances are equal to the correlations,
and we can write
This equation is called the Wiener-Hopf equation.
An integral equation of this form is called a Fredholm equation. The theory on the existence of solutions Fredholm integral equations is well-known. In practice, solutions are usually numerical.
The solution
is sometimes called a Wiener filter.
The filter in this case is called a Non-Causal Wiener Filter.
It can be shown that the residual error for the noncausal Wiener
filter is
This can be seen as follows:
By orthogonality, the last term is 0, which implies that
The MMSE is sometimes written as
where







![\begin{displaymath}
\min_{\Xhat_t \in \Hc_y} E[(X_t - \Xhat_t)^2].
\tag{(*)}
\end{displaymath}](img37_8.png)
