Image Processing and Recognition

KALMAN FILTER

Linear Prediction Model
Given the knowledge of the measured variable y_n-1, ..., y_n-m, the prediction of y_n is denoted Y_n , and the prediction error is the difference between the newly measured value and its prediction,

f_m(n) = y_n - Y_n( y_n-1, ..., y_n-m)

a_n = f_m(n) is called the innovation. We assume that the prediction is a linear function of the measured values (linear prediction), with coefficients c_n,k. The innovation can therefore be written

a_n = y_n + ∑_{k=1 .. m} c_n,k y_n-k

The coefficients of the linear prediction are obtained by minimizing the correlation of the innovation with the measures,

E[ a_n y_n-k ] = 0 for k in 1 .. m
or equivalently
E[ a_n a_n-k ] = 0 for k in 1 .. m

If the process is stationary the prediction coefficicents c_n,k do not depend on the index n, but only on the delay index k. In this case they are computed by solving the linear equation

E[ y_n-h y_n-k] c_k + E[ y_n-h y_n] = 0

The estimation of the parameter variable x is assumed a linear function of the measures, i.e., of the innovations,

X_n( y_n, y_n-1, ..., y_n-m) = X_n( a_n, a_n-1, ..., a_n-m) = ∑_{k=0 .. m} b_k a_n-k

b_k is found by minimizing the distance between x_n and X_n. This is a mean square value problem. In order to find the minimum of

F = | x_n - ∑_{k=0 .. m} b_k a_n-k |²

the derivatives of F with respect to the b_k are computed and equated to zero. This gives the equations

∑_{j=0 .. m} E[a_n-k a_n-j] b_j = E[ x_n a_n-k]

Since E[a_ka_j]=0 if k is different from j, this system is immediately solved

b_k = E[ x_n a_n-k] / E[ a_n-k a_n-k]

and

X_n(y) = X_n-1(y) + b_n a_n

which is suitable for a recursive implementation since X_n(y₁ ... y_n) is expressed as X_n-1(y₁ ... y_n-1) and a coefficient, b_n , that depends on x_n and a_n, i.e., on x_n and y_n, and is multiplied by a_n.

Kalman Filter

Suppose to have a model described by the (linear) evolution equation

x_n+1 = A_n x_n + B_n+1 u_n+1 + w_n
where A_n is a MxM state transition matrix, x_n is a M-dim state vector (parameter variable of the system), and w_n is a M-dim noise vector with normal probability distribution, with zero mean and covariance W,
E[ w_n w_k] = W d_n,k
u_n+1 is an additional control paramater, which will be ignored in the sequel.

The measured variable is

y_n = H_n x_n + v_n
where H_n is a NxM matrix, v_n a N-dim measurement noise vector, with normal probability distribution, with zero mean and covariance V and uncorrelated to w_k. The process and noise covariances might change at each step, ie, they might depend on the index n, but we will assume that they are constant. The matrices A_n and H_n are assumed known. In general thay change with the time step.

The covariance of the system state is P_n+1,n = E[ x_n+1 x_n ].

The a-priori estimate of the system state (ie, the estimate at time step n+1 given the knowledge up to time n) is

The estimate errors are

To derive these results we begin by writing the a-posteriori estimate as a linear combination of the a-priori estimate and the difference between the actual measurement and the predicted measurement,

x_n+1|n+1 = x_n+1|n + K_n+1 ( y_n+1 - H_n+1 x_n+1|n )
The term in parenthesis is called the measurement innovation. When it is zero there is no difference between the predicted measurement and the actual measurement. The matrix K_n+1 is called gain and is chosen so tat to minimize the a-posteriori error covariance.

To find it we substitute this equation in the definition of the covariance, the derivative with respect to K is taken and equated to zero. THe result is the expression of the Kalman gain matrix written above.

When the process is not linear the Kalnam filter theory can be applied to the linearized system (about te current mean and covariance). This is called Extended Kalman Filter.

"Kalman Filter Theory" in S. Haykin, Adaptive Filter Theory, Prentice Hall, 1986, p. 269.
http://www.cs.unc.edu/~welch/kalman/

Marco Corvi - Page hosted by geocities.com.