Separation principle in stochastic control

The separation principle is one of the fundamental principles of stochastic control theory, which states that the problems of optimal control and state estimation can be decoupled under certain conditions.

In its most basic formulation it deals with a linear stochastic system with a state process

is the gain of the optimal linear-quadratic regulator obtained by taking

is replaced by a more general square-integrable martingale with possible jumps.

[1] In this case, the Kalman filter needs to be replaced by a nonlinear filter providing an estimate of the (strict sense) conditional mean where is the filtration generated by the output process; i.e., the family of increasing sigma fields representing the data as it is produced.

In the early literature on the separation principle it was common to allow as admissible controls

This is equivalent to allowing all non-anticipatory Borel functions as feedback laws, which raises the question of existence of a unique solution to the equations of the feedback loop.

is the covariance matrix The separation principle would now follow immediately if

is in general a nonlinear function of the data and thus non-Gaussian, then so is the output process

Consequently, the problem with possibly control-dependent sigma fields does not occur in the usual discrete-time formulation.

, which does not depend on the control, is circular or a best incomplete; see Remark 4 in Georgiou and Lindquist.

) via a Girsanov transformation so that becomes a new Wiener process, which (under the new probability measure) can be assumed to be unaffected by the control.

The question of how this could be implemented in an engineering system is left open.

Although a nonlinear control law will produce a non-Gaussian state process, it can be shown, using nonlinear filtering theory (Chapters 16.1 in Lipster and Shirayev[14] ), that the state process is conditionally Gaussian given the filtration

However, this requires quite a sophisticated analysis and is restricted to the case where the driving noise

In this more general formulation the embedding procedure of Lindquist[2] defines the class

This approach considers stochastic systems as well-defined maps between sample paths rather than between stochastic processes and allows us to extend the separation principle to systems driven by martingales with possible jumps.

The approach is motivated by engineering thinking where systems and feedback loops process signals, and not stochastic processes per se or transformations of probability measures.

Hence the purpose is to create a natural class of admissible control laws that make engineering sense, including those that are nonlinear and discontinuous.

has a unique strong solution if there exists a non-anticipating function

The resulting feedback loop is deterministically well-posedin the sense that the feedback equations admit a unique solution that causally depends on the input for each input sample path.

In this context, a signal is defined to be a sample path of a stochastic process with possible discontinuities.

Hence the response of a typical nonlinear operation that involves thresholding and switching can be modeled as a signal.

is a measurable function of past values of the input and time.

For example, stochastic differential equations with Lipschitz coefficients driven by a Wiener process induce maps between corresponding path spaces, see page 127 in Rogers and Williams,[16] and pages 126-128 in Klebaner.

[17] Also, under fairly general conditions (see e.g., Chapter V in Protter[18]), stochastic differential equations driven by martingales with sample paths in

Examples of simple systems that are not deterministically well-posed are given in Remark 12 in Georgiou and Lindquist.

[1] By only considering feedback laws that are deterministically well-posed, all admissible control laws are physically realizable in the engineering sense that they induce a signal that travels through the feedback loop.

The proof of the following theorem can be found in Georgiou and Lindquist 2013.

, consider the problem of minimizing the quadratic functional J(u) over the class of all deterministically well-posed feedback laws