For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the Moon with minimum fuel expenditure.
[2] Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy.
[3] A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.
[6] The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s, after contributions to calculus of variations by Edward J.
The question is, how should the driver press the accelerator pedal in order to minimize the total traveling time?
In this example, the term control law refers specifically to the way in which the driver presses the accelerator and shifts the gears.
The system consists of both the car and the road, and the optimality criterion is the minimization of the total traveling time.
Another related optimal control problem may be to find the way to drive the car so as to minimize its fuel consumption, given that it must complete a given course in a time not exceeding some amount.
in the infinite-horizon case are enforced to ensure that the cost functional remains positive.
Furthermore, in order to ensure that the cost function is bounded, the additional restriction is imposed that the pair
Note that the LQ or LQR cost functional can be thought of physically as attempting to minimize the control energy (measured as a quadratic form).
The infinite horizon problem (i.e., LQR) may seem overly restrictive and essentially useless because it assumes that the operator is driving the system to zero-state and hence driving the output of the system to zero.
In fact, it can be proved that this secondary LQR problem can be solved in a very straightforward manner.
For the finite horizon LQ problem, the Riccati equation is integrated backward in time using the terminal boundary condition
As a result, it is necessary to employ numerical methods to solve optimal control problems.
In an indirect method, the calculus of variations is employed to obtain the first-order optimality conditions.
This boundary-value problem actually has a special structure because it arises from taking the derivative of a Hamiltonian.
is the augmented Hamiltonian and in an indirect method, the boundary-value problem is solved (using the appropriate boundary or transversality conditions).
[10] The approach that has risen to prominence in numerical optimal control since the 1980s is that of so-called direct methods.
The reason for the relative ease of computation, particularly of a direct collocation method, is that the NLP is sparse and many well-known software programs exist (e.g., SNOPT[13]) to solve large sparse NLPs.
Examples of academically developed MATLAB software tools implementing direct methods include RIOTS,[20] DIDO,[21] DIRECT,[22] FALCON.m,[23] and GPOPS,[24] while an example of an industry developed MATLAB tool is PROPT.
[26] Finally, it is noted that general-purpose MATLAB optimization environments such as TOMLAB have made coding complex optimal control problems significantly easier than was previously possible in languages such as C and FORTRAN.
The Theory of Consistent Approximations[27][28] provides conditions under which solutions to a series of increasingly accurate discretized optimal control problem converge to the solution of the original, continuous-time problem.
[29] For instance, using a variable step-size routine to integrate the problem's dynamic equations may generate a gradient which does not converge to zero (or point in the right direction) as the solution is approached.
A common solution strategy in many optimal control problems is to solve for the costate (sometimes called the shadow price)
The costate summarizes in one number the marginal value of expanding or contracting the state variable next turn.
, the turn-t optimal value for the control can usually be solved as a differential equation conditional on knowledge of
Again it is infrequent, especially in continuous-time problems, that one obtains the value of the control or the state explicitly.
Usually, the strategy is to solve for thresholds and regions that characterize the optimal control and use a numerical solver to isolate the actual choice values in time.
to maximize profits over the period of ownership with no time discounting.