Lyapunov optimization

This article describes Lyapunov optimization for dynamical systems.

Lyapunov functions are used extensively in control theory to ensure different forms of system stability.

The state of a system at a particular time is often described by a multi-dimensional vector.

A Lyapunov function is a nonnegative scalar measure of this multi-dimensional state.

Typically, the function is defined to grow large when the system moves towards undesirable states.

System stability is achieved by taking control actions that make the Lyapunov function drift in the negative direction towards zero.

Lyapunov drift is central to the study of optimal control in queueing networks.

Minimizing the drift of a quadratic Lyapunov function leads to the backpressure routing algorithm for network stability, also called the max-weight algorithm.

[1][2] Adding a weighted penalty term to the Lyapunov drift and minimizing the sum leads to the drift-plus-penalty algorithm for joint network stability and penalty minimization.

define: This function is a scalar measure of the total queue backlog in the network.

It is called quadratic Lyapunov function on the queue state.

This equation can be used to compute a bound on the Lyapunov drift for any slot t: Rearranging this inequality, summing over all

and dividing by 2 leads to: where: Suppose the second moments of arrivals and service in each queue are bounded, so that there is a finite constant

the following property holds: Taking conditional expectations of (Eq.

1) leads to the following bound on the conditional expected Lyapunov drift: In many cases, the network can be controlled so that the difference between arrivals and service at each queue satisfies the following property for some real number

is non-negative and rearranging the terms in the above expression proves the result.

Suppose the goal is to stabilize the queueing network while minimizing the time average of

For example, to stabilize the network while minimizing time average power,

can be defined as the total power incurred by the network on slot t.[8] To treat problems of maximizing the time average of some desirable reward

[3] To stabilize the network while minimizing the time average of the penalty

network algorithms can be designed to make control actions that greedily minimize a bound on the following drift-plus-penalty expression on each slot

is a non-negative weight that is chosen as desired to affect a performance tradeoff.

A key feature of this approach is that it typically does not require knowledge of the probabilities of the random network events (such as random job arrivals or channel realizations).

reduces to minimizing a bound on the drift every slot and, for routing in multi-hop queueing networks, reduces to the backpressure routing algorithm developed by Tassiulas and Ephremides.

leads to the drift-plus-penalty algorithm for minimizing average power subject to network stability developed by Neely.

as the negative of an admission control utility metric leads to the drift-plus-penalty algorithm for joint flow control and network routing developed by Neely, Modiano, and Li.

[3] A generalization of the Lyapunov drift theorem of the previous section is important in this context.

parameter can be tuned to make time average penalty as close to (or below) the target as desired, with a corresponding queue size tradeoff.

and rearranging terms proves the time average penalty bound.

A similar argument proves the time average queue size bound.