Optimal stopping problem and Stochastic control: Difference between pages

Revision as of 01:17, 29 January 2012

Stochastic control refers to the general area in which some random variable distributions depend on the choice of certain controls, and one looks for an optimal strategy to choose those controls in order to maximize or minimize the expected value of the random variable.

The random variable to optimize is computed in terms of some stochastic process. It is usually the value of some given function evaluated at the end point of the stochastic process.

Standard stochastic control: the Bellman equation

Consider a family of stochastic processes $X_t^\alpha$ indexed by a parameter $\alpha \in A$, whose corresponding generator operators are $L^\alpha$. We consider the following dynamic programming setting: the parameter $\alpha$ is a control that can be changed at any period of time.

We look for the optimal choice of the control that will maximize the value of a given function $g$ the first time the process $X_t$ exits a domain $D$. If we call this maximal possible expected value $u(x)$, in terms of the initial point $X_0 = x$, the function $u$ will solve the Bellman equation. \[ \sup_{\alpha \in A} L^\alpha u = 0 \qquad \text{in } D.\]

This is a fully nonlinear convex equation.

When the operators $L^\alpha$ are second order and uniformly elliptic, the solution is $C^{2,\alpha}$. This is the result of the Evans-Krylov theorem. When the operators $L^\alpha$ are integral kernels, one can still prove that the solution is classical if the kernels satisfy some uniform assumptions. This is the nonlocal Evans-Krylov theorem

There are many variants of this problem. If the value of $g$ is given at a previously specified time $T$, then $u(x,t)$ solves the backwards parabolic Bellman equation.

@@ Line 1: / Line 1: @@
-The optimal stopping is a problem in the context of optimal [[stochastic control]] whose solution is obtained through the [[obstacle problem]].
+Stochastic control refers to the general area in which some random variable distributions depend on the choice of certain controls, and one looks for an optimal strategy to choose those controls in order to maximize or minimize the expected value of the random variable.
-== Description of the problem ==
+The random variable to optimize is computed in terms of some stochastic process. It is usually the value of some given function evaluated at the end point of the stochastic process.
-The setting is the following. There is a stochastic process $X_t$ and we have the choice of stopping it at any time $\tau$. When we stop, we are given a quantity $\varphi(X_\tau)$. The problem is to choose the optimal stopping time that would maximize the value of the expected value of the final payoff $\varphi(X_\tau)$.
-The choice of the stopping time $\tau$ has to be made in terms of the information that we have up to time $\tau$ only. In probabilistic technical terms, $\tau$ has to be measurable with respect to the filtration associated with the stochastic process $X_t$.
+== Standard stochastic control: the [[Bellman equation]] ==
+Consider a family of stochastic processes $X_t^\alpha$ indexed by a parameter $\alpha \in A$, whose corresponding generator operators are $L^\alpha$. We consider the following dynamic programming setting: the parameter $\alpha$ is a control that can be changed at any period of time.
-The problem may have some extra constraints. We may be forced to stop before an expiration time T or as soon as $X_t$ exits a domain $D$.
+We look for the optimal choice of the control that will maximize the value of a given function $g$ the first time the process $X_t$ exits a domain $D$. If we call this maximal possible expected value $u(x)$, in terms of the initial point $X_0 = x$, the function $u$ will solve the [[Bellman equation]].
+\[ \sup_{\alpha \in A} L^\alpha u = 0 \qquad \text{in } D.\]
-== Connection with American options ==
+This is a fully nonlinear convex equation.
-In finance, an option gives an agent the possibility to buy or sell a given asset or basket of assets in the future. The payoff of this option is a random variable that will depend on the value of these assets at the moment the option is exercised. In the American market, the options can be exercised any time until their expiration time $T$. If we model the price of the assets by a stochastic process $X_t$, the optimal choice of the moment to exercise the option in order to maximize the expected payoff corresponds to the optimal stopping problem.
-This is a highly simplified model for the pricing of American options. In [[financial mathematics]] there are other factors that enter into consideration (mostly related to risk).
+When the operators $L^\alpha$ are second order and uniformly elliptic, the solution is $C^{2,\alpha}$. This is the result of the [[Evans-Krylov theorem]]. When the operators $L^\alpha$ are integral kernels, one can still prove that the solution is classical if the kernels satisfy some uniform assumptions. This is the [[nonlocal Evans-Krylov theorem]]
-== Derivation of the obstacle problem ==
+There are many variants of this problem. If the value of $g$ is given at a previously specified time $T$, then $u(x,t)$ solves the backwards parabolic Bellman equation.
-Let us assume that the stochastic process $X_t$ is a [[Levy process]] with generator operator $L$. The expected value of $\varphi(X_{\tau})$ naturally depends on the initial point $x$ where the Levy process starts. We can call $u(x) = \mathbb E[\varphi(X_{\tau}) | X_0 = x]$.
+== [[Optimal stopping problem]] ==
-The choice of when to stop depends on the current position of $X_t$ only. This is a simple consequence of the Markovian property of Levy processes, or in layman's terms, from the fact that the future of a Levy process does not depend on the past but only on the current position. Thus, there are some points $x$ where we would choose to stop, and others where we would choose to continue.
+== Zero sum games: the [[Isaacs equation]] ==
-At those points $x$ where we would choose to continue, the function $u$ will satisfy the PDE from the generator operator $Lu(x) = 0$. If we choose to continue it is because this choice is better than stopping. So necessarily $u(x) \geq \varphi(x)$ at these points.
-There are other points where we would choose to stop the process. At those points we are immediately given the value of the payoff function, thus $u(x) = \varphi(x)$. On the other hand, if we choose to stop it is because continuing would not improve the expected value of the payoff, therefore $Lu(x) \leq 0$ at those points (the function is a supersolution).
-Therefore we have derived the conditions of the [[obstacle problem]].

Optimal stopping problem and Stochastic control: Difference between pages

Revision as of 01:17, 29 January 2012

Standard stochastic control: the Bellman equation

Optimal stopping problem

Zero sum games: the Isaacs equation

Navigation menu