# Lecture notes on nonlocal equations

Indication: A star (*) in an exercise indicates that I don't know how to solve it.

By Luis 14:00, 12 May 2012 (CDT)

# Lecture 1

## Definitions: linear equations

The first lecture serves as an overview of the subject and to familiarize ourselves with the type of equations under study.

The aim of the course is to see some regularity results for elliptic equations. Most of these results can be generalized to parabolic equations as well. However, this generalization presents extra difficulties that involve nontrivial ideas.

The prime example of an elliptic equation is the Laplace equation. $\Delta u(x) = 0 \text{ in } \Omega.$

Elliptic equations are those which have similar properties as the Laplace equation. This is a vague definition.

The class of fully nonlinear elliptic equations of second order have the form $F(D^2u, Du, u, x)=0 \text{ in } \Omega.$ for a function $F$ such that $\frac{\partial F}{\partial M_{ij}} > 0 \text{ and } \frac{\partial F}{\partial u} \leq 0.$

These are the minimal monotonicity conditions for which you can expect a comparison principle to hold. The appropriate notion of weak solution, viscosity solutions, is based on this monotonicity.

What is the Laplacian? The most natural (coordinate independent) definition may be $\Delta u(x) = \lim_{r \to 0} \frac c {r^{n+2}} \int_{B_r} u(x+y)-u(x) dy.$

A simple (although rather uninteresting) example of a nonlocal equation would be the following non infinitesimal version of the Laplace equation $\frac c {r^{n+2}} \int_{B_r} u(x+y)-u(x) dy = 0 \text{ for all } x \in \Omega.$

The equation tells us that the value $u(x)$ equals the average of $u$ in the ball $B_r(x)$. A more general integral equation is a weighted version of the above. $\int_{\R^n} (u(x+y)-u(x)) K(y) dy = 0 \text{ for all } x \in \Omega.$ where $K:\R^n \to \R$ is a non negative kernel.

The equations show that $u(x)$ is a weighted average of the values of $u$ in the neighborhood of $x$. This is true in some sense for all elliptic equations, but it is most apparent for integro-differential ones.

For the Dirichlet problem, the boundary values have to be prescribed in the whole complement of the domain. \begin{align*} \int_{\R^n} (u(x+y)-u(x)) K(y) dy &= 0 \text{ for all } x \in \Omega, \\ u(x) &= g(x) \text{ for all } x \notin \Omega. \end{align*}

These type of equations have a natural motivation from probability, as we will see below.

## Probabilistic derivation

Let us start by an overview on how to derive the Laplace equation from Brownian motion.

Let $B_t^x$ be Brownian motion starting at the point $x$ and $\tau$ be the first time it hits the boundary $\partial \Omega$. If we call $u(x) = \mathbb E[g(B_\tau^x)]$ for some prescribed function $g: \partial \Omega \to \R$, then $u$ will solve the classical Laplace equation \begin{align*} \Delta u(x) &= 0 \text{ in } \Omega,\\ u(x) &= g(x) \text{ on } \partial \Omega. \end{align*}

A variation would be to consider diffusions other than Brownian motion. If $X^x_t$ is the stochastic process given by the SDE: $X_0^x = x$ and $dX_t^x = \sigma(X) dB$, and we define as before $u(x) = \mathbb E[g(X_\tau^x)]$, then $u$ will solve \begin{align*} a_{ij}(x) \partial_{ij} u(x) &= 0 \text{ in } \Omega,\\ u(x) &= g(x) \text{ on } \partial \Omega. \end{align*} where $a_{ij}(x) = \sigma^*(x) \sigma(x)$ is a non negative definite matrix for each point $x$.

Nonlinear equations arise from stochastic control problems. Say that we can choose the coefficients $a_{ij}(x)$ from a family of possible matrices $\{a_{ij}^\alpha\}$ indexed by a parameter $\alpha \in A$. For every point $x$, we can choose a different $a_{ij}(x)$ and our objective is to make $u(x)$ as large as possible. The maximum possible value of $u(x)$ will satisfy the equation \begin{align*} \sup_{\alpha} a_{ij}^\alpha \partial_{ij} u &= 0 \text{ in } \Omega,\\ u(x) &= g(x) \text{ on } \partial \Omega. \end{align*}

Sketch of the proof. If $v$ is any solution to \begin{align*} a_{ij}(x) \partial_{ij} v(x) &= 0 \text{ in } \Omega,\\ v(x) &= g(x) \text{ on } \partial \Omega. \end{align*} with $a_{ij}(x) \in \{a_{ij}^\alpha : \alpha \in A\}$, then from the equation that $u$ solves, we have $a_{ij}(x) \partial_{ij} u(x) \leq 0 \text{ in } \Omega.$ Therefore $u \geq v$ in $\Omega$ by the comparison principle for linear elliptic PDE. $\Box$

Integro-differential equations are derived from discontinuous stochastic processes: Levy processes with jumps.

Let $X_t^x$ be a pure jump Levy process starting at $x$. Now $\tau$ is the first exit time from $\Omega$. The point $X_\tau$ may be anywhere outside of $\Omega$ since $X_t$ jumps. The jumps take place at random times determined by a Poisson process. The jumps in any direction $y \in A$, for some set $A \subset \R^n$ follow a Poisson process with intensity $\int_A K(y) dy.$ The kernel $K$ represents then the frequency of jumps in each direction. This type of processes are well understood and studied in the probability community.

The small jumps may happen more often than large ones. In fact, small jumps may happen infinitely often and still have a well defined stochastic process. This mean that the kernels $K$ may have a singularity at the origin. The exact assumption one has to make is $\int_{\R^n} K(y) (1 \wedge |y|^2) dy , +\infty.$ The generator operator of the Levy process is $Lu(x) = \int_{\R^n} (u(x+y) - u(x) - y \cdot Du(x) \chi_{B_1}(y)) K(y) dy.$

We may assume that $K(y)=K(-y)$ in order to simplify the expression. This assumption is not essential, but it makes the computations more compact. This way we can write \begin{align*} Lu(x) &= PV \int_{\R^n} (u(x+y) - u(x)) K(y) dy, \text{ or } \\ &= \int_{\R^n} (u(x+y) + u(x-y) - 2u(x)) K(y) dy. \end{align*}

An optimal control problem for jump processes leads to the integro-differntial Bellman equation $Iu(x) := \sup_{\alpha} \int_{\R^n}(u(x+y)-u(x)) K^\alpha(y) dy = 0 \text{ in } \Omega.$

Another possibility is to consider a problem with two parameters, which are controlled by two competitive players. This is the integro-differential Isaacs equation. $Iu(x) := \inf_\beta \ \sup_{\alpha} \int_{\R^n}(u(x+y)-u(x)) K^{\alpha\beta}(y) dy = 0 \text{ in } \Omega.$

Other contexts in which integral equations arise are the following:

## Uniform ellipticity

Regularity result require stronger monotonicity assumptions. For fully nonlinear elliptic equations of second order F(D^2u)=0, uniform ellipticity is defined as that there exist two constants $\Lambda \geq \lambda > 0$ such that $\lambda I \leq \frac{\partial F}{\partial M_{ij}}(M) \leq \Lambda I.$

Big Theorems:

• Krylov-Safonov (1979): Solutions to fully nonlinear uniformly elliptic equations are $C^{1,\alpha}$ for some $\alpha>0$.
• Evans-Krylov (1982): Solutions to convex fully nonlinear uniformly elliptic equations are $C^{2,\alpha}$ for some $\alpha>0$.

At the end of this course, we should be able to understand the proof of these two theorems and their generalizations to nonlocal equations.

We first need to understand what ellipticity means in an integro-differential equation. The prime example will be the fractional Laplacian. For $s \in (0,2)$, define $-(-\Delta)^{s/2} u(x) = \int_{\R^n} (u(x+y)-u(x)) \frac{c_{n,s}}{|y|^{n+s}} dy.$

This is an integro-differential operator with a kernel which is radially symmetric, homogeneous, and singular at the origin.

A natural ellipticity condition for linear integro-differential operators would be to impose that the kernel is comparable to that of the fractional Laplacian. The condition could be $c_{s,n} \frac \lambda {|y|^{n+s}} \leq K(y) \leq c_{s,n} \frac \Lambda {|y|^{n+s}}, \text{ plus } K(y)=K(-y).$ But other conditions are possible.

Uniform ellipticity is linked to extremal operators. The classical Pucci maximal operators are the extremal of all uniformly elliptic operators which vanish at zero. \begin{align*} M^+(D^2 u) &= \sup_{\lambda I \leq \{a_{ij}\} \leq \Lambda I} a_{ij} \partial_{ij} u(x) = \Lambda tr(D^2u)^+ - \lambda tr(D^2u)^+,\\ M^-(D^2 u) &= \inf_{\lambda I \leq \{a_{ij}\} \leq \Lambda I} a_{ij} \partial_{ij} u(x) = \lambda tr(D^2u)^+ - \Lambda tr(D^2u)^+.\\ \end{align*} A fully nonlinear equation $F(D^2u)=0$ is uniformly elliptic if and only if for any two symmetric matrices $X$ and $Y$, $M^-(X-Y) \leq F(X) - F(Y) \leq M^+(X-Y).$ This definition is originally from [1].

Given any family of kernels $\mathcal L$, we define \begin{align*} M_{\mathcal L}^+ u(x) &= \sup_{K \in \mathcal L} \int (u(x+y)-u(x)) K(y) dy, \\ M_{\mathcal L}^- u(x) &= \inf_{K \in \mathcal L} \int (u(x+y)-u(x)) K(y) dy. \end{align*} Thus, for a nonlocal operator $I$ (which is a black box that maps $C^2$ functions into continuous functions), we can say it is uniformly elliptic if for any two $C^2$ functions $u$ and $v$, $M_{\mathcal L}^- (u-v)(x) \leq Iu(x) - Iv(x) \leq M_{\mathcal L}^+ (u-v)(x).$

The first choice of $\mathcal L$ would be the one described above $\mathcal L = \left\{ K : c_{s,n} \frac \lambda {|y|^{n+s}} \leq K(y) \leq c_{s,n} \frac \Lambda {|y|^{n+s}}, \text{ plus } K(y)=K(-y) \right\}.$

In this case, the maximal operators take a particularly simple form

\begin{align*} M_{\mathcal L}^+ u(x) &= \frac{c_{n,s}}2 \int_{\R^n} \frac{\Lambda (u(x+y)+u(x-y)-2u(x))^+ - \lambda (u(x+y)+u(x-y)-2u(x))^-}{|y|^{n+s}} dy, \\ M_{\mathcal L}^- u(x) &= \frac{c_{n,s}}2 \int_{\R^n} \frac{\lambda (u(x+y)+u(x-y)-2u(x))^+ - \Lambda (u(x+y)+u(x-y)-2u(x))^-}{|y|^{n+s}} dy. \end{align*}

For other choices of $\mathcal L$, the operators $M^+_{\mathcal L}$ and $M^-_{\mathcal L}$ may not have an explicit expression.

Exercise 1. Let $I : C^2(\R^2) \to C(\R)$ be a nonlinear operator which satisfies $M^-(D^2(u-v)) \leq Iu - Iv \leq M^+(D^2(u-v)),$ for any two functions $u$ and $v$, where $M^+$ and $M^-$ are the classical Pucci operators, then prove that $Iu$ is a fully nonlinear uniformly elliptic operator of the form $Iu(x) = F(D^2u(x))$ (in particular you have to show that $I$ is local).

Exercise 2 (*). Let $I : C^2(\R^2) \to C(\R)$ be a nonlinear operator, uniformly elliptic respect to $\mathcal L$ in the sense that for any two functions $u$ and $v$, $M_{\mathcal L}^-(u-v) \leq Iu - Iv \leq M_{\mathcal L}^+(u-v).$ Is it true that there always exists a family of kernels $K^{\alpha \beta} \in \mathcal L$ and constants $c^{\alpha \beta}$ such that $Iu(x) = \inf_{\alpha} \ \sup_{\beta} \ c^{\alpha \beta} + \int_{\R^n} (u(x+y)-u(x)) K^{\alpha \beta}(y) dy \ ?$

# Lecture 2

## Viscosity solutions

Definition. We say that $Iu \leq 0$ in $\Omega$ in the viscosity sense if every time there exists a function $\varphi : \R^n \to \R$ such that for some point $x \in \Omega$,

1. $\varphi$ is $C^2$ in a neighborhood of $x$,
2. $\varphi(x) = u(x)$,
3. $\varphi(y) \leq u(y)$ everywhere in $\R^n$,

then $I\varphi(x) \leq 0$.

The point of the definition is to translate the difficulty of evaluating the operator $I$ into a smooth test function $\varphi$. In this way, the function $u$ is only required to be continuous (lower semicontinuous for the inequality $Iu \leq 0$). The function $\varphi$ is a test function touching $u$ from below at $x$.

The inequality $Iu \geq 0$ is defined analogously using tests functions touching $u$ from above. A viscosity solution is a function $u$ for which both $Iu \leq 0$ and $Iu \geq 0$ hold in $\Omega$.

Viscosity solutions have the following basic properties:

• Stability under uniform limits.

For second order equations this means that if $F_n(D^2 u_n) = 0$ in $\Omega$ and we have both $F_n \to F$ and $u_n \to u$ locally uniformly, then $F(D^2 u)=0$ also holds in the viscosity sense.

This is available under several set of assumptions. Some are rather difficult to prove, like the case of second order equations with variable coefficients.

The method can be applied to find the viscosity solution of the Dirichlet problem every time the comparison principle holds and some barrier construction can be used to assure the boundary condition.

Let us analyze the case of integral equations. Whenever a test function $\varphi$ exists, there is a vector $b$ ($=\nabla \varphi(x)$) and a constant $c$ ($=|D^2 \varphi(x)|$) such that $u(x+y) \leq u(x) + b \cdot y + c|y|^2.$ Therefore, the positive part of the integral $\int_{\R^n} (u(x+y) + u(x-y) - 2u(x))^+ K(y) dy$ has an $L^1$ integrand. The negative part can a priori integrate to $-\infty$. In any case, we can assign a value to the integral in $[-\infty,\infty)$, and also to any expression of the form $Iu(x) = \inf_\beta \ \sup_{\alpha} \int_{\R^n}(u(x+y)-u(x)) K^{\alpha\beta}(y) dy$ Thus, the value of $Iu(x)$ can be evaluated classically. In the case that $I$ is uniformly elliptic one can even show that the negative part of the integral is also finite. This small observation makes it more comfortable to deal with viscosity solutions of integro-differential equations than in the classical PDE case, since the equation is evaluated directly into the solution $u$ at all points $x$ where there is a test function $\varphi$ touching $u$ either from above or below.

## An open problem

Uniqueness with variable coefficients

Exercise 3 (*). Prove that the comparison principle holds for equations of the form $\inf_\alpha \ \sup_\beta \int_{\R^n} (u(x+y)-u(x)) K^{\alpha \beta}(x,y) dy = 0,$ under appropriate ellipticity and continuity conditions on the kernel $K$.

The closest result available, due to Cyril Imbert and Guy Barles [2], is for equations of the form $\inf_{\alpha} \ \sup_\beta \int_{\R^n} (u(x+j(x,y))-u(x)) K^{\alpha \beta}(y) dy = 0.$ Here $j$ is assumed to be essentially Lipschitz continuous respect to $x$, among other nondegeneracy conditions for $j$ and $K$.

## Second order equations as limits of integro-differential equations

We can recover second order elliptic operators as limits of integral ones. Consider $\lim_{s \to 2} \int_{\R^n} (u(x+y)-u(x)) \frac{(2-s)a(y/|y|)}{|y|^{n+s}} dy.$

For $u \in C^3$, we write the expansion $u(x+y) = u(x) + Du(x) \cdot y + y^t \ D^2u(x)\ y + O(|y|^3).$

Let us split the integral above in the domains $B_R$ and $\R^n \setminus B_R$ for some small $R>0$.

For the first part, we have \begin{align*} \int_{B_R} (u(x+y)-u(x)) \frac{(2-s)a(y/|y|)}{|y|^{n+s}} dy &= \int_{B_R} (y^t \ D^2u(x) \ y + O(|y|^3)) \frac{(2-s)a(y/|y|)}{|y|^{n+s}} dy \\ &= \int_0^R (2-s) \frac{r^2}{r^{n+s}} r^{n-1} \int_{\partial B_1} (\theta^t \ D^2u(x) \ \theta) \frac{a(\theta)} d\theta dr + (2-s) O(R^{3-s}) \\ &= R^{2-s} \int_{\partial B_1} (\theta^t \ D^2u(x) \ \theta) \frac{a(\theta)} d\theta + (2-s) O(R^{3-s}) \\ \end{align*}

Therefore, when we take $s\to 2$, we obtain $\int_{B_R} (u(x+y)-u(x)) \frac{(2-s)a(y/|y|)}{|y|^{n+s}} dy = \int_{\partial B_1} \theta^t \ D^2u(x) \ \theta a(\theta) d\theta,$ which is a linear operator in $D^2u$, hence it equals $a_{ij} \partial_{ij}u$ for some matrix $a_{ij}$.

## Smooth approximations of viscosity solutions to fully nonlinear elliptic equations

One of the common difficulties one encounters when dealing with viscosity solutions is that it is difficult to make density type arguments. More precisely, a viscosity solution cannot be approximated by a classical $C^2$ solution in any standard way. We can do it however, if we use nonlocal equations [3].

Given the equation \begin{align*} 0 = F(D^2u) &= \inf_\alpha \ \sup_\beta a^{\alpha \beta}_{ij} \partial_{ij} u\\ &= \frac \lambda 2 \Delta u + \inf_\alpha \ \sup_\beta b^{\alpha \beta}_{ij} \partial_{ij} u. \end{align*}

We approximate linear each operator $b^{\alpha \beta}_{ij} \partial_{ij} u$ by an integro-differential one $b^{\alpha \beta}_{ij} \partial_{ij} u = \lim_{r\to 0} \int_{\R^n} (u(x+y)-u(x)) K_r^{\alpha \beta} dy,$ where $K_r^{\alpha \beta}(y) = \frac 1 {r^{n+2}} K^{\alpha \beta} \left( \frac y r \right),$ and each $K^{\alpha \beta}$ is smooth and compactly supported. Then, we approximate the equation with $\frac \lambda 2 \Delta u_r + \inf_\alpha \ \sup_\beta \int_{\R^n} (u_r(x+y)-u_r(x)) K_r^{\alpha \beta} dy = 0$ For each $r>0$, the solution $u$ will be $C^{2,1}$ (very smooth), and $u_r \to u$ as $r \to 0$, where $u$ is the solution to $F(D^2 u)=0$.

Regularity results, such as Harnack or $C^{1,\alpha}$, can be proved uniformly in $r$ bypassing the technical difficulties of viscosity solutions if we are willing to deal with integral equations.

## Regularity of nonlinear equations: how to start

In order to show that the solution to a fully nonlinear equation $F(D^2 u)=0$ is $C^{1,\alpha}$ for some $\alpha>0$, we differentiate the equation and study the equation that the derivative satisfies. Formally, if we differentiate in an arbitrary direction $e$, $\frac{\partial F}{\partial M_{ij}} (D^2u) \partial_{ij} (\partial_e u) = 0.$

If we call $a_{ij}(x) = \frac{\partial F}{\partial M_{ij}} (D^2u(x))$, we do not know much about this coefficients a priori (they are technically not well defined), but we know that for all $x$ $\lambda I \leq a_{ij}(x) \leq \Lambda I,$ because of the uniform ellipticity assumption on $F$.

What we need is to prove that a solution to an equation of the form $a_{ij}(x) \partial_{ij} v = 0$ is Holder continuous, with an estimate which depends on the ellipticity constants of $a_{ij}$ but is independent of any other property of $a_{ij}$ (no smoothness assumption can be made). This is the fundamental result by Krylov and Safonov.

### Differentiating the equation

When we try to make the argument above rigorous, we encounter some technical difficulties. The first obvious one is that $\partial_e u$ may not be a well defined function. We must take incremental quotients. $v(x) = \frac{u(x+h)-u(x)}{|h|}.$ The coefficients of the equation may not be well defined either, but what can be shown is that $M^+(D^2 v) \geq 0 \text{ and } M^-(D^2 v) \leq 0,$ for the classical Pucci operators of order 2.

For fully nonlinear integro-differential equations, one gets the same thing with the appropriate extremal operators corresponding to the uniform ellipticity assumption. If $Iu=0$ in $B_1$ and $v$ is defined as above, then $M_{\mathcal L}^+(v) \geq 0 \text{ and } M_{\mathcal L}^-(v) \leq 0,$ wherever $x \in \Omega$ and $x+h \in \Omega$.

The challenge is then to find a Holder estimate based on these two inequalities. The result says that if $v$ satisfies in the viscosity sense both inequalities $M_{\mathcal L}^+(v) \geq 0$ and $M_{\mathcal L}^-(v) \leq 0$ in (say) $B_1$, then $v$ is $C^\alpha(B_{1/2})$ with the estimate $\|v\|_{C^\alpha(B_{1/2})} \leq C \|v\|_{L^\infty(\R^n)}.$

The fact that the $L^\infty$ norm is taken in the full space $\R^n$ is an unavoidable consequence of the fact that the equation is non local. This feature does make the proof of $C^{1,\alpha}$ regularity more involved and it even forces us to add extra assumptions

It is good to keep in mind that for smooth functions $v$, the two inequalities above are equivalent to the existence of some kernel $K(x,y)$ such that $\int_{\R^n} (v(x+y)-v(x)) K(x,y) dy = 0,$ and that $K(x,\cdot) \in \mathcal L$ for all $x$. But no assumption can be made about the regularity of $K$ respect to $x$.

### Holder estimates

The proof of the Holder estimates is relatively simple if we do not care about how the constants $C$ and $\alpha$ depend on $s$. If we want a robust estimate that passes to the limit as $s \to 2$, the proof will be much harder. We will start with the simple case.

This simple case was originally proved in [4]. The harder case with uniform constants is in [1]

Theorem. Let $\mathcal L$ be the usual class of kernels $\mathcal L = \left\{ K : c_{s,n} \frac \lambda {|y|^{n+s}} \leq K(y) \leq c_{s,n} \frac \Lambda {|y|^{n+s}}, \text{ plus } K(y)=K(-y) \right\}.$

Let $u$ be a continuous function, bounded in $\R^n$ such that \begin{align*} M^+_{\mathcal L} u &\geq 0 \text{ in } B_1, \\ M^-_{\mathcal L} u &\leq 0 \text{ in } B_1. \end{align*} Where both of the inequalities above are understood in the viscosity sense.

Then, there are constants $C$ and $\alpha>0$ (depending only on $\lambda$, $\Lambda$, $n$ and $s$) such that $|u(x) - u(0)| \leq C |x|^\alpha \|u\|_{L^\infty(\R^n)}.$

There is nothing special about the point $0$. Thus, the estimate can be made uniformly in any set of points compactly contained in $B_1$.

Proof. The factor $\|u\|_{L^\infty(\R^n)}$ can be assumed to be $1$ thanks to the simple normalization $u/\|u\|_{L^\infty}$. So, we assume that $\|u\|_{L^\infty}=1$ and will prove that there is a constant $\theta>0$ such that $osc_{B_{2^{-k}}} u \leq (1-\theta)^k.$ The result then follows taking $\alpha = \log(1-\theta)/\log(1/2)$ and $C = (1-\theta)^{-1}$.

We will prove the above estimate for dyadic balls inductively. It is certainly true for $k \leq 0$ since $\|u\|_{L^\infty} = 1$. Now we assume it holds up to some value of $k$ and want to prove it for $k+1$.

In order to prove the inductive step, we rescale the function so that $B_{2^{-k}}$ corresponds to $B_1$. Let $v(x) = (1-\theta)^{-k} u(2^{-k} x) - a_k .$ The function $v$ is scaled, and the constant $a_k$ is chosen, so that $-1/2 \leq v \leq 1/2$ in $B_1$.

The scale invariance of $M^+_{\mathcal L}$ and $M^-_{\mathcal L}$ plays a crucial role here in that $v$ satisfies the same extremal equations as the original function $u$.

From the inductive hypothesis, $osc_{B_{2^{-j}}} u \leq (1-\theta)^j$ for all $j \leq k$, so we have that $osc_{B_{2^{j}}} v \leq (1-\theta)^{-j}$ for all $j \geq 0$.

There are two obvious ways in which the oscillation of $v$ in the ball of radius $1/2$ can be smaller than its oscillation in $B_1$: either the suppremum of $u$ is smaller in $B_{1/2}$ or the infimum is larger. We prove one or the other depending on which of the sets $\{v < 0\} \cap B_1$ or $\{v > 0\} \cap B_1$ has larger measure. Let us assume the former. The other case follows by exchangind $v$ with $-v$. We want to prove now that $v \leq (1/2-\theta)$ in $B_{1/2}$.

Note that since we know that $osc_{B_{2^{j}}} v \leq (1-\theta)^{-j}$ for all $j \geq 0$, then $v(x) \leq (2|x|)^\alpha-1/2 \text{ for } x \notin B_1.$

The point is to choose $\theta$ and $\alpha$ appropriately so that the following three points

• $v(x) \leq (2|x|)^\alpha-1/2 \ \text{ for all } x \notin B_1$.
• $|\{v < 0\} \cap B_1| > 1/2 |B_1|$.
• $M^+_{\mathcal L} v \leq 0$ in $B_1$

imply that $v \leq (1/2-\theta)$ in $B_{1/2}$.

If that holds for any choice of $\alpha$ and $\theta$, it also holds for smaller values. Thus, a posteriori, we can make one of them smaller so that $\alpha = \log(1-\theta)/\log(1/2)$.

Let $\rho$ be a smooth radial function supported in $B_{3/4}$ such that $\rho \equiv 1$ in $B_{1/2}$.

If $v > (1/2-\theta)$ at any point in $B_{1/2}$, then $(v+\theta \rho)$ would have a local maximum at a point $x_0 \in B_{3/4}$ for which $(v+\theta \rho)(x_0) > 1/2$. $\max_{B_1} (v+\theta \rho) = (v+\theta \rho)(x_0) > 1/2.$ In order to obtain a contradiction, we evaluate $M^+ (v+\theta \rho)(x_0)$.

On one hand \begin{align*} M^+ (v+\theta \rho)(x_0) &\geq M^+ v(x_0) + \theta M^- \rho(x_0) \\ &\geq \theta \ \min_{B_{3/4}} \ M^- \rho(x). \end{align*}

On the other hand, the other estimate is more delicate. Let $w = (v+\theta \rho)$. \begin{align*} M^+ (v+\theta \rho)(x_0) &= \int_{\R^n} \frac{\Lambda (w(x_0+y)+w(x_0-y)-2w(x_0))^+ - \lambda (w(x_0+y)+w(x_0-y)-2w(x_0))^-}{|y|^{n+s}} dy \\ &\leq 2\int_{\R^n} \frac{\Lambda (w(x_0+y)-w(x_0))^+ - \lambda (w(x_0+y)-w(x_0))^-}{|y|^{n+s}} dy \\ &\leq 2\int_{x_0+y \notin B_1} (\dots) + 2\int_{x_0+y \in B_1} (\dots) \end{align*}

The first integral can be bounded using that $v(x) \leq (2|x|)^\alpha-1/2$ for all $x \notin B_1$. In fact, it is arbitrarily small if $\alpha$ is chosen close to $0$. $\int_{x_0+y \notin B_1} (\dots) \leq \int_{x_0+y \notin B_1} ((2|x_0+y|)^\alpha-1) \frac{\Lambda}{|y|^{n+s}} dy \ll 1.$

The second integral has a non negative integrand just because $(v+\theta \rho)$ takes its maximum in $B_1$ at $x_0$. But we can say more using the set $G = \{v < 0\} \cap B_1$. \begin{align*} \int_{x_0+y \in B_1} (\dots) &\leq \int_{x_0+y \in G} (\dots) + \int_{x_0+y \in B_1 \setminus G} (\dots) \\ &\leq \int_{x_0+y \in G} (\dots) = \int_{x_0+y \in G} - \lambda \frac{(w(x_0+y)-w(x_0))^-}{|y|^{n+s}} dy \\ &\leq \int_{x_0+y \in G} - \lambda \frac{(w(x_0+y)-w(x_0))^-}{|y|^{n+s}} dy \\ &\leq \int_{x_0+y \in G} - \lambda \frac{1/2-\theta}{|y|^{n+s}} dy \leq -C. \end{align*} In the last inequality we use that $|y|^{-n-s}$ is bounded below, $\theta$ is chosen less than $1/2$, and $|G|>|B_1|/2$.

So, for $\theta$ and $\alpha$ small enough the sum of the two terms will be negative and less than $\theta \min M^- \rho$, arriving to a contradiction. This finishes the proof. $\Box$

Inspecting the proof above we see that the argument is much more general than presented. The only assumptions used on $\mathcal L$ are that:

1. The extremal operators are scale invariant.
2. For the smooth bump function $\rho$, $M^- \rho$ is bounded.
3. $M^+ w(x_0)$ can be bounded below at point $x_0 \in B_{3/4}$ which achieves the maximum of $w$ in $B_1$ provided that
• $w(x) \leq w(x_0) + (2|x|)^\alpha-1$ for $x \notin B_1$.
• $|\{w(x) \leq w(x_0)-1\} \cap B_1| \geq |B_1|/2$.

There are very general families of non local operators which satisfy those conditions above.

Exercise 4. Verify that the proof above also holds for equations of the form $\int_{\R^n} (u(x+y) - u(x)) K(x,y) dy = 0 \text{ in } B_1.$ for which we assume that for every $x \in B_1$, $\frac{\lambda}{|y|^{n+s}} \leq K(x,y) \leq \frac{\Lambda}{|y|^{n+s}},$ where $s \in (0,1)$ but we do not assume that $K$ is symmetric in $y$.

Exercise 5. Verify that the proof above also holds for equations of the form $\int_{\R^n} (u(x+y) - u(x) - y \cdot Du(x)) K(x,y) dy = 0 \text{ in } B_1.$ for which we assume that for every $x \in B_1$, $\frac{\lambda}{|y|^{n+s}} \leq K(x,y) \leq \frac{\Lambda}{|y|^{n+s}},$ where $s \in (1,2)$ but we do not assume that $K$ is symmetric in $y$.

# Lecture 3

## $C^{1,\alpha}$ estimates for nonlinear nonlocal equations

Let $u$ be a bounded function in $\R^n$ which solves $Iu = 0$ in $B_1$ in the viscosity sense, where $I$ is a nonlocal operator uniformly elliptic respect to a class $\mathcal L$. Let us also assume that $I$ is translation invariant, meaning that if $u$ solves $Iu = 0$ in $\Omega$, then $u(\cdot-x)$ also solves $Iu = 0$ in $x+\Omega$.

We want to obtain a $C^{1,\alpha}$ estimate of the following form. $\|u\|_{C^{1,\alpha}(B_{1/2})} \leq C \|u\|_{L^\infty(B_1)}.$

The strategy of the proof is the following. Let us assume that $I0=0$ (the value of $I$ applied to the zero function is zero). From the ellipticity assumption $M^-_{\mathcal L} u \leq Iu - I0 \leq M^+_{\mathcal L} u.$ Thus, the two inequalities hold \begin{align*} M^-_{\mathcal L} u &\leq 0 \text{ in } B_1, \\ M^+_{\mathcal L} u &\geq 0 \text{ in } B_1. \end{align*} So, from the Holder estimates, $u \in C^\alpha$ in the interior of $B_1$.

Now, for any small vector $h \in \R^n$, we define the incremental quotient $v(x) = \frac{u(x+h)-u(x)}{|h|^\alpha}.$ This function $v$ is bounded independently of $h$ in any set compactly contained in $B_1$ (say $B_{1-\varepsilon}$). From this we would like apply the Holder estimates to obtain that $v \in C^\alpha$ in the interior of $B_1$ independently of $h$. The problem is that the right hand side in the Holder estimate depends on the $L^\infty$ norm of $v$ in the full space $\R^n$ and not only $B_{1-\varepsilon}$.

One way to overcome this difficulty is imposing stronger assumptions to the family of kernels $\mathcal L$. Let us define the following more restrictive family, where we impose a bound on the derivatives of the kernels $\mathcal L_1 = \left\{ K : c_{s,n} \frac \lambda {|y|^{n+s}} \leq K(y) \leq c_{s,n} \frac \Lambda {|y|^{n+s}}, \text{ and } |\nabla K(y)| \leq \frac C{|y|^{n+s+1}}, \text{ plus } K(y)=K(-y) \right\}.$

Now, we can "integrate by parts" the contribution of the tails of the integrals in $M^+_{\mathcal L_1} v$ and $M^-_{\mathcal L_1} v$. If we split the domain of the integrals of each kernel in $\mathcal L_1$ \begin{align*} \int_{\R^n} (v(x+y)-v(x)) K(y) dy &= \int_{B_r} (v(x+y)-v(x)) K(y) dy + \int_{\R^n \setminus B_r} (v(x+y)-v(x)) K(y) dy, \\ &= \int_{B_r} (v(x+y)-v(x)) K(y) dy + \int_{\R^n \setminus B_r} (u(x+y)-u(x)) (K(y)-K(y+h)) dy \end{align*}

The second term is bounded (depending on $r$) thanks to the bound on $DK$ away from zero, and the first term is what we really need to work out the $C^\alpha$ norm of $v$ in terms of the $L^\infty$ norm of $v$ in $B_{1-2 \varepsilon}$.

From the equation above we get that $v \in C^\alpha$ independently of $h$ applying a small variation of the Holder estimates explained above. That implies that $u \in C^{2\alpha}$. Iterating the procedure we get $u \in C^{3\alpha}$, $u \in C^{4\alpha}$, $\dots$, up to $u$ Lipschitz. Then one more iteration gives $u\in C^{1,\alpha}$ but no more gains in regularity are possible with this method because the $C^{1,\alpha}$ estimate of $u$ is not equivalent to any uniform bound of an incremental quotient of $u$.

Exercise 6 (*). Is the extra assumption on the boundedness of the derivatives of the kernels really necessary to obtain $C^{1,\alpha}$ estimates? This condition is unnecessary if the equation holds in the full space. But in fact this condition is necessary if $s<1$ even for linear equations. The result is not clear (and in fact open) for $s>1$.

## Holder estimates in the parabolic case

We will now work out the parabolic version of the Holder estimates that we obtained in the previous lecture. This will show some of the extra difficulties that one faces when dealing with parabolic equations.

The result that we prove is the following.

Theorem. Let $\mathcal L$ be the usual class of kernels $\mathcal L = \left\{ K : c_{s,n} \frac \lambda {|y|^{n+s}} \leq K(y) \leq c_{s,n} \frac \Lambda {|y|^{n+s}}, \text{ plus } K(y)=K(-y) \right\}.$

Let $u$ be a continuous function, bounded in $\R^n \times [-1,0]$ such that \begin{align*} u_t - M^+_{\mathcal L} u &\leq 0 \text{ in } B_1 \times (-1,0], \\ u_t - M^-_{\mathcal L} u &\geq 0 \text{ in } B_1 \times (-1,0]. \end{align*}

Then $u \in C^\alpha(B_{1/2} \times [-1/2,0])$ and $\|u\|_{C^\alpha(B_{1/2} \times [-1/2,0])} \leq C \|u\|_{L^\infty},$ for constants $\alpha$ and $C$ that depend on $\lambda$, $\Lambda$, $s$ and $n$.

As in the elliptic case, the proof is much harder if we want to make sure that $C$ and $\alpha$ have a finite positive limit as $s \to 2$. We will do the simple case now, in which we do not care about how $C$ and $\alpha$ depend on $s$.

This result first appeared in [5] and [6] for equations with integral diffusion and drift.

Proof. Let us normalize the function $u$ such that $osc_{\R^n \times [-1,0]} u = 1$. We will show that there is a Holder modulus of continuity at the origin, i.e. $|u(x,-t) - u(0,0)| \leq C(|x|^\alpha+t^{\alpha/s}).$

It is convenient to keep in mind the natural scaling of the equation. The function $u_r(x,t) = u(rx,r^st)$ satisfies the same two inequalities \begin{align*} \partial_t u_r - M^+_{\mathcal L} u_r &\leq 0 \text{ in } B_{1/r} \times (-1/r^s,0], \\ \partial_t u_r - M^-_{\mathcal L} u_r &\geq 0 \text{ in } B_{1/r} \times (-1/r^s,0]. \end{align*} Thus, $|x|^\alpha$ has the same scaling as $t^{\alpha/s}$.

Let us define the parabolic cylinders $Q_r$ with the right scaling as $Q_r := B_r \times [-r^s,0].$

What we will prove is the inequality $$\tag{1} osc_{Q_{2^-k}} u \leq (1-\theta)^k.$$ From this, the Holder continuity follows as in the elliptic case.

From the assumption that $osc_{\R^n \times [-1,0]} u = 1$, we know that (1) holds for all $k \leq 0$. That gives us the base for the induction. Now we assume it is true up to some value of $k$ and want to show it also holds for $k+1$.

We start by rescaling the function so as to map $Q_r$ to $Q_1$. Let $v(x,t) = (1-\theta)^{-k} u(2^{-k}x, 2^{-ks}t) - a_k$, where $a_k$ is chosen so that $-1/2 \leq v \leq 1/2$ in $B_1$.

From the inductive hypothesis, $osc_{B_{2^j} \times [-1,0]} v \leq (1-\theta)^{-j}$ for all $j \geq 0$.

In order to show that $osc_{Q_{1/2}} v \leq (1-\theta)$ we must show either that $\theta \leq 1/2-\theta$ in $Q_{1/2}$ or that $\theta \geq -1/2+\theta$ in $Q_{1/2}$. Which of the two alternatives we manage to prove depends on which of the two sets $\{v \geq 0\} \cap (B_1 \times [-1,-1/2^s])$ or $\{v \leq 0\} \cap (B_1 \times [-1,-1/2^s])$ has larger measure. Let us assume it is the first, otherwise the same proof upside down would work with the opposite inequalities.

The function $v$ satisfies the following three conditions

• $v(x) \leq (2|x|)^\alpha - 1/2$ for all $x \notin B_1$.
• $|\{v \leq 0 \} \cap (B_1 \times [-1,-1/2^s])| \geq \frac 12 |B_1 \times [-1,-1/2^s]|$.
• $\partial_t v - M^+_{\mathcal L} v \leq 0$ in $Q_1$

We need to show that for small enough $\theta>0$ and $\alpha>0$, these three conditions imply that $v \leq 1/2-\theta$ in $Q_{1/2}$

Let $\rho$ be a smooth radial function supported in $B_{3/4}$ such that $\rho \equiv 1$ in $B_{1/2}$. We will show that the function $v$ stays below the function $b(x,t) = 1/2 + \epsilon + \delta (t+1) - m(t) \rho(x)$ in $B_1 \times [-1,0]$ where $m$ is the solution to the ODE: \begin{align*} m(-1) &= 0, \\ m'(t) &= c_0 | \{x \in B_1: v(x,t) \leq 0\}| - C_1 m(t). \end{align*} for constants $c_0$ and $C_1$ to be chosen later.

We show that the inequality holds by proving that it can never be invalidated for the first time. Indeed, assume there was a point $(x_0,t_0)$ where equality holds. This point must be in the support of $\rho$ (strict inequality holds in the rest since $v \leq 1/2$), thus $x_0 \in B_{3/4}$.

We have the simple inequality $v_t(x_0,t_0) \geq b_t(x_0,t_0) = -m'(t_0) \rho(x_0) + \delta.$

Let $G(t) = \{x \in B_1: u(x,t) \leq 0\}$. We know, by the assumption above, that $\int_{-1}^{-1/2^s} G(t) dt > c$.

Recall that $M^+_{\mathcal L} v(x_0,t_0)$ is the maximum of all integro-differential operators with kernels $K \in \mathcal L$. $M^+_{\mathcal L} v(x_0,t_0) = \sup_{K \in \mathcal L} \int_{\R^n} (v(x_0+y,t_0)-v(x_0,t_0))K(y) dy.$ We will find a negative upper bound for all these integro-differential operators. We divide the domain of integration between $x_0+y \in B_1$ and $x_0+y \notin B_1$. For the latter, we have $$\tag{2} \int_{x_0+y \notin B_1} (v(x_0+y,t_0)-v(x_0,t_0))K(y) dy \leq \int_{x_0+y \notin B_1} ((2|x|)\alpha-1) \frac \Lambda {|y|^{n+s}} dy \leq C(\alpha)$$ and this right hand side can be made arbitrarily small by picking $\alpha$ small.

The rest of the integral is \tag{3} \begin{aligned} \int_{x_0+y \in B_1} (v(x_0+y,t_0)-v(x_0,t_0))K(y) dy &\leq \int_{x_0+y \in B_1} (b(x_0+y,t_0) - b(x_0,t_0)) K(y) dy + \int_{x_0+y \in G(t_0)} (0-b(x_0+y,t_0)) K(y) dy \\ &\leq -C m(t) M^+_{\mathcal L} \rho(x_0,t_0) - c_0 |G(t_0)|. \end{aligned}

Plugging (2) and (3) into the equation, we obtain $v_t(x_0,t_0) - M^+_{\mathcal L} v(x_0,t_0) \geq -m'(t_0) \rho(x_0) + \delta - C(\alpha) + C m(t_0) M^+_{\mathcal L} \rho(x_0,t_0) + c_0 |G(t_0)|$ Recall that $m'(t) = c_0 |G(t)| - C_1 m(t)$ by definition (this is when $c_0$ is chosen). Since $\rho \leq 1$, we have $v_t(x_0,t_0) - M^+_{\mathcal L} v(x_0,t_0) \geq \delta - C(\alpha) + C m(t_0) M^+_{\mathcal L} \rho(x_0,t_0) + C_1 m(t_0) \rho(x_0).$ We choose $\alpha$ small so that $C(\alpha) < \delta$, so we have $v_t(x_0,t_0) - M^+_{\mathcal L} v(x_0,t_0) \geq + C m(t_0) M^+_{\mathcal L} \rho(x_0,t_0) + C_1 m(t_0) \rho(x_0).$ Now we have to choose $C_1$ appropriately to make this right hand side positive and contradict the equation for $v$.

This is clearly possible if we know a lower bound for $\rho(x_0)$. However, we must also consider that $x_0$ may be a point where $\rho$ is very small. It turns out that $M^+_{\mathcal L} \rho > 0$ where $\rho$ is small since trivially $M^+_{\mathcal L} \rho(x) > 0$ if $\rho(x)=0$ (from the formula for $M^+_{\mathcal L}$) and $M^+_{\mathcal L} \rho$ is a continuous function. Thus, where $\rho$ is small, the right hand side is automatically positive. We choose $C_1$ large so that this right hand side is also positive where $\rho$ is large. This gives us a contradiction with the equation and proves that then $v$ must stay below the function $b$.

To finish the proof, all we need is to show that $b \leq 1/2-\theta$ in $Q_{1/2}$. We analyze the ODE that defines $m(t)$ and we realize that $m(t)$ is going to be bounded below in for $t \in [-1/2^s,0]$ in terms of the measure of the set $\{u \leq 0\} \cap (B_1 \times [-1,-1/2^s])$. In fact, an explicit formula can be given for $b$. Let $\theta$ be half of this lower bound for $m(t)$ and let us choose $\delta < \theta/4$. We will then have $b = 1/2 - \epsilon + \delta(t+1) - m(t) b(x,t) \leq 1/2-\theta$ in $Q_{1/2}$, which finishes the proof. $\Box$

Exercise 7. Adapt the proof to the previous result to equations with drift and diffusion. Let $u$ be a continuous function, bounded in $\R^n \times [-1,0]$ such that for some $B>0$ and $s \geq 1$, \begin{align*} u_t - M^+_{\mathcal L} u - B|\nabla u| &\leq 0 \text{ in } B_1 \times (-1,0], \\ u_t - M^-_{\mathcal L} u + B|\nabla u| &\geq 0 \text{ in } B_1 \times (-1,0]. \end{align*} then $u \in C^\alpha(Q_{1/2})$ with $\|u\|_{C^\alpha(Q_{1/2})} \leq C \|u\|_{L^\infty(\R^n \times [-1,0])}.$

The two inequalities above are implied by an equation of the form $u_t + b \cdot \nabla u - \int_{\R^n} (u(x+y)-u(x)) K(x,y) dy = 0.$ where $\|b\|_{L^\infty} \leq B$ and $K(x,\cdot)$ belongs to the class $\mathcal L$ for all $x$.

Exercise 8. Let $I$ be uniformly elliptic with respect to the usual class $\mathcal L$ (without any condition on the derivatives of the kernel) and translation invariant. Prove that bounded solutions to the equation (in the full space) $u_t - Iu = 0 \text{ in } \R^n \times (0,\infty),$ become immediately $C^{1,\alpha}$ in space and time for positive time.

# Lecture 4

The proofs of the Holder estimates we have done so far relied very heavily on the nonlocal character of the equation. They took advantage of the integral quantities in the most crucial step of the proof. It is no surprise that the constants in the estimate degenerate as the order of the equation converges to 2.

In order to obtain Holder estimates for integro-differential equations that pass to the local limit as $s \to 2$, we must generalize the Holder estimates for uniformly elliptic equations in non divergence form by Krylov and Safonov.

It is best to first understand how to prove the result in the classical case of second order, so that then we can discuss how to generalize it to integro-differential equations.

## How to prove Holder estimates for 2nd order equations

Let us first state the result.

Theorem.If $u$ is a viscosity solution of \begin{align*} M^+(D^2 u) &\geq 0 \text{ in } B_1,\\ M^-(D^2 u) &\leq 0 \text{ in } B_1.\\ \end{align*} where $M^+$ and $M^-$ are the classical Pucci operators. Then $u$ is $C^\alpha$ in the interior of $B_1$ and $\|u\|_{C^\alpha(B_{1/2})} \leq C \|u\|_{L^\infty(B_1)}.$

Unlike the previous case of nonlocal equations, the $L^\infty$ norm in the right hand side of the estimate depends on the domain of the equation $B_1$ only, and not outside.

The two inequalities above, in the case that $u$ is a classical $C^2$ function, are equivalent to the existence of measurable coefficients $a_{ij}(x)$, uniformly elliptic, such that $a_{ij}(x) \partial_{ij} u = 0 \ \text{ in } B_1.$

The proof, as in the other cases, is done showing iteratively that the oscillation in dyadic balls has exponential decay: $osc_{B_{2^{-k}}} u \leq (1-\theta)^k.$

The crucial step in the proof (and also the only step which is proved differently to the previous proofs) is the following lemma, sometimes called growth lemma.

Lemma.(growth lemma) There exists a $\theta>0$ and $r>0$ so that if $u:B_1 \to \R$ and

• $u \leq 1$ in $B_1$.
• $M^+(D^2 u) \geq 0$ in $B_1$ ($u$ is a subsolution)
• $|\{u \leq 0\} \cap B_r| \geq |B_r|/2$.

Then $u \leq 1-\theta$ in $B_{1/2}$.

### The ABP estimate

The main difficulty is to be able to extract information about measure of level sets from the equation. The proof for nonlocal equations is easier because a set of positive measure has an apparent effect on the integro-differential operator at every point. Here, we need to obtain some integral estimate, and the way to do it is more subtle. The key is the following classical result by Alexandroff.

Theorem. (The ABP estimate) Let $u : \overline{B_1} \to \R$ such that \begin{align*} u &\geq 0 \text{ on } \partial B_1 \\ M^-(D^2 u) &\leq f \text{ in } B_1 \end{align*} then the following estimate holds $\max_{B_1} (-u) \leq C \left( \int_{\{\Gamma = u\}} f(x)^n dx \right)^{1/n}.$

Here $\Gamma$ is the convex envelope of $u$. It is crucial for the proof of Holder continuity that the integration only takes place on the contact set between $u$ and its convex envelope $\Gamma$.

Sketch of the proof. Let us analyze the image of the gradient map $x \mapsto D\Gamma(x)$. If $a \in \R^n$ is a vector so that $|a| < max_{B_1} (-u)/2$, then one can see that if we slide a plane with slope $a$ from below, it will touch the graph of $u$ first in the interior on $B_1$ then on $\partial B_1$. This is because if we make $ax+b$ coincide with $u$ wherever the $max (-u)$ is achieved, this plane stays negative on $\partial B_1$. And the actual value of $b$ that makes $ax+b$ tangent to $u$ has to be even smaller. Therefore, the ball of vectors $a$ with radius $\max(-u)/2$ is contained in the image of the gradient map. Moreover, all these gradients are achieved in the set $\{u = \Gamma\}$, since all these supporting planes of $\Gamma$ must also be supporting planes of $u$. $$\tag{4} B_{\max(-u)/2} \subset D \Gamma(\{u = \Gamma\}) \Rightarrow c \ \max(-u)^n \leq |D \Gamma(\{u = \Gamma\})|.$$

The measure of the image of the gradient map can also be computed in an elementary way by integrating its Jacobian. $|D \Gamma(\{u = \Gamma\})| = \int_{\{u = \Gamma\}} \det D^2 \Gamma(x) dx.$

The useful property of the convex envelope $\Gamma$ is that since $D^2 \Gamma \geq 0$, the elliptic equation is a weighted sum of the eigenvalues of $D^2 \Gamma$ which are all non negative, and therefore there is no cancellation. The jacobian $\det D^2 \Gamma(x)$ is the product of these eigenvalues. By the geometric-arithmetic mean inequality, and because $D^2 \Gamma$ is positive definite, we have that $\det D^2\Gamma(x)^{1/n} \leq \frac{\Delta \Gamma(x)} n = \frac \lambda n M^-(D^2 \Gamma).$ Since $\Gamma$ is tangent to $u$ from below at all points in the contact set $\{\Gamma = u\}$, then $0 \leq \lambda M^-(D^2 \Gamma(x)) \leq \lambda M^-(D^2 u(x))$. Combining all these inequalities we obtain $c \ \max(-u)^n \leq \int_{\{u = \Gamma\}} (\lambda M^-(D^2 u(x)))^n dx,$ from which the estimate of the theorem easily follows. $\Box$

Exercise 9 (*). Find the smallest $p>0$ depending on $\lambda$, $\Lambda$ and $n$ such that the function $u$, under the same assumptions as in the ABP estimate, satisfies $\max_{B_1} (-u) \leq \int_{B_1} |f(x)|^p dx.$ (this is a fairly well known open problem)

### A special bump function

The main estimate to prove the growth lemma will be obtained applying the ABP estimate to the solution $u$ minus an explicit bump function $\varphi$. We construct this bump function now.

Lemma. For any $r>0$, there exists a smooth function $\varphi$ such that

• $\varphi \geq 0$ in $B_1$.
• $\varphi = 0$ on $\partial B_1$.
• $\varphi > 2$ in $B_{1/2}$.
• $M^- \varphi \geq -f$, where $f$ is supported in $B_r$.

The proof of this lemma is a simple explicit computation (thanks to the non divergence character of the equation). The function $\varphi(x) = (|x|^{-p}-1)$ in $B_1 \setminus B_r$ for some constant $p$ depending on the ellipticity constants $\lambda$, $\Lambda$ and dimension.

### A first estimate in measure

Lemma. For a positive $\mu$ small enough, if $u$ satisfies

• $M^-(D^2 u) \leq 0$ in $B_1$ ($u$ is a supersolution)
• $u \leq 0$ in $B_1$
• $\inf_{B_{1/2}} u \leq 1$

Then $|\{u \geq M\} \cap B_r| \leq (1-\mu) |B_r|,$ where $r>0$ is the one from the special function.

This lemma would imply the growth lemma (applied to $M(1-u)$)if $\mu=1/2$. However, at this point we only get that when $\mu$ is a small quantity.

Sketch of the proof. We apply the ABP estimate to $v = u-\varphi$. We obtain \begin{align*} 1 \leq \max (-v) &\leq C \left( \int_{\{v=\Gamma\}} M^+(D^2v)^n dx \right)^{1/n} \\ &\leq C |\{v=\Gamma\} \cap B_r|^{1/n} \end{align*} And then we observe that $v(x)=\Gamma(x)$ only at points where $v(x)<0$, in particular at points where $u(x) < \varphi(x)$. We finish the proof setting $M = \max \varphi$. $\Box$

### The $L^\epsilon$ estimate

Scaling the previous estimate in measure says that if the function $u$ is larger than $K$ in a proportion larger than $(1-\mu)$ in any ball $B_\rho(x)$, then it will be larger than $K/M$ everywhere in $B_{\rho/r}$. For $r$ being a small number, this property applied at every scale can be used together with the following covering argument to improve the previous measure estimate.

Lemma.(growing ink spots) Let $A \subset B$ be two sets (contained in a large cube $C_1$) so that every time $|B_\rho(x) \cap A| \geq (1-\mu) |B_\rho(x)|$, we have $B_{\rho/r} \subset B$. Assume also that $|A| \leq (1-\mu) |C_1|$. Then $|A| \leq (1-\mu) |B|$.

This lemma above can be proved for example using the Calderon-Zygmund decomposition of the set $A$.

As a consequence, we combine it to the estimate in measure above, and by an iterative argument conclude that under the same hypothesis $|\{u \geq M^k\} \cap B_r| \leq (1-\mu)^k |B_{1/4}|.$

The previous estimate is telling us that the distribution function of $u$ has certain decay. It is called the $L^\epsilon$ estimate because it means that there is some small $\epsilon > 0$ for which $\int_{B_r} u^\epsilon dx \leq C.$

The case that concerns us is to find one value of $k$ for which $(1-\mu)^k \leq 1/2$. In that case, we obtain the growth lemma as a corollary, just applying the $L^\epsilon$ estimate to the function $M^k(1-u)$ and obtaining the result for $\theta = 1/M^k$.

## About the proof of uniform Holder estimates for non local equations

If we want to generalize Krylov-Safonov result to integro-differential equations, we have no option but to adapt the proof above to nonlocal equations. We will also have to deal with the growth of the function outside $B_1$ in each iteration step, but we deal with that in the same way as before.

The first thing we have to analyze is in what steps the equation is actually applied. It turns out that most of the steps are general statements about functions and sets. The are only two crucial steps where the equation enters the proof.

1. The construction of the special function $\varphi$.
2. The estimate $\det D^2 \Gamma(x) \leq C f(x)^n$ in the proof of the ABP estimate.

The construction of a special function $\varphi$ is is just one computation. It is not so easy for nonlocal equations, but one can do it with enough patience. The function will also be equal to $|x|^{-p}$ outside of a ball, for some appropriate power $p$ (and some appropriate ball).

The most severe difficulty is how to deal with the second point. A nonlocal equation will never control the second derivatives of $\Gamma$. We cannot get an analog of this step, and therefore need to find some alternative argument to use in place of the ABP inequality.

It is important to observe that the usefulness of the ABP estimate in the proof comes just as a way to estimate the measure of the set where $\{u-\varphi\}$ is negative. We need to replace the ABP estimate with a lemma like the following.

Lemma. Let $u \geq 0$ in $\R^n \setminus B_1$ and $M^-_{\mathcal L} u \leq C \chi_{B_r}$ in $B_1$. Assume that $\inf_{B_{1/2}} u = -1$, then $|\{u \leq 0\} \cap B_r| \geq \mu |B_r|,$ for some constant $\mu > 0$.

The contact set of $u$ with its envelope $\Gamma$ still has a useful non cancellation property since at those points $x$, the second order incremental quotients $u(x+y)+u(x-y)-2u(x)$ are non negative. This is true for all $y$ (even large ones) if $\Gamma$ is the convex envelope of $u$ in $B_3$.

As $s$ approaches 2, the equation becomes more and more local. The effect of sets of positive measure in the value of an operator at a point is strong only if this set is relatively close to the point. Thus, evaluating the equation at each point of $x$ will give us only a small set around $x$ where $u$ is negative. We need a way to assemble these small sets together. It will turn out that the key is again to use the inequality (4).

Let $x \in \{u = \Gamma\}$. Since all second order differences are non negative, we have \begin{align*} M^-u(x) &= \int_{\R^n} (u(x+y)+u(x-y)-2u(x)) \frac {(2-s)\lambda} {|y|^{n+s}} dy. \end{align*}

The normalization constant $(2-s)$ plays a role if we want to obtain estimates with uniform constants (especially when we deal with right hand sides).

What we need to prove is that there is a small $r>0$ such that

• $|D \Gamma(B_r(x))| \leq C|B_r|$, so that the image of the ball by the gradient map has comparable measure.
• $|\{u \leq 0\} \cap B_{2r}| \geq c |B_r|$, so that $u$ is negative in a portion of the ball.

Putting these balls together (after a classical covering argument), using (4), we will obtain that the measure $|\{u \leq 0\} \cap B_r|$ is bounded below as we want.

Let us consider the ring dyadic rings $R_k = B_{2r} \setminus B_r$ for $r = r_0 2^{-k}$. What we can show is that $u$ separates from its supporting plane $L$ at $x$ less than $Cr^2$ in a large proportion (say $(1-\delta)$) of $R_k$. If this wasn't true, we would obtain a contradiction by evaluating the equation at $x$, as the computation below shows.

\begin{align*} M^-u(x) &= \int_{\R^n} (u(x+y)-u(x)) \frac {(2-s)\lambda} {|y|^{n+s}} dy, \\ &\geq \sum_{k \geq 1} \int_{R_k} (u(x+y)-u(x)) \frac {(2-s)\lambda} {|y|^{n+s}} dy, \\ &\geq \sum_{k \geq 1} \ C \delta r_0^{2-s} (2-s) 2^{-k(2-s)}, \\ &= C \delta r_0^{2-s} \frac {2-s}{1-2^{-2+s}}, \\ &\geq C \end{align*}

Note that the constant above is independent of $s$, and $r_0$ can be chosen arbitrarily small as $s \to 2$.

The fact that $|\{u \leq L + Cr^2\} \cap R_k| \geq (1-\delta) |R_r|$ implies that $\Gamma -L \leq Cr^2$ in the full ball $B_{3r/2}$, which implies that $|D \Gamma(B_r(x))| \leq C|B_r|$.

Exercise 10. Prove the following statement, which was implicitly used in the last paragraph.

Let $\Gamma: B_2 \to \R$ be a convex function. Assume that $|\{\Gamma \leq A\} \cap (B_2 \setminus B_1)| \geq (1-\delta) |B_2 \setminus B_1|$ for a constant $\delta$ sufficiently small. Prove that $\Gamma \leq A$ everywhere in $B_1$.

# Lecture 5

The strongest regularity result we presented so far is the $C^{1,\alpha}$ regularity for nonlinear nonlocal equations. This is not enough regularity for the solutions to be considered classical if the order of the equation is above one. Indeed, when $K$ is a symmetric kernel so that $K(y) \leq C|y|^{-n-s}$, a linear integro-differential operator of the form $Lu(x) = \int_{\R^n} (u(x+y)+u(x-y)-2u(x)) K(y) dy ,$ is computable classically if $u \in C^{s+\epsilon}$ for some $\epsilon > 0$. This regularity ensures the integrals to be convergent close to the origin $y=0$ (here $s+\epsilon$ may be larger than 1). If $u$ is only $C^{1+\alpha}$ and $1+\alpha \leq s$, the integrals may not be well defined classically at every point.

For second order equations of the form $F(D^2 u)=0$, the solution is classical when $u \in C^2$. In general one cannot prove that viscosity solutions of fully nonlinear uniformly elliptic equations are $C^2$ (an in fact it is not true [7])

There is a special case in which viscosity solutions of fully nonlinear equations are indeed $C^2$ (and therefore classical). This is when we add the extra hypothesis that $F$ is either convex or concave. We saw in the probabilistic derivation of the equation that from stochastic control problems we naturally derive the Bellman equation either with the sup or the inf. $F(D^2 u) = \sup_k \ a^k_{ij} \partial_{ij} u = 0,$ or $F(D^2 u) = \inf_k \ a^k_{ij} \partial_{ij} u = 0.$

It is a famous result of Evans [8] and Krylov [9] that these equations have $C^{2,\alpha}$ solutions for some $\alpha > 0$.

As we will see, a similar result holds for the integro-differential Bellman equation [10]. $\inf_k \int_{\R^n} (u(x+y)-u(x)) K^k(y) dy = 0.$ if all kernels $K^k$ belong to the class $\mathcal L_2$ given by \begin{align*} \mathcal L_2 = \{K : & (2-s) \lambda |y|^{-n-s} \leq K(y) \leq (2-s) \Lambda |y|^{-n-s}, \\ & |D^2K(y)| \leq \Lambda |y|^{-n-s-2}, \\ & K(y) = K(-y) \}. \\ \end{align*} The viscosity solution $u$ will belong to the space $C^{s+\alpha}$ for some $\alpha>0$, which is enough regularity for the solution to be considered classical. Moreover, the estimate involves constants which do not blow up as $s \to 2$.

The restriction on $|D^2 K|$ plays the same role as the restriction on $|DK|$ in the $C^{1,\alpha}$ estimates. In this case we must control $D^2 K$ in case $s+\alpha > 1$.

## The (classical?) second order case: the Evans-Krylov theorem

As we did for Krylov-Safonov theorem, we will first review the second order case of the result.

Theorem. Let $u$ be a viscosity solution of $F(D^2u) = 0$ in $B_1$, where $F$ is uniformly elliptic and either concave or convex. Then $u \in C^{2,\alpha}(B_{1/2})$ and an estimate holds $\|u\|_{C^{2,\alpha}(B_{1/2})} \leq C \|u\|_{L^\infty(B_1)}.$ where $C$ and $\alpha>0$ depend on $\lambda$, $\Lambda$ and the dimension $n$.

Besides the original papers, a proof of this theorem can be found in the book of Caffarelli and Cabre [11].

The proof of this theorem has two clearly divided parts. These are

1. $\|u\|_{C^{1,1}(B_{1/2})} \leq C \|u\|_{L^\infty(B_1)}$.
2. $\|u\|_{C^{2,\alpha}(B_{1/2})} \leq C \|u\|_{C^{1,1}(B_1)}$.

We will see the proofs of these two estimates separately.

In this particular proof, it is quite difficult to deal with the technical difficulties caused by the viscosity solution framework. We will assume that the solutions are classical and present the a priori estimates, which makes the proof significantly simpler. One way to go around these technical difficulties is approximating the equation with integral equations, as explained in the second lecture.

### The assumptions revisited

This theorem provides a continuity result for the second derivatives of the solution. Let us analyse what the assumptions say in terms of the values of $D^2u(x)$.

We said that $F$ is uniformly elliptic. This means by definition that $M^-(X-Y) \leq F(X)-F(Y) \leq M^+(X-Y)$, where $M^+$ and $M^-$ are the classical Pucci operators. We know that at every point $D^2u(x)$ is a symmetric matrix for which $F(D^2u(x))=0$. Thus \begin{align*} M^-(D^2u(x)-D^2u(y)) &\leq F(D^2u(x)) - F(D^2u(y)) = 0, \\ M^+(D^2u(x)-D^2u(y)) &\geq F(D^2u(x)) - F(D^2u(y)) = 0. \end{align*} One can check that the two relations above mean that the sum of the positive and negative eigenvalues of $(D^2u(x)-D^2u(y))$ are comparable. More precisely $$\tag{5} \frac \lambda \Lambda tr(D^2u(x)-D^2u(y))^- \leq tr(D^2u(x)-D^2u(y))^+ \leq \frac \Lambda \lambda tr(D^2u(x)-D^2u(y))^-$$ We could rephrase the above as that $\|(D^2u(x)-D^2u(y))^-\| \approx \|(D^2u(x)-D^2u(y))^+\|$, since for positive definite matrices the trace and norm are comparable.

The equation (5) is a way to understand what it means for a function to be a solution to some uniformly elliptic equation (in fact it is equivalent to the existence of some uniformly elliptic $F$ for which $F(D^2u)=0$ [7]).

The concavity and translation invariance of $F$ makes the second order incremental quotients subsolutions of an equation. More precisely, since $F$ is concave we have that $F\left(\frac{D^2u(x+h) + D^2u(x-h)} 2 \right) \geq \frac{F(D^2u(x+h)) + F(D^2u(x-h))} 2 = 0.$ Therefore $M^+ \left(\frac{D^2u(x+h) + D^2u(x-h) - 2D^2u(x)} 2 \right) \geq F\left(\frac{D^2u(x+h) + D^2u(x-h)} 2 \right) - F(D^2u(x)) = 0.$

Because of the homogeneity of $M^+$, $M^+(D^2v)\geq 0$ where $v(x)$ is a second order incremental quotient $v(x) = \frac{u(x+h)+u(x-h)-2u(x)}{|h|^2}.$ Passing to the limit (if $u$ is regular enough) we see that for a second derivative in any direction $e$ we have $M^+(D^2 \partial_{ee} u) \geq 0.$

The computation above can be repeated to obtain $$\tag{6} M^+(D^2 (a_{ij} \partial_{ij} u)) \geq 0,$$ for any positive semidefinite matrix $a_{ij}$. This is because $a_{ij} \partial_{ij}u(x)$ can be approximated with second order incremental quotients which are a weighted average of values of $u$ in points near $x$ minus the value of $u$ at $x$.

The two formulas (5) and (6) are the basis of the proof of Evans-Krylov theorem. This is the same in all available proofs.

### The half Harnack inequalities

In order to prove the Evans-Krylov theorem we need the most precise version of Harnack inequality. We only explained the proof of the $C^\alpha$ estimate in the previous lecture. The $L^\epsilon$ estimate was explained as part of the proof. With a little extra work, one can also derive the second one of these two results below.

Theorem. (the $L^\epsilon$ estimate) Let $u: B_1 \to \R$ be a non negative supersolution: $M^-(D^2 u) \leq 0$ in $B_1$. Then for some $\epsilon>0$, $\|u\|_{L^\epsilon(B_r)} \leq C \inf_{B_{1/2}} u.$

Theorem. Let $u: B_1 \to \R$ be a subsolution: $M^+(D^2 u) \geq 0$ in $B_1$. Then for any $p>0$, $\sup_{B_{1/2}} u \leq C \|u^+\|_{L^p(B_1)}.$

Intuitively, the first estimate says that if a supersolution is large in most points in a ball centered at $x$, then it will be large at $x$. The second one says that if a subsolution is small in most points of $B_r(x)$ (and bounded), then $u(x)$ will be small.

### The a priori estimate in $C^{1,1}$

The idea of the a priori estimate in $C^{1,1}$ is simple to say. From (6), we know that the second derivatives are subsolutions to an equation. This will imply that they are all bounded above. Then, from the equation $F(D^2u(x))=0$ and the uniform ellipticity of $F$, we immediately conclude that they are bounded below as well.

Step 1. $\Delta u \in L^1$.

The equation we are studying is $\inf_k a_{ij}^k \partial_{ij} u + c^k = 0 \text{ in } B_1.$ Let us assume for simplicity that all $c^k=0$. In particular, each of the second order operators $a^k_{ij} \partial_{ij}u$ is non negative in $B_1$. Let us pick one of them $a_{ij}^0\partial_{ij}u$, which (after a change of coordinates if necessary) we can assume it is the Laplacian (case $a^0_{ij} = I$).

Just from the fact that $\Delta u \geq 0$ and $u \in L^\infty(B_1)$, we easily obtain an estimate for $\Delta u$ in $L^1(B_{1/2})$ in the following way. Let $b: B_1 \to \R$ be a non negative smooth function compactly supported inside $B_1$ such that $b \equiv 1$ in $B_{1/2}$, then $\|\Delta u\|_{L^1(B_{1/2})} \leq \int_{B_1} b(x) \Delta u(x) dx = \int_{B_1} \Delta b(x) u(x) dx \leq C \|u\|_{L^\infty(B_1)}.$

Step 2. $u \in W^{2,2}$.

From (6), $M^+(D^2 \Delta u) \geq 0$, $\Delta u \geq 0$ in $B_1$, and $\Delta u \in L^1_{loc} (B_1)$. We conclude from the half Harnack inequality that $\Delta u$ is bounded in $B_{1/2}$.

Using Calderon-Zygmund theory, then $D^2 u(x) \in L^p(B_{1/2})$ for any $p < \infty$. For the remainder of the proof we will only use the simplest estimate $D^2 u(x) \in L^2(B_{1/2})$ which follows from Fourier analysis.

Step 3. $u \in C^{1,1}$.

Since we got that every second derivative $\partial_{ee} u$ is in $L^2_{loc} (B_1)$, then using the half Harnack inequality, they are all bounded above.

We now use the equation again. From the uniform ellipticity we have that $M^-(D^2u(x)) \leq F(D^2u(x)) - F(0) \leq M^+(D^2 u(x)).$ Which says that if the positive part $\|D^2u(x)^+\|$ is bounded, then also the negative part $\|D^2u(x)^-\|$ must be bounded.

This concludes the proof of the $C^{1,1}$ estimate.

### The a priori estimate in $C^{2,\alpha}$

The second part of the proof of Evans-Krylov theorem consists in proving the a priori estimate $\|u\|_{C^{2,\alpha}(B_{1/2})} \leq C \|u\|_{C^{1,1}}.$

This second step is more involved than the first. All the proofs available are based on the relations (5) and (6), but there are different ways to organize the arguments. If we see for example the book of Caffarelli and Cabre [11], the proof involves a quite sophisticated understanding on how the set of values of $D^2u(x)$ decreases as we restrict the domain into smaller balls around the origin. The values of second derivatives in all directions play a role in the proof. This will be naturally difficult to extend to the nonlocal case. For integro-differential equations, there is no natural analog of a pure second order derivative $\partial_{ee} u$. Our study of nonlocal equations actually forced us to develop a new approach to prove this a priori estimate. The proof that is presented in this section is actually the proof for integro-differential equations adapted (or better said stripped down) to second order equations (originally from [12]). It is one of the very few cases in which the ideas flow in the opposite direction. The proof is still based on the same principles (equations (5) and (6)) but it is organized somewhat differently.

Proof. We want to set up an improvement of oscillation iteration but this time is for the Hessian $D^2u$ instead of the solution $u$. More precisely, we want to show that for some $C>0$ and $\theta > 0$, $\sup_{B_{2^{-k}}} \|D^2u(x) - D^2u(0)\| \leq C \|u\|_{C^{1,1}} (1-\theta)^k.$ There is nothing special about the origin, so if we prove the above inequality, we prove the interior estimate translating the result. We define $P$ and $N$ to be the sum of positive and negative eigenvalues of $D^2u(x) - D^2u(0)$. \begin{align*} P(x) &= tr (D^2u(x) - D^2u(0))^+, \\ N(x) &= tr (D^2u(x) - D^2u(0))^-. \end{align*} We are using the convention that $A = A^+ - A^-$, so both $P$ and $N$ are non negative quantities. Because of (5), the quantities $P(x)$, $N(x)$ and $\|D^2u(x) - D^u(0)\|$ are all comparable (meaning that the ratio between any two is bounded below and above) $P(x) \approx N(x) \approx \|D^2 u(x) - D^2u(0)\|.$

Since all these quantities are comparable, we only have to prove that $\sup_{B_{2^{-k}}} P(x) \leq C \|u\|_{C^{1,1}} (1-\theta)^k.$ We will proceed with an iterative improvement of oscillation, as in the proofs of Holder continuity. This time, instead of the oscillation, we are decreasing the maximum of $P$ in $B_{2^{-k}}$ in each step.

By the standard scaling of the equation, we reduce the problem to unit scale. We must show that if $P \leq 1$ in $B_1$, then $P \leq (1-\theta)$ in $B_{1/2}$ for some $\theta > 0$.

The following simple observation is useful. One way to characterize $P$ is $P(x) = \max_{\{a_{ij}\} \text{ orthogonal proj. matrix}} a_{ij} (\partial_{ij}u(x)-\partial_{ij}u(0)).$

Let $x_0$ be the point in $\overline{B_{1/2}}$ such that $P(x_0) = \max_{B_{1/2}} P(x_0)$. Let $a_{ij}$ be the orthogonal projection matrix to the eigenspace of $D^2u(x_0) - D^2u(0)$ corresponding to positive eigenvalues. Thus, we have $P(x_0) = a_{ij} (\partial_{ij} u(x_0) - \partial_{ij}u(0)).$ Let $v(x) = a_{ij} (\partial_{ij} u(x) - \partial_{ij}u(0))$. We have $v(x_0) = P(x_0)$ and $v(x) \leq P(x) \leq 1$ in $B_1$.

By (6), we know that $M^+(D^2 v) \geq 0$ in $B_1$. So, if $v \leq (1-C\theta)$ in a large proportion of $B_1$, we would immediately obtain that $v \leq (1-\theta)$ everywhere in $B_{1/2}$ using the half Harnack inequality. This means that $P \leq (1-\theta)$ in $B_{1/2}$ and would finish the proof of the iterative step.

We have to resolve the bad case now: if $v \geq (1-C\theta)$ for most of the points of $B_1$ (in measure). In fact we will prove that this bad case never takes place.

Assume then, that $P(x) \geq v(x) \geq (1-C\theta)$ for most points in $B_1$. At these points $P-v$ is smaller than $C\theta$, then the orthogonal projection $a_{ij}$ is almost the optimal for $P$. The idea is that in this bad case the eigenvectors of $D^2u(x) - D^2u(0)$ are almost aligned in most of $B_1$, which makes the equation almost linear. Indeed, if $\{b_{ij}\} := I - \{a_{ij}\}$, and $w(x) := b_{ij} (\partial_{ij} u(x) - \partial_{ij} u(0))$, we have \begin{align*} \Delta u(x) &= w(x) + v(x) = P(x) - N(x) \text{ for all } x \in B_1, \end{align*} which implies that \begin{align*} w(x) &= P(x) - v(x) - N(x) \leq C \theta - N(x) \\ &\leq C\theta - c_0 P(x) && \text{ since $P$ and $N$ are comparable, } \\ &\leq (1+c_0) C \theta - c_0 \leq -c_0/2 && \text{ for $\theta$ small enough. } \end{align*}

But again from (6), $M^+(D^2w) \geq 0$ in $B_1$. From the half Harnack inequality, if $w$ is less than $-c_0/2$ in most of $B_1$, then it has to be negative in $B_{1/2}$. But this is impossible since $w(0) = b_{ij} (\partial_{ij} u(0) - \partial_{ij} u(0)) = 0.$ This contradiction tells us that the bad case never happens, and therefore we can always do the iterative step and finish the proof. $\Box$

## The integro-differential Bellman equation

The integro-differential Bellman equation was motivated in the first lecture from a problem of stochastic control. It consists in the equation $\inf_k \int_{\R^n} (u(x+y)-u(x)) K^k(y) dy = 0 \ \text{ for all } x \in B_1.$ Since it is a very natural equation from stochastic control, it is interesting that for this particular equation we can prove that the viscosity solutions are actually classical.

We can interpret this equation as a generic concave uniformly elliptic equation.

The following is the theorem from [10].

Theorem. Let $u$ be a viscosity solution of $\inf_k \int_{\R^n} (u(x+y)-u(x)) K^k(y) dy = 0 \ \text{ in } B_1.$ Assume that for every $k$, the kernel $K^k$ belongs to the class \begin{align*} \mathcal L_2 = \{K : & (2-s) \lambda |y|^{-n-s} \leq K(y) \leq (2-s) \Lambda |y|^{-n-s}, \\ & |D^2K(y)| \leq \Lambda |y|^{-n-s-2}, \\ & K(y) = K(-y) \}. \\ \end{align*} then $u \in C^{s+\alpha}$ for some $\alpha>0$. In particular the solutions are classical in the sense that all the integrals are computable and Holder continuous. Moreover, there is an estimate $\|u\|_{C^{s+\alpha}(B_{1/2})} \leq C \|u\|_{L^\infty(B_1)}.$ where the constants $C$ and $\alpha$ depend on $\lambda$, $\Lambda$, $n$, and $s$, but they do not blow up as $s \to 2$.

Note that the theorem of Evans and Krylov is obtained from the previous one by passing to the limit as $s \to 2$.

The details of the proof of this theorem would be too long to explain in these lectures comprehensibly. We'll just sketch the outline of the proof.

### Reinterpreting the assumptions

In this version of the theorem the equation does not depend on second order derivatives, nor are we going to manipulate second derivatives in the proof. We need an appropriate reinterpretation of the assumptions.

Let us start with the uniform ellipticity. We know that the difference of two solutions satisfies to inequalities for the maximal and minimal operators $M^+_{\mathcal L}$ and $M^-_{\mathcal L}$. In particular for $v = u(x+\cdot) - u$, we have $M^+_{\mathcal L} v \geq 0 \text{ and } M^-_{\mathcal L} v \leq 0.$ These two inequalities together can also be interpreted as that the positive and negative parts of the integral are comparable. Applying it to a function and its translate, we get that if we let \begin{align*} P(x) = \int_{\R^n} (u(x+y)+u(x-y)-2u(x) - u(y) - u(-y) + 2u(0))^+ \frac{(2-s)}{|y|^{n+s}} dy, \\ N(x) = \int_{\R^n} (u(x+y)+u(x-y)-2u(x) - u(y) - u(-y) + 2u(0))^- \frac{(2-s)}{|y|^{n+s}} dy. \end{align*} then $P \approx N$ in $B_1$ (meaning that their ratio is bounded above and below), and also $P(x) \approx N(x) \approx \int_{\R^n} |u(x+y)+u(x-y)-2u(x) - u(y) - u(-y) + 2u(0)| \frac{(2-s)}{|y|^{n+s}} dy.$

The concavity assumption implies that any function that is an average of values of $u$ around $x$ minus the value of $u(x)$ is a subsolution of the maximal equation $M^+_{\mathcal L}$. This applies to any second order incremental quotient, from which we can infer that it applies to pure second derivatives. However, since we won't be able to show any bound on second derivatives, what we will use is that $M^+(Lu) \geq 0$ in $B_{1-\epsilon}$ where $L$ is any linear integro-differential operator supported in $B_\epsilon$ $Lu(x) := \int (u(x+y) - u(x)) K(y) dy.$

### The estimate in $W^{s,\infty}$ (what that means)

The first step in the proof of the theorem is to show that each one of the linear operators $L^k u(x)$ is bounded. The reason is that every function $L^ku(x)$ is the nonnegative, so an argument involving an integration by parts similar to the first step in the proof for the second order case gives us that $L^k u$ is bounded in $L^1$. Furthermore $L^k u$ is the subsolution of an equation involving the maximal operator $M^+_{\mathcal L}$, so applying an integro-differential version of the half Harnack inequality gives us that every $L^ku$ is bounded.

At this point we want to take a kernel $K$ and prove that the linear integro-differential operator $Lu$ associated to $K$ is bounded. The strategy is to use Fourier transform first to imply that since $L^ku \in L^2(B_1)$ for some $k$, then also $Lu \in L^2(B_1)$. Then we use that $M^+(Lu)\geq 0$ to obtain that $Lu$ is bounded above. This can be done with a greater generality than only the kernels of $K$ in $\mathcal L_2$ or even $\mathcal L$. Only a bound above $K(y) \leq (2-s) |y|^{-n-s}$ and symmetry are needed. Using that the bound above holds for any kernel $K$, we use the equation to show that the bound below holds as well.

After this procedure (whose detailed explanation would take quite a bit more space), we have the estimate $\int_{\R^n} |u(x+y)+u(x-y)-2u(x)| \frac{2-s}{|y|^{n+s}} dy \leq C \ \text{ for all } x \in B_{3/4}.$

This estimate is not the same as $u \in C^s$ but it has the same scaling. It means that the quantities $P$ and $Q$ defined above are bounded.

Exercise 11. Let $\alpha \in (0,1]$ and $u$ be a function such that for all $x \in B_1$ $\int_{\R^n} |u(x+y)-u(x)| \frac 1 {|y|^{n+\alpha}} dy \leq C.$ Prove that $u \in C^\alpha(B_1)$, but the opposite implication does not hold.

### The estimate in $C^{s+\alpha}$

In order to show that $u \in C^{s+\alpha}$ for some $\alpha>0$, we show that the quantity $P$ is bounded by $P(x) \leq C |x|^\alpha.$ Since $N \approx P$, this also implies the same type of estimate on $N$. Furthermore, since $(-\Delta)^{s/2} u(x) - (-\Delta)^{s/2} u(0) = P(x) - N(x)$, we will be obtaining that $(-\Delta)^{s/2} u \in C^\alpha$, which classically implies that $u \in C^{s+\alpha}$.

The estimate on $P$ is proved though an iterative improvement on the maximum of $P$ on dyadic balls $B_{2^{-k}}$.

For the proof, we start by scaling the equation to the unit ball and so that $P \leq 1$ in $B_1$. We want to show that $P \leq 1-\theta$ in $B_{1/2}$. Let $x_0 \in B_{1/2}$ be such that $P(x_0) = \max_{B_{1/2}} P$. Define $A$ to be the set $\{ y : (u(x_0+y)+u(x_0-y)-2u(x_0) - u(y) - u(-y) + 2u(0)) > 0 \}$. This set $A$ plays the role of the projection matrix $\{a_{ij}\}$ in the proof of the second order case. In the second order case, $\{a_{ij}\}$ is the orthogonal projection matrix to the space of directions where $u(x_0+\cdot)-u(\cdot)$ has positive second derivatives. In this case, $A$ is the set of directions $y$ for which the second order incremental quotient $(u(x_0+y)+u(x_0-y)-2u(x_0) - u(y) - u(-y) + 2u(0))$ is positive. In particular we have \begin{align*} P(x_0) = \int_{A} (u(x_0+y)+u(x_0-y)-2u(x_0) - u(y) - u(-y) + 2u(0)) \frac{(2-s)}{|y|^{n+s}} dy, \\ N(x_0) = \int_{\R^n \setminus A} (u(x_0+y)+u(x_0-y)-2u(x_0) - u(y) - u(-y) + 2u(0)) \frac{(2-s)}{|y|^{n+s}} dy. \end{align*} We define the function $v$ as the linear integro differential operator which integrates in the directions in $A$ for all point $x \in B_1$ (in particular $P(x_0)=v(x_0)$). $v(x) := \int_{A} (u(x+y)+u(x-y)-2u(x) - u(y) - u(-y) + 2u(0)) \frac{(2-s)}{|y|^{n+s}} dy.$

With these functions $P$, $N$ and $v$ we recreate the proof that we explained above for the second order case. the function $v$ is a subsolution of $M^+_{\mathcal L} v \geq 0$ (at least if we neglect the errors that come from outside of the ball $B_1$). If $v$ is smaller than $(1-C\theta)$ in a substantial proportion of $B_1$, then $v \leq (1-\theta)$ in all $B_{1/2}$ and we obtain the desired inequality $\max_{B_{1/2}} P \leq (1-\theta)$.

The bad case is if $v$ is larger than $(1-C\theta)$ in most of $B_1$. In that case we consider the operator which integrates on the directions which are not in $A$. $w(x) := \int_{\R^n \setminus A} (u(x+y)+u(x-y)-2u(x) - u(y) - u(-y) + 2u(0)) \frac{(2-s)}{|y|^{n+s}} dy.$

As in the second order case, we observe that $0 \leq P - v \leq C\theta$ and $P-N = v+w$, then $w \approx -N \approx -P$ is a negative quantity in most of $B_1$. Using half Harnack this would imply that $w(0)<0$, but that is a contradiction because $w(0)=0$.

Exercise 12 (*). Prove the following generalization of the theorem to variable coefficients. If $u$ is a bounded viscosity solution to $\inf_k \int_{\R^n} (u(x+y) - u(x)) K^k(x,y) dy = 0 \text{ in } B_1,$ where $K(x,\cdot) \in \mathcal L_2$ for every $x$, $s>1$, and $K$ is Holder continuous respect to $x$, then $u \in C^{s+\alpha}(B_{1/2})$ for some $\alpha>0$ and $\|u\|_{C^{s+\alpha}(B_{1/2})} \leq C \|u\|_{L^\infty(B_1)}.$ The constants $C$ and $\alpha$ depend on $\lambda$, $\Lambda$ and dimension $n$ only (not on $s$).

In [10], only the case of translation invariant equations is considered (i.e. $x$ independent kernels). The $C^{1,\alpha}$ regularity result was extended to variable coefficients in [13]. For second order fully nonlinear equations, the corresponding result can be found in [11]. However, the statement of Exercise 11 has not been proved anywhere yet. It is not clear how difficult it would be. In [13] the regularity obtained is always lower than the degree of the equation. This is because of a subtle difficulty in handling the error from outside of the ball in the iteration.

# Lecture 6

## The obstacle problem

The obstacle problem consists in finding the smallest super-solution to a non local equation which is constrained to remain above a given obstacle $\varphi$. The equation reads \begin{align*} &Lu \leq 0 \ \text{ in } \Omega, \\ &u \geq \varphi \ \text{ in } \Omega, \\ &Lu = 0 \ \text{ wherever } u>\varphi. \end{align*}

In order to have a well posed problem, it should be complemented with a boundary condition. For example, we can consider the Dirichlet condition $u=\varphi$ in $\R^n \setminus \Omega$.

The definition of the problem already tells us how to prove the existence of the solution: we use Perron's method. Under reasonable assumptions on the non local operator $L$, it is possible to prove the uniqueness of solutions as well.

The regularity of both the solution $u$ and the free boundary $\partial \{u=\varphi\}$ is a much more delicate subject. It is in fact not well understood for generic non local operators $L$, even linear.

Note that the equation can be written as a Bellman equation, $\max(Lu,\varphi-u) = 0 \ \text{ in } \Omega.$

The equation models a problem in stochastic control known as the optimal stopping problem. We follow a Levy process $X(t)$ with generator $L$ (assume it is linear). We are allowed to stop at any time while $X(t) \in \Omega$ but we must stop if at any time $X(t) \notin \Omega$. Whenever we stop at a time $\tau$, we are given the payoff $\varphi(X(\tau))$. Our objective is to maximize the expected payoff by choosing the best stopping time $u(x) := \sup_{\text{stopping time }\tau} \mathbb E \big[ \varphi(X(\tau)) \vert X(0)=x \big].$ The stopping time $\tau$ must fulfill the probabilistic definition of stopping time. That is, it has to be measurable with respect to the filtration associated with $X$. In plain words, our decision to stop or continue must be based only on the current position $X(t)$ (the past is irrelevant from the Markovian assumption and the future cannot be predicted).

Let us explain heuristically why $u$ solves the obstacle problem. For every $y \notin \Omega$, we are forced to stop the process immediately, thus naturally $u(x) = \varphi(x)$. If $x \in \Omega$, we have the choice to either stop of follow the process. If we choose to stop, we get $u(x)=\varphi(x)$. If we choose to continue, we get $Lu(x)=0$. Moreover, we make those choices because the other choice would give us a worse expectation. That is, if we choose to stop at $x$ then $Lu(x) \leq 0$, and if we choose to continue at $x$ then $u(x) \geq \varphi(x)$. These are all the conditions that define the obstacle problem.

The regularity is quite delicate even in the simplest cases. We will start by only considering the case $L = -(-\Delta)^s$ for $s \in (0,1)$. Moreover, we will take $\Omega = \R^n$ to avoid issues concerning the boundary. The problem is well posed in the full space $\R^n$ if $\varphi$ is compactly supported and $n > 1$. The content of this lecture uses only tools from potential analysis and is based on [14]. From the analogy between the obstacle problem for the fractional Laplacian and the thin obstacle problem it is possible to derive refined estimates and study the regularity of the free boundary [15].

### Lipschitz regularity and semiconvexity

The next two propositions hold for generic non local operators $L$. In fact not even the linearity of $L$ is used, only the convexity of $L$ is necessary for the second proposition (the one on semiconvexity).

The first proposition says in particular that if $\varphi$ is Lipschitz, then also $u$ is Lipschitz with the same seminorm.

Proposition. Assume $\varphi$ has a modulus of continuity $\omega$, then also $u$ has the modulus of continuity $\omega$.

Proof. We know that for all $x$ and $y$ in $\R^n$, $\varphi(x+y) + \omega(|y|) \geq \varphi(x).$ Since $u \geq \varphi$, we also have $u(x+y) + \omega(|y|) \geq \varphi(x).$ Fixing $y$, we can take the left hand side of the inequality as a function of $x$, and we realize it is a supersolution of the equation which is above $\varphi$. Therefore, from the definition of the obstacle problem, it is also above $u$ (recall that $u$ was the minimum of such supersolutions). Then $u(x+y) + \omega(|y|) \geq u(x).$ The fact that this holds for any values of $x,y \in \R^n$ means that $u$ has the modulus of continuity $\omega$. $\Box$

The following proposition implies that if $\varphi$ is smooth, then $u$ is semiconvex in the sense that $D^2 u \geq -C \, \mathrm{I}$. By this inequality we mean that the function $u(x) + C \frac{|x|^2}2$ is convex.

Proposition. Assume $D^2 \varphi \geq -C \, \mathrm I$. Then also $D^2 u \geq -C \, \mathrm I$.

Proof. For any $x,y \in \R^n$ we have $\varphi(x+y) + \varphi(x-y) - 2 \varphi(x) \geq -C |y|^2.$ This can be rewritten as $\frac{\varphi(x+y)+\varphi(x-y)+C|y|^2} 2 \geq \varphi(x).$ Since $u \geq \varphi$, $\frac{u(x+y)+u(x-y)+C|y|^2} 2 \geq \varphi(x).$ This is the point in which the convexity of the equation plays a role. Notice that the obstacle problem can be written as a Bellman equation. It is, itself, a convex problem, meaning that the average of two solutions is a super-solution. Therefore, the left hand side of the inequality is a super-solution above $\varphi$, and therefore must be larger than $u$. $\frac{u(x+y)+u(x-y)+C|y|^2} 2 \geq u(x).$ This is precisely the fact that $D^2 u \geq -C \, \mathrm I$. $\Box$

### $C^{2s}$ regularity

For the classical Laplacian, the optimal regularity is $C^{1,1}$, which was originally proved by Frehse. The proof goes like this

• On one hand, we have the semiconvexity: $D^2 u \geq -C \, \mathrm I$.
• On the other hand, we have from the equation: $\Delta u \leq 0$.

These two things combined give an estimate $|D^2 u| \leq C$.

A similar argument works for the fractional Laplacian to prove $u \in C^{2s}$. However, this regularity is not optimal as soon as $s<1$. Instead $u \in C^{1+s}$ would be the optimal one. The argument goes like this.

• On one hand, $u$ is bounded and semiconvex: $D^2 u \geq -C \, \mathrm I$ and $u \in L^\infty$. Therefore $(-\Delta)^s u \leq C$.
• On the other hand, we have from the equation: $(-\Delta)^s u \geq 0$.

These two things combined say that $|(-\Delta)^s u| \leq C$. This is almost like $u \in C^{2s}$.

The precise estimate $u \in C^{2s}$ follows with a little extra work, but we will not need it.

Exercise 13. Let $u$ be a function such that for some constant $C$, $|u| \leq C$, $D^2 u \geq -C \, \mathrm I$ and $(-\Delta)^s u \geq 0$ in $\R^n$. Prove that there is a constant $\tilde C$ (depending only on $C$) so that for all $x \in \R^n$, $\int_{\R^n} \frac{|u(x+y) - u(x)|}{|y|^{n+2s}} dy \leq \tilde C.$ Recall from Exercise 11 that the conclusion above implies that $u \in C^{2s}$.

### $C^{2s+\alpha}$ regularity

The next step is to prove that $u$ is a little bit more regular than $C^{2s}$. We will now show that $w := (-\Delta)^s u$ is $C^\alpha$ for some $\alpha>0$ small (the optimal $\alpha$ will appear in the next section).

Note that the function $w$ solves the following Dirichlet problem for $(-\Delta)^{1-s}$, \begin{align*} (-\Delta)^{1-s} w &= -\Delta \varphi \ \text{ inside } \{u=\varphi\},\\ w &= 0 \ \text{ in } \{u > \varphi\}. \end{align*}

Here is an important distinction between the cases $s=1$ and $s<1$. If $s=1$, the function $w = \Delta u$ does not satisfy any useful equation, whereas here we can obtain some extra regularity from the equation for $w$.

We can observe here a heuristic reason for the optimal regularity. If we expect that $w$ will be continuous and the contact set $\{u=\varphi\}$ will have a smooth boundary, then the regularity of $w$ on the boundary $\partial \{u=\varphi\}$ is determined by the equation and should be $w \in C^{1-s}$. This corresponds precisely to $u \in C^{1+s}$. The proof will be long though, because we do not know a priori either that $\{u=\varphi\}$ has a smooth boundary or that $w$ is continuous across $\partial \{u=\varphi\}$.

It is convenient to rewrite the problem in terms of $u-\varphi$ instead of $u$. We will abuse notation and still call it $u$, which now solves an obstacle problem with obstacle $\varphi=0$ but with a non zero right hand side $\varphi$. \begin{align*} u &\geq 0 \ \text{ in } \R^n, \\ (-\Delta)^s u &\geq \phi \ \text{ in } \R^n, \\ (-\Delta)^s u &= \phi \ \text{ wherever } u>0. \end{align*} We also call $w = (-\Delta)^s u$.

Recall that $D^2 u \geq -C \, \mathcal I$ for some constant $C$. In particular $(-\Delta)^{1-s} w \leq C$ for some $C$.

In this section, we will show that $w \in C^\alpha$ for some small $\alpha>0$, which implies that $u \in C^{2s+\alpha}$. We will prove it with an iterative improvement of oscillation scheme. That is, we will show that for any point $x$, $\mathrm{osc}_{B_{2^{-k}}(x)} w \leq C (1-\theta)^k,$ for some $\theta>0$.

The location of $x$ is irrelevant, so let us center these balls at the origin. Naturally, the most delicate position to achieve regularity is the free boundary, so let us concentrate on the case $0 \in FB$.

The following lemma plays an important role in the proof.

Lemma. Given $\mu>0$ and $u$ a semiconvex function (i.e. $D^2 u > -C \, \mathrm I$), if $u(x_0) \geq \mu r^2$ at some point $x_0 \in B_r$, then $|\{ x \in B_{2r} : u(x) > 0 \} | \geq \delta |B_r|.$ Here, the constant $\delta$ depends on $\mu$ and $C$ only.

We leave the proof as an exercise.

The right hand side $\phi$ is a smooth function. We will use in particular that it is Lipschitz. By an appropriate scaling of the solution (i.e. replacing $u(x)$ and $w(x)$ by $cr^{-2s}u(rx)$ and $cw(rx)$ with $r \ll 1$ and $c \mathrm{osc} \ w = 1$), we can assume that \begin{equation*} \tag{7} \begin{aligned} D^2 u &\geq - \varepsilon \, \mathrm I, \\ (-\Delta)^{1-s} w &\leq \varepsilon, \\ |\phi(x) - \phi(y)| &\leq \varepsilon |x-y|, \text{ for all } x,y, \\ \mathrm{osc}_{\R^n} w &\leq 1. \end{aligned} \end{equation*} Here $\varepsilon$ is arbitrarily small. All these estimates except the last one will remain valid for all scaled solutions in our iteration. Instead of the last one, we will keep the estimate along the iteration which is common for non local problems $\mathrm{osc}_{B_{2^k}} w \leq 2^{\alpha k} \ \text{ for all integers } k \geq 1.$ Here $\alpha$ is a small positive number to be determined later. We will do an iterative improvement of oscillation scheme similar to the proof of the Holder estimates which we saw in Lecture 2. In order to carry this out, we need to prove that under these assumptions $\mathrm{osc}_{B_{1/2}} w \leq (1-\theta) \ \text{ for some } \theta>0.$

We know that $(-\Delta)^{1-s} w \leq \varepsilon$. So, $w$ is a subsolution to a non local equation of order $2-2s$. There is one easy case of improvement of oscillation. Assume that $$\tag{8} | \{ w - \min_{B_1} \, w < 1/2\} \cap B_1 | \geq \delta |B_1|,$$ for some $\delta>0$. Then we can literally use the result of Lecture 2 to conclude that $w < (1-\theta) + \min_{B_1} w$ in $B_{1/2}$ and obtain the improvement of oscillation.

We will now prove that (8) is in fact always true provided that $\delta$ is chosen sufficiently small (and $0 \in FB$). If there is some point in $B_1$ where $u>0$ (this is certainly true if $0 \in FB$), then $w(x)=\phi(x)$ at some point. Note that since $w \geq \phi$ everywhere and $\phi$ has a very small oscillation in $B_1$ (bounded by an arbitrary constant $\varepsilon$), then the minimum value of $w$ in $B_1$ is approximately the same as the value of $\phi(0)$, or any other value of $\phi$ in $B_1$.

Assuming that (8) is not true, then the opposite must hold, $| \{ w - \min_{B_1} \, w > 1/2\} \cap B_1 | > (1-\delta) |B_1|.$ In particular, since $w=\varphi \approx \min_{B_1} w$ where $u>0$, then $| \{ u=0 \} \cap B_1 | > (1-\delta) |B_1|.$ From the Lemma stated above, then $u<\mu$ in $B_{1/2}$ for some $\mu$ (arbitrarily small) depending on $\varepsilon$ and $\delta$.

Let $z$ be a point where $u(z)>0$ and $|z| \ll 1$. Let $x_0$ be the point where the (positive) maximum value of $u(x) - 100 \mu |x-z|^2$ is achieved in $B_{1/2}$. Since $u < \mu$, we see that necessarily $x_0 \in B_{1/10}$. Since $u(x_0)>0$, we must have \begin{align*} \phi(x_0) &= w(x_0) = (-\Delta)^s u(x_0), \\ &= \int_{\R^n} \frac{u(x_0) - u(y)}{|x_0-y|^{n+2s}} dy, \\ &= \int_{B_{1/2}} \frac{u(x_0) - u(y)}{|x_0-y|^{n+2s}} dy + \int_{\R^n \setminus B_{1/2}} \frac{u(x_0) - u(y)}{|x_0-y|^{n+2s}} dy. \\ \end{align*} Note that the first integral is bounded by the fact that $u$ is touched from above by a parabola of opening $100 \mu$ in $B_{1/2}$. For the second integral, we use only that $u(x_0) \geq 0$, $\phi(x_0) \geq - C \mu + \int_{\R^n \setminus B_{1/2}} \frac{- u(y)}{|x_0-y|^{n+2s}} dy.$ Since we assume that (8) does not hold, we know that there will be a point $x_1$ so that $|x_1-x_0| < C \delta^{1/n}$ and $w(x_1) > \min_{B_1} w + 1/2$. In particular $u(x_1)=0$. Therefore \begin{align*} \phi(x_0) - \varepsilon + 1/2 &< w(x_1) = (-\Delta)^s u(x_1),\\ &= \int_{\R^n} \frac{- u(y)}{|x_1-y|^{n+2s}} dy, \\ &\leq \int_{\R^n \setminus B_{1/2}} \frac{-u(y)}{|x_1-y|^{n+2s}} dy. \\ &\leq \int_{\R^n \setminus B_{1/2}} \frac{-u(y)}{|x_0-y|^{n+2s}} dy + C |x_0-x_1| \end{align*} The last term is somewhat delicate since we are not keeping track of the growth of $u$ at infinity. We need to make the estimate depending on what we know of $w$ only. We omit the details of this computation.

We have obtained \begin{align*} \phi(x_0) &\geq - C \mu + \int_{\R^n \setminus B_{1/2}} \frac{- u(y)}{|x_0-y|^{n+2s}} dy,\\ \phi(x_0) - \varepsilon + 1/2 &< \int_{\R^n \setminus B_{1/2}} \frac{-u(y)}{|x_0-y|^{n+2s}} dy + C \delta^{1/n}. \end{align*} We get a contradiciton if $\mu$, $\delta$ and $\varepsilon$ are chosen small enough.

This contradiction means that the inequality (8) must always hold, and we can always carry out the improvement of oscillation.

### Almost optimal regularity

In the notation of the previous section. We have a continuous function $w$ such that \begin{align*} (-\Delta)^{1-s} w &= 0 \ \text{ inside the contact set } \{u=0\}, \\ w &= \phi \ \text{ in } \{u > 0\}. \end{align*}

The regularity of fractional harmonic functions with smooth Dirichlet conditions in smooth Domains is well understood (of course we have not established yet, nor we ever will, that $\{u=0\}$ has a smooth boundary). This can be achieved constructing appropriate barriers. The Poisson kernel of the fractional Laplacian is explicit in the upper half plane, which allows us to make direct computations. That is, if we want a function $B$ solving \begin{align*} B &= f \ \text{ in the lower half space } \{x_n \leq 0\}, \\ (-\Delta)^\sigma B &= 0 \ \text{ in the upper half space } \{x_n > 0\}, \end{align*} then $B$ is given by the formula $B(x) = \int_{\{y_n \leq 0\}} f(y) P(x,y) dy.$ Here $P$ is the explicit Poisson kernel $P(x,y) = C_{n,\sigma} \left( \frac {y_n} {x_n} \right)^{\sigma} \frac 1 {|x-y|^n}.$

We see that $B$ will be as smooth as $f$ (even locally) up to $C^\sigma$, which is the maximum regularity that can be expected in the boundary.

Using this, if the contact set $\{u=0\}$ is convex, we can easily construct barriers to prove that $w \in C^{1-s}$, and therefore $u \in C^{1+s}$.

In general we do not know whether $\{u=0\}$ is convex. However, based on the semiconvexity assumption on $u$, we can prove an almost convexity result for its level sets. This is the content of the following lemma.

Lemma. Let $w \in C^\alpha$ as before and $u(x_0)=0$. There is a small $\delta>0$ (depending on $\alpha$ and $s$) and a constant $C$ such that $x_0$ does not belong to the convex envelope of the set $\{x \in B_r(x_0) : w(x) > w(x_0) + C r^{\alpha+\delta} \}.$

The previous lemma says in some sense that the level sets of $u$ are almost convex up to a correction. This estimate allows us to construct certain barrier functions (a tedious but elementary computation) and improve the initial $C^\alpha$ estimate on $w$ to a $C^{\tilde \alpha}$, where $\tilde \alpha$ can be computed explicitly. Iterating this procedure, we can obtain a sequence of estimates which converge (but never reach) $w \in C^{1-s}$. That is, we can prove that $w \in C^\alpha$ for all $\alpha < 1-s$. In other words $u \in C^{1+\alpha}$ for all $\alpha < s$.

The optimal regularity $u \in C^{1+s}$ is also true. But we will need to develop completely different tools [15] (related to the thin obstacle problem) in order to prove it.

# References

1. 1.0 1.1 Caffarelli, Luis; Silvestre, Luis (2009), "Regularity theory for fully nonlinear integro-differential equations", Communications on Pure and Applied Mathematics (Wiley Online Library) 62 (5): 597–638, ISSN 0010-3640
2. Barles, Guy; Imbert, Cyril (2008), "Second-order elliptic integro-differential equations: viscosity solutions' theory revisited", Annales de l'Institut Henri Poincaré. Analyse Non Linéaire 25 (3): 567–585, doi:10.1016/j.anihpc.2007.02.007, ISSN 0294-1449
3. Caffarelli, Luis; Silvestre, Luis (2010), "Smooth Approximations of Solutions to Nonconvex Fully Nonlinear Elliptic Equations", Nonlinear partial differential equations and related topics: dedicated to Nina N. Uraltseva (Amer Mathematical Society) 229: 67
4. Silvestre, Luis (2006), "Holder estimates for solutions of integro-differential equations like the fractional laplace", Indiana University Mathematics Journal (Bloomington, Ind.: Dept. of Mathematics, Indiana University, c1970-) 55 (3): 1155–1174, ISSN 0022-2518
5. Silvestre, Luis (2011), "On the differentiability of the solution to the Hamilton--Jacobi equation with critical fractional diffusion", Advances in Mathematics (Elsevier) 226 (2): 2020–2039, ISSN 0001-8708
6. Silvestre, Luis (2010), "Holder estimates for advection fractional-diffusion equations", Arxiv preprint Arxiv:1009.5723
7. 7.0 7.1 Nadirashvili, Nikolai; Vladuţ, Serge (2007), "Nonclassical solutions of fully nonlinear elliptic equations", Geometric and Functional Analysis 17 (4): 1283–1296, doi:10.1007/s00039-007-0626-7, ISSN 1016-443X
8.
9. Krylov, N. V. (1982), "Boundedly inhomogeneous elliptic and parabolic equations", Izvestiya Akademii Nauk SSSR. Seriya Matematicheskaya 46 (3): 487–523, ISSN 0373-2436
10. 10.0 10.1 10.2 Caffarelli, Luis; Silvestre, Luis (2011), "The Evans-Krylov theorem for nonlocal fully nonlinear equations", Annals of Mathematics. Second Series 174 (2): 1163–1187, doi:10.4007/annals.2011.174.2.9, ISSN 0003-486X
11. 11.0 11.1 11.2 Caffarelli, Luis A.; Cabré, Xavier (1995), Fully nonlinear elliptic equations, American Mathematical Society Colloquium Publications, 43, Providence, R.I.: American Mathematical Society, ISBN 978-0-8218-0437-7
12. Caffarelli, Luis; Silvestre, Luis (2010), "On the Evans-Krylov theorem", Proceedings of the American Mathematical Society 138 (1): 263–265, doi:10.1090/S0002-9939-09-10077-1, ISSN 0002-9939
13. 13.0 13.1 Caffarelli, Luis; Silvestre, Luis (2011), "Regularity results for nonlocal equations by approximation", Archive for Rational Mechanics and Analysis 200 (1): 59–88, doi:10.1007/s00205-010-0336-4, ISSN 0003-9527
14. Silvestre, Luis (2007), "Regularity of the obstacle problem for a fractional power of the Laplace operator", Comm. Pure Appl. Math. 60: 67--112, doi:10.1002/cpa.20153, ISSN 0010-3640
15. 15.0 15.1 Caffarelli, Luis A.; Salsa, Sandro; Silvestre, Luis (2008), "Regularity estimates for the solution and the free boundary of the obstacle problem for the fractional Laplacian", Invent. Math. 171: 425--461, doi:10.1007/s00222-007-0086-6, ISSN 0020-9910