LAGRANGE MULTIPLIERS



     By now you're used to finding maxima and minima of both single- and multi-variable functions. But what if a restriction is placed on the variables? Such problems, called Constrained Optimization Problems, come up frequently in applications. Suppose

        you want to maximize production of T-shirts by allocating a fixed amount of capital between labor costs and manufacturing costs,

        or you release a weather balloon at the end of a fixed length of cable, and you want to know where it ends up,

        or you're planning an expedition to the South Pole and need to pack food - but you've got a limited amount of space on your dogsled.
 

     There’s a very powerful method, called Lagrange Multipliers, that works in a wide variety of such problems. Let’s set up an idealized mathematical model as an example by taking 'a hike in the mountains':

  Problem: Find the maximum and minimum values of

$z  =  f(x, y)  =  y^2 - x^2$

subject to the constraint

$g(x,\,y)\,=\,x^2+y^2-4\,=\,0$.

The graph of $f$ is the familiar hyperbolic paraboloid shown to the right, while the graph of $g$ is a circle of radius $2$ centered at the origin in the $xy$-plane. The role of $g$ is to place restrictions on the values of $x,\,y$ used to evaluate $f(x,\,y)$. Without restrictions on $x, y$ there'd be no maximum or minimum value of $f(x,\,y).$ There'd only be the saddle point!

Don't see any hiking?

Well, the trouble is that the graph of $f$ is a surface in 3-space, while the graph of $g$ is a circle in the $xy$-plane. To interpret the graph of 

$g(x,\,y) = x^2+y^2 - 4 = 0$ 

as a surface in $3$-space we simply think of it as a cylinder.

     Ah, by viewing both graphs as surfaces, imposing the constraint $g(x,\,y)= 0$ on the input variables means we look only for the maximum and minimum values of $f(x,\,y)$ where the surface graphs of $f$ and $g$ intersect, and that means asking for the maximum and minimum heights of the path created by the curve of intersection!! These obviously exist as you'd find out soon enough by taking a hike along this path. No need to go to the South Pole now.



    Actually, two different max/min methods from one variable could be used. Let's see how!


   Method I: solving for $y^2$ in $g$ gives $y^2 = 4 - x^2$, so on $g$, 

$f(x,\,y) = y^2 - x^2 = 4 - 2 x^2$,

which is now a function of one variable, $x$, whose critical point we can find:

$ f'(x) = -4 x = 0$,

 i.e., $x = 0$, in which case $y = \pm 2$. But

$f(0,\,-2) = 4 = f(0,\,2)$,

giving a maximum value because $f''(0) = -4 < 0$. On the other hand, if $x$ is eliminated instead and the same method is used, we obtain a minimum value

 $f(-2,\, 0) = -4 = f(2,\, 0).

   Method II: the circle $g$ can be parameterized by

 r$(t) = (2 \cos t, \, 2 \sin t)$.

 Since we want to maximize and minimize $f$ only along this curve, we can plug into $f(x,\,y)$:

 $f($r$(t)) = (2\sin t)^2 - (2 \cos t)^2 = 4(\sin^2 t - \cos^2 t)$.

giving us a space curve with parameter $t$. But we can now optimize $f($r$(t))$ by finding critical points:

$\displaystyle{\frac{d}{dt} f(}$r$(t)) = 16 \sin t \cos t = 0\,,$

i.e., $t = 0,\, \frac{\pi}{2},\, \pi,\, \frac{3\pi}{2}$, and it's easy to check that on $g$ the minimum value of $f$ occurs at $t = 0, \pi$, while the maximum value occurs at $t = \frac{\pi}{2},\, \frac{3\pi}{2}$. How?


 

     Whichever of these two methods we use, therefore, the end-result is still the same as we saw earlier with the surface approach: the minimum values occur at $(\pm 2,\,0)$ and the maximum values at $(0,\, \pm 2)$. The surface approach does make clearer what's happening, but why bother? Well, Method I depended on solving the constraint equation $ g = 0$ to eliminate all but one variable, while Method II needed a parameterization of $g = 0$. Usually we aren't so lucky, especially when $f, \, g$ are functions of more than 2 variables!

    By adopting the surface approach, we pull together a great deal of what we've learned in one very powerful method. Below to the left is the path of intersection shown in orange of the surface graphs of $f, g$ with a few key points identified. While in the $xy$-plane to the right the graph of $g$, shown in black, has been layered over the contour map of the graph of $f$, shown in blue, with the corresponding key points identified.

 



    Clearly the maximum and minimum values of $f(x, y)$ subject to the constraint $g = 0$ occur where contours of $f$ are tangential to the constraint curve $g = 0$. That makes perfectly good sense: when we are at the highest or lowest point on the path, like $P$ and $R$, we'll be walking in the same direction as the contour through that point - any other direction would mean we're still going uphill or downhill, as at $Q$!! How can this be expressed in derivative terms: well, recall that the gradient vector at a point is perpendicular to the contour through that point, so the contour to $f$ will be tangential to the graph of $g = 0$ when $(\nabla f)(a, b)$ is parallel to $(\nabla g)(a, b)$. And when are these two vectors parallel? It happens when $(\nabla f)(a, b)$ is a multiple of $(\nabla g)(a, b)$. This multiple is usually denoted by $\lambda$, and in practice it has an important meaning. Summarizing:


   Method of Lagrange Multipliers: the maximum and minimum values of
       $z = f(x, y)$ subject to the constraint $g(x, y) = 0$ occur at a point $(a, b)$ for
       which there exists $\lambda$ such that 
$(\nabla f)(a, b) = \lambda (\nabla g)(a, b), \,\,\,\,\, g(a, b) = 0.$

Since $\nabla f, \nabla g$ are well-defined when $w = f(x, y, z)$ and $g(x, y, z)$ are functions of $3$ variables (or any greater number of variables for that matter), the Method of Lagrange Multipliers works with any number of variables $x, y, z, ... .$ It is this in part that makes it so important in so many applications in engineering, science and business. Let's return to the ones mentioned at the beginning.