PARTIAL DIFFERENTIATION, TANGENT PLANES etc


     For a function $y = f(x)$ of one variable the derivative $f'(a)$ was defined as the limit

$\displaystyle{\lim_{h\,\to\,0} \frac{f(a+h) - f(a)}{h}}$

of the Newtonian Quotient. The value of $f'(a)$ was the rate of change of $f(x)$ at $x = a$. It was interpreted, for instance, as the limit of the slope of secant lines passing through the point $(a, f(a))$ as
shown to the right in the case of a parabola and $a = -1$. Knowing a few derivatives, we computed many more using the Product, Quotient, and Chain Rules. Via the Point Slope formula the equation of the tangent line at $(a, f(a))$ became

$y = f(a) + f'(a)(x-a).$

This provided a Linearization, $L(x) = f(a) + f'(a)(x-a)$,
of $f$ that was useful in various estimates. In addition, first and second order derivatives turned out to be very helpful with determing graphs and with optimization. 

    So how can we deal with functions $z = f(x, y)$ of two variables? Given all that we've done with surfaces, slicing and vectors, it should come as no surprise that we employ these ideas to the full, just one variable at a time. Partial derivatives, $f_x, f_y, ... $ are the rates of change of $f$ with respect to each variable separately:

$\displaystyle{f_x(a, b) = \frac{\partial f}{\partial x}\Big|_{(a, b)} = \lim_{h \, \to \, 0} \frac{f(a+h, b) - f(a, b)}{h}, \quad  f_y(a, b) =  \frac{\partial f}{\partial y}\Big|_{(a, b)} = \lim_{k \, \to\, 0} \frac{f(a, b+k) - f(a, b)}{k}. }$  

In other words, we differentiate with respect to one variable exactly as in the one variable case, holding all the other variables fixed. All the standard techniques of single variable calculus thus apply. And once we've understood what it all means for functions $f(x, y)$ of two variables, then functions $f(x, y, z, ... )$ of more than two variables can be dealt with in exactly the same way - just don't expect to look at 'pictures' of things for functions of three or more variables!
Let's see what it means geometrically: take vertical slices in the $x$ and $y$-directions passing through $P$. Then the trace of the surface on these slices are the graphs of the respective vector functions
 
r${}_1(x) = \big\langle x, b, f(x, b)\big \rangle,       $r${}_2(y) = \big\langle a, y, f(a, y)\big \rangle$,

both shown in orange on the surface.
The vector derivatives


T${}_1  =  $r$'_1(a)  = \big \langle 1, 0, f_x(a, b) \big \rangle =  $ i $+ f_x(a, b)$ k,
  T${}_2  =  $
r$'_2(b)  = \big \langle 0, 1, f_y(a, b) \big \rangle =  $
 j $+ f_y(a, b)$ k,   

are tangent vectors at $P$ to these space curves, and the tangent plane at $P$ to the surface is the one containing these two tangent vectors. After a bit of calculation using i $\times$ k $= -$ j etc we then get

n = T${}_1 \times$ T${}_2 =  -f_x(a,b) $i  $- f_y(a, b)$ j  $+$ k

for the normal to the surface at $P$.

Ah, now I see! The tangent line in 2-space becomes the tangent plane in 3-space and this is where the normal vector n comes in: at the point $P = (a, b, f(a, b))$ on the surface an equation for the tangent plane is

$\big \langle x - a, y - b, z - f(a, b) \big \rangle\cdot $n $ = 0$.      i.e.,      $z  =  f(a, b) + (x-a)f_x(a, b) + (y-a)f_y(a, b)$.

So 
near $P$ the Linearization of $f$ will be

$L(x, y)  =  f(a, b) + (x-a) f_x(a, b) + (y-a) f_y(a, b)$,

while the Change in $f$ will be

$\Delta f  =  f(x, y) - f(a, b)  \approx  L(x, y) - f(a, b)  =  (x-a) \Delta x + (y-b) \Delta y$.

The pattern is just the same as for one variable, making allowances for the appearance of the second variable. And the corresponding formulas for functions of more than two variables will follow exactly this pattern. Thus we are set to exploit differential calculus in several variables in a completely analogous way to the one variable case.

      What about the Chain Rule? In one variable it refers to the derivative of the composition of two functions of one variable and says that
if $y = f(x)$ and $x = x(t)$, then $y = f(x(t))$ is a function of $t$ such that

$\displaystyle{\frac{d y }{d t}  =  \frac{d f}{d x} \frac{d x}{d t}}$.

 In two variables there is a similar formula, but there are more possibilities. The General Version of the Chain Rule starts with a function $z = f(x, y)$ of two variables and says that if $x = x(s, t)$ and $y = y(s, t)$ are themselves functions of two variables $s, t$, then the composition $z  = f(x(s, t), y(s, t))$ is a function of $s, t$ and

$\displaystyle{\frac{\partial z}{\partial s}   =   \frac{\partial f}{\partial x} \frac{\partial x}{\partial s} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial s},}                  $\displaystyle{\frac{\partial z}{\partial t}   =   \frac{\partial f}{\partial x}\frac{\partial x}{\partial t} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial t}.}$        

The pattern should be clear.  For instance, if $w = f(x, y, z)$ and $x = x(r, s, t),  y = y(r, s, t),  z = z(r, s, t)$, then

$w =   f(x(r, s, t), y(r, s, t), z(r, s, t))$

is a function of $r, s, t$ while

$\displaystyle{\frac{\partial w}{\partial r}   =   \frac{\partial f}{\partial x}\frac{\partial r}{\partial r} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial r} + \frac{\partial f}{\partial z} \frac{\partial z}{\partial r},}$

and so on; it looks like dot products really are involved in some way!

      Why might we be interested in such compositions? Well, earlier we looked at the idea of using both Cartesian and Polar coordinates in the plane. In this case $x = x(r, \theta) = r \cos \theta$ and  $y = y(r, \theta) = r \sin \theta
$, so under the change of variable from Cartesian to Polar coordinates, a function $z = f(x, y)$ becomes a function $z = f(x(r, \theta), y(r, \theta))$ of polar variables and

$\displaystyle{\frac{\partial z}{\partial r}    =    \frac{\partial f}{\partial x}\frac{\partial x}{\partial r}+ \frac{\partial f}{\partial y} \frac{\partial y}{\partial r}    =    \cos \theta \frac{\partial f}{\partial x} + \sin \theta \frac{\partial f}{ \partial y}, }$

 $\displaystyle{\frac{\partial z}{\partial \theta}    =    \frac{\partial f}{\partial x} \frac{\partial x}{\partial \theta} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial \theta}    =    -r \sin \theta \frac{\partial f}{\partial x} + r \cos \theta \frac{\partial f}{\partial y}}$.

Conversely, if we start with a function $z = f(r, \theta)$ and change to Cartesian coodinates, what will be the corresponding formulas for ${\partial z}\big/{\partial x}$ and ${\partial z}\big/{\partial y}$? In 3-space also there is often the need to change from Cartesian to other sets of coordinates (or the other way round), so here too there will be a need for the Chain Rule, this time for functions of three variables.

      A 'Less General Version' of the Chain Rule would be the case when $ z = f(x, y)$ is a function of two variables, but $x = x(t), y = y(t)$ are just functions of a single variable, say $t$. Then $z = f(x(t), y(t))$ is simply a function of $t$, so that we can ask for $z'(t)$. The Chain Rule is basically the same:

$\displaystyle{z'(t)    =    \frac{d z }{dt}    =   \frac{\partial f}{\partial x} \frac{d x}{d t} + \frac{\partial f}{\partial y} \frac{dy}{{dt}}   =   x'(t) \frac{\partial f}{\partial x} + y'(t) \frac{\partial f}{\partial y}$.

Again this looks like a dot product, and the next few sections will show exactly why it is and why it too occurs very naturally.

     An 'Even Less General Version' of the Chain Rule would be a case that really occured in single variable calculus except that now we can be 'cleverer' by regarding an equation $f(x, y) = 0$ as defining $y$ implicitly as a function of $x$. In all previous examples in single variable calculus there was an explicit equation like $x^2 + y^2 = 1$ relating $x$ and $y$, and the single variable Chain Rule was then used to compute $dy \big/ dx$. But now if the 'Less General Version' of the Chain Rule is used with $t = x$, then $f(x, y(x)) = 0$ and

$\displaystyle{\frac{d f}{dx}   =   \frac{\partial f}{\partial x}\frac{d x}{ d x} + \frac{\partial f}{\partial y} \frac{d y }{dx}  =  \frac{\partial f}{\partial x} + \frac{\partial f}{\partial y}\frac{d y}{d x}   =   f_x + f_y \frac{d y}{d x}  =   0 }$,             i.e.,  $\displaystyle{\frac{d y}{d x}   =   - \frac{f_x}{f_y}}$.