DG

Differentiable functions

Definition: For a function f:R -> R to be differentiable at a point x it is enough to require the existence of the limit of the slopes of secants between x and x+dx as dx -> 0. The above limit is called the derivative of f at x, written f'(x)=lim [f(x+dx)-f(x)]/dx.

This idea goes back to Barrow who constructed tangent lines as limits of secants (see Barrow's differential triangle).

For functions of more than one variable, say f:R2 -> R such a limit is unlikely to exist since the slope of a tangent line would naturally depend on how you approach a point. If you consider secants between (x, y) and (x+dx, y+dy) for small dx and dy, their slopes would normally vary quite a bit depending on the relative sizes of dx and dy, i.e. on direction of the vector (dx, dy).

What we really want then is a generalization of the tangent line to the tangent plane (or a higher dimensional linear manifold for functions of more variables).

What does tangent mean?

The calculus answer is through linear approximation. The tangent line should be a ``good'' linear approximation to f near x.

For functions of, say, 2 variables the tangent plane should be a ``good'' linear approximation to f near (x, y).

What does ``good'' mean?

Good means the graph approaches the tangent line like a parabola approaches the x axis near the origin - really fast. In other words, the error of approximation gets small ``really'' fast.

Linear approximation

Suppose L is a line through the point (x, f(x)).

If L not vertical, L(x+dx)=f(x)+m dx, where m is its slope.

The error between L and f is \epsilon = L(x+dx)-f(x+dx) = f(x)+m dx-f(x+dx).

As dx -> 0, if f(x+dx) -> f(x), i.e. if f is continuous at x, then \epsilon -> 0.

To see how fast \epsilon goes to 0 compared to dx, we can take the ratio of the two \epsilon/dx = m-[f(x+dx)-f(x)]/dx.

We can say that \epsilon goes to zero ``faster'' than dx whenever the ratio between them goes to 0 as dx -> 0.

Definition: We say that f is differentiable at x, if \epsilon/dx -> 0 as dx -> 0.

(For functions of one variable this is equivalent to the previous definition.)

If \epsilon/dx -> 0 as dx -> 0, then m=f'(x), and with this slope L is exactly the tangent line L(x+dx)=f(x)+f'(x) dx.

Thus, f(x+dx)=f(x)+f'(x) dx+\epsilon.

Definition: The function f'(x) dx is linear in dx and is called the differential of f (written df).

The same idea works for functions of several variables.

Suppose f: R2 -> R and L is a plane through the point (x, y, f(x, y)).

If L is not vertical, L(x+dx, y+dy)=f(x, y)+m dx+n dy, where m and n are the slopes in the x and y directions.

The error is \epsilon=L(x+dx, y+dy)-f(x+dx, y+dy)=f(x, y)+m dx+n dy-f(x+dx, y+dy).

To express the notion of tangency we look at the ratio of error to the magnitude of displacement dr=|(dx, dy)|.

Definition: We say that f is differentiable at (x, y), if \epsilon/dr -> 0 as dr -> 0.

We can determine the values of m and n in the equation for L which make L tangent to the graph of f by alternately letting x or y stay consant, i.e. by letting the displacement (dx, dy) approach (0, 0) along horizontal or vertical directions.

(For functions with more variables, keep all but one variable constant at a time.)

If y is constant, i.e. dy=0, then dr=\pm dx, so \pm \epsilon/dr=m-[f(x+dx, y)-f(x, y)]/dx.

If \epsilon/dr -> 0, then m=lim [f(x+dx, y)-f(x, y)]/dx, which is known as the partial derivative of f with respect to x (written fx).

Similarly, by keeping x constant we can deduce that n=fy(x, y).

Thus, f(x+dx, y+dy)=f(x, y)+fx(x, y) dx+fy(x, y) dy+\epsilon.

Again the differential df is the linear function of dx and dy:

df=fx dx+fy dy=(fx, fy)·(dx, dy).

Definition: The vector (fx, fy) is called the gradient of f (written \nabla f).

We may write the formula for df also in matrix notation:

df = [ fx fy ] [ dx ]
               [ dy ]
Definition: The matrix [ fx fy ] is called the derivative matrix (or the Jacobian matrix), written D(f).

Matrix notation is particularly useful when the range of f is not just R but a higher dimensional space, say R2. In this case the derivative matrix has more rows - gradients of the components of f.

Theorem: If the partial derivatives of f exist and are continuous in a neighborhood of (x, y), then f is differentiable at (x, y).

Last updated: Jun 17 18:53 / Last fetched: Tue Dec 2 08:18:17 CST 2008