Week 2: Differentiability#
Key Terms#
Vector functions of multiple variables
Directional derivative
Differentiability
The Jacobian matrix, the gradient vector
The chain rule
The Hessian matrix
Preparation and Syllabus#
Long Day: The rest of Chapter 3
Short Day: Theme Exercise 1
Python demo for week 2
Exercises – Long Day#
1: Level Curves and Directional Derivative for Scalar Functions#
A function \(f:\mathbb{R}^2\rightarrow\mathbb{R}\) is given by the expression
Another function \(g:\mathbb{R}^2\rightarrow\mathbb{R}\) is given by the expression
Question a#
Decribe the level curves given by \(f(x,y)=c\) for the values \(c\in\{1,2,3,4,5\}\).
Hint
Remember the circle equation: \((x-a)^2+(y-b)^2=r^2\).
Svar
The level curves are circles that all are centred at \((0,0)\). Their radii are, respectively, \(1,\,\sqrt 2,\,\sqrt 3,\,2,\,\sqrt{5}\,\).
Question b#
Determine the gradient of \(f\) at the point \((1,1)\) and determine the directional derivative of \(f\) at the point \((1,1)\) in the direction that is given by the unit direction vector \(\pmb{e}=(1,0)\).
Answer
\(\nabla f(1,1)=(2,2)\). The directional derivative is the inner product (the dot product) of the gradient and the given direction vector:
Question c#
Describe the level curves given by \(g(x,y)=c\) for the values \(c \in\{-3,-2,-1,0,1\}\).
Hint
Again, remember the circle equation: \((x-a)^2+(y-b)^2=r^2\).
Answer
We give the result for the first one: Since
then the level curve is a circle centred at \((2,0)\) with radius 1. The other level curves are also circles with the same centre but with different radii.
Question d#
Compute the gradient of \(g\) at the point \((1,2)\) and compute the directional derivative of \(g\) at the point \((1,2)\) in the direction towards the origin \((0,0)\).
Answer
We begin with the gradient:
Hint
Now, we need a unit vector pointing from \((1,2)\) towards the origin.
Hint
We can use the direction vector \((-1,-2)\) but it has to be normed, meaning it must be shortened to a length/norm of 1.
Answer
The wanted unit vector is achieved by dividing the suggested direction vector by its own norm, meaning:
When we then calculate the inner product of \(\pmb{v}\) and the gradient \(\nabla g(1,2)\), we get the directional derivative
2: Jacobian Matrices for Different Functions#
We define functions below of the form \(\pmb{f}: \mathbb{R}^n \to \mathbb{R}^k\), where \(n\) and \(k\) can be read from the functional expression.
Question a#
Let \({f}(x_1, x_2, x_3) = x_1^2x_2 + 2x_3\). Compute the Jacobian matrix \(J_{f}(\pmb{x})\) and evaluate it at the point \(\pmb{x} = (1, -1, 3)\). Confirm that the Jacobian matrix of a scalar function of multiple variables only has one row.
Let \(\pmb{f}(x) = (3x, x^2, \sin(2x))\). Compute the Jacobian matrix \(J_{\pmb{f}}(x)\) and evaluate it at the point \(x = 2\). Confirm that the Jacobian matrix of a vector function of one variable only has one column.
Let \(\pmb{f}(x_1, x_2) = (x_1^2, -3x_2, 12x_1)\). Compute the Jacobian matrix \(J_{\pmb{f}}(\pmb{x})\) and evaluate it at the point \(\pmb{x} = (2, 0)\).
Let \(\pmb{f}(x_1, x_2, x_3) = (x_2 \sin(x_3), 3x_1x_2 \ln(x_3))\). Compute the Jacobian matrix \(J_{\pmb{f}}(\pmb{x})\) and evaluate it at the point \(\pmb{x} = (-1, 3, 2)\).
Let \(\pmb{f}(x_1, x_2, x_3) = (x_1 e^{x_2}, 3x_2 \sin(x_2), -x_1^2 \ln(x_2 + x_3))\). Compute the Jacobian matrix \(J_{\pmb{f}}(\pmb{x})\) and evaluate it at the point \(\pmb{x} = (1, 0, 1)\).
Question b#
All functions from the previous question are differentiable. How can one argue for this? For which of the functions can we determine the Hessian matrix? Compute the Hessian matrix of the functions for which it is defined.
Question c#
Let \(\pmb{v} = (1,1,1)\). Normalise the vector \(\pmb{v}\) and denote the result by \(\pmb{e}\). Check that \(||\pmb{e}||=1\). Compute the directional derivative of the scalar function \({f}(x_1, x_2, x_3) = x_1^2x_2 + 2x_3\) at the point \(\pmb{x} = (1, -1, 3)\) in the direction along \(\pmb{v}\). Then compute \(J_f(\pmb{x}) \pmb{e}\). Compare with the directional derivative. Are they equal? If so, is that a coincidence?
3: Description of Sets in the Plane#
Draw in each of the four below cases a sketch of the given set \(\,A\,\), its interior \(\,A^{\circ}\,\), its boundary \(\,\partial A\,\) and its closure \(\,\bar{A}\,\). Investigate further whether \(\,A\,\) is open, closed or neither. Finally, state whether \(\,A\,\) is bounded or not.
\(\{(x,y)\,\vert\, xy\neq 0\}\)
\(\{(x,y)\,\vert\, 0<x<1\,\,\,\mathrm{and}\,\,\,1\leq y\leq 3\}\)
\(\{(x,y)\,\vert\, y\geq x^2 \,\,\,\mathrm{and}\,\,\,y<2 \}\)
\(\{(x,y)\,\vert\, x^2+y^2-2x+6y\leq 15 \}\)
Hint
Begin by focusing on the axes: How is the set of points limited along the directions of the \(x\) and \(y\) axes, respectively?
The variables may be mutually dependent, but only well-known curve shapes are involved.
Answer
\(\{(x,y)\,\vert\, xy\neq 0\}\) constitute the real plane (\(\mathbb{R}^2\)), but excluding the coordinate axes. This region also constitutes the interior of the set, while its boundary is given by the coordinate axes. Its closure is the entire real plane. The set is open and not bounded.
\(\{(x,y)\,\vert\, 0<x<1\wedge 1\leq y\leq 3\}\) is a rectangle delimited by the lines \(x=0\), \(x=1\), \(y=1\), and \(y=3\), where \(x=0\) and \(x=1\) do not belong to the set whereas \(y=1\) and \(y=3\) do belong to the set. The interior of the set is the rectangle, excluding the line segments, the boundary is all four line segments, and the closure is the rectangle including all four line segments. The set is neither open nor closed, but it is bounded.
\(\{(x,y)\,\vert\, y\geq x^2 \,\,\,\mathrm{and}\,\,\,y<2 \}\) is the set intersection of the region located above the parabola with the equation \(\,y=x^2\,\) and the region located below the line \(\,y=2\,\). Note that the parabola segment stretching from the point \(\,(- \sqrt{2},2)\,\) to the point \(\,( \sqrt{2},2)\,\) is included in the set, though excluding its end points, while the line segment from the point \(\,(- \sqrt{2},2)\,\) to the point \(\,( \sqrt{2},2)\,\) is not included. The interior of the set is this set intersection excluding the parabola segment from the point \(\,(- \sqrt{2},2)\,\) to the point \(\,( \sqrt{2},2)\,.\) The boundary consists of this parabola segment and the line segment from the point \(\,(- \sqrt{2},2)\,\) to the point \(\,( \sqrt{2},2)\,.\) Finally, the closure is the region including the line segment and the parabola segment. The given set is neither open nor closed, but it is bounded.
\(\{(x,y)\,\vert\, x^2+y^2-2x+6y\leq 15 \}\) constitutes the region within a circle centred at \((1,-3)\) with a radius of 5. Its interior is this region excluding the circle periphery, its boundary is the circle periphery, and its closure is the region including the circle periphery. That is, the closure is the set itself. The set is closed and bounded.
4: All Linear Maps from \(\mathbb{R}^n\) to \(\mathbb{R}\)#
Let \(L: \mathbb{R}^n \to \mathbb{R}\) be a (arbitrary) linear map. Let \(e = \pmb{e}_1, \pmb{e}_2, \dots, \pmb{e}_n\) be the standard basis of \(\mathbb{R}^n\), and let \(\beta\) be the standard basis of \(\mathbb{R}\). Remember the standard basis from Mathematics 1a. Note that since the dimension of \(\mathbb{R}\) (over \(\mathbb{R}\)) is 1, the standard basis of \(\mathbb{R}\) is just the number \(1\).
Show that a column vector \(\pmb{c} \in \mathbb{R}^n\) exists such that
where \(\langle \cdot, \cdot \rangle\) denote the usual inner product on \(\mathbb{R}^n\). (The column vector is uniquely given, but it is not a part of this exercise to argue for that).
Hint
What is the mapping matrix \({}_\beta[L]_e\) of \(L\) with respect to the two bases?
Answer
\(\pmb{c}^T = {}_\beta[L]_e = [L(\pmb{e}_1), L(\pmb{e}_2), \dots, L(\pmb{e}_n)]\)
5: Linear(?) Vector Functions#
We consider the following two functions:
\(f: \mathbb{R}^{2 \times 2} \to \mathbb{R}^{2 \times 2}, f(X) = C X B\), where \(C = \mathrm{diag}(2,1) \in \mathbb{R}^{2 \times 2}\) and \(B = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\).
\(g: \mathbb{R}^n \to \mathbb{R}, g(\pmb{x}) = \pmb{x}^T A \pmb{x}\), where \(A\) is an \(n \times n\) matrix (and not the zero matrix).
Determine for each function whether it is a linear map. If the map is linear, then find the mapping matrix with respect to, respectively:
the standard basis \(E=\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 \\ 1 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}\) in \(\mathbb{R}^{2 \times 2}\). Remember this example from Math1a.
the standard basis \(e\) in \(\mathbb{R}^n\). Remember this result from Math1a.
6: The Chain Rule for an Altitude Function#
We consider a real function of two real variables given by the expression
Question a#
Determine the domain of \(f\), and characterise it using terms such as open, closed, bounded, unbounded.
Answer
The logarithm is only defined for positive input values, and thus we must require that
But this can be rewritten to \(x^2+y^2<3^2\), which means that \(\mathrm{dom}(f)=\{(x,y)\,|\,x^2+y^2<3^2\}\). This is a circle disc centred at the origin with a radius of 3, excluding the periphery. The set is open and bounded.
We not consider a parametrized curve \(\pmb{r}\) in the \(\,(x,y)\) plane given by
Question b#
Which curve are we dealing with here (you are familiar with its equation!)?
Answer
This is the graph of the third-degree polynomial \(p(x)=x^3\,,\,\,x\in \left[-1.2\,,\,1.2\right]\).
We now consider the composite function
Question c#
Why is it reasonable to call \(h\) an altitude function?
Question d#
Determine \(h'(1)\,\) using two different methods:
Determine a functional expression for \(h(u)\), and differentiate it as usual.
Use the chain rule in Section 3.7.
Answer
We get \(h(u)=\ln(-u^6-u^2+9)\,\) and \(h'(1)=-\frac{8}{7}\).
The tangent vector must be determined. Since \(\pmb{r}'(u)=(1,3\,u^2)\), we get \(\pmb{r}'(1)=(1,3)\). The gradient \(\nabla f(x,y)\) must be computed. Then \(\nabla f(\pmb{r}(1))=\nabla f(1,1)=(-\frac{2}{7},-\frac{2}{7})\) can be computed. The inner product (dot product) of the two vectors we have found is \(-\frac{8}{7}\).
7: Partial Derivatives but not Differentiable#
The function \(f:\mathbb{R}^2 \to \mathbb{R}\) where
is given.
Question a#
Let \(\pmb{x}_0 = (x_1,x_2) \in \mathbb{R}^2\) be an arbitrary point. Justify that \(f\) is differentiable at \(\pmb{x}_0\), and compute the gradient of \(f\) at \(\pmb{x}_0\).
Hard version: Solve the task direction from the definition of differentiability in Section 3.6.
Soft version: Use the result in this theorem.
Hint
As was done in highschool, we must consider the connection between \(\Delta f = f(\pmb{x}_0+\pmb{h}) - f(\pmb{x}_0)\) and \(\pmb{h}\) in regards to the limit \(\pmb{h}\longrightarrow\pmb{0}\), but note that \(\pmb{h}\) is a vector now.
Hint
Let \(\pmb{h}=(h_1,h_2)\). Compute \(\Delta f\).
Hint
\(\Delta f=f(x_1+h_1,x_2+h_2)-f(x_1,x_2)\).
Hint
Answer
Since \(\varepsilon (\pmb{h}) = ||\pmb{h}||\) is an epsilon function, we can in one go write this as:
where \(\pmb{c} =\begin{bmatrix} 2x_1-4 \\ 2x_2 \end{bmatrix}\).
We conclude that \(f\) is differentiable according to the definition, and that it applies that
Question b#
To conclude differentiability based on the partial derivatives, according to this theorem, it is required that the partial derivatives are continuous. Why is it not enough that the partial derivatives exist? We will be investigating this question via concrete example. But first we generalize a (from highschool) well-known statement about a function of one variable: If it is differentiable at a point, then it is also continuous at that point.
Show that if a function of two variables is differentiable at a point \(\pmb{x}_0\), then it is also continuous at that point.
Hint
The two definitions can be used directly (the proof can also be found in the Notes).
And now for the example. We consider the function
Question c#
Show that the partial derivatives of \(f\) exist at \((0,0)\), but that \(f\) is not differentiable at this point.
Hint
The first part of the question should not be too hard, because the two auxiliary functions \(f_1(x_1)\) and \(f_2(x_2)\) are constant on the entire \(\,x_1\) axis, respectively the entire \(x_2\) axis. OK?
Hint
The second part of the question: We saw that if the function is differentiable at a point, then it is also continuous at the point. But then it must reversely apply that if the function is not continuous at the point, it can also not be differentiable at the point. So, now we just have to show that \(f\) is not continuous at \((0,0)!\)
Hint
From the expression, we see that \(f(0,0)=0\). But what does the restriction of \(f\) to the parabola arc \(x_2=x_1^2\) go towards when \(x_1\) goes towards \(0\)?
Answer
It goes towards \(\frac{1}{2}\). And then \(f\) is not continuous at \((0,0)\). It is a good idea to think through this entire example one more time.
8: The Generalized Chain Rule#
In this exercise we will be using the theorem: Generalized chain rule.
Given functions:
\(\pmb{g} : \mathbb{R}^3 \to \mathbb{R}^2\) defined by \(\pmb{g}(x_1, x_2, x_3) = (g_1(x_1, x_2, x_3), g_2(x_1, x_2, x_3))\), where:
\[\begin{align*} g_1(x_1, x_2, x_3) &= x_1^2 + x_2^2 + x_3^2, \\ g_2(x_1, x_2, x_3) &= e^{x_1 + x_2} \, \cos(x_3). \end{align*}\]\(f : \mathbb{R}^2 \to \mathbb{R}\) defined by \(f(y_1, y_2) = y_1 \, \sin(y_2)\).
The composition of these functions: \(h = f \circ \pmb{g}\).
We will in this exercise compute the Jacobian matrix of \(h\) (with respect to the variables \(x_1, x_2,\) and \(x_3\)) using the generalized chain rule. You may carry out the computations in SymPy.
Question a#
Find the functional expression of \(h\) as well as the domain and co-domain. Compute the gradient of \(h\).
Question b#
Compute the Jacobian matrix of \(\pmb{g}\). Compute the Jacobian matrix of \(f\). What is the connection between the gradient and the Jacobian matrix of \(f\)?
Hint
For scalar functions, the Jacobian matrix is a row vector, that being the transposed gradient (column) vector.
Question c#
Now use the chain rule and the Jacobian matrices from the previous question to find the Jacobian matrix of \(h\). Compare with the answer to Question a.
Hint
Your use of the generalized chain rule must involve a matrix-matrix product of \(1 \times 2\) and \(2 \times 3\) matrices. (The \(2 \times 1\) matrix can of course be considered a row vector).
Hint
Remember that you must evaluate the Jacobian matrix, respectively the gradient, of \(f\) at the right point, that being at \((y_1,y_2) = \pmb{g}(x_1, x_2, x_3)\). This is just as in the usual chain rule known from highschool, where \(f(g(x))' = f'(g(x)) g'(x)\) where \(f'\) on the right-hand side is evaluated at \(g(x)\).
9: Gradient Vector Field and Hessian Matrix#
Question a#
The gradient vector of \(f(x_1, x_2) = x_1^2 \sin(x_2)\) is \(\nabla f(\pmb{x})=(2x_1 \sin(x_2),x_1^2 \cos(x_2))\). The gradient vector can thus be considered as a map \(\nabla f : \mathrm{dom}(f) \to \mathbb{R}^2\). Write down the map as a function (where you state \(\mathrm{dom}(f)\)) and plot it as a vector field.
Question b#
Now compute the Jacobian matrix of \(\nabla f : \mathbb{R}^2 \to \mathbb{R}^2\) at the point \((x_1,x_2)\).
Question c#
Compute the Hessian matrix of \(f : \mathbb{R}^2 \to \mathbb{R}\) at the point \((x_1,x_2)\), and compare with the answer to the previous question.
Theme Exercise – Short Day#
Today we will work through Theme Exercise 1.