Locally Lipschitz implies Lipschitz on Compact Set Proof

Assume $\phi$ is locally Lipschitz on $\mathbb{R}^n$ , that is, for any $x\in \mathbb{R}^n$ , there exists $\delta, L>0$ (depending on $x$ ) such that $|\phi(z)-\phi(y)|\leq L|z-y|$ for all $z,y\in B_\delta(x)=\{t\in\mathbb{R}^n: |x-t|<\delta\}$ .

Then, for any compact set $K\subset\mathbb{R}^n$ , there exists a constant $M>0$ (depending on $K$ ) such that $|\phi(x)-\phi(y)|\leq M|x-y|$ for all $x,y\in K$ . That is, $\phi$ is Lipschitz on $K$ .

Proof

Suppose to the contrary $\phi$ is not Lipschitz on $K$ , so that for all $M>0$ , there exists $x,y\in K$ such that $\displaystyle \frac{|\phi(x)-\phi(y)|}{|x-y|}>M.$

Then there exists two sequences $x_n, y_n\in K$ such that $\displaystyle \frac{|\phi(x_n)-\phi(y_n)|}{|x_n-y_n|}\to\infty.$

Since $\phi$ is locally Lipschitz implies $\phi$ is continuous, so $\phi$ is bounded on $K$ by Extreme Value Theorem. Hence $|x_n-y_n|\to 0$ .

By sequential compactness of $K$ , there exists a convergent subsequence $x_{n_k}\to x$ , and thus $y_{n_k}\to x$ .

Then for any $L>0$ , there exists $k$ such that $x_{n_k},y_{n_k}\in B_\delta(x)$ but $\displaystyle \frac{|\phi(x_{n_k})-\phi(y_{n_k})|}{|x_{n_k}-y_{n_k}|}>L$ which contradicts that $\phi$ is locally Lipschitz.

Lp Interpolation

It turns out that there are two types of Lp interpolation: One is called “Lyapunov’s inequality” which is addressed in this previous blog post.

The other one is called Littlewood’s inequality: If $f\in L^p\cap L^q$ , then $f\in L^r$ for any intermediate $p<r<q$ .

The proof is here:

Case 1) $q<\infty$ . We have $\frac{1}{q}<\frac{1}{r}<\frac{1}{p}$ . There exists $0<\lambda<1$ such that $\frac{1}{r}=\frac{\lambda}{p}+\frac{1-\lambda}{q}$ .

$\begin{aligned} \|f\|_r^r&=\int |f|^r\\ &=\int |f|^{\lambda r}|f|^{(1-\lambda)r}\\ &\leq\left(\int |f|^{\lambda r(\frac{p}{\lambda r})}\right)^{\lambda r/p}\left(\int |f|^{(1-\lambda)r(\frac{q}{(1-\lambda)r})}\right)^{r(1-\lambda)/q}\\ &=\left(\int |f|^p\right)^{\lambda r/p}\left(\int |f|^q\right)^{r(1-\lambda)/q}\\ &=\|f\|_p^{\lambda r}\|f\|_q^{r(1-\lambda)}. \end{aligned}$

Thus $\|f\|_r\leq\|f\|_p^\lambda\|f\|_q^{1-\lambda}$ .

Case 2) $q=\infty$ . We have $0<\frac{1}{r}<\frac{1}{p}$ . There exists $0<\lambda<1$ with $\frac{1}{r}=\frac{\lambda}{p}$ . Then
$\begin{aligned} \|f\|_r^r&=\||f|^{\lambda r}|f|^{(1-\lambda)r}\|_1\\ &\leq\||f|^{\lambda r}\|_{\frac{p}{\lambda r}}\||f|^{(1-\lambda)r}\|_\infty\\ &=\left(\int |f|^p\right)^{\lambda r/p}\|f\|_\infty^{(1-\lambda)r}\\ &=\|f\|_p^{\lambda r}\|f\|_\infty^{(1-\lambda)r}. \end{aligned}$

Hence $\|f\|_r\leq\|f\|_p^\lambda\|f\|_\infty^{1-\lambda}$ .

Generalized Riemann-Stieltjes Integral

The generalized Riemann-Stieltjes integral $\int_a^b f\,d\phi$ is a number $\gamma$ such that: for every $\epsilon>0$ there exists a partition $P_\epsilon$ of $[a,b]$ such that if $\dot{P}=(P,\xi)$ , $P=\{a=x_0<x_1<\dots<x_n=b\}$ , $\xi=\{\xi_i: i=1,\dots, n\}$ with $\xi_i\in[x_{i-1},x_i]$ is a tagged partition of $[a,b]$ such that $P$ is a refinement of $P_\epsilon$ , then $\displaystyle |S(\dot{P},f,\phi)-\gamma|=|\sum_{i=1}^n f(\xi_i)(\phi(x_i)-\phi(x_{i-1}))-\gamma|<\epsilon.$

We will write $\displaystyle \lim_{P\to 0}S(\dot{P},f,\phi)=\lim_{P\to 0}\sum_{i=1}^nf(\xi_i)(\phi(x_i)-\phi(x_{i-1}))=\gamma$ and $f\in \mathcal{R}_\phi[a,b]$ .

Sufficient condition for “Weak Convergence”

This is a sufficient condition for something that resembles “Weak convergence”: $\int f_kg\to \int fg$ for all $g\in L^{p'}$
Suppose that $f_k\to f$ a.e.\ and that $f_k, f\in L^p$ , $1<p\leq\infty$ . If $\|f_k\|_p\leq M<\infty$ , we have $\int f_kg\to\int fg$ for all $g\in L^{p'}$ , $1/p+1/p'=1$ . Note that the result is false if $p=1$ .

Proof:
(Case: $|E|<\infty$ , where $E$ is the domain of integration).

We may assume $|E|>0$ , $M>0$ , $\|g\|_{p'}>0$ otherwise the result is trivially true. Also, by Fatou’s Lemma, $\displaystyle \|f\|_p\leq\liminf_{k\to\infty}\|f_k\|_p\leq M.$

Let $\epsilon>0$ . Since $g\in L^{p'}$ , so $g^{p'}\in L^1$ and there exists $\delta>0$ such that for any measurable subset $A\subseteq E$ with $|A|<\delta$ , $\int_A |g^{p'}|<\epsilon^{p'}$ .

Since $f_k\to f$ a.e.\ ( $f$ is finite a.e.\ since $f\in L^p$ ), by Egorov’s Theorem there exists closed $F\subseteq E$ such that $|E\setminus F|<\delta$ and $\{f_k\}$ converge uniformly to $f$ on $F$ . That is, there exists $N(\epsilon)$ such that for $k\geq N$ , $|f_k(x)-f(x)|<\epsilon$ for all $x\in F$ .

Then for $k\geq N$ ,
$\begin{aligned} \left|\int_E f_kg-fg\right|&\leq\int_E|f_k-f||g|\\ &=\int_{E\setminus F}|f_k-f||g|+\int_F|f_k-f||g|\\ &\leq\left(\int_{E\setminus F}|f_k-f|^p\right)^\frac{1}{p}\left(\int_{E\setminus F}|g|^{p'}\right)^\frac{1}{p'}+\epsilon\int_F |g|\\ &<\|f_k-f\|_p(\epsilon)+\epsilon\left(\int_F|g|^{p'}\right)^\frac{1}{p'}\left(\int_F |1|^p\right)^\frac{1}{p}\\ &\leq 2M\epsilon+\epsilon\|g\|_{p'}|E|^\frac{1}{p}\\ &=\epsilon(2M+\|g\|_{p'}|E|^\frac{1}{p}). \end{aligned}$

Since $\epsilon>0$ is arbitrary, this means $\int_E f_g\to \int_E fg$ .

(Case: $|E|=\infty$ ). Error: See correction below.

Define $E_N=E\cap B_N(0)$ , where $B_N(0)$ is the ball with radius $N$ centered at the origin. Then $|E_N|<\infty$ , so there exists $N_1>0$ such that for $N\geq N_1$ , $\int_{E_N}|f_k-f||g|<\epsilon$ .

Since $|g|^{p'}\chi_{E_N}\nearrow|g|^{p'}$ on $E$ , by Monotone Convergence Theorem, $\displaystyle \lim_{N\to\infty}\int_{E_N}|g|^{p'}=\int_E |g|^{p'}<\infty.$
Thus there exists $N_2>0$ such that for $N\geq N_2$ , $\int_{E\setminus E_N} |g|^{p'}<\epsilon^{p'}$ .

Then for $N\geq\max\{N_1, N_2\}$ ,
$\begin{aligned} \int_E |f_kg-fg|&=\int_{E_N}|f_k-f||g|+\int_{E\setminus E_N}|f_k-f||g|\\ &<\epsilon+\left(\int_{E\setminus E_N}|f_k-f|^p\right)^\frac{1}{p}\left(\int_{E\setminus E_N}|g|^{p'}\right)^\frac{1}{p'}\\ &<\epsilon+\|f_k-f\|_p(\epsilon)\\ &\leq\epsilon+2M\epsilon\\ &=\epsilon(1+2M). \end{aligned}$
so that $\int_E f_kg\to\int_E fg$ .

(Show that the result is false if $p=1$ ).

Let $f_k:=k\chi_{[0,\frac 1k]}$ . Then $f_k\to f$ a.e., where $f\equiv 0$ . Note that $\int_\mathbb{R} |f_k|=1$ , $\int_\mathbb{R} |f|=0$ so that $f_k, f\in L^1(\mathbb{R})$ . Similarly, $\|f_k\|_1\leq M=1$ .

However if $g\equiv 1\in L^\infty$ , $\int_\mathbb{R} f_kg=1$ for all $k$ but $\int_\mathbb{R} fg=0$ .

Correction for the case $|E|=\infty$ :

Define $E_N=E\cap B_N(0)$ , where $B_N(0)$ is the ball with radius $N$ centered at the origin.

Since $|g|^{p'}\chi_{E_N}\nearrow |g|^{p'}$ on $E$ , by Monotone Convergence Theorem, $\displaystyle \lim_{N\to\infty}\int_{E_N}|g|^{p'}=\int_E|g|^{p'}<\infty.$

Thus there exists $N_1>0$ such that $\int_{E\setminus E_{N_1}}|g|^{p'}<\epsilon^{p'}$ .

Since $|E_{N_1}|<\infty$ , by the finite measure case there exists $N_2$ such that for $k\geq N_2$ , $\displaystyle \int_{E_{N_1}}|f_k-f||g|<\epsilon.$

So for $k\geq N_2$ ,
$\begin{aligned} \int_E|f_kg-fg|&=\int_{E_{N_1}}|f_k-f||g|+\int_{E\setminus E_{N_1}}|f_k-f||g|\\ &<\epsilon+\left(\int_{E\setminus E_{N_1}}|f_k-f|^p\right)^{1/p}\left(\int_{E\setminus E_{N_1}}|g|^{p'}\right)^{1/p'}\\ &<\epsilon+\|f_k-f\|_p(\epsilon)\\ &\leq\epsilon+2M\epsilon\\ &=\epsilon(1+2M). \end{aligned}$

so that $\int_Ef_kg\to\int_E fg$ .

Square root x is not Lipschitz on [0,1]

$f(x)=\sqrt x$ is not Lipschitz on $[0,1]$ :

Suppose there exists $C\geq 0$ such that for all $x,y\in [0,1]$ , $x\neq y$ , $\displaystyle \frac{|f(x)-f(y)|}{|x-y|}\leq C.$

By Mean Value Theorem, this means that $|f'(\xi)|\leq C$ for some $\xi$ between $x$ and $y$ .

However, $f'(x)=\frac{1}{2}x^{-1/2}$ is unbounded on $[0,1]$ , a contradiction.

Note however, that $\sqrt x$ is absolutely continuous on [0,1]. So Lipschitz is a stronger condition than absolutely continuous.

Young’s Convolution Theorem

Let $1\leq p,q\leq \infty$ and $1/p+1/q\geq 1$ , and let $1/r=1/p+1/q-1$ . If $f\in L^p(\mathbb{R}^n)$ and $g\in L^q(\mathbb{R}^n)$ , then $f*g\in L^r(\mathbb{R}^n)$ and $\displaystyle \|f*g\|_r\leq\|f\|_p\|g\|_q.$

Amazing Theorem! If $q=1$ , then $\|f*g\|_p\leq\|f\|_p\|g\|_1$ .

Relationship between L^p convergence and a.e. convergence

It turns out that convergence in Lp implies that the norms converge. Conversely, a.e. convergence and the fact that norms converge implies Lp convergence. Amazing!

Relationship between $L^p$ convergence and a.e. convergence:
Let $f, \{f_k\}\in L^p$ , $0<p\leq\infty$ . If $\|f-f_k\|_p\to 0$ , then $\|f_k\|_p\to\|f\|_p$ . Conversely, if $f_k\to f$ a.e.\ and $\|f_k\|_p\to\|f\|_p$ , $0<p<\infty$ , then $\|f-f_k\|_p\to 0$ . Note that the converse may fail for $p=\infty$ .

Proof:
Assume $\|f-f_k\|_p\to 0$ .

(Case: $0<p<1$ ).
Lemma 1:
If $0<p<1$ , $|a+b|^p\leq|a|^p+|b|^p$ for all $a,b\in\mathbb{R}$ .
Proof of Lemma 1:
$\displaystyle 1=\frac{|a|}{|a|+|b|}+\frac{|b|}{|a|+|b|}\leq\left(\frac{|a|}{|a|+|b|}\right)^p+\left(\frac{|b|}{|a|+|b|}\right)^p=\frac{|a|^p+|b|^p}{(|a|+|b|)^p}.$
Hence $|a+b|^p\leq(|a|+|b|)^p\leq|a|^p+|b|^p$ .
End Proof of Lemma 1.
Hence, using $|a|^p\leq|a-b|^p+|b|^p$ and $|b|^p\leq|a-b|^p+|a|^p$ we see that $\displaystyle ||a|^p-|b|^p|\leq|a-b|^p.$

Thus
$\begin{aligned} \left|\|f_k\|_p^p-\|f\|_p^p\right|&=\left|\int(|f_k|^p-|f|^p)\right|\\ &\leq\int\left||f_k|^p-|f|^p\right|\\ &\leq\int|f_k-f|^p\\ &=\|f-f_k\|_p^p\to 0\ \ \ \text{as}\ k\to\infty. \end{aligned}$

Hence $\|f_k\|_p\to\|f\|_p$ .

(Case: $1\leq p\leq\infty$ .)

By Minkowski’s inequality, $\|f\|_p\leq\|f-f_k\|_p+\|f_k\|_p$ and $\|f_k\|_p\leq\|f-f_k\|_p+\|f\|_p$ so that $\displaystyle \left|\|f_k\|_p-\|f\|_p\right|\leq\|f-f_k\|_p\to 0$ as $k\to\infty$ . Done.

Converse:

Assume $f_k\to f$ a.e.\ and $\|f_k\|_p\to\|f\|_p$ , $0<p<\infty$ .
Lemma 2:
For $a,b\in\mathbb{R}$ , $|a+b|^p\leq 2^{p-1}(|a|^p+|b|^p)$ for $1\leq p<\infty$ .
Proof of Lemma 2:
By convexity of $|x|^p$ for $1\leq p<\infty$ , $\displaystyle \left|\frac 12 a+\frac 12 b\right|^p\leq\frac 12 |a|^p+\frac 12 |b|^p.$
Multiplying throughout by $2^p$ gives $\displaystyle |a+b|^p\leq 2^{p-1}(|a|^p+|b|^p).$

Thus together with Lemma 1, for $0<p<\infty$ we have $|f-f_k|^p\leq c(|f|^p+|f_k|^p)$ with $c=\max\{2^{p-1}, 1\}$ .

Note that $|f-f_k|^p\to 0$ a.e.\ and $\phi_k:=c(|f|^p+|f_k|^p)\to\phi:=2c|f|^p$ a.e.\ which is integrable. Also, $\int\phi_k\to\int\phi$ since $\|f_k\|_p^p\to\|f\|_p^p$ . By Generalized Lebesgue’s DCT, we have $\int |f-f_k|^p\to 0$ thus $\displaystyle \|f-f_k\|_p\to 0.$

(Show that the converse may fail for $p=\infty$ ):

Consider $f_k=\chi_{[-k,k]}\in L^\infty(\mathbb{R})$ . Then $f_k\to f$ a.e.\ where $f(x)\equiv 1$ , and $\|f_k\|_\infty\to\|f\|_\infty=1$ . However $\|f-f_k\|_\infty=1\not\to 0$ .

Simple Vitali Lemma

Let $E$ be a subset of $\mathbb{R}^n$ with $|E|_e<\infty$ , and let $K$ be a collection of cubes $Q$ covering $E$ . Then there exist a positive constant $\beta$ (depending only on $n$ ), and a finite number of disjoint cubes $Q_1,\dots,Q_N$ in $K$ such that $\displaystyle \sum_{j=1}^N|Q_j|\geq\beta|E|_e.$
(We may take $0<\beta<5^{-n}$ .)

Wheeden Zygmund Measure and Integration Solutions

Here are some solutions to exercises in the book: Measure and Integral, An Introduction to Real Analysis by Richard L. Wheeden and Antoni Zygmund.

Chapter 1,2: analysis1

Chapter 3: analysis2

Chapter 4, 5: analysis3

Chapter 5,6: analysis4

Chapter 6,7: analysis5

Chapter 8: analysis6

Chapter 9: analysis7

Measure and Integral: An Introduction to Real Analysis, Second Edition (Chapman & Hall/CRC Pure and Applied Mathematics)

Other than this book by Wheedon, also check out other highly recommended undergraduate/graduate math books.

Books to Transition from Math to Data Science

Graduating soon and interested to transition to data science (dubbed the sexiest job of the 21st century)? We recommend two books which are very suitable for students with strong math background, but little or no background in data science/ machine learning.

Do check out the following data science / machine learning book (rated 4.5/5 on Amazon) Pattern Recognition and Machine Learning (Information Science and Statistics) which is an in-depth book on the fundamentals of machine learning. The author Christopher M. Bishop has a PhD in theoretical physics, and is the Deputy Director of Microsoft Research Cambridge.

The above book is good for building a solid, theoretical foundation for a data scientist job. The next book Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems is ideal for learning hands-on practical coding for building machine learning (including deep learning) models. The author Aurélien Géron is a former Googler who was the tech lead for YouTube video classification.

Do you know how to prove sin(1/x)/x is not Lebesgue Integrable on (0,1]?

Also check out other popular Measure Theory exam question topics here:

Try Audible Plus (Free!)

Your free, 30-day trial comes with:

The Amazon Audible Plus Catalog of podcasts, audiobooks, guided wellness, and Audible Originals. Listen all you want, no credits needed.
Be more productive by listening to audiobooks during your daily commute to school or work!

Absolute Continuity of Lebesgue Integral

The following is a wonderful property of the Lebesgue Integral, also known as absolute continuity of Lebesgue Integral. Basically, it means that whenever the domain of integration has small enough measure, then the integral will be arbitrarily small.

Suppose $f$ is integrable.
Given $\epsilon>0$ , there exists $\delta>0$ such that for all measurable sets $B\subseteq E$ with $|B|<\delta$ , $|\int_B f\,dx|<\epsilon$ .

Proof:
Define $A_k=\{x\in E: \frac 1k\leq|f(x)|<k\}$ for $k\in\mathbb{N}$ . Each $A_k$ is measurable and $A_k\nearrow A:=\bigcup_{k=1}^\infty A_k$ . Note that $\displaystyle \int_E |f|=\int_{\{f=0\}}|f|+\int_A |f|+\int_{\{f=\infty\}}|f|=\int_A |f|.$

Let $f_k=|f|\chi_{A_k}$ . Then $\{f_k\}$ is a sequence of non-negative functions such that $f_k\nearrow |f|\chi_A$ . By Monotone Convergence Theorem, $\lim_{k\to\infty}\int_E f_k=\int_E |f|\chi_A$ , that is, $\displaystyle \lim_{k\to\infty}\int_{A_k}|f|\,dx=\int_A |f|\,dx=\int_E |f|\,dx.$

Let $N>0$ be sufficiently large such that $\int_{E\setminus A_N}|f|\,dx<\epsilon/2$ .

Let $\delta=\frac{\epsilon}{2N}$ , and suppose $|B|<\delta$ . Then
$\begin{aligned} |\int_B f\,dx|&\leq\int_B |f|\,dx\\ &=\int_{(E\setminus A_N)\cap B}|f|\,dx+\int_{A_N\cap B}|f|\,dx\\ &\leq\int_{E\setminus A_N}|f|\,dx+\int_{A_N\cap B}N\,dx\\ &<\epsilon/2+N\cdot|A_N\cap B|\\ &\leq\epsilon/2+N\cdot|B|\\ &<\epsilon/2+N\cdot\frac{\epsilon}{2N}\\ &=\epsilon. \end{aligned}$

Inequalities for pth powers, where 0<p<infinity

There are some useful inequalities for $|x+y|^p$ , where p is a number ranging from 0 to infinity. These are the top 3 useful inequalities (note some of them only work for p less than 1, or p greater than 1).

1)
For $a,b\in\mathbb{R}$ , $|a+b|^p\leq 2^p(|a|^p+|b|^p)$ , where $0<p<\infty$ .

Proof:
$\begin{aligned} |a+b|^p&\leq(|a|+|b|)^p\\ &\leq(2\max\{|a|,|b|\})^p\\ &=2^p(\max\{|a|,|b|\})^p\\ &\leq 2^p(|a|^p+|b|^p). \end{aligned}$

2)
If $0<p<1$ , $|a+b|^p\leq|a|^p+|b|^p$ for all $a,b\in\mathbb{R}$ .

Proof:
$\displaystyle 1=\frac{|a|}{|a|+|b|}+\frac{|b|}{|a|+|b|}\leq\left(\frac{|a|}{|a|+|b|}\right)^p+\left(\frac{|b|}{|a|+|b|}\right)^p=\frac{|a|^p+|b|^p}{(|a|+|b|)^p}.$
Hence $|a+b|^p\leq(|a|+|b|)^p\leq|a|^p+|b|^p$ .

3)
For $a,b\in\mathbb{R}$ , $|a+b|^p\leq 2^{p-1}(|a|^p+|b|^p)$ for $1\leq p<\infty$ .

Proof:
By convexity of $|x|^p$ for $1\leq p<\infty$ , $\displaystyle \left|\frac 12 a+\frac 12 b\right|^p\leq\frac 12 |a|^p+\frac 12 |b|^p.$
Multiplying throughout by $2^p$ gives $\displaystyle |a+b|^p\leq 2^{p-1}(|a|^p+|b|^p).$

Composition of Continuously Differentiable Function and Function of Bounded Variation

Assume $\phi$ is a continuously differentiable function on $\mathbb{R}$ and $f$ is a function of bounded variation on $[0,1]$ . Then $\phi(f)$ is also a function of bounded variation on $[0,1]$ .

Proof:

$\displaystyle V_a^b(\phi(f))=\sup_{P\in\mathcal{P}}\sum_{i=0}^{N_P-1}|\phi(f(x_{i+1}))-\phi(f(x_i))|$ where $\displaystyle \mathcal{P}=\{P|P:a=x_0<x_1<\dots<x_{N_P}=b\ \text{is a partition of}\ [a,b]\}.$

By Mean Value Theorem, $\displaystyle |\phi(f(x_{i+1}))-\phi(f(x_i))|=|f(x_{i+1})-f(x_i)||\phi'(c)|$ for some $c\in(x_i, x_{i+1})$ .

Since $\phi'$ is continuous, it is bounded on $[0,1]$ , say $|\phi'(x)|\leq K$ for all $x\in[0,1]$ . Thus
$\begin{aligned} V_a^b(\phi(f))&=\sup_{P\in\mathcal{P}}\sum_{i=0}^{N_P-1}|\phi(f(x_{i+1}))-\phi(f(x_i))|\\ &\leq K\sup_{P\in\mathcal{P}}\sum_{i=0}^{N_P-1}|f(x_{i+1})-f(x_i)|\\ &=KV_a^b(f)\\ &<\infty. \end{aligned}$

Fatou’s Lemma for Convergence in Measure

Suppose $f_k\to f$ in measure on a measurable set $E$ such that $f_k\geq 0$ for all $k$ , then $\displaystyle\int_E f\,dx\leq\liminf_{k\to\infty}\int_E f_k\,dx$ .

The proof is short but slightly tricky:

Suppose to the contrary $\int_E f\,dx>\liminf_{k\to\infty}\int_E f_k\,dx$ . Let $\{f_{k_l}\}$ be a subsequence such that $\displaystyle \lim_{l\to\infty}\int f_{k_l}=\liminf_{k\to\infty}\int_E f_k<\int_E f$
(using the fact that for any sequence there is a subsequence converging to $\liminf$ ).

Since $f_{k_l}\xrightarrow{m}f$ , there exists a further subsequence $f_{k_{l_m}}\to f$ a.e. By Fatou’s Lemma, $\displaystyle \int_E f\leq\liminf_{m\to\infty}\int_E f_{k_{l_m}}=\lim_{l\to\infty}\int f_{k_l}<\int_E f,$ a contradiction.

The last equation above uses the fact that if a sequence converges, all subsequences converge to the same limit.

Summation by parts / Abel’s Lemma

This is an amazing identity by Abel.

Let $\{f_k\}$ and $\{g_k\}$ be two sequences. Then,
$\displaystyle \sum_{k=m}^n f_k(g_{k+1}-g_k)=[f_{n+1}g_{n+1}-f_mg_m]-\sum_{k=m}^n g_{k+1}(f_{k+1}-f_k).$

sin(1/x)/x is improper Riemann Integrable but not Lebesgue Integrable on (0,1]

Consider $f(x)=\frac{\sin(\frac 1x)}{x}$ .
$\begin{aligned} \int_0^1 \frac{\sin(\frac 1x)}{x}\,dx&=\int_\infty^1\frac{\sin u}{\frac 1u}(-\frac{1}{u^2})\,du\\ &=\int_1^\infty\frac{\sin u}{u}\,du. \end{aligned}$
$\begin{aligned} \int_1^R\frac{\sin u}{u}\,du&=[\frac{-\cos u}{u}]_1^R+\int_1^R\frac{-\cos u}{u^2}\,du\\ &=\frac{-\cos R}{R}+\frac{\cos 1}{1}-\int_1^R\frac{\cos u}{u^2}\,du. \end{aligned}$
Note that $\lim_{R\to\infty}\frac{\cos R}{R}=0$ , and $\displaystyle |\int_1^\infty\frac{\cos u}{u^2}\,du|\leq\int_1^\infty\frac{1}{u^2}\,du=1<\infty.$

Thus $\int_1^\infty\frac{\sin u}{u}\,du<\infty$ , and $f$ is improper Riemann integrable.

However note that
$\begin{aligned} \int_1^\infty |\frac{\sin u}{u}|\,du&\geq\int_\pi^{(N+1)\pi}|\frac{\sin u}{u}|\,du\\ &=\sum_{k=1}^N\int_{k\pi}^{(k+1)\pi}|\frac{\sin u}{u}|\,du\\ &=\sum_{k=1}^N\int_0^\pi|\frac{\sin(t+k\pi)}{t+k\pi}|\,dt\\ &=\sum_{k=1}^N\int_0^\pi\frac{|\sin t|}{t+k\pi}\,dt\\ &\geq\sum_{k=1}^N\frac{1}{(k+1)\pi}\int_0^\pi\sin t\,dt\\ &=\frac{2}{\pi}\sum_{k=1}^N\frac{1}{(k+1)} \end{aligned}$
which diverges as $N\to\infty$ (harmonic series).

Thus $f$ is not Lebesgue integrable on $(0,1]$ .

Lebesgue’s Dominated Convergence Theorem for Convergence in Measure

If $\{f_k\}$ satisfies $f_k\xrightarrow{m}f$ on $E$ and $|f_k|\leq\phi\in L(E)$ , then $f\in L(E)$ and $\int_E f_k\to\int_E f$ .

Proof

Let $\{f_{k_j}\}$ be any subsequence of $\{f_k\}$ . Then $f_{k_j}\xrightarrow{m}f$ on $E$ . Thus there is a subsequence $f_{k_{j_l}}\to f$ a.e.\ in $E$ . Clearly $|f_{k_{j_l}}|\leq\phi\in L(E)$ .

By the usual Lebesgue’s DCT, $f\in L(E)$ and $\int_E f_{k_{j_l}}\to\int_E f$ .

Since every subsequence of $\{\int_E f_k\}$ has a further subsequence that converges to $\int_E f$ , we have $\int_E f_k\to\int_E f$ .

Generalized Lebesgue Dominated Convergence Theorem Proof

This key theorem showcases the full power of Lebesgue Integration Theory.

Generalized Lebesgue Dominated Convergence Theorem

Let $\{f_k\}$ and $\{\phi_k\}$ be sequences of measurable functions on $E$ satisfying $f_k\to f$ a.e. in $E$ , $\phi_k\to \phi$ a.e. in $E$ , and $|f_k|\leq\phi_k$ a.e. in $E$ . If $\phi\in L(E)$ and $\int_E \phi_k\to\int_E \phi$ , then $\int_E |f_k-f|\to 0$ .

Proof

We have $|f_k-f|\leq|f_k|+|f|\leq\phi_k+\phi$ . Applying Fatou’s lemma to the non-negative sequence $\displaystyle h_k=\phi_k+\phi-|f_k-f|,$ we get $\displaystyle 2\int_E\phi\leq\liminf_{k\to\infty}\int_E (\phi_k+\phi-|f_k-f|).$
That is, $\displaystyle 2\int_E \phi\leq2\int_E\phi-\limsup_{k\to\infty}\int_E |f_k-f|.$

Since $\int_E\phi<\infty$ , we get $\limsup_{k\to\infty}\int_E |f_k-f|\leq 0$ . Since $\liminf_{k\to\infty}\int_E |f_k-f|\geq 0$ , this implies $\lim_{k\to\infty}\int_E |f_k-f|=0$ .

Leibniz Integral Rule (Differentiating under Integral) + Proof

“Differentiating under the Integral” is a useful trick, and here we describe and prove a sufficient condition where we can use the trick. This is the Measure-Theoretic version, which is more general than the usual version stated in calculus books.

Let $X$ be an open subset of $\mathbb{R}$ , and $\Omega$ be a measure space. Suppose $f:X\times\Omega\to\mathbb{R}$ satisfies the following conditions:
1) $f(x,\omega)$ is a Lebesgue-integrable function of $\omega$ for each $x\in X$ .
2) For almost all $w\in\Omega$ , the derivative $\frac{\partial f}{\partial x}(x,\omega)$ exists for all $x\in X$ .
3) There is an integrable function $\Theta: \Omega\to\mathbb{R}$ such that $\displaystyle \left|\frac{\partial f}{\partial x}(x,\omega)\right|\leq\Theta(\omega)$ for all $x\in X$ .

Then for all $x\in X$ , $\displaystyle \frac{d}{dx}\int_\Omega f(x,\omega)\,d\omega=\int_\Omega\frac{\partial}{\partial x} f(x,\omega)\,d\omega.$

Proof:
By definition, $\displaystyle \frac{\partial f}{\partial x}(x,\omega)=\lim_{h\to 0}\frac{f(x+h,\omega)-f(x,\omega)}{h}.$

Let $h_n$ be a sequence tending to 0, and define $\displaystyle \phi_n(x,\omega)=\frac{f(x+h_n,\omega)-f(x,\omega)}{h_n}.$

It follows that $\displaystyle \frac{\partial f}{\partial x}(x,\omega)=\lim_{n\to\infty}\phi_n(x,\omega)$ is measurable.

Using the Mean Value Theorem, we have $\displaystyle |\phi_n(x,\omega)|\leq\sup_{x\in X}|\frac{\partial f}{\partial x}(x,\omega)|\leq\Theta(w)$ for each $x\in X$ .

Thus for each $x\in X$ , by the Dominated Convergence Theorem, we have $\displaystyle \lim_{n\to\infty}\int_\Omega \phi_n(x,\omega)\,d\omega=\int_\Omega\lim_{n\to\infty}\phi_n(x,\omega)\,dw$ which implies $\displaystyle \lim_{h_n\to 0}\frac{\int_\Omega f(x+h_n,\omega)\,d\omega-\int_\Omega f(x,\omega)\,d\omega}{h_n}=\int_\Omega \frac{\partial f}{\partial x}(x,\omega)\,d\omega.$

That is, $\displaystyle \frac{d}{dx}\int_\Omega f(x,\omega)\,d\omega=\int_\Omega \frac{\partial}{\partial x}f(x,\omega)\,d\omega.$

Laurent Series with WolframAlpha

WolframAlpha can compute (simple) Laurent series:
https://www.wolframalpha.com/input/?i=series+sin(z%5E-1)

Series[Sin[z^(-1)], {z, 0, 5}]

1/z-1/(6 z^3)+1/(120 z^5)+O((1/z)^6)
(Laurent series)
(converges everywhere away from origin)

Unfortunately, more “complex” (pun intended) Laurent series are not possible for WolframAlpha.

Increasing sequence of simple functions to a bounded measurable function f

Assume $|f(x)|\leq M$ , where $M\in\mathbb{N}$ . Consider $\displaystyle g_k=\sum_{i=-M2^k+1}^{M2^k}\frac{i-1}{2^k}\chi_{\{\frac{i-1}{2^k}<f\leq\frac{i}{2^k}\}}.$
Thus $\displaystyle g_{k+1}=\sum_{i=-M2^{k+1}+1}^{M2^{k+1}}\frac{i-1}{2^{k+1}}\chi_{\{\frac{i-1}{2^{k+1}}<f\leq\frac{i}{2^{k+1}}\}}.$

For $x\in\{\frac{i-1}{2^k}<f\leq\frac{i}{2^k}\}$ , $g_k(x)=\frac{i-1}{2^k}$ .

The above set is equal to $\{\frac{2i-2}{2^{k+1}}<f\leq\frac{2i}{2^{k+1}}\}$ , so $g_{k+1}(x)\geq\frac{2i-2}{2^{k+1}}=g_k(x)$ .

$|g_k(x)-f(x)|\leq\frac{1}{2^k}\to 0$ as $k\to\infty$ . Hence $g_k$ converges to $f$ everywhere.

Laurent Series (Example)

The Laurent series is something like the Taylor series, but with terms with negative exponents, e.g. $z^{-1}$ . The below Laurent Series formula may not be the most practical way to compute the coefficients, usually we will use known formulas, as the example below shows.

Laurent Series

The Laurent series for a complex function $f(z)$ about a point $c$ is given by: $\displaystyle f(z)=\sum_{n=-\infty}^\infty a_n(z-c)^n$ where $\displaystyle a_n=\frac{1}{2\pi i}\oint_\gamma\frac{f(z)\, dz}{(z-c)^{n+1}}.$

The path of integration $\gamma$ is anticlockwise around a closed, rectifiable path containing no self-intersections, enclosing $c$ and lying in an annulus $A$ in which $f(z)$ is holomorphic. The expansion for $f(z)$ will then be valid anywhere inside the annulus.

Example

Consider $f(z)=\frac{e^z}{z}+e^\frac{1}{z}$ . This function is holomorphic everywhere except at $z=0$ . Using the Taylor series of the exponential function $\displaystyle e^z=\sum_{k=0}^\infty\frac{z^k}{k!},$ we get
$\begin{aligned} \frac{e^z}{z}&=z^{-1}+1+\frac{z}{2!}+\frac{z^2}{3!}+\dots\\ e^\frac{1}{z}&=1+z^{-1}+\frac{1}{2!}z^{-2}+\frac{1}{3!}z^{-3}+\dots\\ \therefore f(z)&=\dots+(\frac{1}{3!})z^{-3}+(\frac{1}{2!})z^{-2}+2z^{-1}+2+(\frac{1}{2!})z+(\frac{1}{3!})z^2+\dots \end{aligned}$
Note that the residue (coefficient of $z^{-1}$ ) is 2.

Implicit Function Theorem

The implicit function theorem is a strong theorem that allows us to express a variable as a function of another variable. For instance, if $x^2y+y^3x+9xy=0$ , can we make $y$ the subject, i.e. write $y$ as a function of $x$ ? The implicit function theorem allows us to answer such questions, though like most Pure Math theorems, it only guarantees existence, the theorem does not explicitly tell us how to write out such a function.

The below material are taken from Wikipedia.

Implicit function theorem

Let $f:\mathbb{R}^{n+m}\to\mathbb{R}^m$ be a continuously differentiable function, and let $\mathbb{R}^{n+m}$ have coordinates $(\mathbf{x},\mathbf{y})=(x_1,\dots,x_n,y_1,\dots,y_m)$ . Fix a point $(\mathbf{a},\mathbf{b})=(a_1,\dots,a_n,b_1,\dots,b_m)$ with $f(\mathbf{a},\mathbf{b})=\mathbf{c}$ , where $\mathbf{c}\in\mathbb{R}^m$ . If the matrix $\displaystyle [(\partial f_i/\partial y_j)(\mathbf{a},\mathbf{b})]$ is invertible, then there exists an open set $U$ containing $\mathbf{a}$ , an open set $V$ containing $\mathbf{b}$ , and a unique continuously differentiable function $g:U\to V$ such that $\displaystyle \{(\mathbf{x},g(\mathbf{x}))\mid\mathbf{x}\in U\}=\{(\mathbf{x},\mathbf{y})\in U\times V\mid f(\mathbf{x},\mathbf{y})=\mathbf{c}\}.$

Elaboration:

Abbreviating $(a_1,\dots,a_n,b_1,\dots,b_m)$ to $(\mathbf{a},\mathbf{b})$ , the Jacobian matrix is
$\displaystyle (Df)(\mathbf{a},\mathbf{b})=\begin{pmatrix} \frac{\partial f_1}{\partial x_1}(\mathbf{a},\mathbf{b}) & \dots &\frac{\partial f_1}{\partial x_n}(\mathbf{a},\mathbf{b}) & \frac{\partial f_1}{\partial y_1}(\mathbf{a},\mathbf{b}) & \dots & \frac{\partial f_1}{\partial y_m}(\mathbf{a},\mathbf{b})\\ \vdots & \ddots &\vdots & \vdots & \ddots &\vdots\\ \frac{\partial f_m}{\partial x_1}(\mathbf{a},\mathbf{b}) & \dots & \frac{\partial f_m}{\partial x_n}(\mathbf{a}, \mathbf{b}) & \frac{\partial f_m}{\partial y_1}(\mathbf{a}, \mathbf{b}) & \dots & \frac{\partial f_m}{\partial y_m}(\mathbf{a}, \mathbf{b}) \end{pmatrix} =(X\mid Y)$
where $X$ is the matrix of partial derivatives in the variables $x_i$ and $Y$ is the matrix of partial derivatives in the variables $y_j$ .

The implicit function theorem says that if $Y$ is an invertible matrix, then there are $U$ , $V$ , and $g$ as desired.

Example (Unit circle)

In this case $n=m=1$ and $f(x,y)=x^2+y^2-1$ .

$\displaystyle (Df)(a,b)=(\frac{\partial f}{\partial x}(a,b)\ \frac{\partial f}{\partial y}(a,b))=(2a\ 2b).$

Note that $Y=(2b)$ is invertible iff $b\neq 0$ . By the implicit function theorem, we see that we can locally write the circle in the form $y=g(x)$ for all points where $y\neq 0$ .

lim sup & lim inf of Sets

The concept of lim sup and lim inf can be applied to sets too. Here is a nice characterisation of lim sup and lim inf of sets:

For a sequence of sets $\{E_k\}$ , $\limsup E_k$ consists of those points that belong to infinitely many $E_k$ , and $\liminf E_k$ consists of those points that belong to all $E_k$ from some $k$ on (i.e. belong to all but finitely many $E_k$ ).

Proof:
Note that
$\begin{aligned} x\in\limsup E_k&\iff x\in\bigcup_{k=j}^\infty E_k\ \text{for all}\ j\in\mathbb{N}\\ &\iff\text{For all}\ j\in\mathbb{N}, \text{there exists}\ i\geq j\ \text{such that}\ x\in E_i\\ &\iff x\ \text{belongs to infinitely many}\ E_k. \end{aligned}$
$\begin{aligned} x\in\liminf E_k&\iff x\in\bigcap_{k=j}^\infty E_k\ \text{for some}\ j\in\mathbb{N}\\ &\iff x\in E_k\ \text{for all}\ k\geq j. \end{aligned}$

Fundamental Theorem of Calculus

The Fundamental Theorem of Calculus is one of the most amazing and important theorems in analysis. It is a non-trivial result that links the concept of area and gradient, two seemingly unrelated concepts.

Fundamental Theorem of Calculus

The first part deals with the derivative of an antiderivative, while the second part deals with the relationship between antiderivatives and definite integrals.

First part

Let $f$ be a continuous real-valued function defined on a closed interval $[a,b]$ . Let $F$ be the function defined, for all $x$ in $[a,b]$ , by $\displaystyle F(x)=\int_a^x f(t)\,dt.$

Then $F$ is uniformly continuous on $[a,b]$ , differentiable on the open interval $(a,b)$ , and $\displaystyle F'(x)=f(x)$ for all $x$ in $(a,b)$ .

Second part

Let $f$ and $F$ be real-valued functions defined on $[a,b]$ such that $F$ is continuous and for all $x\in (a,b)$ , $\displaystyle F'(x)=f(x).$

If $f$ is Riemann integrable on $[a,b]$ , then $\displaystyle \int_a^b f(x)\,dx=F(b)-F(a).$

Gradient Theorem (Proof)

This amazing theorem is also called the Fundamental Theorem of Calculus for Line Integrals. It is quite a powerful theorem that sometimes allows fast computations of line integrals.

Gradient Theorem (Fundamental Theorem of Calculus for Line Integrals)

Let $C$ be a differentiable curve given by the vector function $\mathbf{r}(t)$ , $a\leq t\leq b$ .

Let $f$ be a differentiable function of $n$ variables whose gradient vector $\nabla f$ is continuous on $C$ . Then $\displaystyle \int_C \nabla f\cdot d\mathbf{r}=f(\mathbf{r}(b))-f(\mathbf{r}(a)).$

Proof

$\begin{aligned} \int_C\nabla f\cdot d\mathbf{r}&=\int_a^b\nabla f(\mathbf{r}(t))\cdot \mathbf{r}'(t)\,dt\ \ \ \text{(Definition of line integral)}\\ &=\int_a^b (\frac{\partial f}{\partial x_1}\frac{dx_1}{dt}+\frac{\partial f}{\partial x_2}\frac{dx_2}{dt}+\dots+\frac{\partial f}{\partial x_n}\frac{dx_n}{dt})\,dt\\ &=\int_a^b \frac{d}{dt}f(\mathbf{r}(t))\,dt\ \ \ \text{(by Multivariate Chain Rule)}\\ &=f(\mathbf{r}(b))-f(\mathbf{r}(a))\ \ \ \text{(by Fundamental Theorem of Calculus)} \end{aligned}$

Liouville’s Theorem

Every bounded entire function must be constant.

That is, every holomorphic function $f$ for which there exists $M>0$ such that $|f(z)|\leq M$ for all $z\in\mathbb{C}$ is constant.

Multivariable Version of Taylor’s Theorem

Multivariable calculus is an interesting topic that is often neglected in the curriculum. Furthermore it is hard to learn since the existing textbooks are either too basic/computational (e.g. Multivariable Calculus, 7th Edition by Stewart) or too advanced. Many analysis books skip multivariable calculus altogether and just focus on measure and integration.

If anyone has a good book that covers multivariable calculus (preferably rigorously with proofs), do post it in the comments!

The following is a useful multivariable version of Taylor’s Theorem, using the multi-index notation which is regarded as the most efficient way of writing the formula.

Multivariable Version of Taylor’s Theorem

Let $f:\mathbb{R}^n\to\mathbb{R}$ be a $k$ times differentiable function at the point $\mathbf{a}\in\mathbb{R}^n$ . Then there exists $h_\alpha:\mathbb{R}^n\to\mathbb{R}$ such that $\displaystyle f(\mathbf{x})=\sum_{|\alpha|\leq k}\frac{D^\alpha f(\mathbf{a})}{\alpha!}(\mathbf{x}-\mathbf{a})^\alpha+\sum_{|\alpha|=k}h_\alpha(\mathbf{x})(\mathbf{x}-\mathbf{a})^\alpha,$ and $\lim_{\mathbf{x}\to\mathbf{a}}h_\alpha(\mathbf{x})=0$ .

Example ( $n=2$ , $k=1$ )

Write $\mathbf{x}-\mathbf{a}=\mathbf{v}$ .
$\displaystyle f(x,y)=f(\mathbf{a})+\frac{\partial f}{\partial x}(\mathbf{a})v_1+\frac{\partial f}{\partial y}(\mathbf{a})v_2+h_{(1,0)}(x,y)v_1+h_{(0,1)}(x,y)v_2.$

Cauchy Product Definition

Cauchy Product:
The Cauchy product of two infinite series is defined by $\displaystyle (\sum_{i=0}^\infty a_i)\cdot(\sum_{j=0}^\infty b_j)=\sum_{k=0}^\infty c_k$ where $c_k=\sum_{l=0}^k a_lb_{k-l}$ .

Pasting Lemma (Elaboration of Wikipedia’s proof)

The proof of the Pasting Lemma at Wikipedia is correct, but a bit unclear. In particular, it does not clearly show how the hypothesis that X, Y are both closed is being used. It actually has something to do with subspace topology.

I have added some clarifications here:

Pasting Lemma (Statement)

Let $X$ , $Y$ be both closed (or both open) subsets of a topological space $A$ such that $A=X\cup Y$ , and let $B$ also be a topological space. If both $f|_X: X\to B$ and $f|_Y: Y\to B$ are continuous, then $f:A \to B$ is continuous.

Proof:

Let $U$ be a closed subset of $B$ . Then $f^{-1}(U)\cap X$ is closed in $X$ since it is the preimage of $U$ under the function $f|_X:X\to B$ , which is continuous. Hence $f^{-1}(U)\cap X=F\cap X$ for some set $F$ closed in $A$ . Since $X$ is closed in $A$ , $f^{-1}(U)\cap X$ is closed in $A$ .

Similarly, $f^{-1}(U)\cap Y$ is closed (in $A$ ). Then, their union $f^{-1}(U)$ is also closed (in $A$ ), being a finite union of closed sets.

One-Sided Limit that Does Not Exist

Offhand, it is hard to think of a function that does not have even a one-sided limit. This video shows one!

Algebra and Analysis Theorems

The following are two lists of useful algebra and analysis theorems that are covered during university.

Algebra Theorems Mathtuition88

Analysis Theorems Mathtuition88

Mertens’ Theorem

Let $(a_n)$ and $(b_n)$ be real or complex sequences.

If the series $\sum_{n=0}^\infty a_n$ converges to $A$ and $\sum_{n=0}^\infty b_n$ converges to $B$ , and at least one of them converges absolutely, then their Cauchy product converges to $AB$ .

An immediate corollary of Mertens’ Theorem is that if a power series $f(x)=\sum a_kx^k$ has radius of convergence $R_a$ , and another power series $g(x)=\sum b_kx^k$ has radius of convergence $R_b$ , then their Cauchy product converges to $f\cdot g$ and has radius of convergence at least the minimum of $R_a, R_b$ .

Note that a power series converges absolutely within its radius of convergence so Mertens’ Theorem applies.

Tietze Extension Theorem and Pasting Lemma

Tietze Extension Theorem

If $X$ is a normal topological space and $\displaystyle f:A\to\mathbb{R}$ is a continuous map from a closed subset $A\subseteq X$ , then there exists a continuous map $\displaystyle F:X\to\mathbb{R}$ with $F(a)=f(a)$ for all $a$ in $A$ .

Moreover, $F$ may be chosen such that $\sup\{|f(a)|:a\in A\}=\sup\{|F(x)|:x\in X\}$ , i.e., if $f$ is bounded, $F$ may be chosen to be bounded (with the same bound as $f$ ). $F$ is called a continuous extension of $f$ .

Pasting Lemma

Proof:

Let $U$ be a closed subset of $B$ . Then $f^{-1}(U)\cap X$ is closed since it is the preimage of $U$ under the function $f|_X:X\to B$ , which is continuous. Similarly, $f^{-1}(U)\cap Y$ is closed. Then, their union $f^{-1}(U)$ is also closed, being a finite union of closed sets.

Lusin’s Theorem and Egorov’s Theorem

Lusin’s Theorem and Egorov’s Theorem are the second and third of Littlewood’s famous Three Principles.

There are many variations and generalisations, the most basic of which I think are found in Royden’s book.

Lusin’s Theorem:

Informally, “every measurable function is nearly continuous.”

(Royden) Let $f$ be a real-valued measurable function on $E$ . Then for each $\epsilon>0$ , there is a continuous function $g$ on $\mathbb{R}$ and a closed set $F\subseteq E$ for which $\displaystyle f=g\ \text{on}\ F\ \text{and}\ m(E\setminus F)<\epsilon.$

Egorov’s Theorem

Informally, “every convergent sequence of functions is nearly uniformly convergent.”

(Royden) Assume $m(E)<\infty$ . Let $\{f_n\}$ be a sequence of measurable functions on $E$ that converges pointwise on $E$ to the real-valued function $f$ .

Then for each $\epsilon>0$ , there is a closed set $F\subseteq E$ for which $\displaystyle f_n\to f\ \text{uniformly on}\ F\ \text{and}\ m(E\setminus F)<\epsilon.$

A holomorphic and injective function has nonzero derivative

This post proves that if $f:U\to V$ is a function that is holomorphic (analytic) and injective, then $f'(z)\neq 0$ for all $z$ in $U$ . The condition of having nonzero derivative is equivalent to the condition of conformal (preserves angles). Hence, this result can be stated as “A holomoprhic and injective function is conformal.”

(Proof modified from Stein-Shakarchi Complex Analysis)

We prove by contradiction. Suppose to the contrary $f'(z_0)=0$ for some $z_0\in D$ . Using Taylor series, $\displaystyle f(z)=f(z_0)+f'(z_0)(z-z_0)+\frac{f''(z_0)}{2!}(z-z_0)^2+\dots$

Since $f'(z_0)=0$ , $\displaystyle f(z)-f(z_0)=a(z-z_0)^k+G(z)$ for all $z$ near $z_0$ , with $a\neq 0$ , $k\geq 2$ and $G(z)=(z-z_0)^{k+1}H(z)$ where $H$ is analytic.

For sufficiently small $w\neq 0$ , we write $\displaystyle f(z)-f(z_0)-w=F(z)+G(z),$ where $F(z)=a(z-z_0)^k-w$ .

Since $|G(z)|<|F(z)|$ on a small circle centered at $z_0$ , and $F$ has at least two zeroes inside that circle, Rouche’s theorem implies that $f(z)-f(z_0)-w$ has at least two zeroes there.

Since the zeroes of a non-constant holomorphic function are isolated, $f'(z)\neq 0$ for all $z\neq z_0$ but sufficiently close to $z_0$ .

Let $z_1$ , $z_2$ be the two roots of $f(z)-f(z_0)-w$ . Note that since $w\neq 0$ , $z_1\neq z_0$ , $z_2\neq z_0$ . If $z_1=z_2$ , then $f(z)-f(z_0)-w=(z-z_1)^2h(z)$ for some analytic function $h$ . This means $f'(z_0)=0$ which is a contradiction.

Thus $z_1\neq z_2$ , which implies that $f$ is not injective.

Underrated Complex Analysis Theorem: Schwarz Lemma

The Schwarz Lemma is a relatively basic lemma in Complex Analysis, that can be said to be of greater importance that it seems. There is a whole article written on it.

The conditions and results of Schwarz Lemma are rather difficult to memorize offhand, some tips I gathered from the net on how to memorize the Schwarz Lemma are:

Conditions: $f:D\to D$ holomorphic and fixes zero.

Result 1: $|f(z)|\leq|z|$ can be remembered as “Range of f” subset of “Domain”.

$|f'(0)|\leq 1$ can be remembered as some sort of “Contraction Mapping”.

Result 2: If $|f(z)|=|z|$ , or $|f'(0)|=1$ , then $f=az$ where $|a|=1$ . Remember it as “ $f$ is a rotation”.

If you have other tips on how to remember or intuitively understand Schwarz Lemma, please let me know by posting in the comments below.

Finally, we proceed to prove the Schwarz Lemma.

Schwarz Lemma

Let $D=\{z:|z|<1\}$ be the open unit disk in the complex plane $\mathbb{C}$ centered at the origin and let $f:D\to D$ be a holomorphic map such that $f(0)=0$ .

Then, $|f(z)|\leq |z|$ for all $z\in D$ and $|f'(0)|\leq 1$ .

Moreover, if $|f(z)|=|z|$ for some non-zero $z$ or $|f'(0)|=1$ , then $f(z)=az$ for some $a\in\mathbb{C}$ with $|a|=1$ (i.e.\ $f$ is a rotation).

Proof

Consider $g(z)=\begin{cases} \dfrac{f(z)}{z} &\text{if }z\neq 0,\\ f'(0) &\text{if }z=0. \end{cases}$
Since $f$ is analytic, $f(z)=0+a_1z+a_2z^2+\dots$ on $D$ , and $f'(0)=a_1$ . Note that $g(z)=a_1+a_2z+\dots$ on $D$ , so $g$ is analytic on $D$ .

Let $D_r=\{z:|z|\leq r\}$ denote the closed disk of radius $r$ centered at the origin. The Maximum Modulus Principle implies that, for $r<1$ , given any $z\in D_r$ , there exists $z_r$ on the boundary of $D_r$ such that $\displaystyle |g(z)|\leq|g(z_r)|=\frac{|f(z_r)|}{|z_r|}\leq\frac{1}{r}.$

As $r\to 1$ we get $|g(z)|\leq 1$ , thus $|f(z)|\leq|z|$ . Thus
$\begin{aligned} |f'(0)|&=|\lim_{z\to 0}\frac{f(z)}{z}|\\ &=\lim_{z\to 0}|\frac{f(z)}{z}|\\ &\leq1. \end{aligned}$
Moreover, if $|f(z)|=|z|$ for some non-zero $z\in D$ or $|f'(0)|=1$ , then $|g(z)|=1$ at some point of $D$ . By the Maximum Modulus Principle, $g(z)\equiv a$ where $|a|=1$ . Therefore, $f(z)=az$ .

Rouche’s Theorem

If the complex-valued functions $f$ and $g$ are holomorphic inside and on some closed contour $K$ , with $|g(z)|<|f(z)|$ on $K$ , then $f$ and $f+g$ have the same number of zeroes inside $K$ , where each zero is counted as many times as its multiplicity.

Example

Consider the polynomial $z^5+3z^3+7$ in the disk $|z|<2$ . Let $g(z)=3z^3+7$ , $f(z)=z^5$ , then

$\begin{aligned} |3z^3+7|&<3(8)+7\\ &=31\\ &<32\\ &=|z^5| \end{aligned}$
for every $|z|=2$ .
Then $f+g$ has the same number of zeroes as $f(z)=z^5$ in the disk $|z|<2$ , which is exactly 5 zeroes.

The most Striking Theorem in Real Analysis

Lebesgue’s Theorem (see below) has been called one of the most striking theorems in real analysis. Indeed it is a very surprising result.

Lebesgue’s Theorem (Monotone functions)

If the function $f$ is monotone on the open interval $(a,b)$ , then it is differentiable almost everywhere on $(a,b)$ .

Absolutely Continuous Functions

Definition

A real-valued function $f$ on a closed, bounded interval $[a,b]$ is said to be absolutely continuous on $[a,b]$ provided for each $\epsilon>0$ , there is a $\delta>0$ such that for every finite disjoint collection $\{(a_k,b_k)\}_{k=1}^n$ of open intervals in $(a,b)$ , if $\displaystyle \sum_{k=1}^n(b_k-a_k)<\delta,$ then $\displaystyle \sum_{k=1}^n|f(b_k)-f(a_k)|<\epsilon.$

Equivalent Conditions

The following conditions on a real-valued function $f$ on a compact interval $[a,b]$ are equivalent:
(i) $f$ is absolutely continuous;

(ii) $f$ has a derivative $f'$ almost everywhere, the derivative is Lebesgue integrable, and $\displaystyle f(x)=f(a)+\int_a^x f'(t)\,dt$ for all $x$ on $[a,b]$ ;

(iii) there exists a Lebesgue integrable function $g$ on $[a,b]$ such that $\displaystyle f(x)=f(a)+\int_a^x g(t)\,dt$ for all $x$ on $[a,b]$ .

Equivalence between (i) and (iii) is known as the Fundamental Theorem of Lebesgue integral calculus.

Inner and Outer Approximation of Lebesgue Measurable Sets

Let $E\subseteq\mathbb{R}$ . Then each of the following four assertions is equivalent to the measurability of $E$ .

(Outer Approximation by Open Sets and $G_\delta$ Sets)

(i) For each $\epsilon>0$ , there is an open set $G$ containing $E$ for which $m^*(G\setminus E)<\epsilon$ .

(ii) There is a $G_\delta$ set $G$ containing $E$ for which $m^*(G\setminus E)=0$ .

(Inner Approximation by Closed Sets and $F_\sigma$ Sets)

(iii) For each $\epsilon>0$ , there is a closed set $F$ contained in $E$ for which $m^*(E\setminus F)<\epsilon$ .

(iv) There is an $F_\sigma$ set $F$ contained in $E$ for which $m^*(E\setminus F)=0$ .

Proof:
( $E$ measurable implies (i)):

Assume $E$ is measurable. Let $\epsilon>0$ . First we consider the case where $m^*(E)<\infty$ . By the definition of outer measure, there is a countable collection of open intervals $\{I_k\}_{k=1}^\infty$ which covers $E$ and satisfies $\displaystyle \sum_{k=1}^\infty l(I_k)<m^*(E)+\epsilon.$

Define $G=\bigcup_{k=1}^\infty I_k$ . Then $G$ is an open set containing $E$ . By definition of the outer measure of $G$ , $\displaystyle m^*(G)\leq\sum_{k=1}^\infty l(I_k)<m^*(E)+\epsilon.$

Since $E$ is measureable and has finite outer measure, by the excision property, $\displaystyle m^*(G\setminus E)=m^*(G)-m^*(E)<\epsilon.$

Now consider the case that $m^*(E)=\infty$ . Since $\mathbb{R}$ is $\sigma$ -finite, $E$ may be expressed as the disjoint union of a countable collection $\{E_k\}_{k=1}^\infty$ of measurable sets, each of which has finite outer measure.

By the finite measure case, for each $k\in\mathbb{N}$ , there is an open set $G_k$ containing $E_k$ for which $m^*(G_k\setminus E_k)<\epsilon/2^k$ . The set $G=\bigcup_{k=1}^\infty G_k$ is open, it contains $E$ and $\displaystyle G\setminus E=(\bigcup_{k=1}^\infty G_k)\setminus E\subseteq\bigcup_{k=1}^\infty (G_k\setminus E_k).$

Therefore
$\begin{aligned} m^*(G\setminus E)&\leq\sum_{k=1}^\infty m^*(G_k\setminus E_k)\\ &<\sum_{k=1}^\infty\epsilon/2^k\\ &=\epsilon. \end{aligned}$
Thus property (i) holds for $E$ .

((i) implies (ii)):

Assume property (i) holds for $E$ . For each $k\in\mathbb{N}$ , choose an open set $O_k$ that contains $E$ such that $m^*(O_k\setminus E)<1/k$ . Define $G=\bigcap_{k=1}^\infty O_k$ . Then $G$ is a $G_\delta$ set that contains $E$ . Note that for each $k$ , $\displaystyle G\setminus E\subseteq O_k\setminus E.$

By monotonicity of outer measure, $\displaystyle m^*(G\setminus E)\leq m^*(O_k\setminus E)<1/k.$

Thus $m^*(G\setminus E)=0$ and hence (ii) holds.

((ii) $\implies E$ is measurable):

Now assume property (ii) holds for $E$ . Since a set of measure zero is measurable, $G\setminus E$ is measurable. $G$ is a $G_\delta$ set and thus measurable. Since measurable sets form a $\sigma$ -algebra, $E=G\cap(G\setminus E)^c$ is measurable.

((i) $\implies$ (iii)):

Assume condition (i) holds. Note that $E^c$ is measurable iff $E$ is measurable. Thus there exists an open set $G\supseteq E^c$ such that $m^*(G\setminus E^c)<\epsilon$ .

Define $F=\mathbb{R}\setminus G$ which is closed. Note that $F\subseteq E$ , and $m^*(E\setminus F)=m^*(G\setminus E^c)<\epsilon$ .

((iii) $\implies$ (i)):

Similar.

((ii) $\iff$ (iv)):

Similar idea. Note that a set is $G_\delta$ iff its complement is $F_\sigma$ .

Excision Property in Measure Theory

Excision property of measurable sets (Proof)

If $A$ is a measurable set of finite outer measure that is contained in $B$ , then $\displaystyle m^*(B\setminus A)=m^*(B)-m^*(A).$

Proof:

By the measurability of $A$ ,
$\begin{aligned} m^*(B)&=m^*(B\cap A)+m^*(B\cap A^c)\\ &=m^*(A)+m^*(B\setminus A). \end{aligned}$
Since $m^*(A)<\infty$ , we have the result.

How to remember the Divergence Theorem

The Divergence Theorem:
$\displaystyle \int_U\nabla\cdot\mathbf{F}\,dV_n=\oint_{\partial U}\mathbf{F}\cdot\mathbf{n}\,dS_{n-1}$

is a rather formidable looking formula that is not so easy to memorise.

One trick is to remember it is to remember the simpler-looking General Stoke’s Theorem.

One can use the general Stoke’s Theorem ( $\int_{\Omega}d\omega=\int_{\partial\Omega}\omega$ ) to equate the $n$ -dimensional volume integral of the divergence of a vector field $\mathbf{F}$ over a region $U$ to the $(n-1)$ -dimensional surface integral of $\mathbf{F}$ over the boundary of $U$ .

Why Differentiability in Higher Dimensions is defined as it is?

Source: http://www.math.caltech.edu/~dinakar/08-Ma1cAnalytical-Notes-chap.2.pdf

The above paragraph describes nicely the intuitive meaning of the idea behind the definition of differentiability in higher dimensions! It is a very neat idea.

Borel measurability

A function $f$ is said to be Borel measurable provided its domain $E$ is a Borel set and for each $c$ , the set $\displaystyle \{x\in E\mid f(x)>c\}=f^{-1}(c,\infty)$ is a Borel set.

Borel set

A Borel set is any set in a topological space that can be formed from open sets through the operations of countable union, countable intersection, and relative complement.

How to Change Order of Integration

Check out this video by “patrickJMT”. His videos are excellent.

In this case, the reason why we can change the order of integration is Tonelli’s Theorem, since the integrand is non-negative.

Lebesgue’s Dominated Convergence Theorem (Continuous Version)

This is a basic but very useful corollary of the usual Lebesgue’s Dominated Convergence Theorem.

From what I see, it is basically the Sequential Criterion plus the usual Dominated Convergence Theorem.

From the book: Basic Partial Differential Equations

BV (Bounded Variation) functions

BV functions of one variable

Total variation

The total variation of a real-valued function $f$ , defined on an interval $[a,b]$ , is the quantity $\displaystyle V_a^b(f)=\sup_{P\in\mathcal{P}}\sum_{i=0}^{n_P-1}|f(x_{i+1})-f(x_i)|$ where the supremum is taken over the set $\mathcal{P}=\{P=\{x_0,\dots,x_{n_P}\}\mid P\ \text{is a partituion of }[a,b]\}$ .

BV function

$f\in BV([a,b])\iff V_a^b(f)<\infty$ .

Jordan decomposition of a function

A real function $f$ is of bounded variation in $[a,b]$ iff it can be written as $f=f_1-f_2$ of two non-decreasing functions on $[a,b]$ .

Markov’s Inequality: No more than 1/5 of the population can have more than 5 times the average income

One way to remember Markov’s Inequality (also called Chebyshev’s Inequality) is to remember this application: No more than 1/5 of the population can have more than 5 times the average income. For instance, if the average income of a certain country is USD $3000 per month, no more than 20% of the citizens can earn more than $15 000!

Brief Explanation

$\mu(\{x\in X: f(x)\geq\epsilon\})\leq\frac{1}{\epsilon}\int_X f\,d\mu$ is Markov’s Inequality, where $\mu$ is the probability measure. Taking $\epsilon=5A$ to be 5 times the average income, the left hand side represents the probability of having more than 5 times the average income. The right hand side is $\frac{1}{5A}\cdot A=\frac 15$ .

Chebyshev’s/Markov’s Inequality (Proof):
If $(X,\Sigma,\mu)$ is a measure space, $f$ is a non-negative measurable extended real-valued function, and $\epsilon>0$ , then $\displaystyle \mu(\{x\in X: f(x)\geq\epsilon\})\leq\frac{1}{\epsilon}\int_X f\,d\mu.$

Proof:
Define $\displaystyle s(x)=\begin{cases} \epsilon, &\text{if}\ f(x)\geq\epsilon\\ 0, &\text{if}\ f(x)<\epsilon. \end{cases}$
Then $0\leq s(x)\leq f(x)$ . Thus $\int_X f(x)\,d\mu\geq\int_X s(x)\,d\mu=\epsilon\mu(\{x\in X: f(x)\geq\epsilon\})$ . Dividing both sides by $\epsilon>0$ gives the result.

Riesz-Fischer Theorem Proof

Bartle’s proof of the Riesz-Fischer theorem ( $L^p$ spaces are complete) is quite nice. Royden uses a concept called “rapidly Cauchy”, which may complicated things unnecessarily.

Proof from Bartle’s Book Elements of Integration:

Fatou’s Lemma

Fatou’s Lemma
Let $(f_n)$ be a sequence of nonnegative measurable functions, then $\displaystyle\int\liminf_{n\to\infty}f_n\,d\mu\leq\liminf_{n\to\infty}\int f_n\,d\mu.$

A brilliant graphical way to remember Fatou’s Lemma (taken from the site http://math.stackexchange.com/questions/242920/what-are-some-tricks-to-remember-fatous-lemma).

The first two are ∫f1 and ∫f2 respectively, but even the smaller of these is larger than the area in the third picture, which is ∫inf fn.

Proof

Books to Transition from Math to Data Science

Try Audible Plus (Free!)

Lebesgue’s Dominated Convergence Theorem for Convergence in Measure

Proof

Generalized Lebesgue Dominated Convergence Theorem

Proof

Laurent Series

Example

Implicit function theorem

Example (Unit circle)

Fundamental Theorem of Calculus

First part

Second part

Gradient Theorem (Fundamental Theorem of Calculus for Line Integrals)

Proof

Liouville’s Theorem

Multivariable Version of Taylor’s Theorem

Example (, )

Pasting Lemma (Statement)

Proof:

Mertens’ Theorem

Tietze Extension Theorem

Pasting Lemma

Proof:

Lusin’s Theorem:

Egorov’s Theorem

Schwarz Lemma

Proof

Rouche’s Theorem

Example

Lebesgue’s Theorem (Monotone functions)

Absolutely Continuous Functions

Definition

Equivalent Conditions

(Outer Approximation by Open Sets and Sets)

(Inner Approximation by Closed Sets and Sets)

Excision property of measurable sets (Proof)

Borel measurability

Borel set

BV functions of one variable

Total variation

BV function

Jordan decomposition of a function

Example ( $n=2$ , $k=1$ )

(Outer Approximation by Open Sets and $G_\delta$ Sets)

(Inner Approximation by Closed Sets and $F_\sigma$ Sets)