Sobolev spaces on Euclidean space

\setkeys

Grotunits=360

1 Introduction

The purpose of these notes is to outline the basic definitions and theorems for Sobolev spaces defined on open subsets of Euclidean space. Of course, there are already many good references on this topic, and, rather than duplicate this here, instead the goal is to give examples where possible to illustrate the theory, and to orient the reader towards the different approaches contained in the literature. In addition, there is an appendix containing some basic results from measure theory (again, this contains examples and references to some of the literature on the subject).

There are a number of more advanced topics that have been left to future versions of these notes; for example, complete proofs of the embedding and compactness theorems (as well as examples where embeddings don’t exist), the chain rule and the behaviour of weak derivatives under co-ordinate transformations. Good references for this material include the book [1] by Adams (a classic on the subject), and Ziemer’s book [16]. Sobolev spaces on manifolds and their use in gauge theory would also be good topics for an expanded version of these notes. Future versions of these notes will also contain more examples.

Notation. First-order partial derivatives $\frac{\partial u}{\partial x_{i}}$ are denoted $\partial _{{x_{i}}}u$ or $\partial _{i}u$. Higher-order partial derivatives use the standard notation for multi-indices (see [4, Appendix A]): Given a multi-index $\alpha=(\alpha _{1},\ldots,\alpha _{n})\in\mathbb{Z}_{{\geq 0}}^{n}$ we write $|\alpha|=\alpha _{1}+\cdots+\alpha _{n}$ for the order of $\alpha$, and define the $\alpha^{{th}}$ order partial derivative by

 $D^{\alpha}u=\frac{\partial^{{|\alpha|}}u}{\partial x_{1}^{{\alpha _{1}}}\cdots\partial x_{n}^{{\alpha _{n}}}}=\partial _{{x_{1}}}^{{\alpha _{1}}}\cdots\partial _{{x_{n}}}^{{\alpha _{n}}}u.$

2 Definition of Sobolev spaces

This section contains all of the necessary definitions needed to define Sobolev spaces on open subsets of $\mathbb{R}^{n}$. In order to get a feel for distributional derivatives and Sobolev spaces, some basic examples are given throughout the section.

The approach taken in these notes is to follow the historical definition of Sobolev spaces. First, in this section, we define the distributional and weak derivatives, and then define Sobolev spaces in terms of these. Later on, in Section 3.2, we prove the Meyers-Serrin theorem, which says that Sobolev spaces are the completion of the space of smooth functions in the Sobolev norm.

2.1 Distributions and test functions

Let $\Omega\subseteq\mathbb{R}^{n}$ be open and non-empty, and let $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ denote the space of smooth complex-valued functions with compact support in $\Omega$.

Definition 2.1.

The space of test functions on $\Omega$, denoted $\mathcal{D}(\Omega)$, is the locally convex topological vector space (more precisely the LF-space) consisting of all the functions in $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$, with the following notion of convergence: A sequence $(\phi _{m})_{{m\in\mathbb{N}}}\subset\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ converges in $\mathcal{D}(\Omega)$ to the function $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ if and only if there is some fixed compact set $K$ such that the support of $\phi _{m}-\phi$ is in $K$ for all $m$, and that $D^{\alpha}\phi _{m}\rightarrow D^{\alpha}\phi$ uniformly for each $\alpha$.

Remark 2.2.
1. The notation $\mathcal{D}(\Omega)$ is used to emphasise the topology on $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ described above.

2. Note that the definition does not imply that the constants from the uniform convergence are independent of $\alpha$.

3. To see that $\mathcal{D}(\Omega)$ is a locally convex topological vector space, indeed, let us construct a family of seminorms which induces the topology on $\mathcal{D}(\Omega)$. To this end denote first by $E_{\textup{cpt}}(\Omega)$ the set of compact exhaustions of $\Omega$, which means the set of all families $K=(K_{i})_{{i\in\mathbb{N}}}$ such that $K_{i}\subset\Omega$ is compact, $\bigcup _{{i\in\mathbb{N}}}K_{i}=\Omega$ and $K_{i}\subset K_{{i+1}}^{\circ}$ for all $i\in\mathbb{N}$. For every compact exhaustion $K$ and every pair $M=(m_{i})_{{i\in\mathbb{N}}}$ and $N=(n_{i})_{{i\in\mathbb{N}}}$ of sequences of natural numbers denote then by $p_{{K,M,N}}:\mathcal{D}(\Omega)\rightarrow\mathbb{R}_{{\geq 0}}$ the map defined by

 $p_{{K,M,N}}(\phi)=\sum _{{i=0}}^{\infty}\>\sup _{{x\in K_{{i+1}}\setminus K_{i}^{\circ}}}\>\sup _{{0\leq|\alpha|\leq m_{i}}}\> n_{i}\,|D^{\alpha}\phi(x)|,\text{ \phi\in\mathcal{D}(\Omega)}.$

Note that, the sum in this formula is always finite since the support of $\phi$ is compact.

It is straightforward to check that $p_{{K,M,N}}$ is a seminorm on $\mathcal{D}(\Omega)$ and that the family $\big(p_{{K,M,N}}\big)_{{K\in E_{\textup{cpt}}(\Omega),\> M,N\in\mathbb{N}^{\mathbb{N}}}}$ defines a locally convex topology on $\mathcal{D}(\Omega)$ that exactly recaptures the convergence in $\mathcal{D}(\Omega)$ as defined above (see [14, Chapter 13] for more details on the LF-space structure of $\mathcal{D}(\Omega)$). We present this explicit description of a family of seminorms describing the locally convex topology on $\mathcal{D}(\Omega)$ here, since we do not know of a reference to this in the literature.

Definition 2.3.

A distribution is a continuous complex-valued linear functional on the space of test functions $\mathcal{D}(\Omega)$. The space of distributions is the dual

 $\mathcal{D}(\Omega)^{*}=\left\{ T:\mathcal{D}(\Omega)\rightarrow\mathbb{C}\,\mid\, T\,\,\text{linear and continuous}\right\}.$
Remark 2.4.

In the above definition, linearity simply means that if $T\in\mathcal{D}(\Omega)^{*}$ and $\phi,\psi\in\mathcal{D}(\Omega)$, then

 $T(\lambda\phi+\mu\psi)=\lambda T(\phi)+\mu T(\psi)\quad\text{for all \lambda,\mu\in\mathbb{C}}.$

Continuity means that whenever a sequence $(\phi _{m})_{{m\in\mathbb{N}}}\subset\mathcal{D}(\Omega)$ converges in $\mathcal{D}(\Omega)$ (in the sense of Definition 2.1) to $\phi\in\mathcal{D}(\Omega)$, then $T(\phi _{m})\rightarrow T(\phi)$ as a sequence in $\mathbb{C}$.

The space $\mathcal{D}(\Omega)^{*}$ is also equipped with a notion of convergence, defined as follows.

Definition 2.5.

A sequence $(T_{n})_{{n\in\mathbb{N}}}\subset\mathcal{D}(\Omega)^{*}$ converges in $\mathcal{D}(\Omega)^{*}$ to $T\in\mathcal{D}(\Omega)^{*}$ if for every $\phi\in\mathcal{D}(\Omega)$ we have $T_{n}(\phi)\rightarrow T(\phi)$ in $\mathbb{C}$.

Remark 2.6.

This is the usual notion of convergence in the $\text{weak}^{*}$ topology on a dual space.

The following gives some examples of distributions (recall the definition of $L_{{loc}}^{p}(\Omega)$ from Appendix A).

Example 2.7.
1. Given $x\in\Omega$, the delta functional is the distribution

 $\delta _{x}(\phi)=\phi(x).$

Clearly this is linear. To see that it is continuous, note that if $\phi _{m}\rightarrow\phi$ in $\mathcal{D}(\Omega)$, then $\phi _{m}(x)\rightarrow\phi(x)$, and so $\delta _{x}(\phi _{m})\rightarrow\delta _{x}(\phi)$.

2. The functional

 $T(\phi)=\int _{\Omega}\phi(x)\, dx$

is a distribution. (Note that since $\phi$ is continuous with compact support then it is also integrable.) Again, this is clearly linear. It is also continuous, since if $\phi _{m}\rightarrow\phi$ in $\mathcal{D}(\Omega)$, then, by definition, there exists a fixed compact set $K$ such that $\supp(\phi _{m}-\phi)\subseteq K$. Therefore

 $T(\phi)-T(\phi _{m})=\int _{\Omega}(\phi(x)-\phi _{m}(x))\, dx=\int _{K}(\phi(x)-\phi _{m}(x))\, dx,$

and since $\phi _{m}(x)\rightarrow\phi(x)$ uniformly on the compact set $K$, then $\int _{K}(\phi(x)-\phi _{m}(x))\, dx\rightarrow 0$. Note that it is essential that $K$ has finite measure for this argument to work.

3. Given $g\in L_{{loc}}^{1}(\Omega)$, let $T_{g}$ be the functional

 $T_{g}(\phi)=\int _{\Omega}\phi(x)g(x)\, dx.$ (2.1)

Note that Hölder’s inequality shows that $\left|T_{g}(\phi)\right|\leq\|\phi\| _{{L^{\infty}(K)}}\| g\| _{{L^{1}(K)}}$, where $K$ denotes the (compact) support of $\phi\in\mathcal{D}(\Omega)$. Therefore $T_{g}(\phi)$ is always finite since $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}$ implies that $\|\phi\| _{{L^{\infty}}}$ is finite, and so $T_{g}(\phi)\in\mathbb{C}$ for all $\phi\in\mathcal{D}(\Omega)$. As for the previous examples, clearly $T_{g}$ is linear, and it only remains to show that it is continuous. Note that if $\phi _{m}\rightarrow\phi$, then there is a fixed compact set $K$ with $\supp(\phi _{m}-\phi)\subseteq K$, and so

 $T_{g}(\phi)-T_{g}(\phi _{m})=\int _{K}(\phi(x)-\phi _{m}(x))g\, dx\rightarrow 0$

since $g\in L_{{loc}}^{1}(\Omega)$ and $\phi _{m}(x)\rightarrow\phi(x)$ uniformly on $K$. Therefore $T_{g}\in\mathcal{D}(\Omega)^{*}$.

We will revisit this example later, since it appears in the definition of weak derivative in Section 2.2.

The last example above is an important one, it shows that there is a linear map $L_{{loc}}^{1}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*}$ given by $g\mapsto T_{g}$ (recall that all $L^{p}$ and $L_{{loc}}^{p}$ spaces are defined to be equivalence classes of functions that are equal almost everywhere, and note that this map is well-defined on equivalence classes of functions in $L_{{loc}}^{1}(\Omega)$, since $f=g$ a.e. implies that $\int _{\Omega}f\phi\, dx=\int _{\Omega}g\phi\, dx$ for any test function $\phi$). In fact, since Hölder’s inequality shows that there is an inclusion $L_{{loc}}^{p}(\Omega)\hookrightarrow L_{{loc}}^{1}(\Omega)$ for all $1, then there is also a map $L_{{loc}}^{p}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*}$. The next theorem says that this map is injective.

Theorem 2.8.

Let $\Omega\subset\mathbb{R}^{n}$ be open, and let $f$ and $g$ be functions in $L_{{loc}}^{1}(\Omega)$. Suppose that the distributions $T_{f}$ and $T_{g}$ are equal, i.e. $T_{f}(\phi)=T_{g}(\phi)$ for all test functions $\phi\in\mathcal{D}(\Omega)$. Then $f=g$ a.e. in $\Omega$.

Proof.

This is proved in [8, Theorem 6.5] using convolutions, however, for variety, here we give a slightly different proof. Firstly note that it is sufficient to prove the result for real-valued functions $f$ and $g$, since we can take real and imaginary parts. Suppose that there exists a set $K$ whose Lebesgue measure is finite and non-zero, and which satisfies $f(x)\neq g(x)$ for all $x\in K$. Since the Lebesgue measure is Borel regular (see Lemmas B.20 and B.21) then we can assume without loss of generality that $K$ is compact. Define $K_{+}\subseteq K$ to be the subset such that $f(x)>g(x)$, and again note that without loss of generality we can assume that $K_{+}$ is compact with non-zero measure. Define the constant

 $C:=\int _{{K_{+}}}f(x)-g(x)\, dx>0.$

Now let $\{ V_{n}\} _{{n\in\mathbb{N}}}$ be a collection of open sets such that

1. $K_{+}\subset V_{n}$ and $V_{{n+1}}\subset V_{n}$ for each $n\in\mathbb{N}$, and

2. the Lebesgue measure of $V_{n}\setminus K_{+}$ satisfies $|V_{n}\setminus K_{+}|<\frac{1}{n}$.

The existence of each $V_{n}$ is guaranteed since the Lebesgue measure is Borel regular. Now use Urysohn’s lemma (see Appendix A.3) to construct a smooth positive function $\phi _{n}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ such that $0\leq\phi _{n}(x)\leq 1$ for all $x\in\Omega$, $\phi(x)=1$ for all $x\in K$, and $\phi(x)=0$ for all $x\in\Omega\setminus V_{n}$. Therefore

 $\displaystyle\int _{\Omega}(f-g)\phi _{n}\, dx$ $\displaystyle=\int _{{K_{+}}}(f-g)\phi _{n}\, dx+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx+\int _{{\Omega\setminus V_{n}}}(f-g)\phi _{n}\, dx$ $\displaystyle=\int _{{K_{+}}}(f-g)\phi _{n}\, dx+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx$ $\displaystyle=C+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx.$

The last term in the above equation satisfies the estimate

 $\displaystyle\left|\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx\right|$ $\displaystyle\leq\int _{{V_{n}\setminus K_{+}}}|f-g|\phi _{n}\, dx$ $\displaystyle\leq\int _{{V_{n}\setminus K_{+}}}|f-g|\, dx,$

and, since $\left|V_{n}\setminus K_{+}\right|\rightarrow 0$ as $n\rightarrow\infty$, then

 $\int _{{V_{n}\setminus K_{+}}}|f-g|\, dx\rightarrow 0\quad\text{as n\rightarrow\infty},$

since the integral of a fixed measurable function is an absolutely continuous set function (see for example [15, Corollary 10.41]).

Therefore there exists an $n$ such that

 $\int _{\Omega}(f-g)\phi _{n}\, dx=C+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx>0,$

which is a contradiction. Therefore $f=g$ almost everywhere. ∎

A consequence of this theorem is that the distribution $T_{f}$ associated to a function $f\in L_{{loc}}^{1}(\Omega)$ uniquely determines an equivalence class in $L_{{loc}}^{1}(\Omega)$. Therefore, the following definition makes sense.

Definition 2.9.

A distribution $T\in\mathcal{D}(\Omega)^{*}$ represents the function $f\in L_{{loc}}^{1}(\Omega)$ if

 $T(\phi)=\int _{\Omega}f(x)\phi(x)\, dx=:T_{f}(\phi)$

for all test functions $\phi\in\mathcal{D}(\Omega)$. A function $f\in L_{{loc}}^{1}(\Omega)$ is represented by the distribution $T_{f}\in\mathcal{D}(\Omega)^{*}$.

Theorem 2.8 shows that each distribution can represent at most one element of $L_{{loc}}^{1}(\Omega)$, i.e. the map $L_{{loc}}^{1}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*}$ given by $g\mapsto T_{g}$ is injective. The following example shows that not all distributions represent functions in $L_{{loc}}^{1}(\Omega)$, i.e. the map $L_{{loc}}^{1}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*}$ given by $g\mapsto T_{g}$ is not surjective.

Example 2.10.

Given any $x\in\Omega$, let $\delta _{x}\in\mathcal{D}(\Omega)$ be the delta functional defined in Example 2.7. We claim that this does not represent any function in $L_{{loc}}^{1}(\Omega)$. To see this, suppose for contradiction that $\displaystyle{\delta _{x}(\phi)=\int _{\Omega}f(x)\phi(x)\, dx}$ for some $f\in L_{{loc}}^{1}(\Omega)$ and every $\phi\in\mathcal{D}(\Omega)$. Consider a sequence of bump functions $\phi _{n}$ such that for all $n$ satisfying $B(x,\frac{1}{n})\subset\Omega$ we have

1. $\phi _{n}(x)=1$,

2. $\supp(\phi _{n})=B(x,\frac{1}{n})$,

3. $0\leq\phi _{n}(x)\leq 1$ for all $y\in B(x,\frac{1}{n})$, and

4. $\displaystyle{\int _{\Omega}\phi _{n}(x)\, dx=\frac{1}{n}}$.

Then, since $\left|f(x)\phi _{n}(x)\right|\leq\left|f(x)\right|$ has support in $\overline{B(0,1)}$, dominated convergence shows that $\displaystyle{\int _{\Omega}f(x)\phi _{n}(x)\, dx\rightarrow 0}$ as $n\rightarrow\infty$, which contradicts $\delta _{x}(\phi _{n})=1$ for all $n$.

Therefore the delta functional is an example of a distribution that cannot be represented by a function in $L_{{loc}}^{1}(\Omega)$. It can, however, be represented by a measure (see the measure in Example 3.28), and in Section 3.4 we will show that positive distributions can always be represented by measures (see Theorem 3.35).

2.2 Distributional derivatives and Sobolev spaces

Before defining Sobolev spaces, first we have to define the notion of the derivative of a distribution.

Definition 2.11.

Let $\Omega\subseteq\mathbb{R}^{n}$ be open, let $T\in\mathcal{D}(\Omega)^{*}$, and let $\alpha\in\mathbb{Z}_{{\geq 0}}^{n}$. The $\alpha^{{th}}$ distributional derivative of $T$ is the distribution $D^{\alpha}T$ defined by

 $(D^{\alpha}T)(\phi):=(-1)^{{|\alpha|}}T(D^{\alpha}\phi)$

for all test functions $\phi\in\mathcal{D}(\Omega)$. The distributional gradient, denoted $\nabla T$, is the $n$-tuple of distributions

 $\nabla T=\left(\partial _{1}T,\ldots,\partial _{n}T\right).$

If $T$ and $D^{\alpha}T$ both represent functions in $L_{{loc}}^{1}(\Omega)$ (i.e. $T=T_{f}$ and $D^{\alpha}T=T_{g}$ for some $f,g\in L_{{loc}}^{1}(\Omega)$) then we say that $g$ is a weak derivative of $f$, and write $g=D^{\alpha}f$. In this case we say that the weak derivative of $f$ exists.

Remark 2.12.
1. Since the weak derivative is defined by the relation

 $\int _{\Omega}f(x)D^{\alpha}\phi(x)\, dx=(-1)^{{|\alpha|}}\int _{\Omega}g(x)\phi(x)\, dx$

then it is only defined up to equivalence almost everywhere.

2. The distributional derivative always exists for any multi-index $\alpha$, since the definition only involves differentiating test functions, which are smooth. Since partial derivatives of smooth functions commute, then distributional derivatives also commute, i.e.

 $\partial _{i}\partial _{j}T=\partial _{j}\partial _{i}T.$
3. As we will see in the examples below, the weak derivative does not always exist, and, in fact, may not even exist for any value of $\alpha$ (in Example 2.19 we show that the step function is an example of such a function).

The following lemma shows that the weak derivative extends the notion of classical derivative of differentiable functions. It says that the distributional derivative of the distribution associated to a differentiable function $g$ is the distribution associated to the classical derivative of $g$.

Lemma 2.13.

Let $g\in C^{{|\alpha|}}(\Omega)$. Then for all $\phi\in\mathcal{D}(\Omega)$ we have

 $D^{\alpha}T_{g}(\phi)=(-1)^{{|\alpha|}}\int _{\Omega}(D^{\alpha}\phi(x))g(x)\, dx=\int _{\Omega}\phi(x)(D^{\alpha}g(x))\, dx=T_{{D^{\alpha}g}}(\phi)$ (2.2)
Proof.

The proof simply involves applying the definitions and the integration by parts formula from Section A.2. ∎

Remark 2.14.
1. It is important to emphasise that $D^{\alpha}$ is used to denote both the distributional derivative and the classical derivative in the statement of the lemma: $D^{\alpha}T_{g}$ is the distributional derivative of the distribution $T_{g}$ associated to the function $g$, and $T_{{D^{\alpha}g}}$ is the distribution associated to the classical derivative $D^{\alpha}g$.

2. It is an important exercise to think through the precise meaning of all of the statements above, to understand the distinction between a weak derivative and a distributional derivative, and to understand the meaning of each term in (2.2).

The next lemma shows that functions that are equal almost everywhere have the same distributional derivatives. As a consequence, when defining Sobolev spaces in Definition 2.16, we can define them as subsets of the $L^{p}$ and $L_{{loc}}^{p}$ spaces (i.e. we consider equivalence classes of functions that are equal almost everywhere).

Lemma 2.15.

If $f=g$ almost everywhere, then $D^{\alpha}T_{f}=D^{\alpha}T_{g}$ as distributions.

Proof.

The proof is another straightforward application of the definition of distributional derivative. For any test function $\phi\in\mathcal{D}(\Omega)$ we have

 $\displaystyle D^{\alpha}T_{f}(\phi)=(-1)^{{|\alpha|}}T_{f}(D^{\alpha}\phi)$ $\displaystyle=(-1)^{{|\alpha|}}\int _{\Omega}f(x)\, D^{\alpha}\phi(x)\, dx$ $\displaystyle=(-1)^{{|\alpha|}}\int _{\Omega}g(x)\, D^{\alpha}\phi(x)\, dx\quad\text{(since f=g a.e.)}$ $\displaystyle=(-1)^{{|\alpha|}}T_{g}(D^{\alpha}\phi)$ $\displaystyle=D^{\alpha}T_{g}(\phi).$

Therefore, $D^{\alpha}T_{f}(\phi)=D^{\alpha}T_{g}(\phi)$ for all $\phi\in\mathcal{D}(\Omega)$, and so $D^{\alpha}T_{f}=D^{\alpha}T_{g}$ as elements of $D(\Omega)^{*}$. ∎

Now that we have developed the necessary machinery, we are ready to define Sobolev spaces.

Definition 2.16.

The Sobolev space $W^{{k,p}}(\Omega)$ is the space of equivalence classes of all functions $f\in L^{p}(\Omega)$ such that the weak derivative $D^{\alpha}f$ exists and is in $L^{p}(\Omega)$ for all $\alpha$ such that $|\alpha|\leq k$.

The Sobolev space $W_{{loc}}^{{k,p}}(\Omega)$ is the space of all functions $f\in L_{{loc}}^{p}(\Omega)$ such that the weak derivative $D^{\alpha}f$ exists and is in $L_{{loc}}^{p}(\Omega)$ for all $\alpha$ such that $|\alpha|\leq k$.

The space $W^{{k,p}}(\Omega)$ has a norm given by

 $\left\| f\right\| _{{W^{{k,p}}(\Omega)}}=\sum _{{j=0}}^{k}\left(\sum _{{\alpha:|\alpha|=j}}\| D^{\alpha}f\| _{{L^{p}(\Omega)}}\right),$

and we define $W_{0}^{{k,p}}(\Omega)$ to be the closure of the space $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ in the topology induced by this norm.

For a compact subset $K\subset\Omega$ we define the norm

 $\left\| f\right\| _{{W^{{k,p}}(K)}}=\sum _{{j=0}}^{k}\left(\sum _{{\alpha:|\alpha|=j}}\| D^{\alpha}f\| _{{L^{p}(K)}}\right),$

where the weak derivatives $D^{\alpha}f$ are defined on $\Omega$.

Lemma 2.17.

The norm $\|\cdot\| _{{W^{{k,p}}(\Omega)}}$ gives $W^{{k,p}}(\Omega)$ the structure of a normed linear space for $1\leq p\leq\infty$.

Proof.

Recall that we have to check

1. The space has a unique element of zero norm, i.e. $f=0$ if and only if $\| f\| _{{W^{{k,p}}(\Omega)}}=0$.

2. The norm is linear with respect to scalar multiplication, i.e. $\| cf\| _{{W^{{k,p}}(\Omega)}}=|c|\| f\| _{{W^{{k,p}}(\Omega)}}$ for all $c\in\mathbb{C}$ and $f\in W^{{k,p}}(\Omega)$.

3. The triangle inequality holds, i.e.

 $\| f+g\| _{{W^{{k,p}}(\Omega)}}\leq\| f\| _{{W^{{k,p}}(\Omega)}}+\| g\| _{{W^{{k,p}}(\Omega)}}$

for all $f,g\in W^{{k,p}}(\Omega)$.

It is easy to check (2.2): since the result is true for $L^{p}(\Omega)$, we have $W^{{k,p}}(\Omega)\subseteq L^{p}(\Omega)$, and $\| f\| _{{L^{p}(\Omega)}}\leq\| f\| _{{W^{{k,p}}(\Omega)}}$ for all $f\in W^{{k,p}}(\Omega)$.

The weak derivative commutes with scalar multiplication, i.e. $D^{\alpha}(cf)=cD^{\alpha}f$ for all $c\in\mathbb{C}$, and so we also have $\| cf\| _{{L^{p}(\Omega)}}=|c|\| f\| _{{L^{p}(\Omega)}}$. Therefore (2.2) is satisfied by definition of the Sobolev norm.

The triangle inequality for $W^{{k,p}}(\Omega)$ follows from the definition of the Sobolev norm and the triangle inequality for $L^{p}(\Omega)$ (which is Minkowski’s inequality, see for example [15, Theorem 8.10]). ∎

It is worth recalling that $W^{{k,p}}(\Omega)$ can never be a normed linear space for $0, since the triangle inequality fails in this case. See for example the remark on p130 of [15], and also [15, Theorem 8.16]. For more discussion of $L^{p}$ spaces for $0, see [13, pp35-36].

Remark 2.18.

We will see later, in Section 3.1, that $W^{{k,p}}(\Omega)$ is a Banach space with this norm.

It is worth studying some examples of distributional and weak derivatives. The first example is the step function, for which the distributional derivative is the delta functional from Example 2.7. This is an important example, since it shows that the step function is not in any Sobolev space $W^{{k,p}}(\Omega)$ or $W_{{loc}}^{{k,p}}(\Omega)$ for $k\geq 1$, because the delta functional cannot be represented by a function.

Example 2.19.

Let $g:\mathbb{R}\rightarrow\mathbb{R}$ be the step function

 $g(x)=\left\{\begin{matrix}1&x\geq 0,\\ 0&x<0.\end{matrix}\right.$

Given a test function $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\mathbb{R})$, consider the integral

 $\int _{\mathbb{R}}g(x)\partial _{x}\phi(x)\, dx=\int _{0}^{\infty}\partial _{x}\phi(x)\, dx=\left[\phi(x)\right]_{0}^{\infty}=-\phi(0).$

(Recall that $\phi$ vanishes at infinity since it has compact support.) Therefore the distributional derivative of $T_{g}$ is the linear functional $\partial _{x}T_{{g}}\in\mathcal{D}(\mathbb{R})^{*}$ given by $\partial _{x}T_{g}(\phi)=\phi(0)$, i.e. $\partial _{x}T_{{g}}$ is the delta functional $\delta _{0}$. Example 2.10 shows that this cannot be represented by a function, and therefore the weak derivative of the step function does not exist, so the step function is not in $W^{{1,p}}(\mathbb{R})$ or $W_{{loc}}^{{1,p}}(\mathbb{R})$ for any $p$.

Example 2.20.

Let $f(x)=|x|$. To compute the weak derivative we first consider

 $\displaystyle\begin{split}\int _{\mathbb{R}}f(x)\partial _{1}\phi(x)\, dx&=\int _{{-\infty}}^{0}(-x)\partial _{1}\phi(x)\, dx+\int _{0}^{\infty}x\partial _{1}\phi(x)\, dx\\ &=\left[-x\phi(x)\right]_{0}^{\infty}+\int _{{-\infty}}^{0}\phi(x)\, dx+\left[x\phi(x)\right]_{0}^{\infty}-\int _{0}^{\infty}\phi(x)\, dx\\ &=\int _{{-\infty}}^{0}\phi(x)\, dx-\int _{0}^{\infty}\phi(x)\, dx.\end{split}$ (2.3)

Let

 $g(x)=\left\{\begin{matrix}1&x\geq 0\\ -1&x<0\end{matrix}\right.,$

and note that the previous calculation (2.3) shows that $\int _{\mathbb{R}}f(x)\partial _{1}\phi(x)\, dx=-\int _{\mathbb{R}}g(x)\phi(x)\, dx$. Therefore the weak derivative of $f(x)=|x|$ is the step function $g(x)$.

The next example generalises the method of the previous example to locally Lipschitz functions.

Example 2.21.

In this example we show that if $f$ is locally Lipschitz on $\Omega$ then $f\in W_{{loc}}^{{1,\infty}}(\Omega)$. Rademacher’s theorem shows that the partial derivatives of $f$ exist almost everywhere (see Corollary A.17), and the goal of this example is to show that these partial derivatives are equal almost everywhere to the weak derivative of $f$ in each co-ordinate direction.

For each compact set $K$, let $M_{K}$ be the associated Lipschitz constant, i.e. for all $x,y\in K$ we have

 $\left|f(x)-f(y)\right|\leq M_{K}\left|x-y\right|.$ (2.4)

(Note that this differs slightly from Definition A.14, however we can easily extend this to compact sets $K$ by taking an open cover of $K$.)

Equation (2.4) implies that $f\in L_{{loc}}^{\infty}(\Omega)$. Therefore the integral

 $\int _{\Omega}f(x)\phi(x)\, dx$

is defined for any $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$. To show that $f$ has a weak derivative, we need to show that there exists $g$ such that

 $\int _{\Omega}f(x)\partial _{j}\phi(x)\, dx=-\int _{\Omega}g(x)\phi(x)\, dx$

for all test functions $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$.

Let $K=\supp(\phi)$. Since $K$ is compact then there exists $\varepsilon>0$ such that if $|h|<\varepsilon$ then $x+he_{j}\in\Omega$ for all $x\in K$, and so $\phi(x+he_{j})$ is well-defined for small values of $|h|$. Therefore

 $\int _{\Omega}f(x)\partial _{j}\phi(x)\, dx=\int _{\Omega}f(x)\lim _{{h\rightarrow 0}}\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx.$

The next step involves using dominated convergence to interchange the order of integration and differentiation. Since this is a standard technique that is used in many examples then we include all of the details here. First note that since $\phi$ is smooth with compact support, then it is uniformly Lipschitz, and so the absolute value of the difference quotients $\left|\frac{\phi(x+he_{j})-\phi(x)}{h}\right|$ is uniformly bounded by a constant (call it $\tilde{M}$) for $|h|<\varepsilon$. Since $f\in L_{{loc}}^{1}(\Omega)$ and the difference quotients have compact support, then

 $\left|f(x)\frac{\phi(x+he_{j})-\phi(x)}{h}\right|\leq\tilde{M}f(x)\in L_{{loc}}^{1}(\Omega),$

and so we can use dominated convergence to write

 $\int _{\Omega}f(x)\lim _{{h\rightarrow 0}}\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx=\lim _{{h\rightarrow 0}}\int _{\Omega}f(x)\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx.$

Changing variables, and recalling that the upper bound on $h$ was chosen so that $x-he_{j}\in\Omega$ for all $x\in\supp(\phi)$, gives us

 $\displaystyle\begin{split}\lim _{{h\rightarrow 0}}\int _{\Omega}f(x)\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx&=\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x)\phi(x+he_{j})\, dx-\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x)\phi(x)\, dx\\ &=\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x-he_{j})\phi(x)\, dx-\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x)\phi(x)\, dx\\ &=\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx.\end{split}$ (2.5)

(Even though $x-he_{j}$ may not be in $\Omega$ for arbitrary $x\in\Omega$, we do have $x-he_{j}\in\Omega$ for all $x\in K$. Since the support of $\phi$ is $K\subset\subset\Omega$, then we can define

 $\int _{\Omega}\frac{1}{h}f(x-he_{j})\phi(x)\, dx:=\int _{K}\frac{1}{h}f(x-he_{j})\phi(x)\, dx,$

and therefore the integral in the above calculation makes sense.)

The quanitity $\frac{f(x-he_{j})-f(x)}{h}\phi(x)$ is uniformly bounded for $|h|\leq\frac{1}{2}\varepsilon$ (since $f$ is locally Lipschitz), and so another application of dominated convergence gives us

 $\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx=\int _{\Omega}\lim _{{h\rightarrow 0}}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx.$ (2.6)

Rademacher’s theorem shows that for each $j=1,\ldots,n$, the partial derivative $\partial _{j}f$ exists almost everywhere in $\Omega$, and, on the compact set $K=\supp(\phi)$ it is bounded above by the Lipschitz constant $M_{K}$. Let $g_{j}(x)$ be a function defined on all of $\Omega$ that is equal almost everywhere to $\partial _{j}f(x)$. Therefore

 $\int _{\Omega}\lim _{{h\rightarrow 0}}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx=-\int _{\Omega}\lim _{{h\rightarrow 0}}\frac{f(x-he_{j})-f(x)}{(-h)}\phi(x)\, dx=-\int _{\Omega}g_{j}(x)\phi(x)\, dx,$

and so we have shown that

 $\int _{\Omega}f(x)\partial _{j}\phi(x)\, dx=-\int _{\Omega}g_{j}(x)\phi(x)\, dx.$

Therefore the weak derivative exists and is equal almost everywhere to $\partial _{j}f(x)$. Since $|g_{j}(x)|\leq M_{K}$ almost everywhere on each compact set $K$, then $f\in W_{{loc}}^{{1,\infty}}(\Omega)$.

Remark 2.22.

The part of the above proof that requires the Lipschitz condition on $f$ is the application of dominated convergence in (2.6). The fact that the derivative of $f$ exists almost everywhere is not sufficient for a weak derivative to exist, for example, the derivative of the step function is zero almost everywhere, but we showed in Example 2.19 that the step function does not have a weak derivative. The reason is that (2.6) fails for the step function (the rest of the proof does go through for the step function).

3 Basic properties of Sobolev spaces

In this section we prove some basic results about Sobolev spaces. The results of Sections 3.1 and 3.3 describe basic functional analytic properties of Sobolev spaces, while Section 3.2 gives an alternative characterisation of Sobolev spaces as the completion of the space of smooth functions. Section 3.4 provides an answer to an earlier question by showing that, although distributions cannot always be represented by locally integrable functions, the positive distributions can always be represented by regular Borel measures.

3.1 Banach and Hilbert space structure of Sobolev spaces

It is well-known that $L^{p}(\Omega)$ (with the $L^{p}$ norm) is a Banach space, and that $L^{2}(\Omega)$ (with the $L^{2}$ inner product) is a Hilbert space. In a similar way, we can show that the Sobolev spaces $W^{{k,p}}(\Omega)$ have the structure of a Banach space, and that $W^{{k,2}}(\Omega)$ has the structure of a Hilbert space, and it is the goal of this section to give the details of this proof. This is a useful theorem, since it allows us to use theorems from functional analysis to study sequences of functions in Sobolev spaces.

Firstly, recall that the space $L^{p}(\Omega)$, together with the $L^{p}$ norm, is complete when $1\leq p\leq\infty$ (see for example [8, Theorem 2.7] or [15, Theorem 8.14]). To extend this to the Sobolev space $W^{{k,p}}(\Omega)$, we use an inductive argument. The proof of the following lemma gives the basic idea of this argument for $k=1$.

Lemma 3.1.

Let $\Omega\subseteq\mathbb{R}^{n}$ be an open set and $1\leq p\leq\infty$. Then the space $W^{{1,p}}(\Omega)$ is complete in the norm $\|\cdot\| _{{W^{{1,p}}(\Omega)}}$.

Proof.

Let $(u_{m})_{{m\in\mathbb{N}}}$ be a Cauchy sequence in $W^{{1,p}}(\Omega)$. Then, by definition of the Sobolev norm, $\| u_{m}\| _{{L^{p}}}\leq\| u_{m}\| _{{W^{{1,p}}}}$, and so $(u_{m})_{{m\in\mathbb{N}}}$ is also Cauchy in $L^{p}$. Similarly, since $\|\partial _{j}u_{m}\| _{{L^{p}}}\leq\| u_{m}\| _{{W^{{1,p}}}}$ (again this follows from the definition of the Sobolev norm), we have that $(\partial _{j}u_{m})_{{m\in\mathbb{N}}}$ is a Cauchy sequence in $L^{p}$.

Since $L^{p}(\Omega)$ is complete, then there are functions $v_{0},v_{1},\ldots,v_{n}$ such that

 $\displaystyle u_{m}$ $\displaystyle\stackrel{L^{p}}{\longrightarrow}v_{0}$ $\displaystyle\partial _{j}u_{m}$ $\displaystyle\stackrel{L^{p}}{\longrightarrow}v_{j},\quad\quad j=1,\ldots,n.$

Hölder’s inequality shows that $L^{p}(\Omega)\subset L_{{loc}}^{1}(\Omega)$, and so each $u_{m}$ determines a distribution $T_{{u_{m}}}\in\mathcal{D}(\Omega)^{*}$ given by

 $T_{{u_{m}}}(\phi)=\int _{\Omega}u_{m}\phi\, dx$

for all test functions $\phi\in\mathcal{D}(\Omega)$.

Another application of Hölder’s inequality gives the following estimate for any $\phi\in\mathcal{D}(\Omega)$

 $\left|T_{{u_{m}}}(\phi)-T_{{v_{0}}}(\phi)\right|\leq\int _{\Omega}\left|u_{m}(x)-v_{0}(x)\right|\left|\phi(x)\right|\, dx\leq\|\phi\| _{{L^{q}}}\| u_{m}-v_{0}\| _{{L^{p}}},$

where $q$ is the conjugate Hölder exponent of $p$. (Note that the integral exists since $\sup\phi$ is bounded, $\phi$ has compact support, and $u_{m}-v_{0}\in L_{{loc}}^{1}(\Omega)$.) Since $u_{m}\rightarrow v_{0}$ in $L^{p}$ then this shows that $T_{{u_{m}}}\rightarrow T_{{v_{0}}}$ in $\mathcal{D}(\Omega)^{*}$.

The same argument with $u_{m}$ replaced by $\partial _{j}u_{m}$ and $v_{0}$ replaced by $v_{j}$ shows that $T_{{\partial _{j}u_{m}}}\rightarrow T_{{v_{j}}}$. We then have for every test function $\phi\in\mathcal{D}(\Omega)$

 $\displaystyle T_{{v_{j}}}(\phi)$ $\displaystyle=\lim _{{m\rightarrow\infty}}T_{{\partial _{j}u_{m}}}(\phi)$ $\displaystyle=-\lim _{{m\rightarrow\infty}}T_{{u_{m}}}(\partial _{j}\phi)$ $\displaystyle=-T_{{v_{0}}}(\partial _{j}\phi)$ $\displaystyle=T_{{\partial _{j}v_{0}}}(\phi)\quad\text{(by definition of distributional derivative)}.$

Therefore, by Theorem 2.8, we have $v_{j}=\partial _{j}v_{0}$ almost everywhere, where $\partial _{j}$ is the weak derivative, which exists since $v_{0}\in W^{{1,p}}(\Omega)$. Therefore, we have shown that $u_{m}\rightarrow v_{0}$ in $W^{{1,p}}(\Omega)$, and so $W^{{1,p}}(\Omega)$ is complete. ∎

Using this technique we can now prove the following theorem, which, together with Lemma 2.17, says that $W^{{k,p}}(\Omega)$ is a Banach space.

Theorem 3.2.

Let $\Omega\subseteq\mathbb{R}^{n}$ be open and $1\leq p\leq\infty$. Then $W^{{k,p}}(\Omega)$ is complete in the norm $\|\cdot\| _{{W^{{k,p}}}}$ for all $k\geq 0$. In particular, $W^{{k,p}}(\Omega)$ is a Banach space for all $1\leq p\leq\infty$ and $k\in\mathbb{Z}_{{\geq 0}}$.

Proof.

The proof uses induction on $k$. The case $k=0$ follows from standard results about $L^{p}$ spaces (see for example [15, Theorem 8.14]). Suppose that $W^{{k-1,p}}(\Omega)$ is complete, and let $(u_{m})_{{m\in\mathbb{N}}}$ be a Cauchy sequence in $W^{{k,p}}(\Omega)$. Therefore the sequences $(u_{m})_{{m\in\mathbb{N}}}$ and $(\partial _{j}u_{m})_{{m\in\mathbb{N}}}$ (for $j=1,\ldots,n$) are Cauchy, and the completeness of $W^{{k-1,p}}$ shows that there exist functions $v_{0},v_{1},\ldots,v_{n}$ such that

 $\displaystyle u_{m}$ $\displaystyle\stackrel{W^{{k-1,p}}}{\relbar\mathrel{\mkern-4.0mu}\relbar\mathrel{\mkern-4.0mu}\longrightarrow}v_{0}$ $\displaystyle\partial _{j}u_{m}$ $\displaystyle\stackrel{W^{{k-1,p}}}{\relbar\mathrel{\mkern-4.0mu}\relbar\mathrel{\mkern-4.0mu}\longrightarrow}v_{j}\quad\text{for all j=1,\ldots,n}.$

Note that the inductive hypothesis shows that $D^{\alpha}(\partial _{j}u_{m})\rightarrow D^{\alpha}v_{j}$ in $L^{p}$ for all multi-indices $\alpha$ such that $|\alpha|\leq k=1$, and so it only remains to show that $\partial _{j}v_{0}=v_{j}$ for each $j=1,\ldots,n$.

As in the previous proof we can show that

 $\left|T_{{\partial _{j}u_{m}}}(\phi)-T_{{v_{j}}}(\phi)\right|\leq\|\phi\| _{{L^{q}}}\|\partial _{j}u_{m}-v_{j}\| _{{L^{p}}},$

and so for all test functions $\phi\in\mathcal{D}(\Omega)$ we have

 $T_{{v_{j}}}(\phi)=\lim _{{m\rightarrow\infty}}T_{{\partial _{j}u_{m}}}(\phi)=-\lim _{{m\rightarrow\infty}}T_{{u_{m}}}(\partial _{j}\phi)=-T_{{v_{0}}}(\partial _{j}\phi)=T_{{\partial _{j}v_{0}}}(\phi),$

and so Theorem 2.8 shows that $v_{j}=\partial _{j}v_{0}$ almost everywhere. This, together with the previous statement that $D^{\alpha}(\partial _{j}u_{m})\rightarrow D^{\alpha}v_{j}$ in $L^{p}$ for all multi-indices $\alpha$ such that $|\alpha|\leq k-1$, shows that $D^{\alpha}u_{m}\rightarrow D^{\alpha}v_{0}$ in $L^{p}$ for all $\alpha$ such that $|\alpha|\leq k$.

Therefore, we have shown that there exists $v_{0}\in W^{{k,p}}(\Omega)$ such that $u_{m}\stackrel{W^{{k,p}}}{\relbar\mathrel{\mkern-4.0mu}\relbar\mathrel{\mkern-4.0mu}\longrightarrow}v_{0}$, and so $W^{{k,p}}(\Omega)$ is complete. ∎

In the case $p=2$, the previous theorem, together with the following inner product, gives $W^{{k,2}}(\Omega)$ the structure of a Hilbert space.

Definition 3.3.

The inner product on $W^{{k,2}}(\Omega)$ is defined to be

 $\left_{{W^{{k,2}}(\Omega)}}:=\sum _{{0\leq|\alpha|\leq k}}\int _{\Omega}D^{\alpha}f\,\overline{D^{{\alpha}}g}\, dx.$ (3.1)
Remark 3.4.

The Sobolev norm on $W^{{k,2}}(\Omega)$ is the same as the norm induced by the inner product

 $\| f\| _{{W^{{k,2}}(\Omega)}}=\left(\left_{{W^{{k,2}}(\Omega)}}\right)^{{\frac{1}{2}}}.$
Theorem 3.5.

$\left(W^{{k,2}}(\Omega),\left<\cdot,\cdot\right>_{{W^{{k,2}}(\Omega)}}\right)$ is a Hilbert space.

Remark 3.6.
1. In view of Theorem 3.2, the proof of Theorem 3.5 only requires checking that the axioms for an inner product are satisfied.

2. In order to emphasise the Hilbert space structure, the space $W^{{k,2}}(\Omega)$ is often denoted $H^{k}(\Omega)$.

3.2 Sobolev spaces are the completion of the space of smooth functions in the Sobolev norm (the Meyers-Serrin theorem)

In this section we prove the Meyers-Serrin theorem, which says that the Sobolev spaces defined in Section 2 are the completion of the space of smooth functions in the Sobolev norm. Therefore we now have two equivalent definitions of Sobolev spaces, which gives us a broader range of techniques to draw upon when proving theorems.

First recall the following well-known theorem that says that a normed linear space has a unique completion (see for example [11, Theorem I.3]).

Theorem 3.7.

If $(V,\|\cdot\| _{V})$ is a normed linear space, then there exists a unique complete normed linear space $(\tilde{V},\|\cdot\| _{{\tilde{V}}})$ such that $V$ is isometric to a dense subset of $\tilde{V}$.

Let $\mathcal{C}^{k}(\Omega)$ be the space of $k$-times differentiable functions $f:\Omega\rightarrow\mathbb{C}$. Since the weak derivative of a differentiable function is just the classical derivative (Lemma 2.13), then the weak derivatives of any $\phi\in\mathcal{C}^{k}(\Omega)$ exist up to order $k$, and we can define the subspace

 $S^{{k,p}}(\Omega)=\{\phi\in\mathcal{C}^{k}(\Omega)\,:\,\|\phi\| _{{W^{{k,p}}(\Omega)}}<\infty\}\subseteq W^{{k,p}}(\Omega).$

Let $B^{{k,p}}(\Omega)$ denote the completion of $S^{{k,p}}(\Omega)$ in the $W^{{k,p}}(\Omega)$-norm. Since $W^{{k,p}}(\Omega)$ is complete by Theorem 3.2, and $S^{{k,p}}(\Omega)\subseteq W^{{k,p}}(\Omega)$, then we have proved

Lemma 3.8.

For $1\leq p\leq\infty$ we have

 $B^{{k,p}}(\Omega)\subseteq W^{{k,p}}(\Omega).$

It turns out that the converse is also true for $1\leq p<\infty$, this is known as the Meyers-Serrin theorem, and the proof will occupy the rest of this section.

Example 3.9.

To see that the converse of the previous lemma can never be true for $p=\infty$, in this example we show that $B^{{k,\infty}}(\Omega)\neq W^{{k,\infty}}(\Omega)$. Consider first the case $k=0$ and $\Omega=\mathbb{R}$, where the step function

 $f(x)=\left\{\begin{matrix}-1&\text{if}\, x<0\\ 1&\text{if}\, x\geq 0\end{matrix}\right.$

is not in the completion of $S^{{0,\infty}}(\mathbb{R})$, since for any continuous function $g\in S^{{0,\infty}}(\mathbb{R})$ we have $\| f-g\| _{{L^{\infty}(\mathbb{R})}}\geq 1$. To extend this example to $W^{{k,\infty}}(\mathbb{R})$ for $k>0$, simply consider the function

 $f(x)=\left\{\begin{matrix}-x^{k}&\text{if}\, x<0\\ x^{k}&\text{if}\, x\geq 0,\end{matrix}\right.$

and note that $\frac{d^{k}f}{dx^{k}}$ is a step function. It is easy then to extend this idea to the case where the domain is an open subset of $\mathbb{R}^{n}$.

Next, we recall some basic facts needed in the proof of Theorem 3.15. The first is the existence of partitions of unity.

Theorem 3.10.

Let $A$ be an arbitrary subset of $\mathbb{R}^{n}$, and let $\mathcal{O}=\{ U_{\alpha}\} _{{\alpha\in I}}$ be a collection of open sets in $\mathbb{R}^{n}$ that cover $A$. Then there exists a collection $\Psi=\{\psi _{\beta}\} _{{\beta\in J}}\subset C_{0}^{\infty}(\mathbb{R}^{n})$ such that

1. For every $\beta\in J$ and every $x\in\mathbb{R}^{n}$, we have $0\leq\psi _{\alpha}(x)\leq 1$.

2. If $K\subset\subset A$ then all but at most finitely many $\psi _{\beta}\in\Psi$ vanish identically on $K$.

3. For every $\beta\in J$ there exists $\alpha\in I$ such that $\supp\psi _{\beta}\subset U_{\alpha}$.

4. For every $x\in A$ we have $\displaystyle{\sum _{{\beta\in J}}\psi _{\beta}(x)=1}$ (note that the sum makes sense because of the local finiteness condition (3.10)).

The collection $\Psi$ is called a partition of unity of $A$ subordinate to $\mathcal{O}$.

Proof.

The case where $A$ is compact is given in [12, Theorem 2.13]. If $A$ is open, then for each $j\in\mathbb{N}$ define

 $A_{j}:=\left\{ x\in A\,:\,|x|\leq j\;\text{and}\;\mathop{\rm dist}\nolimits(x,\partial A)\geq\frac{1}{j}\right\},$

and note that $A_{j}$ is compact and satisfies $A_{j}\subset\interior A_{{j+1}}$ for each $j\in\mathbb{N}$. Moreover, we can also write $A$ as the union of compact sets

 $A=\bigcup _{{j\in\mathbb{N}}}A_{j}=\bigcup _{{j\in\mathbb{N}}}(A_{j}\setminus\interior A_{{j-1}}).$

Also, for notational convenience in what follows, define $A_{0}=A_{{-1}}=\emptyset$.

Given an open cover $\mathcal{O}=\{ U_{\alpha}\} _{{\alpha\in I}}$ of $A$, for each $j\in\mathbb{N}$ we can define an open cover of the compact set $A_{j}\setminus\interior A_{{j-1}}$ by

 $\mathcal{O}_{j}:=\left\{ U_{\alpha}\cap\left(\interior(A_{{j+1}})\setminus A_{{j-2}}\right)\,:\,\alpha\in I\right\}.$

By the result for compact sets, for each $j\in\mathbb{N}$ there exists a partition of unity $\Psi _{j}=\{\psi _{{j,n}}\} _{{n=1}}^{{N_{j}}}$ for the compact set $A_{j}\setminus\interior A_{{j-1}}$ that is subordinate to $\mathcal{O}_{j}$, and has finitely many elements. Moreover, since $U_{\alpha}\cap\left(\interior(A_{{j+1}})\setminus A_{{j-2}}\right)\subseteq A$ for each $\alpha\in I$ and $j\in\mathbb{N}$, then $\supp\psi _{{j,n}}\subseteq A$ for each $\psi _{{j,n}}\in\Psi _{j}$. Therefore, since each $x\in A$ satisfies $x\in A_{j}\setminus\interior A_{{j-1}}$ for at most finitely many $j\in\mathbb{N}$, then the sum

 $\sigma(x)=\sum _{{j\in\mathbb{N}}}\sum _{{\psi\in\Psi _{j}}}\psi(x)$

has at most finitely many terms for each $x$, and also satisfies $\sigma(x)\geq 1$ for each $x\in A$. Now define the collection of functions

 $\Psi:=\left\{ f_{{j,n}}(x)=\left\{\begin{matrix}\frac{\psi _{{j,n}}(x)}{\sigma(x)}&x\in A\\ 0&x\notin A\end{matrix}\right|\,:\, j\in\mathbb{N},1\leq n\leq N_{j}\right\}.$

This is now a partition of unity of $A$ subordinate to $\mathcal{O}$.

In the case where $A$ is an arbitrary subset of $\mathbb{R}^{n}$ with an open cover $\mathcal{O}=\{ U_{\alpha}\} _{{\alpha\in I}}$, define the open set $\displaystyle{B=\bigcup _{{\alpha\in I}}U_{\alpha}}$, note that $\mathcal{O}$ is an open cover of $B$, and apply the previous result to find a partition of unity $\Psi$ of $B$ subordinate to $\mathcal{O}$. Since $A\subset B$ then $\Psi$ is also a partition of unity of $A$ subordinate to $\mathcal{O}$. ∎

The second basic fact needed is the convergence of sequences of mollified functions. Let $J$ be a non-negative real-valued function in $C_{0}^{\infty}(\mathbb{R}^{n})$ such that

1. $J(x)=0$ if $|x|\geq 1$.

2. $\displaystyle{\int _{{\mathbb{R}^{n}}}J(x)\, dx=1}$.

For example we can choose

 $J(x)=\left\{\begin{matrix}k\exp\left(-\frac{1}{1-|x|^{2}}\right)&\text{if |x|<1}\\ 0&\text{if |x|\geq 1}\end{matrix}\right.,$

where $k$ is chosen so that $\displaystyle{\int _{{\mathbb{R}^{n}}}J(x)\, dx=1}$. The function $J(x)$ is called a mollifier. For any $\varepsilon>0$, let $J_{\varepsilon}(x)=\frac{1}{\varepsilon^{n}}J\left(\frac{x}{\varepsilon}\right)$, and define the mollification of $u\in L^{p}(\Omega)$ to be the convolution

 $(J_{\varepsilon}*u)(x)=\int _{{\mathbb{R}^{n}}}J_{\varepsilon}(x-y)u(y)\, dy.$
Lemma 3.11.

If $u\in W^{{k,p}}(\Omega)$ then $J_{\varepsilon}*u$ is smooth for all $\varepsilon>0$.

Since $J_{\varepsilon}$ is smooth for all $\varepsilon>0$, then this follows from [15, Theorem 9.3].

Theorem 3.12.

Let $\Omega$ be an open subset of $\mathbb{R}^{n}$, and let $\Omega^{{\prime}}\subset\Omega$ be an open subset with compact closure. If $1\leq p<\infty$ and $u\in W^{{k,p}}(\Omega)$, then

 $\lim _{{\varepsilon\rightarrow 0^{+}}}J_{\varepsilon}*u=u$

in $W^{{k,p}}(\Omega^{{\prime}})$.

Proof.

When $k=0$ this is a standard result for $L^{p}$ spaces (see for example [15, Theorem 9.6] for a proof). The general case follows by reducing to the $k=0$ case.

First we show that for any $\varepsilon<\mathop{\rm dist}\nolimits(\Omega^{{\prime}},\partial\Omega)$ we have $D^{\alpha}(J_{\varepsilon}*u)=J_{\varepsilon}*D^{\alpha}u$ in the distributional sense on $\Omega^{{\prime}}$. To see this, let $\tilde{u}$ denote the zero extension of $u$ from $\Omega$ to all of $\mathbb{R}^{n}$, and note that for any test function $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega^{{\prime}})$ we have

 $\displaystyle\int _{{\Omega^{{\prime}}}}J_{\varepsilon}*u(x)D^{\alpha}\phi(x)\, dx$ $\displaystyle=\int _{{\mathbb{R}^{n}}}\int _{{\mathbb{R}^{n}}}\tilde{u}(x-y)J_{\varepsilon}(y)D^{\alpha}\phi(x)\, dxdy$ $\displaystyle=(-1)^{{|\alpha|}}\int _{\mathbb{R}}^{n}\int _{{\Omega^{{\prime}}}}D^{\alpha}u(x-y)J_{\varepsilon}(y)\phi(x)\, dxdy$ $\displaystyle=(-1)^{{|\alpha|}}\int _{\Omega}^{{\prime}}J_{\varepsilon}*D^{\alpha}u(x)\phi(x)\, dx.$

(All of the derivatives above are taken with respect to the variable $x$.)

Since $D^{\alpha}u\in L^{p}(\Omega)$ for each $0\leq|\alpha|\leq k$, then the result for $L^{p}$ spaces shows that

 $\lim _{{\varepsilon\rightarrow 0^{+}}}\left\| D^{\alpha}(J_{\varepsilon}*u)-D^{\alpha}u\right\| _{{L^{p}(\Omega^{{\prime}})}}=\lim _{{\varepsilon\rightarrow 0^{+}}}\left\| J_{\varepsilon}*D^{\alpha}u-D^{\alpha}u\right\| _{{L^{p}(\Omega^{{\prime}})}}=0.$

This is true for all $\alpha$ such that $0\leq|\alpha|\leq k$, and so $J_{\varepsilon}*u$ converges to $u$ in the $W^{{k,p}}(\Omega^{{\prime}})$ norm. ∎

Next, we introduce the notion of a nested open cover, which will be used in the sequel.

Definition 3.13.

Let $\Omega$ be an open subset of $\mathbb{R}^{n}$. A nested open cover of $\Omega$ is a collection of open sets $\{\Omega _{j}\} _{{j\in\mathbb{N}}}$ such that

1. $\Omega _{j}\subseteq\Omega _{{j+1}}\subseteq\Omega$ for all $j\in\mathbb{N}$.

2. For all $x\in\Omega$ there exists $j\in\mathbb{N}$ such that $x\in\Omega _{j}$.

Lemma 3.14.

Let $\Omega$ be an open set in $\mathbb{R}^{n}$, and let $\{\Omega _{j}\} _{{j\in\mathbb{N}}}$ be a nested open cover of $\Omega$. If $f\in W^{{k,p}}(\Omega)$ satisfies $\| f\| _{{W^{{k,p}}(\Omega _{j})}}\leq C$ for all $j\in\mathbb{N}$, then $\| f\| _{{W^{{k,p}}(\Omega)}}\leq C$.

Proof.

The inclusion $\Omega _{j}\hookrightarrow\Omega$ induces an inclusion $\mathcal{D}(\Omega _{j})\hookrightarrow\mathcal{D}(\Omega)$. Therefore the weak derivative of $f$ on $\Omega _{j}$ is just the restriction of the weak derivative of $f$ on $\Omega$, since for all test functions $\phi\in\mathcal{D}(\Omega _{j})$ we have

 $\int _{{\Omega _{j}}}fD^{\alpha}\phi\, dx=\int _{{\Omega}}fD^{\alpha}\phi\, dx=(-1)^{{|\alpha|}}\int _{{\Omega}}D^{\alpha}f\,\phi\, dx=(-1)^{{|\alpha|}}\int _{{\Omega _{j}}}D^{\alpha}f\,\phi\, dx.$

The dominated convergence theorem shows that $\lim _{{j\rightarrow\infty}}\| D^{\alpha}f\| _{{L^{p}(\Omega _{j})}}=\| D^{\alpha}f\| _{{L^{p}(\Omega)}}$ for each $\alpha$, and as a consequence we have

 $\| D^{\alpha}f\| _{{L^{p}(\Omega)}}\leq\sup _{{j\in\mathbb{N}}}\| D^{\alpha}f\| _{{L^{p}(\Omega _{j})}}\quad\text{for each \alpha}.$

Therefore $\| f\| _{{W^{{k,p}}(\Omega)}}\leq C$. ∎

Now we are ready to prove that the space of smooth functions is dense in $W^{{k,p}}$.

Theorem 3.15 (Meyers-Serrin).

Let $\Omega$ be an open subset of $\mathbb{R}^{n}$, and let $1\leq p<\infty$. Then for any $u\in W^{{k,p}}(\Omega)$, and for every $\varepsilon>0$, there exists $\phi\in\mathcal{C}^{\infty}(\Omega)$ such that $\| u-\phi\| _{{W^{{k,p}}(\Omega)}}<\varepsilon$.

Proof.

Fix $\varepsilon>0$. For each $j\in\mathbb{N}$, define the open sets

 $\displaystyle\Omega _{j}$ $\displaystyle:=\left\{ x\in\Omega\,:\,|x|\frac{1}{j}\right\}$ $\displaystyle U_{j}$ $\displaystyle:=\Omega _{{j+1}}\cap\left(\Omega\setminus\bar{\Omega}_{{j-1}}\right).$

Then $\{\Omega _{j}\} _{{j\in\mathbb{N}}}$ is a nested open cover of $\Omega$, and, in particular, we can apply Lemma 3.14 (we will use this at the end of the proof). Moreover, each $\Omega _{j}$ has compact closure in $\Omega$, and so Theorem 3.12 applies. Define $\mathcal{O}=\{ U_{j}\} _{{j\in\mathbb{N}}}$, and note that $\mathcal{O}$ is also an open cover of $\Omega$ (although it is not nested).

Let $\Psi=\{\psi _{j}\} _{{j\in\mathbb{N}}}$ be a partition of unity for $\Omega$ subordinate to $\mathcal{O}$, and note that the local finiteness property of partitions of unity shows that $\psi _{j}\in\mathcal{C}^{\infty}(U_{j})$ for all $j$, and we also have

 $\sum _{{j=1}}^{\infty}\psi _{j}(x)=1$

for all $x\in\Omega$.

From the definition of $U_{j}$, if $0<\varepsilon _{j}<\frac{1}{(j+1)(j+2)}=\frac{1}{j+1}-\frac{1}{j+2}$ then $J_{{\varepsilon _{j}}}*(\psi _{j}u)$ has support in the set

 $V_{k}:=\Omega _{{j+2}}\cap\left(\Omega\setminus\bar{\Omega}_{{j-2}}\right)\subset\subset\Omega.$

Since $\psi _{j}u\in W^{{k,p}}(\Omega)$, then by Theorem 3.12 we can find $\varepsilon _{j}$ such that $0<\varepsilon _{j}<\frac{1}{(j+1)(j+2)}$ and

 $\| J_{{\varepsilon _{j}}}*(\psi _{j}u)-\psi _{j}u\| _{{W^{{k,p}}(\Omega _{{j+2}})}}<\frac{\varepsilon}{2^{{j+1}}}.$

Define

 $\phi=\sum _{{j=1}}^{\infty}J_{{\varepsilon _{j}}}*(\psi _{j}u).$

On any compact subset $K\subset\subset\Omega$, all by finitely many terms in the sum vanish, and so $\phi\in\mathcal{C}^{\infty}(\Omega)$. Now note that if $x\in\Omega _{\ell}$, then

 $u(x)=\sum _{{j=1}}^{{\ell+2}}\psi _{j}(x)u(x),\quad\psi(x)=\sum _{{j=1}}^{{\ell+2}}J_{{\varepsilon _{j}}}*(\psi _{j}u)(x),$

and so for each $\ell\in\mathbb{N}$

 $\| u-\phi\| _{{W^{{k,p}}(\Omega _{\ell})}}\leq\sum _{{j=1}}^{{\ell+2}}\| J_{{\varepsilon _{j}}}*(\psi _{j}u)-\psi _{j}u\| _{{W^{{k,p}}(\Omega _{{j+2}})}}<\frac{1}{2}\varepsilon.$

An application of Lemma 3.14 then shows that $\| u-\phi\| _{{W^{{k,p}}(\Omega)}}\leq\frac{1}{2}\varepsilon<\varepsilon$, as required. ∎

This theorem shows that $W^{{k,p}}(\Omega)\subseteq B^{{k,p}}(\Omega)$. Combining this with Lemma 3.8 gives us the following corollary, which states that $W^{{k,p}}(\Omega)$ is the completion of the space of $\mathcal{C}^{k}(\Omega)$ functions in the Sobolev norm $\|\cdot\| _{{W^{{k,p}}(\Omega)}}$.

Corollary 3.16.

Let $\Omega$ be an open subset of $\mathbb{R}^{n}$, and let $1\leq p<\infty$. Then

 $B^{{k,p}}(\Omega)=W^{{k,p}}(\Omega)$

for any $k\geq 0$.

Remark 3.17.

The statement of the corollary above is that $W^{{k,p}}(\Omega)$ is the completion of the space of $\mathcal{C}^{k}$ functions with respect to the $W^{{k,p}}$-norm. Since $\mathcal{C}^{\infty}(\Omega)\subset\mathcal{C}^{k}(\Omega)$ and Theorem 3.15 is stated for smooth functions, then we also have that $W^{{k,p}}(\Omega)$ is the completion of the space of smooth functions in the Sobolev norm.

3.3 The dual space of a Sobolev space

In this section $\Omega$ denotes an open subset of $\mathbb{R}^{n}$, $1\leq p<\infty$, and $q$ denotes the conjugate exponent to $p$, i.e. $q=\frac{p}{p-1}$ if $1 and $q=\infty$ if $p=1$.

First recall Theorem A.4, which says that the dual of $L^{p}(\Omega)$ is isomorphic to $L^{q}(\Omega)$ if $1\leq p<\infty$. The proof of this theorem involves showing that for each linear functional $\Lambda:L^{p}(\Omega)\rightarrow\mathbb{R}$ there exists a function $v\in L^{q}(\Omega)$ (unique up to equivalence in $L^{q}(\Omega)$) such that

 $\Lambda(u)=\int _{\Omega}uv\, dx$

for all $u\in L^{p}(\Omega)$. Moreover, as part of the construction, the proof also shows that $\| v\| _{{L^{q}(\Omega)}}=\|\Lambda\| _{{L^{p}(\Omega)^{*}}}$. The converse is also true, so we have an isometric isomorphism $L^{p}(\Omega)^{*}\cong L^{q}(\Omega)$.

The goal of this section is to provide a description of the dual space to the Sobolev space $W^{{k,p}}(\Omega)$. It is important to point out that most of the hard work is done in proving the previous theorem for $L^{p}$ spaces, and that the proofs given below rely heavily on this construction. More details can be found in [1, Chapter 3]

Let $\left<\cdot,\cdot\right>:L^{p}(\Omega)\times L^{q}(\Omega)\rightarrow\mathbb{R}$ denote the dual pairing

 $\left:=\int _{\Omega}uv\, dx$

for $u\in L^{p}(\Omega)$ and $v\in L^{q}(\Omega)$, and, for $N\in\mathbb{N}$, let $L^{q}(\Omega)^{N}:=L^{q}(\Omega)\times\cdots\times L^{q}(\Omega)$ denote the product of $N$ copies of $L^{p}(\Omega)$.

There is a map $F:L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*}$ that takes a vector of functions $(v_{\alpha})$ to the linear functional $\displaystyle{\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left}$. The first theorem below shows that this map is surjective, and therefore we can characterise elements of the dual $W^{{k,p}}(\Omega)$ in terms of elements of $L^{q}(\Omega)^{N}$.

Theorem 3.18.

Given $k\geq 0$, let $N$ be the number of multi-indices $\alpha$ such that $0\leq|\alpha|\leq k$. For every functional $\Lambda\in W^{{k,p}}(\Omega)^{*}$ there exists $(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N}$ such that for all $u\in W^{{k,p}}(\Omega)$ we have

 $\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left,$

Moreover, if we define $V$ to be the set of all $(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N}$ satisfying the previous equation, then

 $\left\|\Lambda\right\| _{{W^{{k,p}}(\Omega)^{*}}}=\inf _{{(v_{\alpha})\in V}}\left\|(v_{\alpha})\right\| _{{L^{q}(\Omega)^{N}}},$ (3.2)

and this infimum is attained by some $(v_{\alpha})\in L^{q}(\Omega)^{N}$.

Proof.

First note that, by the definition of $W^{{k,p}}(\Omega)$, there exists a linear map

 $\displaystyle P:W^{{k,p}}(\Omega)$ $\displaystyle\rightarrow L^{p}(\Omega)^{N}$ $\displaystyle u$ $\displaystyle\mapsto(D^{\alpha}u)_{{0\leq|\alpha|\leq k}}.$

By the definition of the norms on $W^{{k,p}}(\Omega)$ and $L^{p}(\Omega)^{N}$, the map $P$ is an isometry, and therefore $P$ is an isometric isomorphism onto its image.

Given $\Lambda\in W^{{k,p}}(\Omega)^{*}$ define $\Lambda^{*}\in P\left(W^{{k,p}}(\Omega)\right)^{*}$, a linear functional on the image of $P$, by

 $\Lambda^{*}(Pu)=\Lambda(u)\quad\text{for all u\in W^{{k,p}}(\Omega)}.$

Since $P$ is an isometric isomorphism, then

 $\left\|\Lambda^{*}\right\| _{{P\left(W^{{k,p}}(\Omega)\right)^{*}}}=\left\|\Lambda\right\| _{{W^{{k,p}}(\Omega)^{*}}}.$

The Hahn-Banach theorem (see for example [11, p76]) shows that there is a norm-preserving extension $\tilde{\Lambda}$ of $\Lambda^{*}$ to all of $L^{p}(\Omega)^{N}$, and, together with the characterisation of the dual of $L^{p}(\Omega)$, this shows that there exists $(v_{\alpha})\in L^{q}(\Omega)^{N}$ such that

 $\tilde{\Lambda}(w)=\sum _{{0\leq|\alpha|\leq k}}\left$

for any $w=(w_{\alpha})\in L^{p}(\Omega)^{N}$. Moreover, we also have

 $\|\tilde{\Lambda}\| _{{\left(L^{p}(\Omega)^{N}\right)^{*}}}=\left(\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}^{q}\right)^{{\frac{1}{q}}}.$

Therefore, we have shown that for any $\Lambda\in W^{{k,p}}(\Omega)^{*}$ there exists $v=(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N}$ such that for all $u\in W^{{k,p}}(\Omega)$ we have

 $\Lambda(u)=\Lambda^{*}(Pu)=\tilde{\Lambda}(Pu)=\sum _{{0\leq|\alpha|\leq k}}\left.$

Moreover, at each stage of the construction, we also showed that

 $\|\Lambda\| _{{W^{{k,p}}(\Omega)^{*}}}=\|\Lambda^{*}\| _{{P\left(W^{{k,p}}(\Omega)\right)^{*}}}=\|\tilde{\Lambda}\| _{{\left(L^{p}(\Omega)^{N}\right)^{*}}}=\left\|(v_{\alpha})\right\| _{{L^{q}(\Omega)^{N}}}.$

Unfortunately this map $F:L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*}$ is not an isomorphism, since it may have a non-trivial kernel, as the next example shows.

Example 3.19.

Let $\Omega$ be an open subset of $\mathbb{R}$, and let $\varphi$ be a smooth function on $\Omega$ with compact support. Then

 $\int _{\Omega}\partial _{x}u\,\varphi\, dx=-\int _{\Omega}u\,\partial _{x}\varphi\, dx$ (3.3)

by the definition of weak derivative. Now consider the vector $(\partial _{x}\varphi,\varphi)\in L^{q}(\Omega)^{2}$. The linear functional $\Lambda\in W^{{1,p}}(\Omega)^{*}$ associated to this vector is

 $\Lambda(u)=\left+\left<\partial _{x}u,\varphi\right>,$

which is zero by (3.3). Therefore, for every non-zero smooth function $\varphi$ with compact support contained in $\Omega$, the vector $(\partial _{x}\varphi,\varphi)\in L^{q}(\Omega)^{2}$ is a non-trivial element of the kernel of the map $F:L^{q}(\Omega)^{2}\rightarrow W^{{1,p}}(\Omega)^{*}$.

Remark 3.20.

More generally, if the functional $\Lambda$ is represented by a vector of smooth functions, i.e. $(v_{\alpha})\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)^{N}\subset L^{q}(\Omega)^{N}$, then we can write

 $\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left=\sum _{{0\leq|\alpha|\leq k}}\left.$

Therefore $\Lambda(u)=\left$, where $f=(-1)^{{|\alpha|}}D^{\alpha}v_{\alpha}$. In particular, we see that $\Lambda$ is the zero functional if $f\equiv 0$.

The next lemma shows that each element of the dual of a Sobolev space can be regarded as an extension of some distribution.

Lemma 3.21.

Let $\Lambda\in W^{{k,p}}(\Omega)^{*}$. Then there exists $T\in\mathcal{D}(\Omega)^{*}$ such that $\Lambda(\phi)=T(\phi)$ for all $\phi\in\mathcal{D}(\Omega)$.

Proof.

Using the previous theorem, there exists $v=(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N}$ such that

 $\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left$

for every $u\in W^{{k,p}}(\Omega)$. Note that if $\phi\in\mathcal{D}(\Omega)$, then

 $\displaystyle\Lambda(\phi)=\sum _{{0\leq|\alpha|\leq k}}\left$ $\displaystyle=\sum _{{0\leq|\alpha|\leq k}}\int _{\Omega}D^{\alpha}\phi\, v_{\alpha}\, dx$ $\displaystyle=\sum _{{0\leq|\alpha|\leq k}}T_{{v_{\alpha}}}(D^{\alpha}\phi)=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}(\phi),$

where, in the second last term, $D^{\alpha}v_{\alpha}$ refers to the weak derivative of $v_{\alpha}$.

Define

 $T=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}\in\mathcal{D}(\Omega)^{*}.$

Then we have shown that $T(\phi)=\Lambda(\phi)$ for all $\phi\in\mathcal{D}(\Omega)$. ∎

The previous theorems give different characterisations of elements of the dual of $W^{{k,p}}(\Omega)$: Theorem 3.18 shows that there is a surjective map $F:L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*}$, while Lemma 3.21 shows that the restriction of each linear functional to $\mathcal{D}(\Omega)$ is a distribution. Therefore we have maps $L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*}$ and $W^{{k,p}}(\Omega)^{*}\rightarrow\mathcal{D}(\Omega)^{*}$.

Unfortunately, these results do not give a nice description of the kernel of the first map and the image of the second map. In addition, the second map may have a non-trivial kernel (see Remark 3.24). It turns out that $W_{0}^{{k,p}}(\Omega)$ has better properties with respect to the second map, and the next theorem describes the image of the subspace $W_{0}^{{k,p}}(\Omega)^{*}\hookrightarrow W^{{k,p}}(\Omega)^{*}\rightarrow\mathcal{D}(\Omega)^{*}$.

Theorem 3.22.

The dual space $W_{0}^{{k,p}}(\Omega)^{*}$ is isometrically isomorphic to the Banach space consisting of those distributions $T\in\mathcal{D}(\Omega)^{*}$ that satisfy

 $T=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}$ (3.4)

for some $v=(v_{\alpha})\in L^{q}(\Omega)^{N}$, and whose norm is given by

 $\| T\|:=\inf\left\{\| v\| _{{L^{q}(\Omega)^{*}}}\,:\, v\in L^{q}(\Omega)^{N}\;\text{and}\;\; T=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}\right\}.$ (3.5)
Proof.

Given $v=(v_{\alpha})\in L^{q}(\Omega)^{N}$, let $V^{{\prime}}\subseteq\mathcal{D}(\Omega)^{*}$ be the space of distributions satisfying (3.4). Let $T\in V^{{\prime}}$. The goal of the proof is to show that $T$ has a unique extension to some $\Lambda\in W_{0}^{{k,p}}(\Omega)$, and, moreover, that this map $T\mapsto\Lambda$ is the inverse of the restriction map from the previous lemma.

Given $u\in W_{0}^{{k,p}}(\Omega)$, let $(\phi _{n})_{{n\in\mathbb{N}}}$ be a sequence of test functions converging to $u$ in the $W^{{k,p}}(\Omega)$-norm (note that this is not the same as convergence in the topology on the space of test functions). Such a sequence exists by the definition of $W_{0}^{{k,p}}(\Omega)$. We claim that $(T(\phi _{n}))_{{n\in\mathbb{N}}}$ is a Cauchy sequence in $\mathbb{C}$, which is a consequence of the following calculation

 $\displaystyle\left|T(\phi _{m})-T(\phi _{n})\right|$ $\displaystyle\leq\sum _{{0\leq|\alpha|\leq k}}\left|T_{{v_{\alpha}}}(D^{\alpha}\phi _{m}-D^{\alpha}\phi _{n})\right|$ $\displaystyle\leq\sum _{{0\leq|\alpha|\leq k}}\left\| D^{\alpha}(\phi _{m}-\phi _{n})\right\| _{{L^{p}(\Omega)}}\left\| v_{\alpha}\right\| _{{L^{q}(\Omega)}}\quad\text{(H\"{o}lder's inequality)}$ $\displaystyle\leq\left\|\phi _{m}-\phi _{n}\right\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}\quad\text{(definition of W^{{k,p}} norm)},$

which converges to zero, since $(\phi _{n})_{{n\in\mathbb{N}}}$ is a Cauchy sequence in $W^{{k,p}}(\Omega)$. Therefore $\displaystyle{\lim _{{n\rightarrow\infty}}T(\phi _{n})}$ exists, and we claim that the limit only depends on $u$. To see this, consider another sequence $(\varphi _{n})_{{n\in\mathbb{N}}}$ of test functions converging to $u$ in the $W^{{k,p}}(\Omega)$ norm, and note that the same calculation as above shows that

 $\displaystyle\left|T(\phi _{n})-T(\varphi _{n})\right|$ $\displaystyle\leq\left\|\phi _{n}-\varphi _{n}\right\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}$ $\displaystyle\leq\left(\left\|\phi _{n}-u\right\| _{{W^{{k,p}}(\Omega)}}+\left\|\varphi _{n}-u\right\| _{{W^{{k,p}}(\Omega)}}\right)\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}},$

which converges to zero as $n\rightarrow\infty$. Therefore, we can define

 $\Lambda(u):=\displaystyle{\lim _{{n\rightarrow\infty}}T(\phi _{n})}.$

Clearly $\Lambda$ is linear, since both $T$ and the operation of taking the limit in $W^{{k,p}}(\Omega)$ are linear. To see that $\Lambda$ is bounded, we compute

 $\left|\Lambda(u)\right|=\lim _{{n\rightarrow\infty}}\left|T(\phi _{n})\right|\leq\lim _{{n\rightarrow\infty}}\|\phi _{n}\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}=\| u\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}},$

and so $\|\Lambda\| _{{W^{{k,p}}(\Omega)^{*}}}\leq\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}$.

Therefore, we have shown that $T$ has an extension to $\Lambda\in W_{0}^{{k,p}}(\Omega)^{*}$, and, moreover, this extension is unique since $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ is dense in $W_{0}^{{k,p}}(\Omega)$. More precisely, any other bounded linear functional $\Lambda^{{\prime}}$ that restricts to $T$ on $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ must satisfy

 $\displaystyle\Lambda(u)-\Lambda^{{\prime}}(u)$ $\displaystyle=\lim _{{n\rightarrow\infty}}\Lambda(\phi _{n})-\lim _{{n\rightarrow\infty}}\Lambda^{{\prime}}(\phi _{n})\quad\text{(since \Lambda and \Lambda^{{\prime}} are both continuous)}$ $\displaystyle=\lim _{{n\rightarrow\infty}}T(\phi _{n})-\lim _{{n\rightarrow\infty}}T(\phi _{n})=0.$

By construction, $\Lambda(\phi)=T(\phi)$ for every test function $\phi$, and so the map $V^{{\prime}}\rightarrow W^{{k,p}}(\Omega)^{*}$ is the inverse of the restriction map from Lemma 3.21. To see that this is an isometry, note that Theorem 3.18 shows that the norm on $V$ given by (3.5) is the same as the norm on $W^{{k,p}}(\Omega)^{*}$ given by (3.2). Therefore $V^{{\prime}}$ is isometrically isomorphic to $W_{0}^{{k,p}}(\Omega)$, which also implies that $V^{{\prime}}$ is a Banach space. ∎

Remark 3.23.

The space $V^{{\prime}}$ is a strict subset of $\mathcal{D}(\Omega)^{*}$, since there are many distributions that cannot be written as

 $T=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}T_{{v_{\alpha}}}$

for some $(v_{\alpha})\in L^{q}(\Omega)^{N}$. For example, the delta functional can never be written in this form, since Example 2.10 shows that it cannot be represented by a function.

Remark 3.24.
1. As part of the previous proof we showed that the restriction map

 $W_{0}^{{k,p}}(\Omega)^{*}\rightarrow\mathcal{D}(\Omega)^{*}$

is injective. It is natural to ask whether these results can be extended to $W^{{k,p}}(\Omega)$, however the previous proof will not work since it depends on the fact that, by definition, $W_{0}^{{k,p}}(\Omega)$ is the completion of $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ in the $W^{{k,p}}$ norm (since the first step is to approximate an element of $W_{0}^{{k,p}}(\Omega)$ by a sequence of smooth functions with compact support).

2. One could still ask whether there is an alternative proof that works for $W^{{k,p}}(\Omega)$, however it turns out that in general the answer is no, since the extension of a linear functional $T\in\mathcal{D}(\Omega)^{*}$ to a linear functional $\Lambda\in W^{{k,p}}(\Omega)^{*}$ may be non-unique. When the domain $\Omega$ is bounded and the boundary has good properties, then one can construct examples using the trace operator $W^{{1,p}}(\Omega)\rightarrow L^{p}(\partial\Omega)$ (see [4, Section 5.5] for the construction), which is zero on $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ but non-zero in general. Therefore the restriction map $W^{{k,p}}(\Omega)^{*}\rightarrow\mathcal{D}(\Omega)^{*}$ has non-zero kernel, and so we cannot identify $W^{{k,p}}(\Omega)^{*}$ with a subspace of $\mathcal{D}(\Omega)^{*}$ in this case. Note that the trace operator is zero precisely on the subspace $W_{0}^{{k,p}}(\Omega)$ (see [4, Theorem 2, Section 5.5] for more details).

3.4 Positive distributions can be represented by measures (the Riesz representation theorem)

Given the results of the previous section on the dual space of a Sobolev space, it is natural to ask whether there is a nice characterisation of $\mathcal{D}(\Omega)^{*}$ in terms of familiar objects, and it is the goal of this section to answer this question for positive, real-valued distributions.

As we have seen from Theorem 2.8, there is an injective map $L_{{loc}}^{1}(\Omega)\hookrightarrow\mathcal{D}(\Omega)^{*}$. Unfortunately, as explained in Example 2.10, the set $L_{{loc}}^{1}(\Omega)$ is too small to provide a unique representative for every distribution. In Theorem 3.35 we show that regular Borel measures are the right class of objects to represent distributions.

This theorem is also proved in [12, Theorem 2.14] (for the dual of the space of continuous functions with compact support) and [8, Theorem 6.22] (for the dual of the space of smooth functions with compact support). Both proofs follow a similar strategy, which involves first using the distribution to define an outer measure, and then showing that open sets are all measurable with respect to this outer measure. Rudin also considers the case of complex-valued distributions in [12, Theorem 6.19], and a more general proof (for the dual of the space $\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega,\mathbb{R}^{m})$) is given in [5, Section 1.8]

Note that in [8] the proof only uses the Riemann integral, and in particular it does not involve Lebesgue measure. Since we are assuming the construction of Lebesgue measure (and a construction using outer measure is also given in Definition B.18), then we are free to use it here where it simplifies the proof.

For this entire section we use the following notation: let $\Omega$ be an open subset of $\mathbb{R}^{n}$, let $\mathcal{O}(\Omega)$ denote the collection of open subsets of $\Omega$, and let $\mathcal{B}$ denote the Borel $\sigma$-algebra generated by the open subsets of $\Omega$.

Definition 3.25.

Let $T\in\mathcal{D}(\Omega)^{*}$. The distribution $T$ is a positive distribution if $T(\phi)\geq 0$ for all $\phi\in\mathcal{D}(\Omega)$ such that $\phi(x)\geq 0$ for all $x$.

In the following, let $U\subseteq\Omega$ be an open set, and define $\mathcal{C}(U)$ to be the set of functions $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ with $0\leq\phi\leq 1$ and $\supp(\phi)\subset U$ (note that Urysohn’s lemma shows that this set is nonempty if $U$ is nonempty).

Lemma 3.26.

Let $T\in\mathcal{D}(\Omega)^{*}$ be a positive distribution. Then the function $\mu:\mathcal{O}(\Omega)\rightarrow\mathbb{R}$ defined by

 $\mu(U):=\left\{\begin{matrix}\sup _{{\phi\in\mathcal{C}(U)}}\{ T(\phi)\}&\text{if U nonempty}\\ 0&\text{if U=\emptyset}\end{matrix}\right.$ (3.6)

satisfies

1. $\mu(U_{1})\leq\mu(U_{2})$ if $U_{1}\subseteq U_{2}$ are open sets,

2. $\displaystyle{\mu\left(\bigcup _{{n\in\mathbb{N}}}U_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu(U_{n})}$ for every countable collection of open subsets $\{ U_{n}\} _{{n\in\mathbb{N}}}\subset\mathcal{O}$.

Proof.

The first property follows from the fact that $U_{1}\subseteq U_{2}$ implies that $\mathcal{C}(U_{1})\subseteq\mathcal{C}(U_{2})$.

To prove the second property, we first show that

 $\mu(U_{1}\cup U_{2})\leq\mu(U_{1})+\mu(U_{2})$

for any open sets $U_{1},U_{2}\in\mathcal{O}$. Given any $\phi\in\mathcal{C}(U_{1}\cup U_{2})$, let $K=\supp(\phi)$, and apply Lemma A.12 to show that there exist functions $\phi _{1}$ and $\phi _{2}$ such that $\phi\cdot\phi _{1}\in\mathcal{C}(U_{1})$, $\phi\cdot\phi _{2}\in\mathcal{C}(U_{2})$, and $\phi\cdot\phi _{1}+\phi\cdot\phi _{2}=\phi$. Therefore

 $T(\phi)=T(\phi\cdot\phi _{1})+T(\phi\cdot\phi _{2})\leq\mu(U_{1})+\mu(U_{2})$

for all $\phi\in\mathcal{C}(U_{1}\cup U_{2})$, and so $\mu(U_{1}\cup U_{2})\leq\mu(U_{1})+\mu(U_{2})$. Induction then shows that for any $N\in\mathbb{N}$

 $\mu\left(\bigcup _{{n=1}}^{N}U_{n}\right)\leq\sum _{{n=1}}^{N}\mu(U_{n}),$ (3.7)

and so it only remains to extend this to countable collections of open sets. To do this, note that any $\displaystyle{\phi\in\mathcal{C}\left(\bigcup _{{n\in\mathbb{N}}}U_{n}\right)}$ has compact support in $\Omega$, and so there exists a finite collection of sets (re-order so that these are $U_{1},\ldots,U_{N}$) such that $\displaystyle{\supp(\phi)\subset\bigcup _{{n=1}}^{N}U_{n}}$. Equation (3.7) then gives us

 $T(\phi)\leq\sum _{{n=1}}^{N}\mu(U_{n})\leq\sum _{{n\in\mathbb{N}}}\mu(U_{n}),$

which completes the proof. ∎

Now extend $\mu$ to a function $\mu^{*}$ on the set of all subsets of $\Omega$ by

 $\mu^{*}(A):=\inf\{\mu(U)\,:\, A\subset U\,\text{and}\, U\in\mathcal{O}\}.$ (3.8)
Lemma 3.27.

The function $\mu^{*}$ is an outer measure on $\Omega$.

Proof.

Recall that we have to prove that each of the following conditions hold.

1. $\mu^{*}(A)\geq 0$ for all $A\subseteq\Omega$ and $\mu(\emptyset)=0$,

2. $\mu^{*}(A_{1})\leq\mu^{*}(A_{2})$ if $A_{1}\subseteq A_{2}$, and

3. $\displaystyle{\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n})}$ for any countable collection of sets $\{ A_{n}\} _{{n\in\mathbb{N}}}\subset\mathcal{P}(\Omega)$.

The first two of the above properties follow easily from the respective definitions of $\mu^{*}$ and $\mu$, and so it only remains to show countable subadditivity. For any $\varepsilon>0$, let $\{ U_{n}\} _{{n\in\mathbb{N}}}$ be a collection of open subsets of $\Omega$ such that $\mu^{*}(U_{n})=\mu(U_{n})\leq\mu^{*}(A_{n})+2^{{-n}}\varepsilon$ (these sets exist since $\mu^{*}$ is defined using the infimum). Then

 $\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}U_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n})+\varepsilon.$

Since we can do this for any $\varepsilon>0$, then we have

 $\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n}),$

as required. ∎

It is worth pausing at this stage to consider some examples.

Example 3.28.
1. Given $x\in\Omega$, let $T=\delta _{0}$, the delta functional. Then for any subset $A\subseteq\Omega$ we have

 $\mu(A)=\left\{\begin{matrix}1&x\in A\\ 0&x\notin A\end{matrix}\right.$
2. Let $\Omega=\mathbb{R}^{n}$ with co-ordinates $(x_{1},\ldots,x_{n})$, and let $T$ be the distribution defined by integration on the subspace $\mathbb{R}_{1}=\{ x_{2}=\cdots=x_{n}=0\}$, i.e.

 $T(\phi)=\int _{{-\infty}}^{\infty}\phi(t,0,\ldots,0)\, dt.$

Then for any subset $A\subseteq\Omega$ we have $\mu(A)=\left|\mathbb{R}_{1}\cap A\right|$, where $|\cdot|$ denotes the one-dimensional Lebesgue measure on $\mathbb{R}_{1}\cong\mathbb{R}$.

Theorem B.16 shows that to construct a measure $\mu$ from $\mu^{*}$ we need to restrict to the $\sigma$-algebra of measurable subsets. The next lemma shows that, for the outer measure constructed above, this $\sigma$-algebra contains the Borel $\sigma$-algebra $\mathcal{B}$.

Lemma 3.29.

All open sets $U\in\mathcal{O}(\Omega)$ are measurable with respect to $\mu^{*}$, i.e. for every set $A\subseteq\Omega$ we have

 $\mu^{*}(A)=\mu^{*}(A\cap U)+\mu^{*}(A\cap(\Omega\setminus U)).$
Proof.

Since $A=(A\cap U)\cup(A\cap(\Omega\setminus U))$, then the inequality

 $\mu^{*}(A)\leq\mu^{*}(A\cap U)+\mu^{*}(A\cap(\Omega\setminus U))$

follows from the previous lemma, and so it only remains to show the reverse inequality. First consider the case where $A$ is an open subset of $\Omega$. Given any open set $U\subset\Omega$ and any $\varepsilon>0$, choose $\phi\in\mathcal{C}(A\cap U)$ such that $T(\phi)\geq\mu^{*}(A\cap U)-\frac{1}{2}\varepsilon$ (such a $\phi$ exists since $\mu^{*}(A\cap U)=\mu(A\cap U)$ is defined using the supremum). Let $K=\supp(\phi)$. Then $\Omega\setminus K$ is open, and $K\subset A\cap U\subseteq U$ implies that $\Omega\setminus U\subset\Omega\setminus K$.

Now choose $\psi\in\mathcal{C}\left((\Omega\setminus K)\cap A\right)$ such that $T(\psi)\geq\mu^{*}\left((\Omega\setminus K\cap A)\right)-\frac{1}{2}\varepsilon$ (again, such a $\psi$ exists since $\mu^{*}\left((\Omega\setminus K\cap A)\right)=\mu\left((\Omega\setminus K\cap A)\right)$ is defined using the supremum). Since $\supp(\phi)=K$ and $\supp(\psi)\subset(\Omega\setminus K)\cap A\subseteq\Omega\setminus K$, then $\phi$ and $\psi$ have disjoint support, and so

 $\displaystyle\mu^{*}(A)=\mu(A)$ $\displaystyle\geq T(\phi)+T(\psi)$ $\displaystyle\geq\mu^{*}\left(A\cap U\right)-\frac{1}{2}\varepsilon+\mu^{*}\left((\Omega\setminus K)\cap A\right)-\frac{1}{2}\varepsilon$ $\displaystyle\geq\mu^{*}\left(A\cap U\right)+\mu^{*}\left(A\cap(\Omega\setminus U)\right)-\varepsilon,$

where the last step follows from Lemma 3.26 and the fact that $\Omega\setminus U\subset\Omega\setminus K$. We can do this for any $\varepsilon>0$, and so $\mu^{*}(A)\geq\mu^{*}\left(A\cap U\right)+\mu^{*}\left(A\cap(\Omega\setminus U)\right)$ for any open set $U\subseteq\Omega$.

Now consider the case where $A$ is an arbitrary subset of $\Omega$. Then for any open set $U\subseteq\Omega$ containing $A$ and any open set $V\subseteq\Omega$ we have from Lemma 3.27

 $\displaystyle\mu^{*}(U)$ $\displaystyle\geq\mu^{*}(A)\quad\text{since A\subseteq U}$ $\displaystyle\mu^{*}(U\cap V)$ $\displaystyle\geq\mu^{*}(A\cap V)\quad\text{since A\cap V\subseteq U\cap V}$ $\displaystyle\text{and}\quad\mu^{*}\left(U\cap(\Omega\setminus V)\right)$ $\displaystyle\geq\mu^{*}\left(A\cap(\Omega\setminus V)\right)\quad\text{since A\cap(\Omega\setminus V)\subseteq U\cap(\Omega\setminus V)}.$

Therefore

 $\mu^{*}(U)=\mu^{*}(U\cap V)+\mu^{*}\left(U\cap(\Omega\setminus V)\right)\geq\mu^{*}(A\cap V)+\mu^{*}\left(A\cap(\Omega\setminus V)\right)$

for every open set $U\subseteq\Omega$ containing $A$, and any open set $V\subseteq\Omega$. Therefore, since $\mu^{*}(A)$ is defined using the infimum, then

 $\mu^{*}(A)\geq\mu^{*}(A\cap V)+\mu^{*}\left(A\cap(\Omega\setminus V)\right),$

which completes the proof. ∎

Therefore, by Theorem B.16, the function $\mu^{*}$ restricts to a measure (call it $\mu$) on the Borel sigma algebra $\mathcal{B}$. Note that this measure $\mu$ is given by (3.6) on open sets. The next two lemmas give a characterisation of $\mu$ on compact sets.

Lemma 3.30.

Given any compact set $K\subset\Omega$, and any $\psi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ such that $\psi\equiv 1$ on $K$ and $0\leq\psi\leq 1$ on $\Omega$, we have $\mu(K)\leq T(\psi)$.

Proof.

(See also [12, p43].) For all $\alpha$ such that $0<\alpha<1$, let $V_{\alpha}=\{ x\,:\,\psi(x)>\alpha\}$. Then each $V_{\alpha}$ is open, and since $\psi\equiv 1$ on $K$ we have $K\subset V_{\alpha}$. Moreover, if $\phi\in\mathcal{C}(V_{\alpha})$ then $\alpha\phi(x)\leq\psi(x)$ for all $x\in V_{\alpha}$. Therefore $T(\phi)\leq\frac{1}{\alpha}T(\psi)$ (since $T$ is a positive distribution) and we have

 $\mu(K)\leq\mu(V_{\alpha})=\sup\{ T(\phi)\,:\,\phi\in\mathcal{C}(V_{\alpha})\}\leq\frac{1}{\alpha}T(\psi)$

for all $\alpha$ such that $0<\alpha<1$. Therefore $\mu(K)\leq T(\psi)$. ∎

Corollary 3.31.

If $K$ is compact, then $\mu(K)$ is finite.

Lemma 3.32.

Let $K\subset\Omega$ be a compact set. Then

 $\mu(K)=\inf\left\{ T(\psi)\,:\,\psi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega),\,\text{and \psi\equiv 1 on K}\right\}.$ (3.9)
Proof.

Firstly note that compact sets are closed and therefore elements of the Borel $\sigma$-algebra. Given any $\varepsilon>0$, let $U$ be an open set such that $K\subset U\subseteq\Omega$ and $\mu(U)\leq\mu(K)+\varepsilon$ (the existence of $U$ follows from outer regularity of $\mu$, which is a direct consequence of the definition of $\mu^{*}$ in (3.8)). Recall from Urysohn’s lemma (Theorem A.11) that there exists $\psi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ such that $\supp(\psi)\subset U$, $0\leq\psi(x)\leq 1$ for all $x\in U$, and $\psi\equiv 1$ on $K$. Then $\psi\in\mathcal{C}(U)$, and so $T(\psi)\leq\mu(U)$ by (3.6). Therefore Lemma 3.30 implies that

 $\mu(K)\leq T(\psi)\leq\mu(U)\leq\mu(K)+\varepsilon.$

We can do this for any $\varepsilon>0$ and any compact set $K\subset\Omega$, therefore (3.9) holds for any compact set $K$. ∎

Lemma 3.33.

Given any $\varepsilon>0$ and any measurable set $A$ there exists an open set $U$ with $A\subset U$ and $\mu(U\setminus A)<\varepsilon$.

Proof.

If $\mu(A)$ is finite then the result follows easily, since $A$ is measurable and $\mu^{*}(A)$ is defined to be the infimum of $\mu(U)$ for $U\supset A$ open.

If $\mu(A)$ is infinite, then we first write the open set $\Omega$ as the countable union of compact sets

 $\Omega=\bigcup _{{\ell\in\mathbb{N}}}K_{\ell}$

(for example we could take each $K_{\ell}$ to be a closed ball), and note that

 $A=\bigcup _{{\ell\in\mathbb{N}}}A\cap K_{\ell}.$

Each $A\cap K_{\ell}$ is a subset of a compact set, and therefore has finite measure, so we can find an open set $U_{\ell}$ such that $A\cap K_{\ell}\subset U_{\ell}$ and

 $\mu(U_{\ell}\setminus A)<2^{{-\ell}}\varepsilon.$

Then $\displaystyle{U=\bigcup _{{\ell\in\mathbb{N}}}U_{\ell}}$ is an open set containing $A$, and

 $\mu(U\setminus A)=\mu\left(\bigcup _{{\ell\in\mathbb{N}}}U_{\ell}\setminus(A\cap K_{\ell})\right)\leq\sum _{{\ell\in\mathbb{N}}}\mu(U_{\ell}\setminus(A\cap K_{\ell}))<\varepsilon.$

We can now show that the measure $\mu$ is Borel regular (recall Definition B.8).

Lemma 3.34.

$\mu$ is a regular Borel measure on $\Omega$.

Proof.

Outer regularity of $\mu$ follows easily from the definition of $\mu^{*}$, and therefore it only remains to show that it is inner regular, i.e. for any measurable set $A\subseteq\Omega$ we have

 $\mu(A)=\sup\left\{\mu(K)\,:\, K\subset A\,\text{and K is compact}\right\}.$ (3.10)

Given $\varepsilon>0$, outer regularity of $\mu$ shows that there exists an open set $U$ such that $\Omega\setminus A\subset U$ and $\mu\left(U\setminus(\Omega\setminus A)\right)<\varepsilon$. Then, since we also have $\Omega\setminus U\subset A$, then

 $U\setminus(\Omega\setminus A)=U\cap A=A\setminus(\Omega\setminus U),$

and so the previous lemma shows that there exists a closed set $F=\Omega\setminus U$ such that

 $\mu\left(A\setminus F\right)<\varepsilon.$

Any closed set $F\subset\mathbb{R}^{n}$ is the countable union of compact sets; for example we can take $K_{\ell}=F\cap\overline{B(0,\ell)}$ for each $\ell\in\mathbb{N}$ and write $\displaystyle{F=\bigcup _{{\ell\in\mathbb{N}}}K_{\ell}}$. For $F=\Omega\setminus U$ as above, let $\displaystyle{F_{n}=\bigcup _{{\ell=1}}^{n}K_{\ell}}$. If $\mu(A)$ is infinite, then $\lim _{{n\rightarrow\infty}}\mu(F_{n})$ is infinite also. If $\mu(A)$ is finite, then so is $\mu(F)$, therefore there exists $N$ such that $n\geq N$ implies that $\mu(F_{n})>\mu(F)-\varepsilon$.

In both of these cases we see that $\mu(A)$ can be approximated by the measure of compact sets contained in $A$, which completes the proof of (3.10). ∎

We are now ready to prove the main theorem of this section.

Theorem 3.35.

Given a positive distribution $T$ there is a unique, positive, regular Borel measure $\mu$ on $\Omega$ such that

1. $\mu(K)<\infty$ for all compact $K\subset\Omega$, and

2. for all $\phi\in\mathcal{D}(\Omega)$ we have

 $T(\phi)=\int _{\Omega}\phi(x)\, d\mu.$ (3.11)
Proof.

Given such a distribution $T$, we have already constructed a positive regular Borel measure $\mu$, which is defined on $\Omega$ and is finite on compact sets, and so it only remains to show (3.11) for all $\phi\in\mathcal{D}(\Omega)$.

First note that we can reduce to the case of $\phi\geq 0$, since both $T$ and the integral with respect to $\mu$ are linear, and any test function $\phi$ can be written $\phi=\phi _{+}-\phi _{-}$ for non-negative test functions $\phi _{+},\phi _{-}\geq 0$ (Lemma A.13).

For each $j,n\in\mathbb{N}$, define compact sets $K_{j}^{n}=\{ x\in\Omega\,:\,\phi(x)\geq\frac{j}{n}\}$ (these are compact since $\phi$ is continuous with compact support), and define $K_{0}^{n}=\supp(\phi)$. Let $\chi _{j}^{n}$ be the characteristic function of $K_{j}^{n}$. Then

 $\phi(x)

Moreover, $f_{n}(x)$ converges pointwise to $\phi(x)$, since $f_{n}(x)-\phi(x)\leq\frac{1}{n}$ for each $n\in\mathbb{N}$. Since $f_{n}(x)\leq\sup _{{x\in\Omega}}\phi(x)+\frac{1}{n}$, and $\supp(\phi)$ is compact, then we can construct a function in $L^{1}(\mu)$ that dominates $f_{n}$, and so the dominated convergence theorem shows that

 $\lim _{{n\rightarrow\infty}}\int _{\Omega}f_{n}(x)\, d\mu=\int _{\Omega}\phi(x)\, d\mu.$

Therefore it only remains to show that the integral of $f_{n}$ with respect to $\mu$ converges to $T(\phi)$. To see this, note that for each $\varepsilon>0$ outer regularity of $\mu$ shows that we can choose $U_{j}^{n}$ to be an open set containing $K_{j}^{n}$ such that $\mu(U_{j}^{n})<\mu(K_{j}^{n})+\varepsilon$, and use Urysohn’s lemma (Theorem A.11) to find $\psi _{j}^{n}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ such that $\psi _{j}^{n}\equiv 1$ on $K_{j}^{n}$ and $\supp\psi _{j}^{n}\subset U_{j}^{n}$. Then by construction, we have

 $\phi(x)

and therefore $\displaystyle{T(\phi)\leq\frac{1}{n}\sum _{{j\geq 0}}T(\psi _{j}^{n})}$. By the definition of $\mu$ on open sets, we also have

 $\frac{1}{n}\sum _{{j\geq 0}}T(\psi _{j}^{n})\leq\frac{1}{n}\sum _{{j\geq 0}}\mu(U_{j}^{n})<\frac{1}{n}\sum _{{j\geq 0}}\left(\mu(K_{j}^{n})+\varepsilon\right).$

This is true for all $\varepsilon>0$, and so

 $T(\phi)\leq\frac{1}{n}\sum _{{j\geq 0}}\mu(K_{j}^{n})=\int _{\Omega}f_{n}(x)\, d\mu$

for all $n\in\mathbb{N}$. Taking the limit as $n\rightarrow\infty$ gives us $\displaystyle{T(\phi)\leq\int _{\Omega}\phi(x)\, d\mu}$.

Similarly, we can approximate $\phi$ from below by simple functions to obtain the opposite inequality. Since the idea is the same as above then we only sketch the details here.

For each $j,n\in\mathbb{N}$, let $O_{j}^{n}=\{ x\in\Omega\,:\,\phi(x)>\frac{j}{n}\}$, and let $\xi _{j}^{n}$ be the characteristic function of $O_{j}^{n}$. Then

 $g_{n}(x):=\frac{1}{n}\sum _{{j\geq 1}}\xi _{j}^{n}(x)\leq\phi(x)$

for all $n\in\mathbb{N}$, and $g_{n}$ converges pointwise to $\phi$ since $\phi(x)-g_{n}(x)\leq\frac{1}{n}$. Dominated convergence then shows that

 $\lim _{{n\rightarrow\infty}}\int _{\Omega}g_{n}(x)\, d\mu=\int _{\Omega}\phi(x)\, d\mu.$

For each $j,n$, inner regularity of $\mu$ implies that we can find a compact set $C_{j}^{n}$ such that $C_{j}^{n}\subset O_{j}^{n}$ and $\mu(C_{j}^{n})>\mu(O_{j}^{n})-\varepsilon$. Then use Urysohn’s lemma to find $\psi _{j}^{n}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ such that $\psi _{j}^{n}\equiv 1$ on $C_{j}^{n}$ and $\supp\psi _{j}^{n}\subset O_{j}^{n}$. Then the same argument as before shows that

 $T(\phi)\geq\frac{1}{n}\sum _{{j\geq 1}}T(\psi _{j}^{n})\geq\frac{1}{n}\sum _{{j\geq 1}}\mu(C_{j}^{n})\geq\frac{1}{n}\sum _{{j\geq 1}}\left(\mu(O_{j}^{n})-\varepsilon\right).$

This is true for all $\varepsilon>0$, and so

 $T(\phi)\geq\frac{1}{n}\sum _{{j\geq 1}}\mu(O_{j}^{n})=\int _{\Omega}g_{n}(x)\, d\mu\longrightarrow\int _{\Omega}\phi\, d\mu.$

Therefore $\displaystyle{T(\phi)=\int _{\Omega}\phi(x)\, d\mu}$, as required. ∎

Remark 3.36.

Recall that $f\in L_{{loc}}^{1}(\Omega)$ defines a distribution $\displaystyle{T_{f}(\phi)=\int _{\Omega}f\phi\, dx}$. Conversely, the Radon-Nikodym theorem and the Lebesgue decomposition (Theorems B.28 and B.30 respectively) show that the distribution can be represented by a function in $L^{1}(\Omega)$ if and only if the measure $\mu$ from Theorem 3.35 is absolutely continuous with respect to Lebesgue measure. We have already seen that there exist distributions that cannot be represented by functions in $L^{1}(\Omega)\subseteq L_{{loc}}^{1}(\Omega)$, for example the delta functional from Example 2.10. For these distributions, the measure constructed in Theorem 3.35 will not be absolutely continuous with respect to Lebesgue measure, i.e. it will have a non-trivial singular component with respect to the Lebesgue decomposition (B.7).

4 Embedding and compactness theorems

The goal of this section is to state the Sobolev Embedding Theorem and the Rellich-Kondrachov compactness theorem. For now, the proof has been postponed until a future version of these notes. An excellent source for the embedding and compactness theorems is [1], which also contains many examples that show the bounds from the theorems are sharp.

First, we have to define the class of domains under consideration. Given an open subset $\Omega\subseteq\mathbb{R}^{n}$, let

 $\Omega _{\delta}:=\left\{ x\in\Omega\,:\,\mathop{\rm dist}\nolimits(x,\partial\Omega)<\delta\right\}.$
Definition 4.1.

Let $\Omega$ be an open subset of $\mathbb{R}^{n}$. We say that $\Omega$ satisfies the cone condition if there exists a finite cone $C$ such that each $x\in\Omega$ is the vertex of a finite cone $C_{x}$ contained in $\Omega$ and congruent to $C$.

We say that $\Omega$ satisfies the uniform cone condition if there exists a locally finite open cover $\{ U_{j}\}$ of the boundary of $\Omega$ and a corresponding sequence $(C_{j})_{{j\in\mathbb{N}}}$ of finite cones, each congruent to some fixed finite cone $C$, such that

1. there exists $M<\infty$ such that every $U_{j}$ has diameter less than $M$,

2. $\displaystyle{\Omega _{\delta}\subset\bigcup _{{j=1}}^{\infty}U_{j}}$ for some $\delta>0$,

3. $\displaystyle{Q_{j}\equiv\bigcup _{{x\in\Omega\cap U_{j}}}(x+C_{j})\subset\Omega}$ for every $j$, and

4. for some $R>1$, every collection of $R$ of the sets $Q_{j}$ has empty intersection.

Since $\Omega$ is open, then continuously differentiable on $\Omega$ does not imply bounded. For $j\geq 0$, define $C_{B}^{j}(\Omega)$ to be the space of functions in $C^{j}(\Omega)$ that are bounded and have bounded partial derivatives up to $j^{{th}}$ order.

This is a Banach space with norm

 $\left\| f\right\| _{{C_{B}^{j}(\Omega)}}=\max _{{0\leq|\alpha|\leq j}}\sup _{{x\in\Omega}}\left|D^{\alpha}f(x)\right|.$

Recall that a linear map $T:A\rightarrow B$ of normed linear spaces is an embedding if $T$ is bounded with respect to the norms on $A$ and $B$. Since the elements of $W^{{k,p}}(\Omega)$ are equivalence classes of functions defined almost everywhere, then the meaning of an inclusion map from $W^{{k,p}}(\Omega)$ into $C_{B}^{j}(\Omega)$ is that each equivalence class in $W^{{k,p}}(\Omega)$ contains a function in $C_{B}^{j}(\Omega)$.

The meaning of an inclusion map from $W^{{k,p}}(\Omega)$ into $W^{{j,q}}(\Omega _{k})$ (where $\Omega _{k}$ is the intersection of $\Omega$ with a plane of dimension $k$ in $\mathbb{R}^{n}$) is that each function in $W^{{k,p}}(\Omega)$ is the limit of a sequence of $C^{\infty}$ functions (see Section 3.2) and the restriction of these smooth functions to $\Omega _{k}$ converges to a limit in $W^{{j,q}}(\Omega _{k})$. For the map to be well-defined then this limit needs to be independent of the original choice of sequence, however this is guaranteed if the norm on $W^{{j,q}}(\Omega _{k})$ is bounded by a constant times the norm on $W^{{k,p}}(\Omega)$ (which always occurs in the cases considered below).

Theorem 4.2 (Sobolev Embedding Theorem).

Let $\Omega\subseteq\mathbb{R}^{n}$ be an open set satisfying the cone condition, and, for $1\leq k\leq n$, let $\Omega _{k}$ be the intersection of $\Omega$ with a plane of dimension $k$ in $\mathbb{R}^{n}$. Let $j\geq 0$ and $m\geq 1$ be integers, and let $1\leq p<\infty$. Then

1. If either $m-\frac{n}{p}>0$, or $m=n$ and $p=1$, then

 $W^{{j+m,p}}(\Omega)\hookrightarrow C_{B}^{j}(\Omega)$

and

 $W^{{j+m,p}}(\Omega)\hookrightarrow W^{{j,q}}(\Omega _{k}),\quad W^{{m,p}}(\Omega)\hookrightarrow L^{q}(\Omega)\quad\text{for}\quad p\leq q<\infty.$
2. If $1\leq k\leq n$ and $m-\frac{n}{p}=0$, then

 $W^{{j+m,p}}(\Omega)\hookrightarrow W^{{j,q}}(\Omega _{k})\quad\text{for}\quad p\leq q<\infty.$
3. If $m-\frac{n}{p}<0$ and either $m-\frac{n}{p}>-\frac{k}{p}$, or $p=1$ and $m-\frac{n}{p}\geq-\frac{k}{p}$, then

 $W^{{j+m,p}}(\Omega)\hookrightarrow W^{{j,q}}(\Omega _{k})\quad\text{when}\quad p\leq q\quad\text{and}\quad m-\frac{n}{p}\geq-\frac{k}{q}.$

Note that in each case, it is the quantity $m-\frac{n}{p}$ that determines the allowed embeddings. Increasing this quantity by either (a) giving up more derivatives, or (b) increasing the power $p$, allows for “better” embeddings in the following sense: when $m-\frac{n}{p}>0$ then we get an embedding into the space of continuously differentiable functions (the first case above), and when $k_{1}-\frac{n}{p_{1}}>k_{2}-\frac{n}{p_{2}}$ then we get an embedding $W^{{k_{1},p_{1}}}(\Omega)\hookrightarrow W^{{k_{2},p_{2}}}(\Omega)$ (the third case above). The same philosophy applies to the compactness theorem below, as well as the Sobolev multiplication theorem (which has been postponed until a future version of the notes).

Theorem 4.3 (Rellich-Kondrachov compactness theorem).

Let $\Omega\subseteq\mathbb{R}^{n}$ be an open set satisfying the cone condition, let $\Omega _{0}$ be a bounded open subset of $\Omega$, and let $\Omega _{0}^{k}$ be the intersection of $\Omega _{0}$ with a $k$-dimensional plane in $\mathbb{R}^{n}$. Let $j\geq 0$ and $m\geq 1$ be integers, and let $1\leq p<\infty$. Then

1. If $m-\frac{n}{p}>0$ then the following embeddings are compact

 $\displaystyle W^{{j+m,p}}(\Omega)$ $\displaystyle\hookrightarrow C_{B}^{j}(\Omega _{0}),$ $\displaystyle\text{and}\quad W^{{j+m,p}}(\Omega)$ $\displaystyle\hookrightarrow W^{{j,q}}(\Omega _{0})\quad\text{if}\quad 1\leq q<\infty.$
2. If $m-\frac{n}{p}\leq 0$, then the following embeddings are compact

 $\displaystyle W^{{j+m,p}}(\Omega)$ $\displaystyle\hookrightarrow W^{{j,q}}(\Omega _{0}^{k})\quad\text{if}\quad 0>m-\frac{n}{p}>-\frac{k}{p},\quad q\geq 1,\quad\text{and}\quad m-\frac{n}{p}>-\frac{k}{q}.$ $\displaystyle W^{{j+m,p}}(\Omega)$ $\displaystyle\hookrightarrow W^{{j,q}}(\Omega _{0}^{k})\quad\text{if}\quad m-\frac{n}{p}=0\quad\text{and}\quad 1\leq q<\infty.$

Appendix A Notation and basic definitions

A.1 $L^{p}$ spaces and $L_{{loc}}^{p}$ spaces

The spaces $L^{p}(\Omega)$ and $L_{{loc}}^{p}(\Omega)$ form the basis for the definition of the Sobolev spaces $W^{{k,p}}(\Omega)$ and $W_{{loc}}^{{k,p}}(\Omega)$ in Definition 2.16, and so we review some of their basic properties here.

Definition A.1.

Let $\Omega\subseteq\mathbb{R}^{n}$ be an open set, let $0, and let $\mathcal{F}_{\Omega}$ denote the space of Lebesgue measurable functions on $\Omega$. Define

 $L^{p}(\Omega)=\left\{ f\in\mathcal{F}_{\Omega}\,:\,\int _{\Omega}|f|^{p}\, dx<\infty\right\}/\sim$

where $f\sim g$ if $f=g$ almost everywhere. When $p=\infty$, define

 $L^{\infty}(\Omega)=\left\{ f\in\mathcal{F}_{\Omega}\,:\,\esssup _{\Omega}f<\infty\right\}/\sim,$

where

 $\esssup _{\Omega}f=\inf\left\{\alpha\,:\,\left|\{ x\in\Omega\,:\, f(x)>\alpha\}\right|=0\right\}$

is the essential supremum of $f$ on $\Omega$.

It is well-known that when $1\leq p\leq\infty$ the spaces $L^{p}(\Omega)$ are Banach spaces with the norm $\displaystyle{\| f\| _{{L^{p}(\Omega)}}=\left(\int _{\Omega}|f|^{p}\right)^{{\frac{1}{p}}}}$ (see for example [12, Theorem 3.11] or [15, Theorem 8.14]).

Definition A.2.

Let $1. The conjugate exponent of $p$ is the real number $1 such that

 $\frac{1}{p}+\frac{1}{q}=1.$

If $p=1$ then the conjugate exponent of $p$ is $q=\infty$, and if $p=\infty$ then the conjugate exponent of $p$ is $q=1$.

One of the most important inequalities for $L^{p}$ spaces is Hölder’s inequality. For a proof, see for example [12, Theorem 3.5]

Theorem A.3 (Hölder’s inequality).

Let $\Omega\subseteq\mathbb{R}^{n}$ be open, and let $f\in L^{p}(\Omega)$, $g\in L^{q}(\Omega)$, where $p$ and $q$ are conjugate exponents. Then

 $\| fg\| _{{L^{1}(\Omega)}}\leq\| f\| _{{L^{p}(\Omega)}}\| g\| _{{L^{q}(\Omega)}}.$

The following theorem characterises the dual space of $L^{p}(\Omega)$. It is also well-known, for a proof see for example [2, Chapter IV], [8, Theorem 2.14], or [15, Theorem 10.44].

Theorem A.4.

Let $\Omega\subseteq\mathbb{R}^{n}$ be open, let $1\leq p<\infty$, and let $q$ be the conjugate exponent of $p$. Then

 $(L^{p}(\Omega))^{*}\cong L^{q}(\Omega).$
Remark A.5.
1. The isomorphism $L^{q}(\Omega)\cong L^{p}(\Omega)^{*}$ has an explicit form

 $L^{q}(\Omega)\ni f\mapsto\left(T_{f}:g\mapsto\int _{\Omega}f(x)g(x)\, dx\right)\in L^{p}(\Omega)^{*}$
2. It is not true that $L^{\infty}(\Omega)^{*}\cong L^{1}(\Omega)$, since, for any $x\in\Omega$, the Hahn-Banach theorem shows that the delta functional $\delta _{x}(f)=f(x)$ defined on $\mathcal{C}^{\infty}(\Omega)$ extends to a bounded linear functional (call it $\tilde{\delta}_{x}$) on $L^{\infty}(\Omega)$. A similar argument to Example 2.10 shows that $\tilde{\delta}_{x}$ cannot be represented by a function in $L^{1}(\Omega)$, i.e. there is no $g\in L^{1}(\Omega)$ such that

 $\tilde{\delta}_{x}(f)=\int _{\Omega}fg\, dx$

for all $f\in L^{\infty}(\Omega)$.

More generally, this theorem is true for any $\sigma$-finite measure space (see [15, pp182-185]). Since the proof uses the Radon-Nikodym theorem then the result may not be true if the measure is not $\sigma$-finite (see Appendix B for the relevant definitions and statements of the theorems). The following example illustrates this for a simple case.

Example A.6.

Let $\Sigma=\{\emptyset,X\}$ be the trivial $\sigma$-algebra on a space $X$, and let $\mu$ be the measure $\mu(X)=\infty$, $\mu(\emptyset)=0$. Then the measurable functions $f:X\rightarrow\mathbb{C}$ are constants, and so we see that $L^{1}(X,d\mu)\cong\{ 0\}$ consists of only the zero function. Therefore the dual is $L^{1}(X,d\mu)^{*}\cong\{ 0\}$, however $L^{\infty}(X,d\mu)\cong\mathbb{C}$, and so the dual of $L^{1}(X,d\mu)$ is not isomorphic to $L^{\infty}(X,d\mu)$ in this case.

Next, we define the space $L_{{loc}}^{p}(\Omega)$, which consists of locally integrable functions, in the sense that their integral is finite on compact sets.

Definition A.7.

Let $\Omega$ be an open set in $\mathbb{R}^{n}$, and let $\mathcal{F}$ be the set of Lebesgue measurable functions on $\Omega$. Then

 $L_{{loc}}^{p}(\Omega)=\left\{ f\in\mathcal{F}\,:\, f\in L^{p}(K)\,\,\text{for all compact K\subset\Omega}\right\}.$
Remark A.8.

One can easily extend this definition to arbitrary measure spaces that also have a topology (and hence a notion of compactness).

Clearly we have an inclusion $L^{p}(\Omega)\hookrightarrow L_{{loc}}^{p}(\Omega)$. The following examples show that this is not surjective.

Example A.9.
1. $\Omega=\mathbb{R}^{n}$. Let $f\equiv 1$, and note that

 $\int _{K}|f(x)|^{p}\, dx=m(K)<\infty,$

where $m(K)$ denotes the Lebesgue measure of $K$. Therefore $f\in L_{{loc}}^{p}(\Omega)$ for any $p>0$, even though $f\notin L^{p}(\Omega)$.

2. $\Omega=(0,\varepsilon)\subset\mathbb{R}$. Let $f(x)=\frac{1}{x}$, and note that $f$ is bounded on any compact subset $K\subset(0,\varepsilon)$. Therefore $\displaystyle{\int _{K}|f(x)|^{p}\, dx<\infty}$, and so $f\in L_{{loc}}^{p}(\Omega)$, but $f\notin L^{p}(\Omega)$ for any $p\geq 1$. (Note that we can extend this to any $p>0$ by choosing $f(x)=\frac{1}{x^{n}}$ for some $n$, or even a function that grows faster at the origin, such as $f(x)=\exp(\frac{1}{x^{2}})$.)

The spaces $L^{p}$ and $L_{{loc}}^{p}$ for $0 have radically different properties to those described above for other values of $p$. These properties are discussed further in [13, pp35-36].

A.2 Integration by parts

Since we use integration by parts on open subsets of $\mathbb{R}^{n}$ in Section 2.2, then we recall the formula here. See [4, Appendix C.1] for a more complete description.

Given an open set $\Omega\subset\mathbb{R}^{n}$ with a $C^{1}$ boundary, define

 ${\bf\nu}=(\nu^{1},\ldots,\nu^{n})$

to be the outward pointing normal at each point of the boundary $\partial\Omega$, and let $dS$ denote the volume element on the boundary.

Theorem A.10 (Integration by parts).

Let $\Omega$ be a bounded open subset of $\mathbb{R}^{n}$ with a $C^{1}$ boundary, and let $u,v\in C^{1}(\bar{\Omega})$. Then for all $i=1,\ldots,n$ we have

 $\int _{\Omega}\partial _{{x_{i}}}u\, v\, dx=-\int _{\Omega}u\,\partial _{{x_{i}}}v\, dx+\int _{{\partial\Omega}}uv\nu^{i}\, dS.$

If $u$ has compact support in $\Omega$, then the boundary term disappears, and we have

 $\int _{\Omega}\partial _{{x_{i}}}u\, v\, dx=-\int _{\Omega}u\,\partial _{{x_{i}}}v\, dx.$

A.3 The smooth Urysohn lemma and partitions of unity

The goal of this section is to give some consequences of the smooth Urysohn lemma and the existence of partitions of unity on open subsets of $\mathbb{R}^{n}$.

Theorem A.11.

Let $\Omega\subset\mathbb{R}^{n}$ be an open set, let $K\subset\Omega$ be compact, and let $U$ be an open set such that $K\subset U\subset\Omega$. Then there exists a smooth function $f\in\mathcal{C}^{\infty}(\Omega)$ such that $0\leq f(x)\leq 1$ for all $x\in\Omega$, $f\equiv 1$ on $K$, and $f\equiv 0$ on $\Omega\setminus U$. Moreover, there also exists $f\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ such that $0\leq f(x)\leq 1$ for all $x\in\Omega$, and $f\equiv 1$ on $K$.

See [3, Theorem 2.6.1] for a proof.

As a consequence of Urysohn’s lemma, we have the following useful results.

Lemma A.12.

Let $U_{1},U_{2}$ be open sets in $\mathbb{R}^{n}$, and let $K\subset U_{1}\cup U_{2}$ be compact. Then there exist non-negative functions $\phi _{1},\phi _{2}$ which are smooth on $U_{1}\cup U_{2}$ and satisfy

1. $\phi _{1}(x)+\phi _{2}(x)=1$ for all $x\in K$,

2. $\phi _{1}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1})$ and $\phi _{2}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{2})$.

Proof.

Urysohn’s lemma shows that there exists $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1}\cup U_{2})$ such that $0\leq\phi\leq 1$ and $\phi\equiv 1$ on $K$. Apply Urysohn’s lemma again to find $\tilde{\phi}_{1}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1})$ such that $0\leq\tilde{\phi}_{1}\leq 1$ and $\tilde{\phi}_{1}\equiv 1$ on a neighbourhood of the compact set $\supp(\phi)\cap U_{2}^{c}$. Let $\phi _{1}=\phi\cdot\tilde{\phi}_{1}$, and note that

1. $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1})$,

2. $0\leq\phi _{1}\leq\phi$,

3. $\phi _{1}\equiv 1$ on $K\cap U_{2}^{c}$,

4. $\supp(\phi _{1})\subseteq\supp(\phi)$ and $\phi _{1}=\phi$ on $U_{2}^{c}$.

Define $\phi _{2}=\phi-\phi _{1}$, and note that

1. $0\leq\phi _{2}\leq 1$,

2. $\phi _{2}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{2})$, and

3. $\phi _{1}+\phi _{2}\equiv 1$ on $K$.

Therefore $\phi _{1}$ and $\phi _{2}$ satisfy the stated conditions. ∎

Any smooth function can be written as the difference of two non-negative continuous functions, just by taking the positive and negative parts of the original function. The next lemma shows that a smooth function can also be written as the difference of two non-negative smooth functions.

Lemma A.13.

Let $\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$. Then there exist functions $\phi _{+},\phi _{-}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)$ such that $\phi _{+}(x)\geq 0$ and $\phi _{-}(x)\geq 0$ for all $x\in\Omega$ and $\phi(x)=\phi _{+}(x)-\phi _{-}(x)$ for all $x\in\Omega$.

Proof.

Using Urysohn’s lemma, construct a non-negative smooth function $\psi$ such that $\psi(x)\equiv\sup _{{x\in\Omega}}|\phi(x)|$ on $\supp(\phi)$, and $\supp(\psi)\subset\Omega$. Then both $\psi$ and $\psi-\phi$ are non-negative smooth functions, and so we can define $\phi _{+}=\psi$, $\phi _{-}=\psi-\phi$. ∎

A.4 A corollary of Rademacher’s theorem

Rademacher’s theorem states that a Lipschitz function $f:\Omega\subseteq\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}$ is differentiable almost everywhere (with respect to Lebesgue measure). This is used in Example 2.21 as part of the proof that a Lipschitz continuous function is in $W_{{loc}}^{{1,\infty}}(\Omega)$. The actual statement used in Example 2.21 is that the partial derivatives of $f$ exist almost everywhere (a slightly weaker statement than Rademacher’s theorem). The purpose of this section is to recall the basic definitions and state the theorem. A proof of Rademacher’s theorem can be found in [5].

First, recall the following definition.

Definition A.14.

Let $\Omega\subseteq\mathbb{R}^{n}$ be an open set. A function $f:\Omega\rightarrow\mathbb{R}^{m}$ is locally Lipschitz continuous on $\Omega$ if for every $x\in\Omega$ there exists a constant $C(x)$ and a neighbourhood $U$ of $x$ such that the following inequality is satisfied

 $\left|f(x)-f(y)\right|\leq C(x)|x-y|.$ (A.1)

If there exists a uniform constant $C$ such that $\left|f(x)-f(y)\right|\leq C|x-y|$ for all $x,y\in\Omega$ then we say that $f$ is uniformly Lipschitz on $\Omega$. The smallest value of the constant $C$ is called the Lipschitz constant

 $\Lip(f)=\sup _{{x,y\in\Omega}}\frac{\left|f(x)-f(y)\right|}{|x-y|}.$ (A.2)
Theorem A.15.

Let $\Omega\subseteq\mathbb{R}^{n}$ be open, and let $f:\Omega\rightarrow\mathbb{R}$ be locally Lipschitz on $\Omega$. Then $f$ is differentiable almost everywhere in $\Omega$ (with respect to the Lebesgue measure on $\mathbb{R}^{n}$).

For a proof, see [5, Section 3.1.2].

Remark A.16.

Uniformly Lipschitz implies absolutely continuous, and so we know that the theorem is true for functions $f:\Omega\subseteq\mathbb{R}\rightarrow\mathbb{R}$ with one-dimensional domains by general theory of absolutely continuous functions (see for example [15, Theorem 7.29]). This fact is used in the proof for $n\geq 2$, however some further analysis is also necessary (see [5, Section 3.1.2] for the details).

As a consequence of this, we have the following

Corollary A.17.

Let $\Omega$ be an open subset of $\mathbb{R}^{n}$, and let $f:\Omega\rightarrow\mathbb{R}$ be a locally Lipschitz function. Then the partial derivatives of $f$ exist almost everywhere on $\Omega$ (with respect to the Lebesgue measure on $\mathbb{R}^{n}$).

Appendix B Basic results from measure theory

Since Section 3.4 deals with measure, then, for completeness, here we review the basic definitions. In particular, this includes the definition of a complex measure. Since these notes assume knowledge of Lebesgue integration and the basic theorems associated to this (monotone convergence, dominated convergence, etc.) then this is not included here, the purpose is just to recall the important definitions that are used elsewhere in the notes. Of course, there are already good sources for this material such as [12], [15], [7], [10] (and many more!), so only the material relevant to the rest of the notes is covered in this section. Examples are included wherever possible in order to clarify the theory.

Since we want to deal with sets of infinite measure ($\mathbb{R}^{n}$ is the standard example), then first we have to define arithmetic in $\mathbb{R}\cup\{-\infty,+\infty\}$. This is an extension of the usual operations of addition and multiplication on $\mathbb{R}$, together with the following definitions.

 $a\cdot\infty=\infty\cdot a=\left\{\begin{matrix}\infty&\text{if a\in\mathbb{R}\setminus\{ 0\}}\\ 0&\text{if a=0}\end{matrix}\right.$

Care must be taken when cancelling terms from an equation: $a+b=a+c$ implies $b=c$ only if $a\in\mathbb{R}$, and $ab=ac$ only if $a\in\mathbb{R}$. The consequence of these definitions is that the integral of any function over a set of measure zero will be zero, and the integral of the zero function over any set will also be zero.

Definition B.1.

A collection $\Sigma$ of subsets of a set $X$ is a $\sigma$-algebra on $X$ if all of the following hold.

1. $X\in\Sigma$.

2. If $A\in\Sigma$, then $X\setminus A\in\Sigma$.

3. If $A_{n}\in\Sigma$ for all $n\in\mathbb{N}$, then $\displaystyle{\bigcup _{{n\in\mathbb{N}}}A_{n}\in\Sigma}$.

Example B.2.
1. The set of all subsets of $X$ forms a $\sigma$-algebra.

2. $\Sigma=\{\emptyset,X\}$ is a $\sigma$-algebra, called the trivial $\sigma$-algebra on $X$.

3. The set of all subsets of $\mathbb{R}$ that are open is not a $\sigma$-algebra, since the complement of an open set is not necessarily open.

4. The set of all subsets of $\mathbb{R}^{n}$ that are either open or closed is not a $\sigma$-algebra, since it is not closed under the operation of countable unions. For example, in the case $n=1$

 $[a,b)=\bigcup _{{n\in\mathbb{N}}}\left[a,b-\frac{1}{n}\right]$

is neither open nor closed.

5. If $\Sigma$ is a $\sigma$-algebra on $X$ and $U\subset X$, then the collection

 $\Sigma _{U}=\{ A\cap U\,:\, A\in\Sigma\}$

is a $\sigma$-algebra on $U$.

Since open and closed subsets of $\mathbb{R}^{n}$ are of fundamental importance, then it would be useful to have a $\sigma$-algebra that contains all of these sets. The $\sigma$-algebra of all subsets of $\mathbb{R}^{n}$ is too large for interesting measures to exist (see [10, Section 5] for more insight into why this is the case), so it would also be useful for this $\sigma$-algebra to have some minimality property, i.e. it is the “smallest” $\sigma$-algebra that contains all of the open and closed subsets of $\mathbb{R}$. The next theorem shows that such a $\sigma$-algebra exists.

Theorem B.3.

Let $\mathcal{F}$ be a collection of subsets of a set $X$. Then there exists a unique $\sigma$-algebra on $X$, call it $\Sigma$, such that

1. $\mathcal{F}\subseteq\Sigma$, and

2. any other $\sigma$-algebra $\Sigma^{{\prime}}$ containing $\mathcal{F}$ satisfies $\Sigma\subseteq\Sigma^{{\prime}}$ (i.e. $\Sigma$ is the smallest $\sigma$-algebra containing $\mathcal{F}$).

This $\sigma$-algebra is called the $\sigma$-algebra generated by $\mathcal{F}$.

Proof of Theorem B.3.

Consider the family of all $\sigma$-algebras on $X$ that contain $\mathcal{F}$. Since the set of all subsets of $X$ is a $\sigma$-algebra containing $\mathcal{F}$, then this family is non-empty. We claim that the intersection $\Sigma$ of all $\sigma$-algebras containing $\mathcal{F}$ is also a $\sigma$-algebra, and the result will then follow, since such a $\sigma$-algebra clearly satisfies both of the conditions of the theorem.

Firstly note that the set $X$ is in every $\sigma$-algebra containing $\mathcal{F}$, and so $X\in\Sigma$ also. If a subset $A\subseteq X$ is in $\Sigma$, then it is in every $\sigma$-algebra containing $\mathcal{F}$, and so $X\setminus A$ is in every $\sigma$-algebra containing $\mathcal{F}$, therefore $X\setminus A\in\Sigma$ also. Therefore it only remains to check that $\Sigma$ is closed under countable unions. To see this, let $\{ A_{n}\} _{{n\in\mathbb{N}}}$ be a countable collection of sets in $\Sigma$. Then $\{ A_{n}\} _{{n\in\mathbb{N}}}\subseteq\Sigma^{{\prime}}$ for every $\sigma$-algebra $\Sigma^{{\prime}}$ containing $\mathcal{F}$, and so $A:=\displaystyle{\bigcup _{{n\in\mathbb{N}}}A_{n}\in\Sigma^{{\prime}}}$ also. Therefore $A\in\Sigma$, and we have shown that $\Sigma$ is a $\sigma$-algebra. ∎

Definition B.4.

The Borel $\sigma$-algebra $\mathcal{B}$ on $\mathbb{R}^{n}$ is the smallest $\sigma$-algebra that contains the collection of open and closed subsets of $\mathbb{R}^{n}$. The sets in $\mathcal{B}$ are called the Borel subsets of $\mathbb{R}^{n}$.

Definition B.5.

Let $\Sigma$ be a $\sigma$-algebra on a set $X$. A positive measure on $\Sigma$ is a function $\mu:\Sigma\rightarrow[0,\infty]$ such that

 $\mu\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)=\sum _{{n\in\mathbb{N}}}\mu(A_{n})$ (B.1)

for any disjoint collection $\{ A_{n}\} _{{n\in\mathbb{N}}}\subseteq\Sigma$. A function $\mu$ satisfying (B.1), but without the restriction that the range is $[0,\infty]$, is called a countably additive set function.

A complex measure on $\Sigma$ is a countably additive function $\mu:\Sigma\rightarrow\mathbb{C}$ (see [12, Chapter 6] for more about complex measures).

Definition B.6.

A measure space $(X,\Sigma,\mu)$ consists of a set $X$, a $\sigma$-algebra $\Sigma$ of subsets of $X$, and a measure $\mu$ on $\Sigma$.

A measure space $(X,\Sigma,\mu)$ is finite if $\mu(X)$ is finite. A measure space $(X,\Sigma,\mu)$ is $\sigma$-finite if $X$ is the countable union of sets $X_{n}\in\Sigma$ with $\mu(X_{n})$ finite for each $n\in\mathbb{N}$.

Remark B.7.

A $\sigma$-finite measure is the countable sum of finite measures. To see this, let $(X,\Sigma,\mu)$ be $\sigma$-finite, with $\displaystyle{X=\bigcup _{{n\in\mathbb{N}}}X_{n}}$ and $\mu(X_{n})$ finite for each $n\in\mathbb{N}$. Define measures

 $\mu _{n}(E)=\mu(E\cap X_{n}),$

and note that $\mu(E)=\displaystyle{\sum _{{n\in\mathbb{N}}}\mu _{n}(E)}$.

An important class of measures on $\mathbb{R}^{n}$ are those defined on the Borel $\sigma$-algebra.

Definition B.8.

A Borel measure on $\mathbb{R}^{n}$ is a measure defined on the Borel $\sigma$-algebra.

A Borel measure $\mu$ is inner regular (resp. outer regular) if for every $E\subseteq\mathbb{R}^{n}$ we have

 $\mu(E)=\sup\{\mu(K)\,:\,\text{K compact and K\subset E}\}\quad\text{(resp. }\,\mu(E)=\inf\{\mu(U)\,:\,\text{U open and E\subset U}\}).$

If a Borel measure $\mu$ is both inner and outer regular, then we say that $\mu$ is Borel regular.

Remark B.9.
1. Again, this notion can be extended to measures on a locally compact Hausdorff space (see [12]).

2. The definition of a Borel regular measure given above is equivalent to the requirement that every measurable set $E$ has the same measure as some Borel sets $B_{1}\supseteq E$ and $B_{2}\subseteq E$. To see this, let $B_{1}$ be the intersection of open sets $\{ U_{n}\} _{{n\in\mathbb{N}}}$ such that $U_{n}\supset E$ and $\mu(U_{n}\setminus E)<\frac{1}{n}$, and let $B_{2}$ be the union of compact sets $\{ K_{n}\} _{{n\in\mathbb{N}}}$ such that $K_{n}\subset E$ and $\mu(E\setminus K_{n})<\frac{1}{n}$.

A useful way to construct a measure with certain desired properties is to start with an outer measure. For example, Lebesgue measure and Hausdorff measure can both be constructed using outer measures (see also [12] for a construction of Lebesgue measure that doesn’t use outer measure), and the construction in Section 3.4 of a measure associated to a distribution also uses outer measure.

Definition B.10.

A function $\mu^{*}:\mathcal{P}(X)\rightarrow[0,\infty]$ defined on the power set $\mathcal{P}(X)$ of a space $X$ is called an outer measure on $X$ if it satisfies all of the following.

1. $\mu^{*}(A)\geq 0$, $\mu^{*}(\emptyset)=0$.

2. $\mu^{*}(A_{1})\leq\mu^{*}(A_{2})$ if $A_{1}\subseteq A_{2}$.

3. $\displaystyle{\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n})}$ for any countable collection of sets $\{ A_{n}\} _{{n\in\mathbb{N}}}\subset\mathcal{P}(X)$.

The point of studying outer measures is that it is easy to construct an outer measure with certain properties.

To get a feel for outer measure we recall here two main examples: Lebesgue outer measure and Hausdorff outer measure.

Example B.11 (Lebesgue outer measure).

The Lebesgue outer measure is defined on compact rectangular subsets

 $R=[a_{1},b_{1}]\times\cdots\times[a_{n},b_{n}]\subset\mathbb{R}^{n}$

by

 $m^{*}(R):=\prod _{{j=1}}^{n}(b_{j}-a_{j}).$

For an arbitrary subset $E\subseteq\mathbb{R}^{n}$, consider the collection $\mathcal{K}$ of all countable covers $\{ R_{n}\} _{{n\in\mathbb{N}}}$ of $E$ by rectangular sets $R_{n}$, and define

 $m^{*}(E):=\inf _{{\mathcal{K}}}\sum _{{n\in\mathbb{N}}}m^{*}(R).$ (B.2)

It is easy to check that this definition satisfies the conditions of an outer measure (see for example [15, Theorems 3.3 & 3.4]).

The next theorem is a useful characterisation of the Lebesgue outer measure.

Theorem B.12.

Let $E\subseteq\mathbb{R}^{n}$. Then for each $\varepsilon>0$, there exists an open set $U\subseteq\mathbb{R}^{n}$ such that $E\subset U$ and $m^{*}(U)\leq m^{*}(E)+\varepsilon$.

In particular, we have

 $m^{*}(E)=\inf\{ m^{*}(U)\,:\,\text{U open and E\subset U}\}.$

For a proof, see [15, Theorem 3.6].

The Hausdorff outer measure and the associated Hausdorff measure are useful for studying certain subsets of $\mathbb{R}^{n}$ that have Lebesgue measure zero. For example, the $d$-dimensional Hausdorff measure of a $d$-dimensional ball in $\mathbb{R}^{n}$ is non-trivial, even though the Lebesgue measure is zero. Furthermore, the Hausdorff measure can be used to distinguish fractal sets (sets of fractional Hausdorff dimension), and the study of the properties of Hausdorff measure is a major component of Geometric Measure Theory (see [5], [6], [9]).

Example B.13 (Hausdorff outer measure).

The diameter of a set $E\subseteq\mathbb{R}^{n}$ is defined to be

 $\delta(E):=\sup _{{x,y\in E}}|x-y|.$

Fix $m>0$ (not necessarily an integer), and let $E\subseteq\mathbb{R}^{n}$. Let $\mathcal{K}_{\varepsilon}$ denote the collection of countable covers $\{ E_{n}\} _{{n\in\mathbb{N}}}$ of $E$ such that $\delta(E_{n})<\varepsilon$ for each $n\in\mathbb{N}$. Given $\varepsilon>0$, define

 $\mathcal{H}_{\alpha}^{\varepsilon}(E)=\inf _{{\mathcal{K}_{\varepsilon}}}\sum _{{n\in\mathbb{N}}}\delta(E_{n})^{\alpha}.$

If $\varepsilon _{1}<\varepsilon _{2}$, then $\mathcal{K}_{{\varepsilon _{1}}}\subset\mathcal{K}_{{\varepsilon _{2}}}$, and so $\mathcal{H}_{\alpha}^{{\varepsilon _{1}}}(E)\geq\mathcal{H}_{\alpha}^{{\varepsilon _{2}}}(E)$. Therefore

 $\mathcal{H}_{\alpha}(E):=\lim _{{\varepsilon\rightarrow 0}}\mathcal{H}_{\alpha}^{\varepsilon}(E)\quad\text{exists}.$

$\mathcal{H}_{\alpha}(E)$ is called the $\alpha$-dimensional Hausdorff outer measure of $E$. Again, it is easy to check that this is an outer measure (see for example [15, Theorem 11.12]).

The next definition and theorem show that each outer measure has an associated $\sigma$-algebra and that the restriction of the outer measure to this $\sigma$-algebra is a measure.

Definition B.14.

Let $\mu^{*}$ be an outer measure on $X$. A subset $E\subset X$ is $\mu^{*}$-measurable if and only if

 $\mu^{*}(A)=\mu^{*}(A\cap E)+\mu^{*}(A\setminus(A\cap E))$ (B.3)

for every subset $A\subseteq X$.

Remark B.15.
1. An equivalent definition is that $E$ is $\mu^{*}$-measurable if and only if

 $\mu^{*}(A_{1}\cup A_{2})=\mu^{*}(A_{1})+\mu^{*}(A_{2})$ (B.4)

whenever $A_{1}\subseteq E$ and $A_{2}\subseteq X\setminus E$. To see that the first definition implies the second, given any sets $A_{1}\subseteq E$ and $A_{2}\subseteq E\setminus X$, let $A=A_{1}\cup A_{2}$. Clearly (B.3) implies (B.4). Conversely, given any set $A\subseteq X$, let $A_{1}=A\cap E$ and $A_{2}=A\setminus(A\cap E)$. Clearly these satisfy the requirements $A_{1}\subseteq E$ and $A_{2}\subseteq X\setminus E$, and we have $A=A_{1}\cup A_{2}$. Again, it is clear that (B.4) implies (B.3).

2. Both of these definitions have the same basic idea: the $\mu^{*}$-measurable subsets of $X$ are those for which $\mu^{*}$ is additive on arbitrary decompositions into disjoint subsets.

The next theorem justifies the use of the term “measurable” in the previous definition.

Theorem B.16 (Caratheodory).

Let $\mu^{*}$ be an outer measure on $X$. Then the collection of $\mu^{*}$-measurable subsets of $X$ forms a $\sigma$-algebra, and the restriction of $\mu^{*}$ to this $\sigma$-algebra is a measure.

For a proof, see for example [8, Theorem 1.15].

Remark B.17.

This theorem is used in Section 3.4 to construct the measure associated to a positive distribution.

Definition B.18.
1. The Lebesgue measure, denoted $m(\cdot)$, is the measure associated to the Lebesgue outer measure from Example B.11.

2. The $\alpha$-dimensional Hausdorff measure, denoted $\mathcal{H}_{\alpha}$, is the measure associated to the $\alpha$-dimensional Hausdorff outer measure from Example B.13.

Remark B.19.

Open and closed sets are Lebesgue measurable, and therefore the $\sigma$-algebra of Lebesgue measurable sets contains the Borel $\sigma$-algebra.

Using this definition of Lebesgue measure, together with Theorem B.12, we see that Lebesgue measure is Borel outer regular.

Lemma B.20.

For any Lebesgue measurable set $E\subseteq\mathbb{R}^{n}$ we have

 $m(E)=\inf\{ m(U)\,:\,\text{U open and E\subset U}\}.$

The proof follows by restricting the result of Theorem B.12 to the $\sigma$-algebra of Lebesgue-measurable sets. By taking complements, we also see that $E$ is Borel inner regular.

Lemma B.21.

For any Lebesgue measurable set $E\subseteq\mathbb{R}^{n}$ we have

 $m(E)=\sup\{ m(K)\,:\,\text{K compact and K\subset E}\}.$

This is a consequence of [15, Lemma 3.22], which states that $E$ is measurable if and only if for all $\varepsilon>0$ there exists a closed set $F\subset E$ such that $m(E\setminus F)<\varepsilon$. The lemma above then follows by taking a sequence of compact sets $K_{n}=F\cap\overline{B(0,n)}$.

A natural question arising from Theorem B.3 is whether two measures that agree on the Borel subsets of $\mathbb{R}^{n}$ also agree on the Borel $\sigma$-algebra (the minimal $\sigma$-algebra generated by the Borel subsets). This question is answered in more generality by the Caratheodory-Hahn Extension Theorem, for which we first need the following definitions.

Definition B.22.

An algebra of subsets of $X$ is a non-empty collection $\mathcal{A}$ of subsets of $X$ that is closed under the operations of taking complements and finite unions.

Note that, as a consequence, an algebra is also closed under finite intersections, and therefore both $X$ and the empty set are both in $\mathcal{A}$. The difference between this definition and that of a $\sigma$-algebra is that a $\sigma$-algebra is also closed under countable unions. For example, the set of all open and closed subsets of $\mathbb{R}^{n}$ is an algebra, but not a $\sigma$-algebra.

Definition B.23.

A measure on an algebra $\mathcal{A}$ is a function $\mu:\mathcal{A}\rightarrow[0,\infty]$ such that $\mu(\emptyset)=0$, and

 $\mu\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)=\sum _{{n\in\mathbb{N}}}\mu(A_{n})$

whenever $\{ A_{n}\}$ is a countable collection of disjoint sets in $\mathcal{A}$ whose union also belongs to $\mathcal{A}$.

Given a measure on an algebra $\mathcal{A}$, we can construct an outer measure $\mu^{*}$ on $X$ as follows. For each subset $A\subset X$, let $\mathcal{C}=\left\{\{ A_{n}\} _{{n\in\mathbb{N}}}\right\}$ be the collection of countable covers of $A$ by sets in $\mathcal{A}$. Define

 $\mu^{*}(A)=\inf _{\mathcal{C}}\sum _{{n\in\mathbb{N}}}\mu(A_{n}).$ (B.5)
Theorem B.24.

Let $\mu$ be a measure on an algebra $\mathcal{A}$, and let $\mu^{*}$ be as defined in (B.5). Then

1. $\mu^{*}$ is an outer measure,

2. $\mu^{*}(A)=\mu(A)$ for all $A\in\mathcal{A}$, and

3. $A$ is $\mu^{*}$-measurable for all $A\in\mathcal{A}$.

For a proof, see [15, Theorems 11.18 and 11.19].

Definition B.25.

Let $\mu$ be a measure on an algebra $\mathcal{A}$. If $\tilde{\mu}$ is a measure on a $\sigma$-algebra $\Sigma$ containing $\mathcal{A}$, and $\tilde{\mu}(A)=\mu(A)$ for all $A\in\mathcal{A}$, then we say that $\tilde{\mu}$ is an extension of the measure $\mu$ to the $\sigma$-algebra $\Sigma$.

Theorem B.16 shows that the outer measure $\mu^{*}$ defined in (B.5) is a measure on some $\sigma$-algebra $\mathcal{A}^{*}$ containing $\mathcal{A}$. The next theorem shows that this is the unique extension of $\mu$ to any $\sigma$-algebra contained in $\mathcal{A}^{*}$.

Theorem B.26 (Caratheodory-Hahn Extension Theorem).

Let $\mu$ be a measure on an algebra $\mathcal{A}$, let $\mu^{*}$ be the corresponding outer measure, and let $\mathcal{A}^{*}$ be the $\sigma$-algebra of $\mu^{*}$-measurable sets. Then the restriction of $\mu^{*}$ to $\mathcal{A}^{*}$ is an extension of $\mu$. Moreover, if $\mu$ is $\sigma$-finite with respect to $\mathcal{A}$, and if $\Sigma$ is any $\sigma$-algebra with $\mathcal{A}\subseteq\Sigma\subseteq\mathcal{A}^{*}$, then $\mu^{*}$ is the only measure on $\Sigma$ that is an extension of $\mu$.

For a proof see [15, Theorem 11.20].

Sobolev spaces are defined in terms of distributions, and in many of the examples from Sections 2 and 3 we consider distributions $T\in\mathcal{D}(\Omega)^{*}$ that are represented by a function $f\in L_{{loc}}^{1}(\Omega)$, i.e. $T=T_{f}$ where

 $T_{f}(\phi):=\int _{\Omega}f\phi\, dx.$

Many distributions cannot be represented by a function, for example the delta functional from Example 2.7. The main theorem of Section 3.4 shows that instead of using functions to represent distributions, the right class of objects to look at is the class of regular Borel measures (see Theorem 3.35). A natural question is to ask when a measure can be represented by a function, and, if not, then how can this failure be expressed in terms of properties of the measure. This is the content of the Lebesgue decomposition and Radon-Nikodym theorem.

Definition B.27.

Let $\mu$ and $\nu$ be measures on the same $\sigma$-algebra $\Sigma$ on a space $X$. The measure $\nu$ is absolutely continuous with respect to $\mu$ if $\nu(E)=0$ for every set $E\in\Sigma$ with $\mu(E)=0$. The measure $\nu$ is singular with respect to $\mu$ if there is a set $Z\in\Sigma$ with $\mu(Z)=0$, and $\nu(E)=0$ for every $E\in\Sigma$ such that $E\subseteq X\setminus Z$.

In other words, if sets of $\mu$-measure zero are also sets of $\nu$-measure zero, then $\nu$ is absolutely continuous with respect to $\mu$. If $\nu$ is supported on a set of $\mu$-measure zero then it is singular with respect to $\mu$.

Let $(X,\Sigma,\mu)$ be a $\sigma$-finite measure space, and let $\alpha$ be a measure on $\Sigma$ that is absolutely continuous with respect to $\mu$. Then there exists $f\in L^{1}(X,d\mu)$ such that

 $\alpha(E)=\int _{E}f\, d\mu$ (B.6)

for each $E\in\Sigma$.

Theorem B.29.

Let $(X,\Sigma,\mu)$ be a measure space, and let $\sigma$ be a measure on $\Sigma$ that is singular with respect to $\mu$. Then there exists a set $Z$ with $\mu(Z)=0$, and

 $\sigma(E)=\sigma(E\cap Z)$

for each $E\in\Sigma$.

Theorem B.30 (Lebesgue Decomposition).

Let $\mu$ be a $\sigma$-finite measure on a $\sigma$-algebra $\Sigma$, and let $\nu$ be a finite measure on $\Sigma$. Then there is a unique decomposition

 $\nu=\alpha+\sigma,$ (B.7)

where $\alpha$ and $\sigma$ are measures on $\Sigma$ such that $\alpha$ is absolutely continuous with respect to $\mu$, and $\sigma$ is singular with respect to $\mu$.

See [12] or [15] for different proofs of the above statements. Note that Rudin in [12] considers the more general case of a complex measure.

The following simple example shows that $\sigma$-finiteness is a necessary condition in the Radon-Nikodym theorem. Another example using the counting measure is described in [12, pp123-124].

Example B.31.

Let $\Sigma=\{\emptyset,X\}$ be the trivial $\sigma$-algebra on a set $X$, and let $\mu$ and $\nu$ be measures on $X$ with $\mu(\emptyset)=0$, $\mu(X)=\infty$, $\nu(\emptyset)=0$, and $\nu(X)=1$. Note that $\nu$ is absolutely continuous with respect to $\mu$, and that $\mu$ is not a $\sigma$-finite measure on $X$. Then the $\Sigma$-measurable functions $f:X\rightarrow\mathbb{C}$ are the constants (since $f$ measurable implies that $f^{{-1}}(U)\in\Sigma$ for all open sets $U\subseteq\mathbb{C}$), and so $\displaystyle{\int _{X}f\, d\mu=\infty}$ for any non-zero measurable function. Since $\nu(X)=1$, then there cannot exist any measurable function $f$ such that $\displaystyle{\int _{X}f\, d\mu=\nu(X)}$, and therefore the Radon-Nikodym theorem does not hold in this case.

The next lemma is a consequence of the well-known Vitali covering lemma.

Lemma B.32.

Let $\Omega\subseteq\mathbb{R}^{n}$ be an open set. Then for all $\delta>0$ there exists a countable collection $\{ B_{n}\} _{{n\in\mathbb{N}}}$ of disjoint closed balls in $\Omega$ such that

1. $\diam B_{n}\leq\delta$ for all $n\in\mathbb{N}$, and

2. $\displaystyle{\Omega\setminus\bigcup _{{n\in\mathbb{N}}}B_{n}}$ has Lebesgue measure zero.

See [5, Corollary 2, p28] for a proof.

References

• 1
Sobolev spaces.
Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York-London, 1975.
Pure and Applied Mathematics, Vol. 65.
• 2
S. Banach.
Theory of linear operations, volume 38 of North-Holland Mathematical Library.
North-Holland Publishing Co., Amsterdam, 1987.
Translated from the French by F. Jellett, With comments by A. Pełczyński and Cz. Bessaga.
• 3
Lawrence Conlon.
Differentiable manifolds: a first course.
Birkhäuser Advanced Texts: Basler Lehrbücher. [Birkhäuser Advanced Texts: Basel Textbooks]. Birkhäuser Boston Inc., Boston, MA, 1993.
• 4
Lawrence C. Evans.
Partial differential equations, volume 19 of Graduate Studies in Mathematics.
American Mathematical Society, Providence, RI, 1998.
• 5
Lawrence C. Evans and Ronald F. Gariepy.
Measure theory and fine properties of functions.
Studies in Advanced Mathematics. CRC Press, Boca Raton, FL, 1992.
• 6
Herbert Federer.
Geometric measure theory.
Die Grundlehren der mathematischen Wissenschaften, Band 153. Springer-Verlag New York Inc., New York, 1969.
• 7
Paul R. Halmos.
Measure Theory.
D. Van Nostrand Company, Inc., New York, N. Y., 1950.
• 8
Elliott H. Lieb and Michael Loss.
Analysis, volume 14 of Graduate Studies in Mathematics.
American Mathematical Society, Providence, RI, second edition, 2001.
• 9
Frank Morgan.
Geometric measure theory.
Academic Press Inc., San Diego, CA, third edition, 2000.
A beginner’s guide.
• 10
John C. Oxtoby.
Measure and category, volume 2 of Graduate Texts in Mathematics.
Springer-Verlag, New York, second edition, 1980.
A survey of the analogies between topological and measure spaces.
• 11
Michael Reed and Barry Simon.
Methods of modern mathematical physics. I.
Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, second edition, 1980.
Functional analysis.
• 12
Walter Rudin.
Real and complex analysis.
McGraw-Hill Book Co., New York, third edition, 1987.
• 13
Walter Rudin.
Functional analysis.
International Series in Pure and Applied Mathematics. McGraw-Hill Inc., New York, second edition, 1991.
• 14
François Trèves.
Topological vector spaces, distributions and kernels.