Sobolev spaces on Euclidean space

Graeme Wilkin
Department of Mathematics
University of Colorado
Boulder, CO 80309



1 Introduction

The purpose of these notes is to outline the basic definitions and theorems for Sobolev spaces defined on open subsets of Euclidean space. Of course, there are already many good references on this topic, and, rather than duplicate this here, instead the goal is to give examples where possible to illustrate the theory, and to orient the reader towards the different approaches contained in the literature. In addition, there is an appendix containing some basic results from measure theory (again, this contains examples and references to some of the literature on the subject).

There are a number of more advanced topics that have been left to future versions of these notes; for example, complete proofs of the embedding and compactness theorems (as well as examples where embeddings don’t exist), the chain rule and the behaviour of weak derivatives under co-ordinate transformations. Good references for this material include the book [1] by Adams (a classic on the subject), and Ziemer’s book [16]. Sobolev spaces on manifolds and their use in gauge theory would also be good topics for an expanded version of these notes. Future versions of these notes will also contain more examples.

Notation. First-order partial derivatives uxi\frac{\partial u}{\partial x_{i}} are denoted xiu\partial _{{x_{i}}}u or iu\partial _{i}u. Higher-order partial derivatives use the standard notation for multi-indices (see [4, Appendix A]): Given a multi-index α=(α1,,αn)0n\alpha=(\alpha _{1},\ldots,\alpha _{n})\in\mathbb{Z}_{{\geq 0}}^{n} we write |α|=α1++αn|\alpha|=\alpha _{1}+\cdots+\alpha _{n} for the order of α\alpha, and define the αth\alpha^{{th}} order partial derivative by

Dαu=|α|ux1α1xnαn=x1α1xnαnu.D^{\alpha}u=\frac{\partial^{{|\alpha|}}u}{\partial x_{1}^{{\alpha _{1}}}\cdots\partial x_{n}^{{\alpha _{n}}}}=\partial _{{x_{1}}}^{{\alpha _{1}}}\cdots\partial _{{x_{n}}}^{{\alpha _{n}}}u.

2 Definition of Sobolev spaces

This section contains all of the necessary definitions needed to define Sobolev spaces on open subsets of n\mathbb{R}^{n}. In order to get a feel for distributional derivatives and Sobolev spaces, some basic examples are given throughout the section.

The approach taken in these notes is to follow the historical definition of Sobolev spaces. First, in this section, we define the distributional and weak derivatives, and then define Sobolev spaces in terms of these. Later on, in Section 3.2, we prove the Meyers-Serrin theorem, which says that Sobolev spaces are the completion of the space of smooth functions in the Sobolev norm.

2.1 Distributions and test functions

Let Ωn\Omega\subseteq\mathbb{R}^{n} be open and non-empty, and let 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) denote the space of smooth complex-valued functions with compact support in Ω\Omega.

Definition 2.1.

The space of test functions on Ω\Omega, denoted 𝒟(Ω)\mathcal{D}(\Omega), is the locally convex topological vector space (more precisely the LF-space) consisting of all the functions in 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega), with the following notion of convergence: A sequence (ϕm)m𝒞cpt(Ω)(\phi _{m})_{{m\in\mathbb{N}}}\subset\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) converges in 𝒟(Ω)\mathcal{D}(\Omega) to the function ϕ𝒞cpt(Ω)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) if and only if there is some fixed compact set KK such that the support of ϕm-ϕ\phi _{m}-\phi is in KK for all mm, and that DαϕmDαϕD^{\alpha}\phi _{m}\rightarrow D^{\alpha}\phi uniformly for each α\alpha.

Remark 2.2.
  1. The notation 𝒟(Ω)\mathcal{D}(\Omega) is used to emphasise the topology on 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) described above.

  2. Note that the definition does not imply that the constants from the uniform convergence are independent of α\alpha.

  3. To see that 𝒟(Ω)\mathcal{D}(\Omega) is a locally convex topological vector space, indeed, let us construct a family of seminorms which induces the topology on 𝒟(Ω)\mathcal{D}(\Omega). To this end denote first by Ecpt(Ω)E_{\textup{cpt}}(\Omega) the set of compact exhaustions of Ω\Omega, which means the set of all families K=(Ki)iK=(K_{i})_{{i\in\mathbb{N}}} such that KiΩK_{i}\subset\Omega is compact, iKi=Ω\bigcup _{{i\in\mathbb{N}}}K_{i}=\Omega and KiKi+1K_{i}\subset K_{{i+1}}^{\circ} for all ii\in\mathbb{N}. For every compact exhaustion KK and every pair M=(mi)iM=(m_{i})_{{i\in\mathbb{N}}} and N=(ni)iN=(n_{i})_{{i\in\mathbb{N}}} of sequences of natural numbers denote then by pK,M,N:𝒟(Ω)0p_{{K,M,N}}:\mathcal{D}(\Omega)\rightarrow\mathbb{R}_{{\geq 0}} the map defined by

    pK,M,N(ϕ)=i=0supxKi+1Kisup0|α|mini|Dαϕ(x)|,.p_{{K,M,N}}(\phi)=\sum _{{i=0}}^{\infty}\>\sup _{{x\in K_{{i+1}}\setminus K_{i}^{\circ}}}\>\sup _{{0\leq|\alpha|\leq m_{i}}}\> n_{i}\,|D^{\alpha}\phi(x)|,\text{ $\phi\in\mathcal{D}(\Omega)$}.

    Note that, the sum in this formula is always finite since the support of ϕ\phi is compact.

    It is straightforward to check that pK,M,Np_{{K,M,N}} is a seminorm on 𝒟(Ω)\mathcal{D}(\Omega) and that the family (pK,M,N)KEcpt(Ω),M,N\big(p_{{K,M,N}}\big)_{{K\in E_{\textup{cpt}}(\Omega),\> M,N\in\mathbb{N}^{\mathbb{N}}}} defines a locally convex topology on 𝒟(Ω)\mathcal{D}(\Omega) that exactly recaptures the convergence in 𝒟(Ω)\mathcal{D}(\Omega) as defined above (see [14, Chapter 13] for more details on the LF-space structure of 𝒟(Ω)\mathcal{D}(\Omega)). We present this explicit description of a family of seminorms describing the locally convex topology on 𝒟(Ω)\mathcal{D}(\Omega) here, since we do not know of a reference to this in the literature.

Definition 2.3.

A distribution is a continuous complex-valued linear functional on the space of test functions 𝒟(Ω)\mathcal{D}(\Omega). The space of distributions is the dual

𝒟(Ω)*={T:𝒟(Ω)Tlinear and continuous}.\mathcal{D}(\Omega)^{*}=\left\{ T:\mathcal{D}(\Omega)\rightarrow\mathbb{C}\,\mid\, T\,\,\text{linear and continuous}\right\}.
Remark 2.4.

In the above definition, linearity simply means that if T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} and ϕ,ψ𝒟(Ω)\phi,\psi\in\mathcal{D}(\Omega), then

T(λϕ+μψ)=λT(ϕ)+μT(ψ)for all .T(\lambda\phi+\mu\psi)=\lambda T(\phi)+\mu T(\psi)\quad\text{for all $\lambda,\mu\in\mathbb{C}$}.

Continuity means that whenever a sequence (ϕm)m𝒟(Ω)(\phi _{m})_{{m\in\mathbb{N}}}\subset\mathcal{D}(\Omega) converges in 𝒟(Ω)\mathcal{D}(\Omega) (in the sense of Definition 2.1) to ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega), then T(ϕm)T(ϕ)T(\phi _{m})\rightarrow T(\phi) as a sequence in \mathbb{C}.

The space 𝒟(Ω)*\mathcal{D}(\Omega)^{*} is also equipped with a notion of convergence, defined as follows.

Definition 2.5.

A sequence (Tn)n𝒟(Ω)*(T_{n})_{{n\in\mathbb{N}}}\subset\mathcal{D}(\Omega)^{*} converges in 𝒟(Ω)*\mathcal{D}(\Omega)^{*} to T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} if for every ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega) we have Tn(ϕ)T(ϕ)T_{n}(\phi)\rightarrow T(\phi) in \mathbb{C}.

Remark 2.6.

This is the usual notion of convergence in the weak*\text{weak}^{*} topology on a dual space.

The following gives some examples of distributions (recall the definition of Llocp(Ω)L_{{loc}}^{p}(\Omega) from Appendix A).

Example 2.7.
  1. Given xΩx\in\Omega, the delta functional is the distribution

    δx(ϕ)=ϕ(x).\delta _{x}(\phi)=\phi(x).

    Clearly this is linear. To see that it is continuous, note that if ϕmϕ\phi _{m}\rightarrow\phi in 𝒟(Ω)\mathcal{D}(\Omega), then ϕm(x)ϕ(x)\phi _{m}(x)\rightarrow\phi(x), and so δx(ϕm)δx(ϕ)\delta _{x}(\phi _{m})\rightarrow\delta _{x}(\phi).

  2. The functional

    T(ϕ)=Ωϕ(x)dxT(\phi)=\int _{\Omega}\phi(x)\, dx

    is a distribution. (Note that since ϕ\phi is continuous with compact support then it is also integrable.) Again, this is clearly linear. It is also continuous, since if ϕmϕ\phi _{m}\rightarrow\phi in 𝒟(Ω)\mathcal{D}(\Omega), then, by definition, there exists a fixed compact set KK such that supp(ϕm-ϕ)K\supp(\phi _{m}-\phi)\subseteq K. Therefore

    T(ϕ)-T(ϕm)=Ω(ϕ(x)-ϕm(x))dx=K(ϕ(x)-ϕm(x))dx,T(\phi)-T(\phi _{m})=\int _{\Omega}(\phi(x)-\phi _{m}(x))\, dx=\int _{K}(\phi(x)-\phi _{m}(x))\, dx,

    and since ϕm(x)ϕ(x)\phi _{m}(x)\rightarrow\phi(x) uniformly on the compact set KK, then K(ϕ(x)-ϕm(x))dx0\int _{K}(\phi(x)-\phi _{m}(x))\, dx\rightarrow 0. Note that it is essential that KK has finite measure for this argument to work.

  3. Given gLloc1(Ω)g\in L_{{loc}}^{1}(\Omega), let TgT_{g} be the functional

    Tg(ϕ)=Ωϕ(x)g(x)dx.T_{g}(\phi)=\int _{\Omega}\phi(x)g(x)\, dx.(2.1)

    Note that Hölder’s inequality shows that |Tg(ϕ)|ϕL(K)gL1(K)\left|T_{g}(\phi)\right|\leq\|\phi\| _{{L^{\infty}(K)}}\| g\| _{{L^{1}(K)}}, where KK denotes the (compact) support of ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega). Therefore Tg(ϕ)T_{g}(\phi) is always finite since ϕ𝒞cpt\phi\in\mathcal{C}_{\textup{cpt}}^{\infty} implies that ϕL\|\phi\| _{{L^{\infty}}} is finite, and so Tg(ϕ)T_{g}(\phi)\in\mathbb{C} for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega). As for the previous examples, clearly TgT_{g} is linear, and it only remains to show that it is continuous. Note that if ϕmϕ\phi _{m}\rightarrow\phi, then there is a fixed compact set KK with supp(ϕm-ϕ)K\supp(\phi _{m}-\phi)\subseteq K, and so

    Tg(ϕ)-Tg(ϕm)=K(ϕ(x)-ϕm(x))gdx0T_{g}(\phi)-T_{g}(\phi _{m})=\int _{K}(\phi(x)-\phi _{m}(x))g\, dx\rightarrow 0

    since gLloc1(Ω)g\in L_{{loc}}^{1}(\Omega) and ϕm(x)ϕ(x)\phi _{m}(x)\rightarrow\phi(x) uniformly on KK. Therefore Tg𝒟(Ω)*T_{g}\in\mathcal{D}(\Omega)^{*}.

    We will revisit this example later, since it appears in the definition of weak derivative in Section 2.2.

The last example above is an important one, it shows that there is a linear map Lloc1(Ω)𝒟(Ω)*L_{{loc}}^{1}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*} given by gTgg\mapsto T_{g} (recall that all LpL^{p} and LlocpL_{{loc}}^{p} spaces are defined to be equivalence classes of functions that are equal almost everywhere, and note that this map is well-defined on equivalence classes of functions in Lloc1(Ω)L_{{loc}}^{1}(\Omega), since f=gf=g a.e. implies that Ωfϕdx=Ωgϕdx\int _{\Omega}f\phi\, dx=\int _{\Omega}g\phi\, dx for any test function ϕ\phi). In fact, since Hölder’s inequality shows that there is an inclusion Llocp(Ω)Lloc1(Ω)L_{{loc}}^{p}(\Omega)\hookrightarrow L_{{loc}}^{1}(\Omega) for all 1<p1<p\leq\infty, then there is also a map Llocp(Ω)𝒟(Ω)*L_{{loc}}^{p}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*}. The next theorem says that this map is injective.

Theorem 2.8.

Let Ωn\Omega\subset\mathbb{R}^{n} be open, and let ff and gg be functions in Lloc1(Ω)L_{{loc}}^{1}(\Omega). Suppose that the distributions TfT_{f} and TgT_{g} are equal, i.e. Tf(ϕ)=Tg(ϕ)T_{f}(\phi)=T_{g}(\phi) for all test functions ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega). Then f=gf=g a.e. in Ω\Omega.


This is proved in [8, Theorem 6.5] using convolutions, however, for variety, here we give a slightly different proof. Firstly note that it is sufficient to prove the result for real-valued functions ff and gg, since we can take real and imaginary parts. Suppose that there exists a set KK whose Lebesgue measure is finite and non-zero, and which satisfies f(x)g(x)f(x)\neq g(x) for all xKx\in K. Since the Lebesgue measure is Borel regular (see Lemmas B.20 and B.21) then we can assume without loss of generality that KK is compact. Define K+KK_{+}\subseteq K to be the subset such that f(x)>g(x)f(x)>g(x), and again note that without loss of generality we can assume that K+K_{+} is compact with non-zero measure. Define the constant

C:=K+f(x)-g(x)dx>0.C:=\int _{{K_{+}}}f(x)-g(x)\, dx>0.

Now let {Vn}n\{ V_{n}\} _{{n\in\mathbb{N}}} be a collection of open sets such that

  1. K+VnK_{+}\subset V_{n} and Vn+1VnV_{{n+1}}\subset V_{n} for each nn\in\mathbb{N}, and

  2. the Lebesgue measure of VnK+V_{n}\setminus K_{+} satisfies |VnK+|<1n|V_{n}\setminus K_{+}|<\frac{1}{n}.

The existence of each VnV_{n} is guaranteed since the Lebesgue measure is Borel regular. Now use Urysohn’s lemma (see Appendix A.3) to construct a smooth positive function ϕn𝒞cpt(Ω)\phi _{n}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) such that 0ϕn(x)10\leq\phi _{n}(x)\leq 1 for all xΩx\in\Omega, ϕ(x)=1\phi(x)=1 for all xKx\in K, and ϕ(x)=0\phi(x)=0 for all xΩVnx\in\Omega\setminus V_{n}. Therefore

Ω(f-g)ϕndx\displaystyle\int _{\Omega}(f-g)\phi _{n}\, dx=K+(f-g)ϕndx+VnK+(f-g)ϕndx+ΩVn(f-g)ϕndx\displaystyle=\int _{{K_{+}}}(f-g)\phi _{n}\, dx+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx+\int _{{\Omega\setminus V_{n}}}(f-g)\phi _{n}\, dx
=K+(f-g)ϕndx+VnK+(f-g)ϕndx\displaystyle=\int _{{K_{+}}}(f-g)\phi _{n}\, dx+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx
=C+VnK+(f-g)ϕndx.\displaystyle=C+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx.

The last term in the above equation satisfies the estimate

|VnK+(f-g)ϕndx|\displaystyle\left|\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx\right|VnK+|f-g|ϕndx\displaystyle\leq\int _{{V_{n}\setminus K_{+}}}|f-g|\phi _{n}\, dx
VnK+|f-g|dx,\displaystyle\leq\int _{{V_{n}\setminus K_{+}}}|f-g|\, dx,

and, since |VnK+|0\left|V_{n}\setminus K_{+}\right|\rightarrow 0 as nn\rightarrow\infty, then

VnK+|f-g|dx0as ,\int _{{V_{n}\setminus K_{+}}}|f-g|\, dx\rightarrow 0\quad\text{as $n\rightarrow\infty$},

since the integral of a fixed measurable function is an absolutely continuous set function (see for example [15, Corollary 10.41]).

Therefore there exists an nn such that

Ω(f-g)ϕndx=C+VnK+(f-g)ϕndx>0,\int _{\Omega}(f-g)\phi _{n}\, dx=C+\int _{{V_{n}\setminus K_{+}}}(f-g)\phi _{n}\, dx>0,

which is a contradiction. Therefore f=gf=g almost everywhere. ∎

A consequence of this theorem is that the distribution TfT_{f} associated to a function fLloc1(Ω)f\in L_{{loc}}^{1}(\Omega) uniquely determines an equivalence class in Lloc1(Ω)L_{{loc}}^{1}(\Omega). Therefore, the following definition makes sense.

Definition 2.9.

A distribution T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} represents the function fLloc1(Ω)f\in L_{{loc}}^{1}(\Omega) if

T(ϕ)=Ωf(x)ϕ(x)dx=:Tf(ϕ)T(\phi)=\int _{\Omega}f(x)\phi(x)\, dx=:T_{f}(\phi)

for all test functions ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega). A function fLloc1(Ω)f\in L_{{loc}}^{1}(\Omega) is represented by the distribution Tf𝒟(Ω)*T_{f}\in\mathcal{D}(\Omega)^{*}.

Theorem 2.8 shows that each distribution can represent at most one element of Lloc1(Ω)L_{{loc}}^{1}(\Omega), i.e. the map Lloc1(Ω)𝒟(Ω)*L_{{loc}}^{1}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*} given by gTgg\mapsto T_{g} is injective. The following example shows that not all distributions represent functions in Lloc1(Ω)L_{{loc}}^{1}(\Omega), i.e. the map Lloc1(Ω)𝒟(Ω)*L_{{loc}}^{1}(\Omega)\rightarrow\mathcal{D}(\Omega)^{*} given by gTgg\mapsto T_{g} is not surjective.

Example 2.10.

Given any xΩx\in\Omega, let δx𝒟(Ω)\delta _{x}\in\mathcal{D}(\Omega) be the delta functional defined in Example 2.7. We claim that this does not represent any function in Lloc1(Ω)L_{{loc}}^{1}(\Omega). To see this, suppose for contradiction that δx(ϕ)=Ωf(x)ϕ(x)dx\displaystyle{\delta _{x}(\phi)=\int _{\Omega}f(x)\phi(x)\, dx} for some fLloc1(Ω)f\in L_{{loc}}^{1}(\Omega) and every ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega). Consider a sequence of bump functions ϕn\phi _{n} such that for all nn satisfying B(x,1n)ΩB(x,\frac{1}{n})\subset\Omega we have

  1. ϕn(x)=1\phi _{n}(x)=1,

  2. supp(ϕn)=B(x,1n)\supp(\phi _{n})=B(x,\frac{1}{n}),

  3. 0ϕn(x)10\leq\phi _{n}(x)\leq 1 for all yB(x,1n)y\in B(x,\frac{1}{n}), and

  4. Ωϕn(x)dx=1n\displaystyle{\int _{\Omega}\phi _{n}(x)\, dx=\frac{1}{n}}.

Then, since |f(x)ϕn(x)||f(x)|\left|f(x)\phi _{n}(x)\right|\leq\left|f(x)\right| has support in B(0,1)¯\overline{B(0,1)}, dominated convergence shows that Ωf(x)ϕn(x)dx0\displaystyle{\int _{\Omega}f(x)\phi _{n}(x)\, dx\rightarrow 0} as nn\rightarrow\infty, which contradicts δx(ϕn)=1\delta _{x}(\phi _{n})=1 for all nn.

Therefore the delta functional is an example of a distribution that cannot be represented by a function in Lloc1(Ω)L_{{loc}}^{1}(\Omega). It can, however, be represented by a measure (see the measure in Example 3.28), and in Section 3.4 we will show that positive distributions can always be represented by measures (see Theorem 3.35).

2.2 Distributional derivatives and Sobolev spaces

Before defining Sobolev spaces, first we have to define the notion of the derivative of a distribution.

Definition 2.11.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be open, let T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*}, and let α0n\alpha\in\mathbb{Z}_{{\geq 0}}^{n}. The αth\alpha^{{th}} distributional derivative of TT is the distribution DαTD^{\alpha}T defined by


for all test functions ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega). The distributional gradient, denoted T\nabla T, is the nn-tuple of distributions

T=(1T,,nT).\nabla T=\left(\partial _{1}T,\ldots,\partial _{n}T\right).

If TT and DαTD^{\alpha}T both represent functions in Lloc1(Ω)L_{{loc}}^{1}(\Omega) (i.e. T=TfT=T_{f} and DαT=TgD^{\alpha}T=T_{g} for some f,gLloc1(Ω)f,g\in L_{{loc}}^{1}(\Omega)) then we say that gg is a weak derivative of ff, and write g=Dαfg=D^{\alpha}f. In this case we say that the weak derivative of ff exists.

Remark 2.12.
  1. Since the weak derivative is defined by the relation

    Ωf(x)Dαϕ(x)dx=(-1)|α|Ωg(x)ϕ(x)dx\int _{\Omega}f(x)D^{\alpha}\phi(x)\, dx=(-1)^{{|\alpha|}}\int _{\Omega}g(x)\phi(x)\, dx

    then it is only defined up to equivalence almost everywhere.

  2. The distributional derivative always exists for any multi-index α\alpha, since the definition only involves differentiating test functions, which are smooth. Since partial derivatives of smooth functions commute, then distributional derivatives also commute, i.e.

    ijT=jiT.\partial _{i}\partial _{j}T=\partial _{j}\partial _{i}T.
  3. As we will see in the examples below, the weak derivative does not always exist, and, in fact, may not even exist for any value of α\alpha (in Example 2.19 we show that the step function is an example of such a function).

The following lemma shows that the weak derivative extends the notion of classical derivative of differentiable functions. It says that the distributional derivative of the distribution associated to a differentiable function gg is the distribution associated to the classical derivative of gg.

Lemma 2.13.

Let gC|α|(Ω)g\in C^{{|\alpha|}}(\Omega). Then for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega) we have

DαTg(ϕ)=(-1)|α|Ω(Dαϕ(x))g(x)dx=Ωϕ(x)(Dαg(x))dx=TDαg(ϕ)D^{\alpha}T_{g}(\phi)=(-1)^{{|\alpha|}}\int _{\Omega}(D^{\alpha}\phi(x))g(x)\, dx=\int _{\Omega}\phi(x)(D^{\alpha}g(x))\, dx=T_{{D^{\alpha}g}}(\phi)(2.2)

The proof simply involves applying the definitions and the integration by parts formula from Section A.2. ∎

Remark 2.14.
  1. It is important to emphasise that DαD^{\alpha} is used to denote both the distributional derivative and the classical derivative in the statement of the lemma: DαTgD^{\alpha}T_{g} is the distributional derivative of the distribution TgT_{g} associated to the function gg, and TDαgT_{{D^{\alpha}g}} is the distribution associated to the classical derivative DαgD^{\alpha}g.

  2. It is an important exercise to think through the precise meaning of all of the statements above, to understand the distinction between a weak derivative and a distributional derivative, and to understand the meaning of each term in (2.2).

The next lemma shows that functions that are equal almost everywhere have the same distributional derivatives. As a consequence, when defining Sobolev spaces in Definition 2.16, we can define them as subsets of the LpL^{p} and LlocpL_{{loc}}^{p} spaces (i.e. we consider equivalence classes of functions that are equal almost everywhere).

Lemma 2.15.

If f=gf=g almost everywhere, then DαTf=DαTgD^{\alpha}T_{f}=D^{\alpha}T_{g} as distributions.


The proof is another straightforward application of the definition of distributional derivative. For any test function ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega) we have

DαTf(ϕ)=(-1)|α|Tf(Dαϕ)\displaystyle D^{\alpha}T_{f}(\phi)=(-1)^{{|\alpha|}}T_{f}(D^{\alpha}\phi)=(-1)|α|Ωf(x)Dαϕ(x)dx\displaystyle=(-1)^{{|\alpha|}}\int _{\Omega}f(x)\, D^{\alpha}\phi(x)\, dx
=(-1)|α|Ωg(x)Dαϕ(x)dx(since  a.e.)\displaystyle=(-1)^{{|\alpha|}}\int _{\Omega}g(x)\, D^{\alpha}\phi(x)\, dx\quad\text{(since $f=g$ a.e.)}

Therefore, DαTf(ϕ)=DαTg(ϕ)D^{\alpha}T_{f}(\phi)=D^{\alpha}T_{g}(\phi) for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega), and so DαTf=DαTgD^{\alpha}T_{f}=D^{\alpha}T_{g} as elements of D(Ω)*D(\Omega)^{*}. ∎

Now that we have developed the necessary machinery, we are ready to define Sobolev spaces.

Definition 2.16.

The Sobolev space Wk,p(Ω)W^{{k,p}}(\Omega) is the space of equivalence classes of all functions fLp(Ω)f\in L^{p}(\Omega) such that the weak derivative DαfD^{\alpha}f exists and is in Lp(Ω)L^{p}(\Omega) for all α\alpha such that |α|k|\alpha|\leq k.

The Sobolev space Wlock,p(Ω)W_{{loc}}^{{k,p}}(\Omega) is the space of all functions fLlocp(Ω)f\in L_{{loc}}^{p}(\Omega) such that the weak derivative DαfD^{\alpha}f exists and is in Llocp(Ω)L_{{loc}}^{p}(\Omega) for all α\alpha such that |α|k|\alpha|\leq k.

The space Wk,p(Ω)W^{{k,p}}(\Omega) has a norm given by

fWk,p(Ω)=j=0k(α:|α|=jDαfLp(Ω)),\left\| f\right\| _{{W^{{k,p}}(\Omega)}}=\sum _{{j=0}}^{k}\left(\sum _{{\alpha:|\alpha|=j}}\| D^{\alpha}f\| _{{L^{p}(\Omega)}}\right),

and we define W0k,p(Ω)W_{0}^{{k,p}}(\Omega) to be the closure of the space 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) in the topology induced by this norm.

For a compact subset KΩK\subset\Omega we define the norm

fWk,p(K)=j=0k(α:|α|=jDαfLp(K)),\left\| f\right\| _{{W^{{k,p}}(K)}}=\sum _{{j=0}}^{k}\left(\sum _{{\alpha:|\alpha|=j}}\| D^{\alpha}f\| _{{L^{p}(K)}}\right),

where the weak derivatives DαfD^{\alpha}f are defined on Ω\Omega.

Lemma 2.17.

The norm Wk,p(Ω)\|\cdot\| _{{W^{{k,p}}(\Omega)}} gives Wk,p(Ω)W^{{k,p}}(\Omega) the structure of a normed linear space for 1p1\leq p\leq\infty.


Recall that we have to check

  1. The space has a unique element of zero norm, i.e. f=0f=0 if and only if fWk,p(Ω)=0\| f\| _{{W^{{k,p}}(\Omega)}}=0.

  2. The norm is linear with respect to scalar multiplication, i.e. cfWk,p(Ω)=|c|fWk,p(Ω)\| cf\| _{{W^{{k,p}}(\Omega)}}=|c|\| f\| _{{W^{{k,p}}(\Omega)}} for all cc\in\mathbb{C} and fWk,p(Ω)f\in W^{{k,p}}(\Omega).

  3. The triangle inequality holds, i.e.

    f+gWk,p(Ω)fWk,p(Ω)+gWk,p(Ω)\| f+g\| _{{W^{{k,p}}(\Omega)}}\leq\| f\| _{{W^{{k,p}}(\Omega)}}+\| g\| _{{W^{{k,p}}(\Omega)}}

    for all f,gWk,p(Ω)f,g\in W^{{k,p}}(\Omega).

It is easy to check (2.2): since the result is true for Lp(Ω)L^{p}(\Omega), we have Wk,p(Ω)Lp(Ω)W^{{k,p}}(\Omega)\subseteq L^{p}(\Omega), and fLp(Ω)fWk,p(Ω)\| f\| _{{L^{p}(\Omega)}}\leq\| f\| _{{W^{{k,p}}(\Omega)}} for all fWk,p(Ω)f\in W^{{k,p}}(\Omega).

The weak derivative commutes with scalar multiplication, i.e. Dα(cf)=cDαfD^{\alpha}(cf)=cD^{\alpha}f for all cc\in\mathbb{C}, and so we also have cfLp(Ω)=|c|fLp(Ω)\| cf\| _{{L^{p}(\Omega)}}=|c|\| f\| _{{L^{p}(\Omega)}}. Therefore (2.2) is satisfied by definition of the Sobolev norm.

The triangle inequality for Wk,p(Ω)W^{{k,p}}(\Omega) follows from the definition of the Sobolev norm and the triangle inequality for Lp(Ω)L^{p}(\Omega) (which is Minkowski’s inequality, see for example [15, Theorem 8.10]). ∎

It is worth recalling that Wk,p(Ω)W^{{k,p}}(\Omega) can never be a normed linear space for 0<p<10<p<1, since the triangle inequality fails in this case. See for example the remark on p130 of [15], and also [15, Theorem 8.16]. For more discussion of LpL^{p} spaces for 0<p<10<p<1, see [13, pp35-36].

Remark 2.18.

We will see later, in Section 3.1, that Wk,p(Ω)W^{{k,p}}(\Omega) is a Banach space with this norm.

It is worth studying some examples of distributional and weak derivatives. The first example is the step function, for which the distributional derivative is the delta functional from Example 2.7. This is an important example, since it shows that the step function is not in any Sobolev space Wk,p(Ω)W^{{k,p}}(\Omega) or Wlock,p(Ω)W_{{loc}}^{{k,p}}(\Omega) for k1k\geq 1, because the delta functional cannot be represented by a function.

Example 2.19.

Let g:g:\mathbb{R}\rightarrow\mathbb{R} be the step function

g(x)={1x0,0x<0.g(x)=\left\{\begin{matrix}1&x\geq 0,\\ 0&x<0.\end{matrix}\right.

Given a test function ϕ𝒞cpt()\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\mathbb{R}), consider the integral

g(x)xϕ(x)dx=0xϕ(x)dx=[ϕ(x)]0=-ϕ(0).\int _{\mathbb{R}}g(x)\partial _{x}\phi(x)\, dx=\int _{0}^{\infty}\partial _{x}\phi(x)\, dx=\left[\phi(x)\right]_{0}^{\infty}=-\phi(0).

(Recall that ϕ\phi vanishes at infinity since it has compact support.) Therefore the distributional derivative of TgT_{g} is the linear functional xTg𝒟()*\partial _{x}T_{{g}}\in\mathcal{D}(\mathbb{R})^{*} given by xTg(ϕ)=ϕ(0)\partial _{x}T_{g}(\phi)=\phi(0), i.e. xTg\partial _{x}T_{{g}} is the delta functional δ0\delta _{0}. Example 2.10 shows that this cannot be represented by a function, and therefore the weak derivative of the step function does not exist, so the step function is not in W1,p()W^{{1,p}}(\mathbb{R}) or Wloc1,p()W_{{loc}}^{{1,p}}(\mathbb{R}) for any pp.

Example 2.20.

Let f(x)=|x|f(x)=|x|. To compute the weak derivative we first consider

f(x)1ϕ(x)dx=-0(-x)1ϕ(x)dx+0x1ϕ(x)dx=[-xϕ(x)]0+-0ϕ(x)dx+[xϕ(x)]0-0ϕ(x)dx=-0ϕ(x)dx-0ϕ(x)dx.\displaystyle\begin{split}\int _{\mathbb{R}}f(x)\partial _{1}\phi(x)\, dx&=\int _{{-\infty}}^{0}(-x)\partial _{1}\phi(x)\, dx+\int _{0}^{\infty}x\partial _{1}\phi(x)\, dx\\ &=\left[-x\phi(x)\right]_{0}^{\infty}+\int _{{-\infty}}^{0}\phi(x)\, dx+\left[x\phi(x)\right]_{0}^{\infty}-\int _{0}^{\infty}\phi(x)\, dx\\ &=\int _{{-\infty}}^{0}\phi(x)\, dx-\int _{0}^{\infty}\phi(x)\, dx.\end{split}(2.3)


g(x)={1x0-1x<0,g(x)=\left\{\begin{matrix}1&x\geq 0\\ -1&x<0\end{matrix}\right.,

and note that the previous calculation (2.3) shows that f(x)1ϕ(x)dx=-g(x)ϕ(x)dx\int _{\mathbb{R}}f(x)\partial _{1}\phi(x)\, dx=-\int _{\mathbb{R}}g(x)\phi(x)\, dx. Therefore the weak derivative of f(x)=|x|f(x)=|x| is the step function g(x)g(x).

The next example generalises the method of the previous example to locally Lipschitz functions.

Example 2.21.

In this example we show that if ff is locally Lipschitz on Ω\Omega then fWloc1,(Ω)f\in W_{{loc}}^{{1,\infty}}(\Omega). Rademacher’s theorem shows that the partial derivatives of ff exist almost everywhere (see Corollary A.17), and the goal of this example is to show that these partial derivatives are equal almost everywhere to the weak derivative of ff in each co-ordinate direction.

For each compact set KK, let MKM_{K} be the associated Lipschitz constant, i.e. for all x,yKx,y\in K we have

|f(x)-f(y)|MK|x-y|.\left|f(x)-f(y)\right|\leq M_{K}\left|x-y\right|.(2.4)

(Note that this differs slightly from Definition A.14, however we can easily extend this to compact sets KK by taking an open cover of KK.)

Equation (2.4) implies that fLloc(Ω)f\in L_{{loc}}^{\infty}(\Omega). Therefore the integral

Ωf(x)ϕ(x)dx\int _{\Omega}f(x)\phi(x)\, dx

is defined for any ϕ𝒞cpt(Ω)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega). To show that ff has a weak derivative, we need to show that there exists gg such that

Ωf(x)jϕ(x)dx=-Ωg(x)ϕ(x)dx\int _{\Omega}f(x)\partial _{j}\phi(x)\, dx=-\int _{\Omega}g(x)\phi(x)\, dx

for all test functions ϕ𝒞cpt(Ω)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega).

Let K=supp(ϕ)K=\supp(\phi). Since KK is compact then there exists ε>0\varepsilon>0 such that if |h|<ε|h|<\varepsilon then x+hejΩx+he_{j}\in\Omega for all xKx\in K, and so ϕ(x+hej)\phi(x+he_{j}) is well-defined for small values of |h||h|. Therefore

Ωf(x)jϕ(x)dx=Ωf(x)limh0ϕ(x+hej)-ϕ(x)hdx.\int _{\Omega}f(x)\partial _{j}\phi(x)\, dx=\int _{\Omega}f(x)\lim _{{h\rightarrow 0}}\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx.

The next step involves using dominated convergence to interchange the order of integration and differentiation. Since this is a standard technique that is used in many examples then we include all of the details here. First note that since ϕ\phi is smooth with compact support, then it is uniformly Lipschitz, and so the absolute value of the difference quotients |ϕ(x+hej)-ϕ(x)h|\left|\frac{\phi(x+he_{j})-\phi(x)}{h}\right| is uniformly bounded by a constant (call it M~\tilde{M}) for |h|<ε|h|<\varepsilon. Since fLloc1(Ω)f\in L_{{loc}}^{1}(\Omega) and the difference quotients have compact support, then

|f(x)ϕ(x+hej)-ϕ(x)h|M~f(x)Lloc1(Ω),\left|f(x)\frac{\phi(x+he_{j})-\phi(x)}{h}\right|\leq\tilde{M}f(x)\in L_{{loc}}^{1}(\Omega),

and so we can use dominated convergence to write

Ωf(x)limh0ϕ(x+hej)-ϕ(x)hdx=limh0Ωf(x)ϕ(x+hej)-ϕ(x)hdx.\int _{\Omega}f(x)\lim _{{h\rightarrow 0}}\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx=\lim _{{h\rightarrow 0}}\int _{\Omega}f(x)\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx.

Changing variables, and recalling that the upper bound on hh was chosen so that x-hejΩx-he_{j}\in\Omega for all xsupp(ϕ)x\in\supp(\phi), gives us

limh0Ωf(x)ϕ(x+hej)-ϕ(x)hdx=limh0Ω1hf(x)ϕ(x+hej)dx-limh0Ω1hf(x)ϕ(x)dx=limh0Ω1hf(x-hej)ϕ(x)dx-limh0Ω1hf(x)ϕ(x)dx=limh0Ωf(x-hej)-f(x)hϕ(x)dx.\displaystyle\begin{split}\lim _{{h\rightarrow 0}}\int _{\Omega}f(x)\frac{\phi(x+he_{j})-\phi(x)}{h}\, dx&=\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x)\phi(x+he_{j})\, dx-\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x)\phi(x)\, dx\\ &=\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x-he_{j})\phi(x)\, dx-\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{1}{h}f(x)\phi(x)\, dx\\ &=\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx.\end{split}(2.5)

(Even though x-hejx-he_{j} may not be in Ω\Omega for arbitrary xΩx\in\Omega, we do have x-hejΩx-he_{j}\in\Omega for all xKx\in K. Since the support of ϕ\phi is KΩK\subset\subset\Omega, then we can define

Ω1hf(x-hej)ϕ(x)dx:=K1hf(x-hej)ϕ(x)dx,\int _{\Omega}\frac{1}{h}f(x-he_{j})\phi(x)\, dx:=\int _{K}\frac{1}{h}f(x-he_{j})\phi(x)\, dx,

and therefore the integral in the above calculation makes sense.)

The quanitity f(x-hej)-f(x)hϕ(x)\frac{f(x-he_{j})-f(x)}{h}\phi(x) is uniformly bounded for |h|12ε|h|\leq\frac{1}{2}\varepsilon (since ff is locally Lipschitz), and so another application of dominated convergence gives us

limh0Ωf(x-hej)-f(x)hϕ(x)dx=Ωlimh0f(x-hej)-f(x)hϕ(x)dx.\lim _{{h\rightarrow 0}}\int _{\Omega}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx=\int _{\Omega}\lim _{{h\rightarrow 0}}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx.(2.6)

Rademacher’s theorem shows that for each j=1,,nj=1,\ldots,n, the partial derivative jf\partial _{j}f exists almost everywhere in Ω\Omega, and, on the compact set K=supp(ϕ)K=\supp(\phi) it is bounded above by the Lipschitz constant MKM_{K}. Let gj(x)g_{j}(x) be a function defined on all of Ω\Omega that is equal almost everywhere to jf(x)\partial _{j}f(x). Therefore

Ωlimh0f(x-hej)-f(x)hϕ(x)dx=-Ωlimh0f(x-hej)-f(x)(-h)ϕ(x)dx=-Ωgj(x)ϕ(x)dx,\int _{\Omega}\lim _{{h\rightarrow 0}}\frac{f(x-he_{j})-f(x)}{h}\phi(x)\, dx=-\int _{\Omega}\lim _{{h\rightarrow 0}}\frac{f(x-he_{j})-f(x)}{(-h)}\phi(x)\, dx=-\int _{\Omega}g_{j}(x)\phi(x)\, dx,

and so we have shown that

Ωf(x)jϕ(x)dx=-Ωgj(x)ϕ(x)dx.\int _{\Omega}f(x)\partial _{j}\phi(x)\, dx=-\int _{\Omega}g_{j}(x)\phi(x)\, dx.

Therefore the weak derivative exists and is equal almost everywhere to jf(x)\partial _{j}f(x). Since |gj(x)|MK|g_{j}(x)|\leq M_{K} almost everywhere on each compact set KK, then fWloc1,(Ω)f\in W_{{loc}}^{{1,\infty}}(\Omega).

Remark 2.22.

The part of the above proof that requires the Lipschitz condition on ff is the application of dominated convergence in (2.6). The fact that the derivative of ff exists almost everywhere is not sufficient for a weak derivative to exist, for example, the derivative of the step function is zero almost everywhere, but we showed in Example 2.19 that the step function does not have a weak derivative. The reason is that (2.6) fails for the step function (the rest of the proof does go through for the step function).

3 Basic properties of Sobolev spaces

In this section we prove some basic results about Sobolev spaces. The results of Sections 3.1 and 3.3 describe basic functional analytic properties of Sobolev spaces, while Section 3.2 gives an alternative characterisation of Sobolev spaces as the completion of the space of smooth functions. Section 3.4 provides an answer to an earlier question by showing that, although distributions cannot always be represented by locally integrable functions, the positive distributions can always be represented by regular Borel measures.

3.1 Banach and Hilbert space structure of Sobolev spaces

It is well-known that Lp(Ω)L^{p}(\Omega) (with the LpL^{p} norm) is a Banach space, and that L2(Ω)L^{2}(\Omega) (with the L2L^{2} inner product) is a Hilbert space. In a similar way, we can show that the Sobolev spaces Wk,p(Ω)W^{{k,p}}(\Omega) have the structure of a Banach space, and that Wk,2(Ω)W^{{k,2}}(\Omega) has the structure of a Hilbert space, and it is the goal of this section to give the details of this proof. This is a useful theorem, since it allows us to use theorems from functional analysis to study sequences of functions in Sobolev spaces.

Firstly, recall that the space Lp(Ω)L^{p}(\Omega), together with the LpL^{p} norm, is complete when 1p1\leq p\leq\infty (see for example [8, Theorem 2.7] or [15, Theorem 8.14]). To extend this to the Sobolev space Wk,p(Ω)W^{{k,p}}(\Omega), we use an inductive argument. The proof of the following lemma gives the basic idea of this argument for k=1k=1.

Lemma 3.1.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be an open set and 1p1\leq p\leq\infty. Then the space W1,p(Ω)W^{{1,p}}(\Omega) is complete in the norm W1,p(Ω)\|\cdot\| _{{W^{{1,p}}(\Omega)}}.


Let (um)m(u_{m})_{{m\in\mathbb{N}}} be a Cauchy sequence in W1,p(Ω)W^{{1,p}}(\Omega). Then, by definition of the Sobolev norm, umLpumW1,p\| u_{m}\| _{{L^{p}}}\leq\| u_{m}\| _{{W^{{1,p}}}}, and so (um)m(u_{m})_{{m\in\mathbb{N}}} is also Cauchy in LpL^{p}. Similarly, since jumLpumW1,p\|\partial _{j}u_{m}\| _{{L^{p}}}\leq\| u_{m}\| _{{W^{{1,p}}}} (again this follows from the definition of the Sobolev norm), we have that (jum)m(\partial _{j}u_{m})_{{m\in\mathbb{N}}} is a Cauchy sequence in LpL^{p}.

Since Lp(Ω)L^{p}(\Omega) is complete, then there are functions v0,v1,,vnv_{0},v_{1},\ldots,v_{n} such that

um\displaystyle u_{m}Lpv0\displaystyle\stackrel{L^{p}}{\longrightarrow}v_{0}
jum\displaystyle\partial _{j}u_{m}Lpvj,j=1,,n.\displaystyle\stackrel{L^{p}}{\longrightarrow}v_{j},\quad\quad j=1,\ldots,n.

Hölder’s inequality shows that Lp(Ω)Lloc1(Ω)L^{p}(\Omega)\subset L_{{loc}}^{1}(\Omega), and so each umu_{m} determines a distribution Tum𝒟(Ω)*T_{{u_{m}}}\in\mathcal{D}(\Omega)^{*} given by

Tum(ϕ)=ΩumϕdxT_{{u_{m}}}(\phi)=\int _{\Omega}u_{m}\phi\, dx

for all test functions ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega).

Another application of Hölder’s inequality gives the following estimate for any ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega)

|Tum(ϕ)-Tv0(ϕ)|Ω|um(x)-v0(x)||ϕ(x)|dxϕLqum-v0Lp,\left|T_{{u_{m}}}(\phi)-T_{{v_{0}}}(\phi)\right|\leq\int _{\Omega}\left|u_{m}(x)-v_{0}(x)\right|\left|\phi(x)\right|\, dx\leq\|\phi\| _{{L^{q}}}\| u_{m}-v_{0}\| _{{L^{p}}},

where qq is the conjugate Hölder exponent of pp. (Note that the integral exists since supϕ\sup\phi is bounded, ϕ\phi has compact support, and um-v0Lloc1(Ω)u_{m}-v_{0}\in L_{{loc}}^{1}(\Omega).) Since umv0u_{m}\rightarrow v_{0} in LpL^{p} then this shows that TumTv0T_{{u_{m}}}\rightarrow T_{{v_{0}}} in 𝒟(Ω)*\mathcal{D}(\Omega)^{*}.

The same argument with umu_{m} replaced by jum\partial _{j}u_{m} and v0v_{0} replaced by vjv_{j} shows that TjumTvjT_{{\partial _{j}u_{m}}}\rightarrow T_{{v_{j}}}. We then have for every test function ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega)

Tvj(ϕ)\displaystyle T_{{v_{j}}}(\phi)=limmTjum(ϕ)\displaystyle=\lim _{{m\rightarrow\infty}}T_{{\partial _{j}u_{m}}}(\phi)
=-limmTum(jϕ)\displaystyle=-\lim _{{m\rightarrow\infty}}T_{{u_{m}}}(\partial _{j}\phi)
=-Tv0(jϕ)\displaystyle=-T_{{v_{0}}}(\partial _{j}\phi)
=Tjv0(ϕ)(by definition of distributional derivative).\displaystyle=T_{{\partial _{j}v_{0}}}(\phi)\quad\text{(by definition of distributional derivative)}.

Therefore, by Theorem 2.8, we have vj=jv0v_{j}=\partial _{j}v_{0} almost everywhere, where j\partial _{j} is the weak derivative, which exists since v0W1,p(Ω)v_{0}\in W^{{1,p}}(\Omega). Therefore, we have shown that umv0u_{m}\rightarrow v_{0} in W1,p(Ω)W^{{1,p}}(\Omega), and so W1,p(Ω)W^{{1,p}}(\Omega) is complete. ∎

Using this technique we can now prove the following theorem, which, together with Lemma 2.17, says that Wk,p(Ω)W^{{k,p}}(\Omega) is a Banach space.

Theorem 3.2.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be open and 1p1\leq p\leq\infty. Then Wk,p(Ω)W^{{k,p}}(\Omega) is complete in the norm Wk,p\|\cdot\| _{{W^{{k,p}}}} for all k0k\geq 0. In particular, Wk,p(Ω)W^{{k,p}}(\Omega) is a Banach space for all 1p1\leq p\leq\infty and k0k\in\mathbb{Z}_{{\geq 0}}.


The proof uses induction on kk. The case k=0k=0 follows from standard results about LpL^{p} spaces (see for example [15, Theorem 8.14]). Suppose that Wk-1,p(Ω)W^{{k-1,p}}(\Omega) is complete, and let (um)m(u_{m})_{{m\in\mathbb{N}}} be a Cauchy sequence in Wk,p(Ω)W^{{k,p}}(\Omega). Therefore the sequences (um)m(u_{m})_{{m\in\mathbb{N}}} and (jum)m(\partial _{j}u_{m})_{{m\in\mathbb{N}}} (for j=1,,nj=1,\ldots,n) are Cauchy, and the completeness of Wk-1,pW^{{k-1,p}} shows that there exist functions v0,v1,,vnv_{0},v_{1},\ldots,v_{n} such that

um\displaystyle u_{m}--Wk-1,pv0\displaystyle\stackrel{W^{{k-1,p}}}{\relbar\mathrel{\mkern-4.0mu}\relbar\mathrel{\mkern-4.0mu}\longrightarrow}v_{0}
jum\displaystyle\partial _{j}u_{m}--Wk-1,pvjfor all .\displaystyle\stackrel{W^{{k-1,p}}}{\relbar\mathrel{\mkern-4.0mu}\relbar\mathrel{\mkern-4.0mu}\longrightarrow}v_{j}\quad\text{for all $j=1,\ldots,n$}.

Note that the inductive hypothesis shows that Dα(jum)DαvjD^{\alpha}(\partial _{j}u_{m})\rightarrow D^{\alpha}v_{j} in LpL^{p} for all multi-indices α\alpha such that |α|k=1|\alpha|\leq k=1, and so it only remains to show that jv0=vj\partial _{j}v_{0}=v_{j} for each j=1,,nj=1,\ldots,n.

As in the previous proof we can show that

|Tjum(ϕ)-Tvj(ϕ)|ϕLqjum-vjLp,\left|T_{{\partial _{j}u_{m}}}(\phi)-T_{{v_{j}}}(\phi)\right|\leq\|\phi\| _{{L^{q}}}\|\partial _{j}u_{m}-v_{j}\| _{{L^{p}}},

and so for all test functions ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega) we have

Tvj(ϕ)=limmTjum(ϕ)=-limmTum(jϕ)=-Tv0(jϕ)=Tjv0(ϕ),T_{{v_{j}}}(\phi)=\lim _{{m\rightarrow\infty}}T_{{\partial _{j}u_{m}}}(\phi)=-\lim _{{m\rightarrow\infty}}T_{{u_{m}}}(\partial _{j}\phi)=-T_{{v_{0}}}(\partial _{j}\phi)=T_{{\partial _{j}v_{0}}}(\phi),

and so Theorem 2.8 shows that vj=jv0v_{j}=\partial _{j}v_{0} almost everywhere. This, together with the previous statement that Dα(jum)DαvjD^{\alpha}(\partial _{j}u_{m})\rightarrow D^{\alpha}v_{j} in LpL^{p} for all multi-indices α\alpha such that |α|k-1|\alpha|\leq k-1, shows that DαumDαv0D^{\alpha}u_{m}\rightarrow D^{\alpha}v_{0} in LpL^{p} for all α\alpha such that |α|k|\alpha|\leq k.

Therefore, we have shown that there exists v0Wk,p(Ω)v_{0}\in W^{{k,p}}(\Omega) such that um--Wk,pv0u_{m}\stackrel{W^{{k,p}}}{\relbar\mathrel{\mkern-4.0mu}\relbar\mathrel{\mkern-4.0mu}\longrightarrow}v_{0}, and so Wk,p(Ω)W^{{k,p}}(\Omega) is complete. ∎

In the case p=2p=2, the previous theorem, together with the following inner product, gives Wk,2(Ω)W^{{k,2}}(\Omega) the structure of a Hilbert space.

Definition 3.3.

The inner product on Wk,2(Ω)W^{{k,2}}(\Omega) is defined to be

f,gWk,2(Ω):=0|α|kΩDαfDαg¯dx.\left<f,g\right>_{{W^{{k,2}}(\Omega)}}:=\sum _{{0\leq|\alpha|\leq k}}\int _{\Omega}D^{\alpha}f\,\overline{D^{{\alpha}}g}\, dx.(3.1)
Remark 3.4.

The Sobolev norm on Wk,2(Ω)W^{{k,2}}(\Omega) is the same as the norm induced by the inner product

fWk,2(Ω)=(f,fWk,2(Ω))12.\| f\| _{{W^{{k,2}}(\Omega)}}=\left(\left<f,f\right>_{{W^{{k,2}}(\Omega)}}\right)^{{\frac{1}{2}}}.
Theorem 3.5.

(Wk,2(Ω),,Wk,2(Ω))\left(W^{{k,2}}(\Omega),\left<\cdot,\cdot\right>_{{W^{{k,2}}(\Omega)}}\right) is a Hilbert space.

Remark 3.6.
  1. In view of Theorem 3.2, the proof of Theorem 3.5 only requires checking that the axioms for an inner product are satisfied.

  2. In order to emphasise the Hilbert space structure, the space Wk,2(Ω)W^{{k,2}}(\Omega) is often denoted Hk(Ω)H^{k}(\Omega).

3.2 Sobolev spaces are the completion of the space of smooth functions in the Sobolev norm (the Meyers-Serrin theorem)

In this section we prove the Meyers-Serrin theorem, which says that the Sobolev spaces defined in Section 2 are the completion of the space of smooth functions in the Sobolev norm. Therefore we now have two equivalent definitions of Sobolev spaces, which gives us a broader range of techniques to draw upon when proving theorems.

First recall the following well-known theorem that says that a normed linear space has a unique completion (see for example [11, Theorem I.3]).

Theorem 3.7.

If (V,V)(V,\|\cdot\| _{V}) is a normed linear space, then there exists a unique complete normed linear space (V~,V~)(\tilde{V},\|\cdot\| _{{\tilde{V}}}) such that VV is isometric to a dense subset of V~\tilde{V}.

Let 𝒞k(Ω)\mathcal{C}^{k}(\Omega) be the space of kk-times differentiable functions f:Ωf:\Omega\rightarrow\mathbb{C}. Since the weak derivative of a differentiable function is just the classical derivative (Lemma 2.13), then the weak derivatives of any ϕ𝒞k(Ω)\phi\in\mathcal{C}^{k}(\Omega) exist up to order kk, and we can define the subspace

Sk,p(Ω)={ϕ𝒞k(Ω):ϕWk,p(Ω)<}Wk,p(Ω).S^{{k,p}}(\Omega)=\{\phi\in\mathcal{C}^{k}(\Omega)\,:\,\|\phi\| _{{W^{{k,p}}(\Omega)}}<\infty\}\subseteq W^{{k,p}}(\Omega).

Let Bk,p(Ω)B^{{k,p}}(\Omega) denote the completion of Sk,p(Ω)S^{{k,p}}(\Omega) in the Wk,p(Ω)W^{{k,p}}(\Omega)-norm. Since Wk,p(Ω)W^{{k,p}}(\Omega) is complete by Theorem 3.2, and Sk,p(Ω)Wk,p(Ω)S^{{k,p}}(\Omega)\subseteq W^{{k,p}}(\Omega), then we have proved

Lemma 3.8.

For 1p1\leq p\leq\infty we have

Bk,p(Ω)Wk,p(Ω).B^{{k,p}}(\Omega)\subseteq W^{{k,p}}(\Omega).

It turns out that the converse is also true for 1p<1\leq p<\infty, this is known as the Meyers-Serrin theorem, and the proof will occupy the rest of this section.

Example 3.9.

To see that the converse of the previous lemma can never be true for p=p=\infty, in this example we show that Bk,(Ω)Wk,(Ω)B^{{k,\infty}}(\Omega)\neq W^{{k,\infty}}(\Omega). Consider first the case k=0k=0 and Ω=\Omega=\mathbb{R}, where the step function

f(x)={-1ifx<01ifx0f(x)=\left\{\begin{matrix}-1&\text{if}\, x<0\\ 1&\text{if}\, x\geq 0\end{matrix}\right.

is not in the completion of S0,()S^{{0,\infty}}(\mathbb{R}), since for any continuous function gS0,()g\in S^{{0,\infty}}(\mathbb{R}) we have f-gL()1\| f-g\| _{{L^{\infty}(\mathbb{R})}}\geq 1. To extend this example to Wk,()W^{{k,\infty}}(\mathbb{R}) for k>0k>0, simply consider the function

f(x)={-xkifx<0xkifx0,f(x)=\left\{\begin{matrix}-x^{k}&\text{if}\, x<0\\ x^{k}&\text{if}\, x\geq 0,\end{matrix}\right.

and note that dkfdxk\frac{d^{k}f}{dx^{k}} is a step function. It is easy then to extend this idea to the case where the domain is an open subset of n\mathbb{R}^{n}.

Next, we recall some basic facts needed in the proof of Theorem 3.15. The first is the existence of partitions of unity.

Theorem 3.10.

Let AA be an arbitrary subset of n\mathbb{R}^{n}, and let O={Uα}αI\mathcal{O}=\{ U_{\alpha}\} _{{\alpha\in I}} be a collection of open sets in n\mathbb{R}^{n} that cover AA. Then there exists a collection Ψ={ψβ}βJC0(n)\Psi=\{\psi _{\beta}\} _{{\beta\in J}}\subset C_{0}^{\infty}(\mathbb{R}^{n}) such that

  1. For every βJ\beta\in J and every xnx\in\mathbb{R}^{n}, we have 0ψα(x)10\leq\psi _{\alpha}(x)\leq 1.

  2. If KAK\subset\subset A then all but at most finitely many ψβΨ\psi _{\beta}\in\Psi vanish identically on KK.

  3. For every βJ\beta\in J there exists αI\alpha\in I such that suppψβUα\supp\psi _{\beta}\subset U_{\alpha}.

  4. For every xAx\in A we have βJψβ(x)=1\displaystyle{\sum _{{\beta\in J}}\psi _{\beta}(x)=1} (note that the sum makes sense because of the local finiteness condition (3.10)).

The collection Ψ\Psi is called a partition of unity of AA subordinate to O\mathcal{O}.


The case where AA is compact is given in [12, Theorem 2.13]. If AA is open, then for each jj\in\mathbb{N} define

Aj:={xA:|x|janddist(x,A)1j},A_{j}:=\left\{ x\in A\,:\,|x|\leq j\;\text{and}\;\mathop{\rm dist}\nolimits(x,\partial A)\geq\frac{1}{j}\right\},

and note that AjA_{j} is compact and satisfies AjintAj+1A_{j}\subset\interior A_{{j+1}} for each jj\in\mathbb{N}. Moreover, we can also write AA as the union of compact sets

A=jAj=j(AjintAj-1).A=\bigcup _{{j\in\mathbb{N}}}A_{j}=\bigcup _{{j\in\mathbb{N}}}(A_{j}\setminus\interior A_{{j-1}}).

Also, for notational convenience in what follows, define A0=A-1=A_{0}=A_{{-1}}=\emptyset.

Given an open cover O={Uα}αI\mathcal{O}=\{ U_{\alpha}\} _{{\alpha\in I}} of AA, for each jj\in\mathbb{N} we can define an open cover of the compact set AjintAj-1A_{j}\setminus\interior A_{{j-1}} by

Oj:={Uα(int(Aj+1)Aj-2):αI}.\mathcal{O}_{j}:=\left\{ U_{\alpha}\cap\left(\interior(A_{{j+1}})\setminus A_{{j-2}}\right)\,:\,\alpha\in I\right\}.

By the result for compact sets, for each jj\in\mathbb{N} there exists a partition of unity Ψj={ψj,n}n=1Nj\Psi _{j}=\{\psi _{{j,n}}\} _{{n=1}}^{{N_{j}}} for the compact set AjintAj-1A_{j}\setminus\interior A_{{j-1}} that is subordinate to Oj\mathcal{O}_{j}, and has finitely many elements. Moreover, since Uα(int(Aj+1)Aj-2)AU_{\alpha}\cap\left(\interior(A_{{j+1}})\setminus A_{{j-2}}\right)\subseteq A for each αI\alpha\in I and jj\in\mathbb{N}, then suppψj,nA\supp\psi _{{j,n}}\subseteq A for each ψj,nΨj\psi _{{j,n}}\in\Psi _{j}. Therefore, since each xAx\in A satisfies xAjintAj-1x\in A_{j}\setminus\interior A_{{j-1}} for at most finitely many jj\in\mathbb{N}, then the sum

σ(x)=jψΨjψ(x)\sigma(x)=\sum _{{j\in\mathbb{N}}}\sum _{{\psi\in\Psi _{j}}}\psi(x)

has at most finitely many terms for each xx, and also satisfies σ(x)1\sigma(x)\geq 1 for each xAx\in A. Now define the collection of functions

Ψ:={fj,n(x)={ψj,n(x)σ(x)xA0xA|:j,1nNj}.\Psi:=\left\{ f_{{j,n}}(x)=\left\{\begin{matrix}\frac{\psi _{{j,n}}(x)}{\sigma(x)}&x\in A\\ 0&x\notin A\end{matrix}\right|\,:\, j\in\mathbb{N},1\leq n\leq N_{j}\right\}.

This is now a partition of unity of AA subordinate to O\mathcal{O}.

In the case where AA is an arbitrary subset of n\mathbb{R}^{n} with an open cover O={Uα}αI\mathcal{O}=\{ U_{\alpha}\} _{{\alpha\in I}}, define the open set B=αIUα\displaystyle{B=\bigcup _{{\alpha\in I}}U_{\alpha}}, note that O\mathcal{O} is an open cover of BB, and apply the previous result to find a partition of unity Ψ\Psi of BB subordinate to O\mathcal{O}. Since ABA\subset B then Ψ\Psi is also a partition of unity of AA subordinate to O\mathcal{O}. ∎

The second basic fact needed is the convergence of sequences of mollified functions. Let JJ be a non-negative real-valued function in C0(n)C_{0}^{\infty}(\mathbb{R}^{n}) such that

  1. J(x)=0J(x)=0 if |x|1|x|\geq 1.

  2. nJ(x)dx=1\displaystyle{\int _{{\mathbb{R}^{n}}}J(x)\, dx=1}.

For example we can choose

J(x)={kexp(-11-|x|2)if 0if ,J(x)=\left\{\begin{matrix}k\exp\left(-\frac{1}{1-|x|^{2}}\right)&\text{if $|x|<1$}\\ 0&\text{if $|x|\geq 1$}\end{matrix}\right.,

where kk is chosen so that nJ(x)dx=1\displaystyle{\int _{{\mathbb{R}^{n}}}J(x)\, dx=1}. The function J(x)J(x) is called a mollifier. For any ε>0\varepsilon>0, let Jε(x)=1εnJ(xε)J_{\varepsilon}(x)=\frac{1}{\varepsilon^{n}}J\left(\frac{x}{\varepsilon}\right), and define the mollification of uLp(Ω)u\in L^{p}(\Omega) to be the convolution

(Jε*u)(x)=nJε(x-y)u(y)dy.(J_{\varepsilon}*u)(x)=\int _{{\mathbb{R}^{n}}}J_{\varepsilon}(x-y)u(y)\, dy.
Lemma 3.11.

If uWk,p(Ω)u\in W^{{k,p}}(\Omega) then Jε*uJ_{\varepsilon}*u is smooth for all ε>0\varepsilon>0.

Since JεJ_{\varepsilon} is smooth for all ε>0\varepsilon>0, then this follows from [15, Theorem 9.3].

Theorem 3.12.

Let Ω\Omega be an open subset of n\mathbb{R}^{n}, and let ΩΩ\Omega^{{\prime}}\subset\Omega be an open subset with compact closure. If 1p<1\leq p<\infty and uWk,p(Ω)u\in W^{{k,p}}(\Omega), then

limε0+Jε*u=u\lim _{{\varepsilon\rightarrow 0^{+}}}J_{\varepsilon}*u=u

in Wk,p(Ω)W^{{k,p}}(\Omega^{{\prime}}).


When k=0k=0 this is a standard result for LpL^{p} spaces (see for example [15, Theorem 9.6] for a proof). The general case follows by reducing to the k=0k=0 case.

First we show that for any ε<dist(Ω,Ω)\varepsilon<\mathop{\rm dist}\nolimits(\Omega^{{\prime}},\partial\Omega) we have Dα(Jε*u)=Jε*DαuD^{\alpha}(J_{\varepsilon}*u)=J_{\varepsilon}*D^{\alpha}u in the distributional sense on Ω\Omega^{{\prime}}. To see this, let u~\tilde{u} denote the zero extension of uu from Ω\Omega to all of n\mathbb{R}^{n}, and note that for any test function ϕ𝒞cpt(Ω)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega^{{\prime}}) we have

ΩJε*u(x)Dαϕ(x)dx\displaystyle\int _{{\Omega^{{\prime}}}}J_{\varepsilon}*u(x)D^{\alpha}\phi(x)\, dx=nnu~(x-y)Jε(y)Dαϕ(x)dxdy\displaystyle=\int _{{\mathbb{R}^{n}}}\int _{{\mathbb{R}^{n}}}\tilde{u}(x-y)J_{\varepsilon}(y)D^{\alpha}\phi(x)\, dxdy
=(-1)|α|nΩDαu(x-y)Jε(y)ϕ(x)dxdy\displaystyle=(-1)^{{|\alpha|}}\int _{\mathbb{R}}^{n}\int _{{\Omega^{{\prime}}}}D^{\alpha}u(x-y)J_{\varepsilon}(y)\phi(x)\, dxdy
=(-1)|α|ΩJε*Dαu(x)ϕ(x)dx.\displaystyle=(-1)^{{|\alpha|}}\int _{\Omega}^{{\prime}}J_{\varepsilon}*D^{\alpha}u(x)\phi(x)\, dx.

(All of the derivatives above are taken with respect to the variable xx.)

Since DαuLp(Ω)D^{\alpha}u\in L^{p}(\Omega) for each 0|α|k0\leq|\alpha|\leq k, then the result for LpL^{p} spaces shows that

limε0+Dα(Jε*u)-DαuLp(Ω)=limε0+Jε*Dαu-DαuLp(Ω)=0.\lim _{{\varepsilon\rightarrow 0^{+}}}\left\| D^{\alpha}(J_{\varepsilon}*u)-D^{\alpha}u\right\| _{{L^{p}(\Omega^{{\prime}})}}=\lim _{{\varepsilon\rightarrow 0^{+}}}\left\| J_{\varepsilon}*D^{\alpha}u-D^{\alpha}u\right\| _{{L^{p}(\Omega^{{\prime}})}}=0.

This is true for all α\alpha such that 0|α|k0\leq|\alpha|\leq k, and so Jε*uJ_{\varepsilon}*u converges to uu in the Wk,p(Ω)W^{{k,p}}(\Omega^{{\prime}}) norm. ∎

Next, we introduce the notion of a nested open cover, which will be used in the sequel.

Definition 3.13.

Let Ω\Omega be an open subset of n\mathbb{R}^{n}. A nested open cover of Ω\Omega is a collection of open sets {Ωj}j\{\Omega _{j}\} _{{j\in\mathbb{N}}} such that

  1. ΩjΩj+1Ω\Omega _{j}\subseteq\Omega _{{j+1}}\subseteq\Omega for all jj\in\mathbb{N}.

  2. For all xΩx\in\Omega there exists jj\in\mathbb{N} such that xΩjx\in\Omega _{j}.

Lemma 3.14.

Let Ω\Omega be an open set in n\mathbb{R}^{n}, and let {Ωj}j\{\Omega _{j}\} _{{j\in\mathbb{N}}} be a nested open cover of Ω\Omega. If fWk,p(Ω)f\in W^{{k,p}}(\Omega) satisfies fWk,p(Ωj)C\| f\| _{{W^{{k,p}}(\Omega _{j})}}\leq C for all jj\in\mathbb{N}, then fWk,p(Ω)C\| f\| _{{W^{{k,p}}(\Omega)}}\leq C.


The inclusion ΩjΩ\Omega _{j}\hookrightarrow\Omega induces an inclusion 𝒟(Ωj)𝒟(Ω)\mathcal{D}(\Omega _{j})\hookrightarrow\mathcal{D}(\Omega). Therefore the weak derivative of ff on Ωj\Omega _{j} is just the restriction of the weak derivative of ff on Ω\Omega, since for all test functions ϕ𝒟(Ωj)\phi\in\mathcal{D}(\Omega _{j}) we have

ΩjfDαϕdx=ΩfDαϕdx=(-1)|α|ΩDαfϕdx=(-1)|α|ΩjDαfϕdx.\int _{{\Omega _{j}}}fD^{\alpha}\phi\, dx=\int _{{\Omega}}fD^{\alpha}\phi\, dx=(-1)^{{|\alpha|}}\int _{{\Omega}}D^{\alpha}f\,\phi\, dx=(-1)^{{|\alpha|}}\int _{{\Omega _{j}}}D^{\alpha}f\,\phi\, dx.

The dominated convergence theorem shows that limjDαfLp(Ωj)=DαfLp(Ω)\lim _{{j\rightarrow\infty}}\| D^{\alpha}f\| _{{L^{p}(\Omega _{j})}}=\| D^{\alpha}f\| _{{L^{p}(\Omega)}} for each α\alpha, and as a consequence we have

DαfLp(Ω)supjDαfLp(Ωj)for each .\| D^{\alpha}f\| _{{L^{p}(\Omega)}}\leq\sup _{{j\in\mathbb{N}}}\| D^{\alpha}f\| _{{L^{p}(\Omega _{j})}}\quad\text{for each $\alpha$}.

Therefore fWk,p(Ω)C\| f\| _{{W^{{k,p}}(\Omega)}}\leq C. ∎

Now we are ready to prove that the space of smooth functions is dense in Wk,pW^{{k,p}}.

Theorem 3.15 (Meyers-Serrin).

Let Ω\Omega be an open subset of n\mathbb{R}^{n}, and let 1p<1\leq p<\infty. Then for any uWk,p(Ω)u\in W^{{k,p}}(\Omega), and for every ε>0\varepsilon>0, there exists ϕ𝒞(Ω)\phi\in\mathcal{C}^{\infty}(\Omega) such that u-ϕWk,p(Ω)<ε\| u-\phi\| _{{W^{{k,p}}(\Omega)}}<\varepsilon.


Fix ε>0\varepsilon>0. For each jj\in\mathbb{N}, define the open sets

Ωj\displaystyle\Omega _{j}:={xΩ:|x|<j,anddist(x,Ω)>1j}\displaystyle:=\left\{ x\in\Omega\,:\,|x|<j,\,\text{and}\,\mathop{\rm dist}\nolimits(x,\partial\Omega)>\frac{1}{j}\right\}
Uj\displaystyle U_{j}:=Ωj+1(ΩΩ¯j-1).\displaystyle:=\Omega _{{j+1}}\cap\left(\Omega\setminus\bar{\Omega}_{{j-1}}\right).

Then {Ωj}j\{\Omega _{j}\} _{{j\in\mathbb{N}}} is a nested open cover of Ω\Omega, and, in particular, we can apply Lemma 3.14 (we will use this at the end of the proof). Moreover, each Ωj\Omega _{j} has compact closure in Ω\Omega, and so Theorem 3.12 applies. Define O={Uj}j\mathcal{O}=\{ U_{j}\} _{{j\in\mathbb{N}}}, and note that O\mathcal{O} is also an open cover of Ω\Omega (although it is not nested).

Let Ψ={ψj}j\Psi=\{\psi _{j}\} _{{j\in\mathbb{N}}} be a partition of unity for Ω\Omega subordinate to O\mathcal{O}, and note that the local finiteness property of partitions of unity shows that ψj𝒞(Uj)\psi _{j}\in\mathcal{C}^{\infty}(U_{j}) for all jj, and we also have

j=1ψj(x)=1\sum _{{j=1}}^{\infty}\psi _{j}(x)=1

for all xΩx\in\Omega.

From the definition of UjU_{j}, if 0<εj<1(j+1)(j+2)=1j+1-1j+20<\varepsilon _{j}<\frac{1}{(j+1)(j+2)}=\frac{1}{j+1}-\frac{1}{j+2} then Jεj*(ψju)J_{{\varepsilon _{j}}}*(\psi _{j}u) has support in the set

Vk:=Ωj+2(ΩΩ¯j-2)Ω.V_{k}:=\Omega _{{j+2}}\cap\left(\Omega\setminus\bar{\Omega}_{{j-2}}\right)\subset\subset\Omega.

Since ψjuWk,p(Ω)\psi _{j}u\in W^{{k,p}}(\Omega), then by Theorem 3.12 we can find εj\varepsilon _{j} such that 0<εj<1(j+1)(j+2)0<\varepsilon _{j}<\frac{1}{(j+1)(j+2)} and

Jεj*(ψju)-ψjuWk,p(Ωj+2)<ε2j+1.\| J_{{\varepsilon _{j}}}*(\psi _{j}u)-\psi _{j}u\| _{{W^{{k,p}}(\Omega _{{j+2}})}}<\frac{\varepsilon}{2^{{j+1}}}.


ϕ=j=1Jεj*(ψju).\phi=\sum _{{j=1}}^{\infty}J_{{\varepsilon _{j}}}*(\psi _{j}u).

On any compact subset KΩK\subset\subset\Omega, all by finitely many terms in the sum vanish, and so ϕ𝒞(Ω)\phi\in\mathcal{C}^{\infty}(\Omega). Now note that if xΩx\in\Omega _{\ell}, then

u(x)=j=1+2ψj(x)u(x),ψ(x)=j=1+2Jεj*(ψju)(x),u(x)=\sum _{{j=1}}^{{\ell+2}}\psi _{j}(x)u(x),\quad\psi(x)=\sum _{{j=1}}^{{\ell+2}}J_{{\varepsilon _{j}}}*(\psi _{j}u)(x),

and so for each \ell\in\mathbb{N}

u-ϕWk,p(Ω)j=1+2Jεj*(ψju)-ψjuWk,p(Ωj+2)<12ε.\| u-\phi\| _{{W^{{k,p}}(\Omega _{\ell})}}\leq\sum _{{j=1}}^{{\ell+2}}\| J_{{\varepsilon _{j}}}*(\psi _{j}u)-\psi _{j}u\| _{{W^{{k,p}}(\Omega _{{j+2}})}}<\frac{1}{2}\varepsilon.

An application of Lemma 3.14 then shows that u-ϕWk,p(Ω)12ε<ε\| u-\phi\| _{{W^{{k,p}}(\Omega)}}\leq\frac{1}{2}\varepsilon<\varepsilon, as required. ∎

This theorem shows that Wk,p(Ω)Bk,p(Ω)W^{{k,p}}(\Omega)\subseteq B^{{k,p}}(\Omega). Combining this with Lemma 3.8 gives us the following corollary, which states that Wk,p(Ω)W^{{k,p}}(\Omega) is the completion of the space of 𝒞k(Ω)\mathcal{C}^{k}(\Omega) functions in the Sobolev norm Wk,p(Ω)\|\cdot\| _{{W^{{k,p}}(\Omega)}}.

Corollary 3.16.

Let Ω\Omega be an open subset of n\mathbb{R}^{n}, and let 1p<1\leq p<\infty. Then


for any k0k\geq 0.

Remark 3.17.

The statement of the corollary above is that Wk,p(Ω)W^{{k,p}}(\Omega) is the completion of the space of 𝒞k\mathcal{C}^{k} functions with respect to the Wk,pW^{{k,p}}-norm. Since 𝒞(Ω)𝒞k(Ω)\mathcal{C}^{\infty}(\Omega)\subset\mathcal{C}^{k}(\Omega) and Theorem 3.15 is stated for smooth functions, then we also have that Wk,p(Ω)W^{{k,p}}(\Omega) is the completion of the space of smooth functions in the Sobolev norm.

3.3 The dual space of a Sobolev space

In this section Ω\Omega denotes an open subset of n\mathbb{R}^{n}, 1p<1\leq p<\infty, and qq denotes the conjugate exponent to pp, i.e. q=pp-1q=\frac{p}{p-1} if 1<p<1<p<\infty and q=q=\infty if p=1p=1.

First recall Theorem A.4, which says that the dual of Lp(Ω)L^{p}(\Omega) is isomorphic to Lq(Ω)L^{q}(\Omega) if 1p<1\leq p<\infty. The proof of this theorem involves showing that for each linear functional Λ:Lp(Ω)\Lambda:L^{p}(\Omega)\rightarrow\mathbb{R} there exists a function vLq(Ω)v\in L^{q}(\Omega) (unique up to equivalence in Lq(Ω)L^{q}(\Omega)) such that

Λ(u)=Ωuvdx\Lambda(u)=\int _{\Omega}uv\, dx

for all uLp(Ω)u\in L^{p}(\Omega). Moreover, as part of the construction, the proof also shows that vLq(Ω)=ΛLp(Ω)*\| v\| _{{L^{q}(\Omega)}}=\|\Lambda\| _{{L^{p}(\Omega)^{*}}}. The converse is also true, so we have an isometric isomorphism Lp(Ω)*Lq(Ω)L^{p}(\Omega)^{*}\cong L^{q}(\Omega).

The goal of this section is to provide a description of the dual space to the Sobolev space Wk,p(Ω)W^{{k,p}}(\Omega). It is important to point out that most of the hard work is done in proving the previous theorem for LpL^{p} spaces, and that the proofs given below rely heavily on this construction. More details can be found in [1, Chapter 3]

Let ,:Lp(Ω)×Lq(Ω)\left<\cdot,\cdot\right>:L^{p}(\Omega)\times L^{q}(\Omega)\rightarrow\mathbb{R} denote the dual pairing

u,v:=Ωuvdx\left<u,v\right>:=\int _{\Omega}uv\, dx

for uLp(Ω)u\in L^{p}(\Omega) and vLq(Ω)v\in L^{q}(\Omega), and, for NN\in\mathbb{N}, let Lq(Ω)N:=Lq(Ω)××Lq(Ω)L^{q}(\Omega)^{N}:=L^{q}(\Omega)\times\cdots\times L^{q}(\Omega) denote the product of NN copies of Lp(Ω)L^{p}(\Omega).

There is a map F:Lq(Ω)NWk,p(Ω)*F:L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*} that takes a vector of functions (vα)(v_{\alpha}) to the linear functional Λ(u)=0|α|kDαu,vα\displaystyle{\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left<D^{\alpha}u,v_{\alpha}\right>}. The first theorem below shows that this map is surjective, and therefore we can characterise elements of the dual Wk,p(Ω)W^{{k,p}}(\Omega) in terms of elements of Lq(Ω)NL^{q}(\Omega)^{N}.

Theorem 3.18.

Given k0k\geq 0, let NN be the number of multi-indices α\alpha such that 0|α|k0\leq|\alpha|\leq k. For every functional ΛWk,p(Ω)*\Lambda\in W^{{k,p}}(\Omega)^{*} there exists (vα)0|α|kLq(Ω)N(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N} such that for all uWk,p(Ω)u\in W^{{k,p}}(\Omega) we have

Λ(u)=0|α|kDαu,vα,\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left<D^{\alpha}u,v_{\alpha}\right>,

Moreover, if we define VV to be the set of all (vα)0|α|kLq(Ω)N(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N} satisfying the previous equation, then

ΛWk,p(Ω)*=inf(vα)V(vα)Lq(Ω)N,\left\|\Lambda\right\| _{{W^{{k,p}}(\Omega)^{*}}}=\inf _{{(v_{\alpha})\in V}}\left\|(v_{\alpha})\right\| _{{L^{q}(\Omega)^{N}}},(3.2)

and this infimum is attained by some (vα)Lq(Ω)N(v_{\alpha})\in L^{q}(\Omega)^{N}.


First note that, by the definition of Wk,p(Ω)W^{{k,p}}(\Omega), there exists a linear map

P:Wk,p(Ω)\displaystyle P:W^{{k,p}}(\Omega)Lp(Ω)N\displaystyle\rightarrow L^{p}(\Omega)^{N}
u\displaystyle u(Dαu)0|α|k.\displaystyle\mapsto(D^{\alpha}u)_{{0\leq|\alpha|\leq k}}.

By the definition of the norms on Wk,p(Ω)W^{{k,p}}(\Omega) and Lp(Ω)NL^{p}(\Omega)^{N}, the map PP is an isometry, and therefore PP is an isometric isomorphism onto its image.

Given ΛWk,p(Ω)*\Lambda\in W^{{k,p}}(\Omega)^{*} define Λ*P(Wk,p(Ω))*\Lambda^{*}\in P\left(W^{{k,p}}(\Omega)\right)^{*}, a linear functional on the image of PP, by

Λ*(Pu)=Λ(u)for all .\Lambda^{*}(Pu)=\Lambda(u)\quad\text{for all $u\in W^{{k,p}}(\Omega)$}.

Since PP is an isometric isomorphism, then

Λ*P(Wk,p(Ω))*=ΛWk,p(Ω)*.\left\|\Lambda^{*}\right\| _{{P\left(W^{{k,p}}(\Omega)\right)^{*}}}=\left\|\Lambda\right\| _{{W^{{k,p}}(\Omega)^{*}}}.

The Hahn-Banach theorem (see for example [11, p76]) shows that there is a norm-preserving extension Λ~\tilde{\Lambda} of Λ*\Lambda^{*} to all of Lp(Ω)NL^{p}(\Omega)^{N}, and, together with the characterisation of the dual of Lp(Ω)L^{p}(\Omega), this shows that there exists (vα)Lq(Ω)N(v_{\alpha})\in L^{q}(\Omega)^{N} such that

Λ~(w)=0|α|kwα,vα\tilde{\Lambda}(w)=\sum _{{0\leq|\alpha|\leq k}}\left<w_{\alpha},v_{\alpha}\right>

for any w=(wα)Lp(Ω)Nw=(w_{\alpha})\in L^{p}(\Omega)^{N}. Moreover, we also have

Λ~(Lp(Ω)N)*=(0|α|k(vα)Lq(Ω)Nq)1q.\|\tilde{\Lambda}\| _{{\left(L^{p}(\Omega)^{N}\right)^{*}}}=\left(\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}^{q}\right)^{{\frac{1}{q}}}.

Therefore, we have shown that for any ΛWk,p(Ω)*\Lambda\in W^{{k,p}}(\Omega)^{*} there exists v=(vα)0|α|kLq(Ω)Nv=(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N} such that for all uWk,p(Ω)u\in W^{{k,p}}(\Omega) we have

Λ(u)=Λ*(Pu)=Λ~(Pu)=0|α|kDαu,vα.\Lambda(u)=\Lambda^{*}(Pu)=\tilde{\Lambda}(Pu)=\sum _{{0\leq|\alpha|\leq k}}\left<D^{\alpha}u,v_{\alpha}\right>.

Moreover, at each stage of the construction, we also showed that

ΛWk,p(Ω)*=Λ*P(Wk,p(Ω))*=Λ~(Lp(Ω)N)*=(vα)Lq(Ω)N.\|\Lambda\| _{{W^{{k,p}}(\Omega)^{*}}}=\|\Lambda^{*}\| _{{P\left(W^{{k,p}}(\Omega)\right)^{*}}}=\|\tilde{\Lambda}\| _{{\left(L^{p}(\Omega)^{N}\right)^{*}}}=\left\|(v_{\alpha})\right\| _{{L^{q}(\Omega)^{N}}}.

Unfortunately this map F:Lq(Ω)NWk,p(Ω)*F:L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*} is not an isomorphism, since it may have a non-trivial kernel, as the next example shows.

Example 3.19.

Let Ω\Omega be an open subset of \mathbb{R}, and let φ\varphi be a smooth function on Ω\Omega with compact support. Then

Ωxuφdx=-Ωuxφdx\int _{\Omega}\partial _{x}u\,\varphi\, dx=-\int _{\Omega}u\,\partial _{x}\varphi\, dx(3.3)

by the definition of weak derivative. Now consider the vector (xφ,φ)Lq(Ω)2(\partial _{x}\varphi,\varphi)\in L^{q}(\Omega)^{2}. The linear functional ΛW1,p(Ω)*\Lambda\in W^{{1,p}}(\Omega)^{*} associated to this vector is

Λ(u)=u,xφ+xu,φ,\Lambda(u)=\left<u,\partial _{x}\varphi\right>+\left<\partial _{x}u,\varphi\right>,

which is zero by (3.3). Therefore, for every non-zero smooth function φ\varphi with compact support contained in Ω\Omega, the vector (xφ,φ)Lq(Ω)2(\partial _{x}\varphi,\varphi)\in L^{q}(\Omega)^{2} is a non-trivial element of the kernel of the map F:Lq(Ω)2W1,p(Ω)*F:L^{q}(\Omega)^{2}\rightarrow W^{{1,p}}(\Omega)^{*}.

Remark 3.20.

More generally, if the functional Λ\Lambda is represented by a vector of smooth functions, i.e. (vα)𝒞cpt(Ω)NLq(Ω)N(v_{\alpha})\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega)^{N}\subset L^{q}(\Omega)^{N}, then we can write

Λ(u)=0|α|kDαu,vα=0|α|ku,(-1)|α|Dαvα.\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left<D^{\alpha}u,v_{\alpha}\right>=\sum _{{0\leq|\alpha|\leq k}}\left<u,(-1)^{{|\alpha|}}D^{\alpha}v_{\alpha}\right>.

Therefore Λ(u)=u,f\Lambda(u)=\left<u,f\right>, where f=(-1)|α|Dαvαf=(-1)^{{|\alpha|}}D^{\alpha}v_{\alpha}. In particular, we see that Λ\Lambda is the zero functional if f0f\equiv 0.

The next lemma shows that each element of the dual of a Sobolev space can be regarded as an extension of some distribution.

Lemma 3.21.

Let ΛWk,p(Ω)*\Lambda\in W^{{k,p}}(\Omega)^{*}. Then there exists T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} such that Λ(ϕ)=T(ϕ)\Lambda(\phi)=T(\phi) for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega).


Using the previous theorem, there exists v=(vα)0|α|kLq(Ω)Nv=(v_{\alpha})_{{0\leq|\alpha|\leq k}}\in L^{q}(\Omega)^{N} such that

Λ(u)=0|α|kDαu,vα\Lambda(u)=\sum _{{0\leq|\alpha|\leq k}}\left<D^{\alpha}u,v_{\alpha}\right>

for every uWk,p(Ω)u\in W^{{k,p}}(\Omega). Note that if ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega), then

Λ(ϕ)=0|α|kDαϕ,vα\displaystyle\Lambda(\phi)=\sum _{{0\leq|\alpha|\leq k}}\left<D^{\alpha}\phi,v_{\alpha}\right>=0|α|kΩDαϕvαdx\displaystyle=\sum _{{0\leq|\alpha|\leq k}}\int _{\Omega}D^{\alpha}\phi\, v_{\alpha}\, dx
=0|α|kTvα(Dαϕ)=0|α|k(-1)|α|DαTvα(ϕ),\displaystyle=\sum _{{0\leq|\alpha|\leq k}}T_{{v_{\alpha}}}(D^{\alpha}\phi)=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}(\phi),

where, in the second last term, DαvαD^{\alpha}v_{\alpha} refers to the weak derivative of vαv_{\alpha}.


T=0|α|k(-1)|α|DαTvα𝒟(Ω)*.T=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}\in\mathcal{D}(\Omega)^{*}.

Then we have shown that T(ϕ)=Λ(ϕ)T(\phi)=\Lambda(\phi) for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega). ∎

The previous theorems give different characterisations of elements of the dual of Wk,p(Ω)W^{{k,p}}(\Omega): Theorem 3.18 shows that there is a surjective map F:Lq(Ω)NWk,p(Ω)*F:L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*}, while Lemma 3.21 shows that the restriction of each linear functional to 𝒟(Ω)\mathcal{D}(\Omega) is a distribution. Therefore we have maps Lq(Ω)NWk,p(Ω)*L^{q}(\Omega)^{N}\rightarrow W^{{k,p}}(\Omega)^{*} and Wk,p(Ω)*𝒟(Ω)*W^{{k,p}}(\Omega)^{*}\rightarrow\mathcal{D}(\Omega)^{*}.

Unfortunately, these results do not give a nice description of the kernel of the first map and the image of the second map. In addition, the second map may have a non-trivial kernel (see Remark 3.24). It turns out that W0k,p(Ω)W_{0}^{{k,p}}(\Omega) has better properties with respect to the second map, and the next theorem describes the image of the subspace W0k,p(Ω)*Wk,p(Ω)*𝒟(Ω)*W_{0}^{{k,p}}(\Omega)^{*}\hookrightarrow W^{{k,p}}(\Omega)^{*}\rightarrow\mathcal{D}(\Omega)^{*}.

Theorem 3.22.

The dual space W0k,p(Ω)*W_{0}^{{k,p}}(\Omega)^{*} is isometrically isomorphic to the Banach space consisting of those distributions T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} that satisfy

T=0|α|k(-1)|α|DαTvαT=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}(3.4)

for some v=(vα)Lq(Ω)Nv=(v_{\alpha})\in L^{q}(\Omega)^{N}, and whose norm is given by

T:=inf{vLq(Ω)*:vLq(Ω)NandT=0|α|k(-1)|α|DαTvα}.\| T\|:=\inf\left\{\| v\| _{{L^{q}(\Omega)^{*}}}\,:\, v\in L^{q}(\Omega)^{N}\;\text{and}\;\; T=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}D^{\alpha}T_{{v_{\alpha}}}\right\}.(3.5)

Given v=(vα)Lq(Ω)Nv=(v_{\alpha})\in L^{q}(\Omega)^{N}, let V𝒟(Ω)*V^{{\prime}}\subseteq\mathcal{D}(\Omega)^{*} be the space of distributions satisfying (3.4). Let TVT\in V^{{\prime}}. The goal of the proof is to show that TT has a unique extension to some ΛW0k,p(Ω)\Lambda\in W_{0}^{{k,p}}(\Omega), and, moreover, that this map TΛT\mapsto\Lambda is the inverse of the restriction map from the previous lemma.

Given uW0k,p(Ω)u\in W_{0}^{{k,p}}(\Omega), let (ϕn)n(\phi _{n})_{{n\in\mathbb{N}}} be a sequence of test functions converging to uu in the Wk,p(Ω)W^{{k,p}}(\Omega)-norm (note that this is not the same as convergence in the topology on the space of test functions). Such a sequence exists by the definition of W0k,p(Ω)W_{0}^{{k,p}}(\Omega). We claim that (T(ϕn))n(T(\phi _{n}))_{{n\in\mathbb{N}}} is a Cauchy sequence in \mathbb{C}, which is a consequence of the following calculation

|T(ϕm)-T(ϕn)|\displaystyle\left|T(\phi _{m})-T(\phi _{n})\right|0|α|k|Tvα(Dαϕm-Dαϕn)|\displaystyle\leq\sum _{{0\leq|\alpha|\leq k}}\left|T_{{v_{\alpha}}}(D^{\alpha}\phi _{m}-D^{\alpha}\phi _{n})\right|
0|α|kDα(ϕm-ϕn)Lp(Ω)vαLq(Ω)(Hölder’s inequality)\displaystyle\leq\sum _{{0\leq|\alpha|\leq k}}\left\| D^{\alpha}(\phi _{m}-\phi _{n})\right\| _{{L^{p}(\Omega)}}\left\| v_{\alpha}\right\| _{{L^{q}(\Omega)}}\quad\text{(H\"{o}lder's inequality)}
ϕm-ϕnWk,p(Ω)0|α|k(vα)Lq(Ω)N(definition of  norm),\displaystyle\leq\left\|\phi _{m}-\phi _{n}\right\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}\quad\text{(definition of $W^{{k,p}}$ norm)},

which converges to zero, since (ϕn)n(\phi _{n})_{{n\in\mathbb{N}}} is a Cauchy sequence in Wk,p(Ω)W^{{k,p}}(\Omega). Therefore limnT(ϕn)\displaystyle{\lim _{{n\rightarrow\infty}}T(\phi _{n})} exists, and we claim that the limit only depends on uu. To see this, consider another sequence (φn)n(\varphi _{n})_{{n\in\mathbb{N}}} of test functions converging to uu in the Wk,p(Ω)W^{{k,p}}(\Omega) norm, and note that the same calculation as above shows that

|T(ϕn)-T(φn)|\displaystyle\left|T(\phi _{n})-T(\varphi _{n})\right|ϕn-φnWk,p(Ω)0|α|k(vα)Lq(Ω)N\displaystyle\leq\left\|\phi _{n}-\varphi _{n}\right\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}
(ϕn-uWk,p(Ω)+φn-uWk,p(Ω))0|α|k(vα)Lq(Ω)N,\displaystyle\leq\left(\left\|\phi _{n}-u\right\| _{{W^{{k,p}}(\Omega)}}+\left\|\varphi _{n}-u\right\| _{{W^{{k,p}}(\Omega)}}\right)\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}},

which converges to zero as nn\rightarrow\infty. Therefore, we can define

Λ(u):=limnT(ϕn).\Lambda(u):=\displaystyle{\lim _{{n\rightarrow\infty}}T(\phi _{n})}.

Clearly Λ\Lambda is linear, since both TT and the operation of taking the limit in Wk,p(Ω)W^{{k,p}}(\Omega) are linear. To see that Λ\Lambda is bounded, we compute

|Λ(u)|=limn|T(ϕn)|limnϕnWk,p(Ω)0|α|k(vα)Lq(Ω)N=uWk,p(Ω)0|α|k(vα)Lq(Ω)N,\left|\Lambda(u)\right|=\lim _{{n\rightarrow\infty}}\left|T(\phi _{n})\right|\leq\lim _{{n\rightarrow\infty}}\|\phi _{n}\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}=\| u\| _{{W^{{k,p}}(\Omega)}}\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}},

and so ΛWk,p(Ω)*0|α|k(vα)Lq(Ω)N\|\Lambda\| _{{W^{{k,p}}(\Omega)^{*}}}\leq\sum _{{0\leq|\alpha|\leq k}}\|(v_{\alpha})\| _{{L^{q}(\Omega)^{N}}}.

Therefore, we have shown that TT has an extension to ΛW0k,p(Ω)*\Lambda\in W_{0}^{{k,p}}(\Omega)^{*}, and, moreover, this extension is unique since 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) is dense in W0k,p(Ω)W_{0}^{{k,p}}(\Omega). More precisely, any other bounded linear functional Λ\Lambda^{{\prime}} that restricts to TT on 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) must satisfy

Λ(u)-Λ(u)\displaystyle\Lambda(u)-\Lambda^{{\prime}}(u)=limnΛ(ϕn)-limnΛ(ϕn)(since  and  are both continuous)\displaystyle=\lim _{{n\rightarrow\infty}}\Lambda(\phi _{n})-\lim _{{n\rightarrow\infty}}\Lambda^{{\prime}}(\phi _{n})\quad\text{(since $\Lambda$ and $\Lambda^{{\prime}}$ are both continuous)}
=limnT(ϕn)-limnT(ϕn)=0.\displaystyle=\lim _{{n\rightarrow\infty}}T(\phi _{n})-\lim _{{n\rightarrow\infty}}T(\phi _{n})=0.

By construction, Λ(ϕ)=T(ϕ)\Lambda(\phi)=T(\phi) for every test function ϕ\phi, and so the map VWk,p(Ω)*V^{{\prime}}\rightarrow W^{{k,p}}(\Omega)^{*} is the inverse of the restriction map from Lemma 3.21. To see that this is an isometry, note that Theorem 3.18 shows that the norm on VV given by (3.5) is the same as the norm on Wk,p(Ω)*W^{{k,p}}(\Omega)^{*} given by (3.2). Therefore VV^{{\prime}} is isometrically isomorphic to W0k,p(Ω)W_{0}^{{k,p}}(\Omega), which also implies that VV^{{\prime}} is a Banach space. ∎

Remark 3.23.

The space VV^{{\prime}} is a strict subset of 𝒟(Ω)*\mathcal{D}(\Omega)^{*}, since there are many distributions that cannot be written as

T=0|α|k(-1)|α|TvαT=\sum _{{0\leq|\alpha|\leq k}}(-1)^{{|\alpha|}}T_{{v_{\alpha}}}

for some (vα)Lq(Ω)N(v_{\alpha})\in L^{q}(\Omega)^{N}. For example, the delta functional can never be written in this form, since Example 2.10 shows that it cannot be represented by a function.

Remark 3.24.
  1. As part of the previous proof we showed that the restriction map


    is injective. It is natural to ask whether these results can be extended to Wk,p(Ω)W^{{k,p}}(\Omega), however the previous proof will not work since it depends on the fact that, by definition, W0k,p(Ω)W_{0}^{{k,p}}(\Omega) is the completion of 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) in the Wk,pW^{{k,p}} norm (since the first step is to approximate an element of W0k,p(Ω)W_{0}^{{k,p}}(\Omega) by a sequence of smooth functions with compact support).

  2. One could still ask whether there is an alternative proof that works for Wk,p(Ω)W^{{k,p}}(\Omega), however it turns out that in general the answer is no, since the extension of a linear functional T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} to a linear functional ΛWk,p(Ω)*\Lambda\in W^{{k,p}}(\Omega)^{*} may be non-unique. When the domain Ω\Omega is bounded and the boundary has good properties, then one can construct examples using the trace operator W1,p(Ω)Lp(Ω)W^{{1,p}}(\Omega)\rightarrow L^{p}(\partial\Omega) (see [4, Section 5.5] for the construction), which is zero on 𝒞cpt(Ω)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) but non-zero in general. Therefore the restriction map Wk,p(Ω)*𝒟(Ω)*W^{{k,p}}(\Omega)^{*}\rightarrow\mathcal{D}(\Omega)^{*} has non-zero kernel, and so we cannot identify Wk,p(Ω)*W^{{k,p}}(\Omega)^{*} with a subspace of 𝒟(Ω)*\mathcal{D}(\Omega)^{*} in this case. Note that the trace operator is zero precisely on the subspace W0k,p(Ω)W_{0}^{{k,p}}(\Omega) (see [4, Theorem 2, Section 5.5] for more details).

3.4 Positive distributions can be represented by measures (the Riesz representation theorem)

Given the results of the previous section on the dual space of a Sobolev space, it is natural to ask whether there is a nice characterisation of 𝒟(Ω)*\mathcal{D}(\Omega)^{*} in terms of familiar objects, and it is the goal of this section to answer this question for positive, real-valued distributions.

As we have seen from Theorem 2.8, there is an injective map Lloc1(Ω)𝒟(Ω)*L_{{loc}}^{1}(\Omega)\hookrightarrow\mathcal{D}(\Omega)^{*}. Unfortunately, as explained in Example 2.10, the set Lloc1(Ω)L_{{loc}}^{1}(\Omega) is too small to provide a unique representative for every distribution. In Theorem 3.35 we show that regular Borel measures are the right class of objects to represent distributions.

This theorem is also proved in [12, Theorem 2.14] (for the dual of the space of continuous functions with compact support) and [8, Theorem 6.22] (for the dual of the space of smooth functions with compact support). Both proofs follow a similar strategy, which involves first using the distribution to define an outer measure, and then showing that open sets are all measurable with respect to this outer measure. Rudin also considers the case of complex-valued distributions in [12, Theorem 6.19], and a more general proof (for the dual of the space 𝒞cpt(Ω,m)\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega,\mathbb{R}^{m})) is given in [5, Section 1.8]

Note that in [8] the proof only uses the Riemann integral, and in particular it does not involve Lebesgue measure. Since we are assuming the construction of Lebesgue measure (and a construction using outer measure is also given in Definition B.18), then we are free to use it here where it simplifies the proof.

For this entire section we use the following notation: let Ω\Omega be an open subset of n\mathbb{R}^{n}, let O(Ω)\mathcal{O}(\Omega) denote the collection of open subsets of Ω\Omega, and let \mathcal{B} denote the Borel σ\sigma-algebra generated by the open subsets of Ω\Omega.

Definition 3.25.

Let T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*}. The distribution TT is a positive distribution if T(ϕ)0T(\phi)\geq 0 for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega) such that ϕ(x)0\phi(x)\geq 0 for all xx.

In the following, let UΩU\subseteq\Omega be an open set, and define 𝒞(U)\mathcal{C}(U) to be the set of functions ϕ𝒞cpt(Ω)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) with 0ϕ10\leq\phi\leq 1 and supp(ϕ)U\supp(\phi)\subset U (note that Urysohn’s lemma shows that this set is nonempty if UU is nonempty).

Lemma 3.26.

Let T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} be a positive distribution. Then the function μ:O(Ω)\mu:\mathcal{O}(\Omega)\rightarrow\mathbb{R} defined by

μ(U):={supϕ𝒞(U){T(ϕ)}if  nonempty0if \mu(U):=\left\{\begin{matrix}\sup _{{\phi\in\mathcal{C}(U)}}\{ T(\phi)\}&\text{if $U$ nonempty}\\ 0&\text{if $U=\emptyset$}\end{matrix}\right.(3.6)


  1. μ(U1)μ(U2)\mu(U_{1})\leq\mu(U_{2}) if U1U2U_{1}\subseteq U_{2} are open sets,

  2. μ(nUn)nμ(Un)\displaystyle{\mu\left(\bigcup _{{n\in\mathbb{N}}}U_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu(U_{n})} for every countable collection of open subsets {Un}nO\{ U_{n}\} _{{n\in\mathbb{N}}}\subset\mathcal{O}.


The first property follows from the fact that U1U2U_{1}\subseteq U_{2} implies that 𝒞(U1)𝒞(U2)\mathcal{C}(U_{1})\subseteq\mathcal{C}(U_{2}).

To prove the second property, we first show that

μ(U1U2)μ(U1)+μ(U2)\mu(U_{1}\cup U_{2})\leq\mu(U_{1})+\mu(U_{2})

for any open sets U1,U2OU_{1},U_{2}\in\mathcal{O}. Given any ϕ𝒞(U1U2)\phi\in\mathcal{C}(U_{1}\cup U_{2}), let K=supp(ϕ)K=\supp(\phi), and apply Lemma A.12 to show that there exist functions ϕ1\phi _{1} and ϕ2\phi _{2} such that ϕϕ1𝒞(U1)\phi\cdot\phi _{1}\in\mathcal{C}(U_{1}), ϕϕ2𝒞(U2)\phi\cdot\phi _{2}\in\mathcal{C}(U_{2}), and ϕϕ1+ϕϕ2=ϕ\phi\cdot\phi _{1}+\phi\cdot\phi _{2}=\phi. Therefore

T(ϕ)=T(ϕϕ1)+T(ϕϕ2)μ(U1)+μ(U2)T(\phi)=T(\phi\cdot\phi _{1})+T(\phi\cdot\phi _{2})\leq\mu(U_{1})+\mu(U_{2})

for all ϕ𝒞(U1U2)\phi\in\mathcal{C}(U_{1}\cup U_{2}), and so μ(U1U2)μ(U1)+μ(U2)\mu(U_{1}\cup U_{2})\leq\mu(U_{1})+\mu(U_{2}). Induction then shows that for any NN\in\mathbb{N}

μ(n=1NUn)n=1Nμ(Un),\mu\left(\bigcup _{{n=1}}^{N}U_{n}\right)\leq\sum _{{n=1}}^{N}\mu(U_{n}),(3.7)

and so it only remains to extend this to countable collections of open sets. To do this, note that any ϕ𝒞(nUn)\displaystyle{\phi\in\mathcal{C}\left(\bigcup _{{n\in\mathbb{N}}}U_{n}\right)} has compact support in Ω\Omega, and so there exists a finite collection of sets (re-order so that these are U1,,UNU_{1},\ldots,U_{N}) such that supp(ϕ)n=1NUn\displaystyle{\supp(\phi)\subset\bigcup _{{n=1}}^{N}U_{n}}. Equation (3.7) then gives us

T(ϕ)n=1Nμ(Un)nμ(Un),T(\phi)\leq\sum _{{n=1}}^{N}\mu(U_{n})\leq\sum _{{n\in\mathbb{N}}}\mu(U_{n}),

which completes the proof. ∎

Now extend μ\mu to a function μ*\mu^{*} on the set of all subsets of Ω\Omega by

μ*(A):=inf{μ(U):AUandUO}.\mu^{*}(A):=\inf\{\mu(U)\,:\, A\subset U\,\text{and}\, U\in\mathcal{O}\}.(3.8)
Lemma 3.27.

The function μ*\mu^{*} is an outer measure on Ω\Omega.


Recall that we have to prove that each of the following conditions hold.

  1. μ*(A)0\mu^{*}(A)\geq 0 for all AΩA\subseteq\Omega and μ()=0\mu(\emptyset)=0,

  2. μ*(A1)μ*(A2)\mu^{*}(A_{1})\leq\mu^{*}(A_{2}) if A1A2A_{1}\subseteq A_{2}, and

  3. μ*(nAn)nμ*(An)\displaystyle{\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n})} for any countable collection of sets {An}nP(Ω)\{ A_{n}\} _{{n\in\mathbb{N}}}\subset\mathcal{P}(\Omega).

The first two of the above properties follow easily from the respective definitions of μ*\mu^{*} and μ\mu, and so it only remains to show countable subadditivity. For any ε>0\varepsilon>0, let {Un}n\{ U_{n}\} _{{n\in\mathbb{N}}} be a collection of open subsets of Ω\Omega such that μ*(Un)=μ(Un)μ*(An)+2-nε\mu^{*}(U_{n})=\mu(U_{n})\leq\mu^{*}(A_{n})+2^{{-n}}\varepsilon (these sets exist since μ*\mu^{*} is defined using the infimum). Then

μ*(nAn)μ*(nUn)nμ*(An)+ε.\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}U_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n})+\varepsilon.

Since we can do this for any ε>0\varepsilon>0, then we have

μ*(nAn)nμ*(An),\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n}),

as required. ∎

It is worth pausing at this stage to consider some examples.

Example 3.28.
  1. Given xΩx\in\Omega, let T=δ0T=\delta _{0}, the delta functional. Then for any subset AΩA\subseteq\Omega we have

    μ(A)={1xA0xA\mu(A)=\left\{\begin{matrix}1&x\in A\\ 0&x\notin A\end{matrix}\right.
  2. Let Ω=n\Omega=\mathbb{R}^{n} with co-ordinates (x1,,xn)(x_{1},\ldots,x_{n}), and let TT be the distribution defined by integration on the subspace 1={x2==xn=0}\mathbb{R}_{1}=\{ x_{2}=\cdots=x_{n}=0\}, i.e.

    T(ϕ)=-ϕ(t,0,,0)dt.T(\phi)=\int _{{-\infty}}^{\infty}\phi(t,0,\ldots,0)\, dt.

    Then for any subset AΩA\subseteq\Omega we have μ(A)=|1A|\mu(A)=\left|\mathbb{R}_{1}\cap A\right|, where |||\cdot| denotes the one-dimensional Lebesgue measure on 1\mathbb{R}_{1}\cong\mathbb{R}.

Theorem B.16 shows that to construct a measure μ\mu from μ*\mu^{*} we need to restrict to the σ\sigma-algebra of measurable subsets. The next lemma shows that, for the outer measure constructed above, this σ\sigma-algebra contains the Borel σ\sigma-algebra \mathcal{B}.

Lemma 3.29.

All open sets UO(Ω)U\in\mathcal{O}(\Omega) are measurable with respect to μ*\mu^{*}, i.e. for every set AΩA\subseteq\Omega we have

μ*(A)=μ*(AU)+μ*(A(ΩU)).\mu^{*}(A)=\mu^{*}(A\cap U)+\mu^{*}(A\cap(\Omega\setminus U)).

Since A=(AU)(A(ΩU))A=(A\cap U)\cup(A\cap(\Omega\setminus U)), then the inequality

μ*(A)μ*(AU)+μ*(A(ΩU))\mu^{*}(A)\leq\mu^{*}(A\cap U)+\mu^{*}(A\cap(\Omega\setminus U))

follows from the previous lemma, and so it only remains to show the reverse inequality. First consider the case where AA is an open subset of Ω\Omega. Given any open set UΩU\subset\Omega and any ε>0\varepsilon>0, choose ϕ𝒞(AU)\phi\in\mathcal{C}(A\cap U) such that T(ϕ)μ*(AU)-12εT(\phi)\geq\mu^{*}(A\cap U)-\frac{1}{2}\varepsilon (such a ϕ\phi exists since μ*(AU)=μ(AU)\mu^{*}(A\cap U)=\mu(A\cap U) is defined using the supremum). Let K=supp(ϕ)K=\supp(\phi). Then ΩK\Omega\setminus K is open, and KAUUK\subset A\cap U\subseteq U implies that ΩUΩK\Omega\setminus U\subset\Omega\setminus K.

Now choose ψ𝒞((ΩK)A)\psi\in\mathcal{C}\left((\Omega\setminus K)\cap A\right) such that T(ψ)μ*(ΩKA)-12εT(\psi)\geq\mu^{*}\left((\Omega\setminus K\cap A)\right)-\frac{1}{2}\varepsilon (again, such a ψ\psi exists since μ*(ΩKA)=μ(ΩKA)\mu^{*}\left((\Omega\setminus K\cap A)\right)=\mu\left((\Omega\setminus K\cap A)\right) is defined using the supremum). Since supp(ϕ)=K\supp(\phi)=K and supp(ψ)(ΩK)AΩK\supp(\psi)\subset(\Omega\setminus K)\cap A\subseteq\Omega\setminus K, then ϕ\phi and ψ\psi have disjoint support, and so

μ*(A)=μ(A)\displaystyle\mu^{*}(A)=\mu(A)T(ϕ)+T(ψ)\displaystyle\geq T(\phi)+T(\psi)
μ*(AU)-12ε+μ*((ΩK)A)-12ε\displaystyle\geq\mu^{*}\left(A\cap U\right)-\frac{1}{2}\varepsilon+\mu^{*}\left((\Omega\setminus K)\cap A\right)-\frac{1}{2}\varepsilon
μ*(AU)+μ*(A(ΩU))-ε,\displaystyle\geq\mu^{*}\left(A\cap U\right)+\mu^{*}\left(A\cap(\Omega\setminus U)\right)-\varepsilon,

where the last step follows from Lemma 3.26 and the fact that ΩUΩK\Omega\setminus U\subset\Omega\setminus K. We can do this for any ε>0\varepsilon>0, and so μ*(A)μ*(AU)+μ*(A(ΩU))\mu^{*}(A)\geq\mu^{*}\left(A\cap U\right)+\mu^{*}\left(A\cap(\Omega\setminus U)\right) for any open set UΩU\subseteq\Omega.

Now consider the case where AA is an arbitrary subset of Ω\Omega. Then for any open set UΩU\subseteq\Omega containing AA and any open set VΩV\subseteq\Omega we have from Lemma 3.27

μ*(U)\displaystyle\mu^{*}(U)μ*(A)since \displaystyle\geq\mu^{*}(A)\quad\text{since $A\subseteq U$}
μ*(UV)\displaystyle\mu^{*}(U\cap V)μ*(AV)since \displaystyle\geq\mu^{*}(A\cap V)\quad\text{since $A\cap V\subseteq U\cap V$}
andμ*(U(ΩV))\displaystyle\text{and}\quad\mu^{*}\left(U\cap(\Omega\setminus V)\right)μ*(A(ΩV))since .\displaystyle\geq\mu^{*}\left(A\cap(\Omega\setminus V)\right)\quad\text{since $A\cap(\Omega\setminus V)\subseteq U\cap(\Omega\setminus V)$}.


μ*(U)=μ*(UV)+μ*(U(ΩV))μ*(AV)+μ*(A(ΩV))\mu^{*}(U)=\mu^{*}(U\cap V)+\mu^{*}\left(U\cap(\Omega\setminus V)\right)\geq\mu^{*}(A\cap V)+\mu^{*}\left(A\cap(\Omega\setminus V)\right)

for every open set UΩU\subseteq\Omega containing AA, and any open set VΩV\subseteq\Omega. Therefore, since μ*(A)\mu^{*}(A) is defined using the infimum, then

μ*(A)μ*(AV)+μ*(A(ΩV)),\mu^{*}(A)\geq\mu^{*}(A\cap V)+\mu^{*}\left(A\cap(\Omega\setminus V)\right),

which completes the proof. ∎

Therefore, by Theorem B.16, the function μ*\mu^{*} restricts to a measure (call it μ\mu) on the Borel sigma algebra \mathcal{B}. Note that this measure μ\mu is given by (3.6) on open sets. The next two lemmas give a characterisation of μ\mu on compact sets.

Lemma 3.30.

Given any compact set KΩK\subset\Omega, and any ψ𝒞cpt(Ω)\psi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) such that ψ1\psi\equiv 1 on KK and 0ψ10\leq\psi\leq 1 on Ω\Omega, we have μ(K)T(ψ)\mu(K)\leq T(\psi).


(See also [12, p43].) For all α\alpha such that 0<α<10<\alpha<1, let Vα={x:ψ(x)>α}V_{\alpha}=\{ x\,:\,\psi(x)>\alpha\}. Then each VαV_{\alpha} is open, and since ψ1\psi\equiv 1 on KK we have KVαK\subset V_{\alpha}. Moreover, if ϕ𝒞(Vα)\phi\in\mathcal{C}(V_{\alpha}) then αϕ(x)ψ(x)\alpha\phi(x)\leq\psi(x) for all xVαx\in V_{\alpha}. Therefore T(ϕ)1αT(ψ)T(\phi)\leq\frac{1}{\alpha}T(\psi) (since TT is a positive distribution) and we have

μ(K)μ(Vα)=sup{T(ϕ):ϕ𝒞(Vα)}1αT(ψ)\mu(K)\leq\mu(V_{\alpha})=\sup\{ T(\phi)\,:\,\phi\in\mathcal{C}(V_{\alpha})\}\leq\frac{1}{\alpha}T(\psi)

for all α\alpha such that 0<α<10<\alpha<1. Therefore μ(K)T(ψ)\mu(K)\leq T(\psi). ∎

Corollary 3.31.

If KK is compact, then μ(K)\mu(K) is finite.

Lemma 3.32.

Let KΩK\subset\Omega be a compact set. Then

μ(K)=inf{T(ψ):ψ𝒞cpt(Ω),and  on }.\mu(K)=\inf\left\{ T(\psi)\,:\,\psi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega),\,\text{and $\psi\equiv 1$ on $K$}\right\}.(3.9)

Firstly note that compact sets are closed and therefore elements of the Borel σ\sigma-algebra. Given any ε>0\varepsilon>0, let UU be an open set such that KUΩK\subset U\subseteq\Omega and μ(U)μ(K)+ε\mu(U)\leq\mu(K)+\varepsilon (the existence of UU follows from outer regularity of μ\mu, which is a direct consequence of the definition of μ*\mu^{*} in (3.8)). Recall from Urysohn’s lemma (Theorem A.11) that there exists ψ𝒞cpt(Ω)\psi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) such that supp(ψ)U\supp(\psi)\subset U, 0ψ(x)10\leq\psi(x)\leq 1 for all xUx\in U, and ψ1\psi\equiv 1 on KK. Then ψ𝒞(U)\psi\in\mathcal{C}(U), and so T(ψ)μ(U)T(\psi)\leq\mu(U) by (3.6). Therefore Lemma 3.30 implies that

μ(K)T(ψ)μ(U)μ(K)+ε.\mu(K)\leq T(\psi)\leq\mu(U)\leq\mu(K)+\varepsilon.

We can do this for any ε>0\varepsilon>0 and any compact set KΩK\subset\Omega, therefore (3.9) holds for any compact set KK. ∎

Lemma 3.33.

Given any ε>0\varepsilon>0 and any measurable set AA there exists an open set UU with AUA\subset U and μ(UA)<ε\mu(U\setminus A)<\varepsilon.


If μ(A)\mu(A) is finite then the result follows easily, since AA is measurable and μ*(A)\mu^{*}(A) is defined to be the infimum of μ(U)\mu(U) for UAU\supset A open.

If μ(A)\mu(A) is infinite, then we first write the open set Ω\Omega as the countable union of compact sets

Ω=K\Omega=\bigcup _{{\ell\in\mathbb{N}}}K_{\ell}

(for example we could take each KK_{\ell} to be a closed ball), and note that

A=AK.A=\bigcup _{{\ell\in\mathbb{N}}}A\cap K_{\ell}.

Each AKA\cap K_{\ell} is a subset of a compact set, and therefore has finite measure, so we can find an open set UU_{\ell} such that AKUA\cap K_{\ell}\subset U_{\ell} and

μ(UA)<2-ε.\mu(U_{\ell}\setminus A)<2^{{-\ell}}\varepsilon.

Then U=U\displaystyle{U=\bigcup _{{\ell\in\mathbb{N}}}U_{\ell}} is an open set containing AA, and

μ(UA)=μ(U(AK))μ(U(AK))<ε.\mu(U\setminus A)=\mu\left(\bigcup _{{\ell\in\mathbb{N}}}U_{\ell}\setminus(A\cap K_{\ell})\right)\leq\sum _{{\ell\in\mathbb{N}}}\mu(U_{\ell}\setminus(A\cap K_{\ell}))<\varepsilon.

We can now show that the measure μ\mu is Borel regular (recall Definition B.8).

Lemma 3.34.

μ\mu is a regular Borel measure on Ω\Omega.


Outer regularity of μ\mu follows easily from the definition of μ*\mu^{*}, and therefore it only remains to show that it is inner regular, i.e. for any measurable set AΩA\subseteq\Omega we have

μ(A)=sup{μ(K):KAand  is compact}.\mu(A)=\sup\left\{\mu(K)\,:\, K\subset A\,\text{and $K$ is compact}\right\}.(3.10)

Given ε>0\varepsilon>0, outer regularity of μ\mu shows that there exists an open set UU such that ΩAU\Omega\setminus A\subset U and μ(U(ΩA))<ε\mu\left(U\setminus(\Omega\setminus A)\right)<\varepsilon. Then, since we also have ΩUA\Omega\setminus U\subset A, then

U(ΩA)=UA=A(ΩU),U\setminus(\Omega\setminus A)=U\cap A=A\setminus(\Omega\setminus U),

and so the previous lemma shows that there exists a closed set F=ΩUF=\Omega\setminus U such that

μ(AF)<ε.\mu\left(A\setminus F\right)<\varepsilon.

Any closed set FnF\subset\mathbb{R}^{n} is the countable union of compact sets; for example we can take K=FB(0,)¯K_{\ell}=F\cap\overline{B(0,\ell)} for each \ell\in\mathbb{N} and write F=K\displaystyle{F=\bigcup _{{\ell\in\mathbb{N}}}K_{\ell}}. For F=ΩUF=\Omega\setminus U as above, let Fn==1nK\displaystyle{F_{n}=\bigcup _{{\ell=1}}^{n}K_{\ell}}. If μ(A)\mu(A) is infinite, then limnμ(Fn)\lim _{{n\rightarrow\infty}}\mu(F_{n}) is infinite also. If μ(A)\mu(A) is finite, then so is μ(F)\mu(F), therefore there exists NN such that nNn\geq N implies that μ(Fn)>μ(F)-ε\mu(F_{n})>\mu(F)-\varepsilon.

In both of these cases we see that μ(A)\mu(A) can be approximated by the measure of compact sets contained in AA, which completes the proof of (3.10). ∎

We are now ready to prove the main theorem of this section.

Theorem 3.35.

Given a positive distribution TT there is a unique, positive, regular Borel measure μ\mu on Ω\Omega such that

  1. μ(K)<\mu(K)<\infty for all compact KΩK\subset\Omega, and

  2. for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega) we have

    T(ϕ)=Ωϕ(x)dμ.T(\phi)=\int _{\Omega}\phi(x)\, d\mu.(3.11)

Given such a distribution TT, we have already constructed a positive regular Borel measure μ\mu, which is defined on Ω\Omega and is finite on compact sets, and so it only remains to show (3.11) for all ϕ𝒟(Ω)\phi\in\mathcal{D}(\Omega).

First note that we can reduce to the case of ϕ0\phi\geq 0, since both TT and the integral with respect to μ\mu are linear, and any test function ϕ\phi can be written ϕ=ϕ+-ϕ-\phi=\phi _{+}-\phi _{-} for non-negative test functions ϕ+,ϕ-0\phi _{+},\phi _{-}\geq 0 (Lemma A.13).

For each j,nj,n\in\mathbb{N}, define compact sets Kjn={xΩ:ϕ(x)jn}K_{j}^{n}=\{ x\in\Omega\,:\,\phi(x)\geq\frac{j}{n}\} (these are compact since ϕ\phi is continuous with compact support), and define K0n=supp(ϕ)K_{0}^{n}=\supp(\phi). Let χjn\chi _{j}^{n} be the characteristic function of KjnK_{j}^{n}. Then

ϕ(x)<fn(x):=1nj0χjn(x).\phi(x)<f_{n}(x):=\frac{1}{n}\sum _{{j\geq 0}}\chi _{j}^{n}(x).

Moreover, fn(x)f_{n}(x) converges pointwise to ϕ(x)\phi(x), since fn(x)-ϕ(x)1nf_{n}(x)-\phi(x)\leq\frac{1}{n} for each nn\in\mathbb{N}. Since fn(x)supxΩϕ(x)+1nf_{n}(x)\leq\sup _{{x\in\Omega}}\phi(x)+\frac{1}{n}, and supp(ϕ)\supp(\phi) is compact, then we can construct a function in L1(μ)L^{1}(\mu) that dominates fnf_{n}, and so the dominated convergence theorem shows that

limnΩfn(x)dμ=Ωϕ(x)dμ.\lim _{{n\rightarrow\infty}}\int _{\Omega}f_{n}(x)\, d\mu=\int _{\Omega}\phi(x)\, d\mu.

Therefore it only remains to show that the integral of fnf_{n} with respect to μ\mu converges to T(ϕ)T(\phi). To see this, note that for each ε>0\varepsilon>0 outer regularity of μ\mu shows that we can choose UjnU_{j}^{n} to be an open set containing KjnK_{j}^{n} such that μ(Ujn)<μ(Kjn)+ε\mu(U_{j}^{n})<\mu(K_{j}^{n})+\varepsilon, and use Urysohn’s lemma (Theorem A.11) to find ψjn𝒞cpt(Ω)\psi _{j}^{n}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) such that ψjn1\psi _{j}^{n}\equiv 1 on KjnK_{j}^{n} and suppψjnUjn\supp\psi _{j}^{n}\subset U_{j}^{n}. Then by construction, we have

ϕ(x)<fn(x)1nj0ψjn(x),\phi(x)<f_{n}(x)\leq\frac{1}{n}\sum _{{j\geq 0}}\psi _{j}^{n}(x),

and therefore T(ϕ)1nj0T(ψjn)\displaystyle{T(\phi)\leq\frac{1}{n}\sum _{{j\geq 0}}T(\psi _{j}^{n})}. By the definition of μ\mu on open sets, we also have

1nj0T(ψjn)1nj0μ(Ujn)<1nj0(μ(Kjn)+ε).\frac{1}{n}\sum _{{j\geq 0}}T(\psi _{j}^{n})\leq\frac{1}{n}\sum _{{j\geq 0}}\mu(U_{j}^{n})<\frac{1}{n}\sum _{{j\geq 0}}\left(\mu(K_{j}^{n})+\varepsilon\right).

This is true for all ε>0\varepsilon>0, and so

T(ϕ)1nj0μ(Kjn)=Ωfn(x)dμT(\phi)\leq\frac{1}{n}\sum _{{j\geq 0}}\mu(K_{j}^{n})=\int _{\Omega}f_{n}(x)\, d\mu

for all nn\in\mathbb{N}. Taking the limit as nn\rightarrow\infty gives us T(ϕ)Ωϕ(x)dμ\displaystyle{T(\phi)\leq\int _{\Omega}\phi(x)\, d\mu}.

Similarly, we can approximate ϕ\phi from below by simple functions to obtain the opposite inequality. Since the idea is the same as above then we only sketch the details here.

For each j,nj,n\in\mathbb{N}, let Ojn={xΩ:ϕ(x)>jn}O_{j}^{n}=\{ x\in\Omega\,:\,\phi(x)>\frac{j}{n}\}, and let ξjn\xi _{j}^{n} be the characteristic function of OjnO_{j}^{n}. Then

gn(x):=1nj1ξjn(x)ϕ(x)g_{n}(x):=\frac{1}{n}\sum _{{j\geq 1}}\xi _{j}^{n}(x)\leq\phi(x)

for all nn\in\mathbb{N}, and gng_{n} converges pointwise to ϕ\phi since ϕ(x)-gn(x)1n\phi(x)-g_{n}(x)\leq\frac{1}{n}. Dominated convergence then shows that

limnΩgn(x)dμ=Ωϕ(x)dμ.\lim _{{n\rightarrow\infty}}\int _{\Omega}g_{n}(x)\, d\mu=\int _{\Omega}\phi(x)\, d\mu.

For each j,nj,n, inner regularity of μ\mu implies that we can find a compact set CjnC_{j}^{n} such that CjnOjnC_{j}^{n}\subset O_{j}^{n} and μ(Cjn)>μ(Ojn)-ε\mu(C_{j}^{n})>\mu(O_{j}^{n})-\varepsilon. Then use Urysohn’s lemma to find ψjn𝒞cpt(Ω)\psi _{j}^{n}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) such that ψjn1\psi _{j}^{n}\equiv 1 on CjnC_{j}^{n} and suppψjnOjn\supp\psi _{j}^{n}\subset O_{j}^{n}. Then the same argument as before shows that

T(ϕ)1nj1T(ψjn)1nj1μ(Cjn)1nj1(μ(Ojn)-ε).T(\phi)\geq\frac{1}{n}\sum _{{j\geq 1}}T(\psi _{j}^{n})\geq\frac{1}{n}\sum _{{j\geq 1}}\mu(C_{j}^{n})\geq\frac{1}{n}\sum _{{j\geq 1}}\left(\mu(O_{j}^{n})-\varepsilon\right).

This is true for all ε>0\varepsilon>0, and so

T(ϕ)1nj1μ(Ojn)=Ωgn(x)dμΩϕdμ.T(\phi)\geq\frac{1}{n}\sum _{{j\geq 1}}\mu(O_{j}^{n})=\int _{\Omega}g_{n}(x)\, d\mu\longrightarrow\int _{\Omega}\phi\, d\mu.

Therefore T(ϕ)=Ωϕ(x)dμ\displaystyle{T(\phi)=\int _{\Omega}\phi(x)\, d\mu}, as required. ∎

Remark 3.36.

Recall that fLloc1(Ω)f\in L_{{loc}}^{1}(\Omega) defines a distribution Tf(ϕ)=Ωfϕdx\displaystyle{T_{f}(\phi)=\int _{\Omega}f\phi\, dx}. Conversely, the Radon-Nikodym theorem and the Lebesgue decomposition (Theorems B.28 and B.30 respectively) show that the distribution can be represented by a function in L1(Ω)L^{1}(\Omega) if and only if the measure μ\mu from Theorem 3.35 is absolutely continuous with respect to Lebesgue measure. We have already seen that there exist distributions that cannot be represented by functions in L1(Ω)Lloc1(Ω)L^{1}(\Omega)\subseteq L_{{loc}}^{1}(\Omega), for example the delta functional from Example 2.10. For these distributions, the measure constructed in Theorem 3.35 will not be absolutely continuous with respect to Lebesgue measure, i.e. it will have a non-trivial singular component with respect to the Lebesgue decomposition (B.7).

4 Embedding and compactness theorems

The goal of this section is to state the Sobolev Embedding Theorem and the Rellich-Kondrachov compactness theorem. For now, the proof has been postponed until a future version of these notes. An excellent source for the embedding and compactness theorems is [1], which also contains many examples that show the bounds from the theorems are sharp.

First, we have to define the class of domains under consideration. Given an open subset Ωn\Omega\subseteq\mathbb{R}^{n}, let

Ωδ:={xΩ:dist(x,Ω)<δ}.\Omega _{\delta}:=\left\{ x\in\Omega\,:\,\mathop{\rm dist}\nolimits(x,\partial\Omega)<\delta\right\}.
Definition 4.1.

Let Ω\Omega be an open subset of n\mathbb{R}^{n}. We say that Ω\Omega satisfies the cone condition if there exists a finite cone CC such that each xΩx\in\Omega is the vertex of a finite cone CxC_{x} contained in Ω\Omega and congruent to CC.

We say that Ω\Omega satisfies the uniform cone condition if there exists a locally finite open cover {Uj}\{ U_{j}\} of the boundary of Ω\Omega and a corresponding sequence (Cj)j(C_{j})_{{j\in\mathbb{N}}} of finite cones, each congruent to some fixed finite cone CC, such that

  1. there exists M<M<\infty such that every UjU_{j} has diameter less than MM,

  2. Ωδj=1Uj\displaystyle{\Omega _{\delta}\subset\bigcup _{{j=1}}^{\infty}U_{j}} for some δ>0\delta>0,

  3. QjxΩUj(x+Cj)Ω\displaystyle{Q_{j}\equiv\bigcup _{{x\in\Omega\cap U_{j}}}(x+C_{j})\subset\Omega} for every jj, and

  4. for some R>1R>1, every collection of RR of the sets QjQ_{j} has empty intersection.

Since Ω\Omega is open, then continuously differentiable on Ω\Omega does not imply bounded. For j0j\geq 0, define CBj(Ω)C_{B}^{j}(\Omega) to be the space of functions in Cj(Ω)C^{j}(\Omega) that are bounded and have bounded partial derivatives up to jthj^{{th}} order.

This is a Banach space with norm

fCBj(Ω)=max0|α|jsupxΩ|Dαf(x)|.\left\| f\right\| _{{C_{B}^{j}(\Omega)}}=\max _{{0\leq|\alpha|\leq j}}\sup _{{x\in\Omega}}\left|D^{\alpha}f(x)\right|.

Recall that a linear map T:ABT:A\rightarrow B of normed linear spaces is an embedding if TT is bounded with respect to the norms on AA and BB. Since the elements of Wk,p(Ω)W^{{k,p}}(\Omega) are equivalence classes of functions defined almost everywhere, then the meaning of an inclusion map from Wk,p(Ω)W^{{k,p}}(\Omega) into CBj(Ω)C_{B}^{j}(\Omega) is that each equivalence class in Wk,p(Ω)W^{{k,p}}(\Omega) contains a function in CBj(Ω)C_{B}^{j}(\Omega).

The meaning of an inclusion map from Wk,p(Ω)W^{{k,p}}(\Omega) into Wj,q(Ωk)W^{{j,q}}(\Omega _{k}) (where Ωk\Omega _{k} is the intersection of Ω\Omega with a plane of dimension kk in n\mathbb{R}^{n}) is that each function in Wk,p(Ω)W^{{k,p}}(\Omega) is the limit of a sequence of CC^{\infty} functions (see Section 3.2) and the restriction of these smooth functions to Ωk\Omega _{k} converges to a limit in Wj,q(Ωk)W^{{j,q}}(\Omega _{k}). For the map to be well-defined then this limit needs to be independent of the original choice of sequence, however this is guaranteed if the norm on Wj,q(Ωk)W^{{j,q}}(\Omega _{k}) is bounded by a constant times the norm on Wk,p(Ω)W^{{k,p}}(\Omega) (which always occurs in the cases considered below).

Theorem 4.2 (Sobolev Embedding Theorem).

Let Ωn\Omega\subseteq\mathbb{R}^{n} be an open set satisfying the cone condition, and, for 1kn1\leq k\leq n, let Ωk\Omega _{k} be the intersection of Ω\Omega with a plane of dimension kk in n\mathbb{R}^{n}. Let j0j\geq 0 and m1m\geq 1 be integers, and let 1p<1\leq p<\infty. Then

  1. If either m-np>0m-\frac{n}{p}>0, or m=nm=n and p=1p=1, then

    Wj+m,p(Ω)CBj(Ω)W^{{j+m,p}}(\Omega)\hookrightarrow C_{B}^{j}(\Omega)


    Wj+m,p(Ω)Wj,q(Ωk),Wm,p(Ω)Lq(Ω)forpq<.W^{{j+m,p}}(\Omega)\hookrightarrow W^{{j,q}}(\Omega _{k}),\quad W^{{m,p}}(\Omega)\hookrightarrow L^{q}(\Omega)\quad\text{for}\quad p\leq q<\infty.
  2. If 1kn1\leq k\leq n and m-np=0m-\frac{n}{p}=0, then

    Wj+m,p(Ω)Wj,q(Ωk)forpq<.W^{{j+m,p}}(\Omega)\hookrightarrow W^{{j,q}}(\Omega _{k})\quad\text{for}\quad p\leq q<\infty.
  3. If m-np<0m-\frac{n}{p}<0 and either m-np>-kpm-\frac{n}{p}>-\frac{k}{p}, or p=1p=1 and m-np-kpm-\frac{n}{p}\geq-\frac{k}{p}, then

    Wj+m,p(Ω)Wj,q(Ωk)whenpqandm-np-kq.W^{{j+m,p}}(\Omega)\hookrightarrow W^{{j,q}}(\Omega _{k})\quad\text{when}\quad p\leq q\quad\text{and}\quad m-\frac{n}{p}\geq-\frac{k}{q}.

Note that in each case, it is the quantity m-npm-\frac{n}{p} that determines the allowed embeddings. Increasing this quantity by either (a) giving up more derivatives, or (b) increasing the power pp, allows for “better” embeddings in the following sense: when m-np>0m-\frac{n}{p}>0 then we get an embedding into the space of continuously differentiable functions (the first case above), and when k1-np1>k2-np2k_{1}-\frac{n}{p_{1}}>k_{2}-\frac{n}{p_{2}} then we get an embedding Wk1,p1(Ω)Wk2,p2(Ω)W^{{k_{1},p_{1}}}(\Omega)\hookrightarrow W^{{k_{2},p_{2}}}(\Omega) (the third case above). The same philosophy applies to the compactness theorem below, as well as the Sobolev multiplication theorem (which has been postponed until a future version of the notes).

Theorem 4.3 (Rellich-Kondrachov compactness theorem).

Let Ωn\Omega\subseteq\mathbb{R}^{n} be an open set satisfying the cone condition, let Ω0\Omega _{0} be a bounded open subset of Ω\Omega, and let Ω0k\Omega _{0}^{k} be the intersection of Ω0\Omega _{0} with a kk-dimensional plane in n\mathbb{R}^{n}. Let j0j\geq 0 and m1m\geq 1 be integers, and let 1p<1\leq p<\infty. Then

  1. If m-np>0m-\frac{n}{p}>0 then the following embeddings are compact

    Wj+m,p(Ω)\displaystyle W^{{j+m,p}}(\Omega)CBj(Ω0),\displaystyle\hookrightarrow C_{B}^{j}(\Omega _{0}),
    andWj+m,p(Ω)\displaystyle\text{and}\quad W^{{j+m,p}}(\Omega)Wj,q(Ω0)if1q<.\displaystyle\hookrightarrow W^{{j,q}}(\Omega _{0})\quad\text{if}\quad 1\leq q<\infty.
  2. If m-np0m-\frac{n}{p}\leq 0, then the following embeddings are compact

    Wj+m,p(Ω)\displaystyle W^{{j+m,p}}(\Omega)Wj,q(Ω0k)if0>m-np>-kp,q1,andm-np>-kq.\displaystyle\hookrightarrow W^{{j,q}}(\Omega _{0}^{k})\quad\text{if}\quad 0>m-\frac{n}{p}>-\frac{k}{p},\quad q\geq 1,\quad\text{and}\quad m-\frac{n}{p}>-\frac{k}{q}.
    Wj+m,p(Ω)\displaystyle W^{{j+m,p}}(\Omega)Wj,q(Ω0k)ifm-np=0and1q<.\displaystyle\hookrightarrow W^{{j,q}}(\Omega _{0}^{k})\quad\text{if}\quad m-\frac{n}{p}=0\quad\text{and}\quad 1\leq q<\infty.

Appendix A Notation and basic definitions

A.1 LpL^{p} spaces and LlocpL_{{loc}}^{p} spaces

The spaces Lp(Ω)L^{p}(\Omega) and Llocp(Ω)L_{{loc}}^{p}(\Omega) form the basis for the definition of the Sobolev spaces Wk,p(Ω)W^{{k,p}}(\Omega) and Wlock,p(Ω)W_{{loc}}^{{k,p}}(\Omega) in Definition 2.16, and so we review some of their basic properties here.

Definition A.1.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be an open set, let 0<p<0<p<\infty, and let Ω\mathcal{F}_{\Omega} denote the space of Lebesgue measurable functions on Ω\Omega. Define

Lp(Ω)={fΩ:Ω|f|pdx<}/L^{p}(\Omega)=\left\{ f\in\mathcal{F}_{\Omega}\,:\,\int _{\Omega}|f|^{p}\, dx<\infty\right\}/\sim

where fgf\sim g if f=gf=g almost everywhere. When p=p=\infty, define

L(Ω)={fΩ:esssupΩf<}/,L^{\infty}(\Omega)=\left\{ f\in\mathcal{F}_{\Omega}\,:\,\esssup _{\Omega}f<\infty\right\}/\sim,


esssupΩf=inf{α:|{xΩ:f(x)>α}|=0}\esssup _{\Omega}f=\inf\left\{\alpha\,:\,\left|\{ x\in\Omega\,:\, f(x)>\alpha\}\right|=0\right\}

is the essential supremum of ff on Ω\Omega.

It is well-known that when 1p1\leq p\leq\infty the spaces Lp(Ω)L^{p}(\Omega) are Banach spaces with the norm fLp(Ω)=(Ω|f|p)1p\displaystyle{\| f\| _{{L^{p}(\Omega)}}=\left(\int _{\Omega}|f|^{p}\right)^{{\frac{1}{p}}}} (see for example [12, Theorem 3.11] or [15, Theorem 8.14]).

Definition A.2.

Let 1<p<1<p<\infty. The conjugate exponent of pp is the real number 1<q<1<q<\infty such that


If p=1p=1 then the conjugate exponent of pp is q=q=\infty, and if p=p=\infty then the conjugate exponent of pp is q=1q=1.

One of the most important inequalities for LpL^{p} spaces is Hölder’s inequality. For a proof, see for example [12, Theorem 3.5]

Theorem A.3 (Hölder’s inequality).

Let Ωn\Omega\subseteq\mathbb{R}^{n} be open, and let fLp(Ω)f\in L^{p}(\Omega), gLq(Ω)g\in L^{q}(\Omega), where pp and qq are conjugate exponents. Then

fgL1(Ω)fLp(Ω)gLq(Ω).\| fg\| _{{L^{1}(\Omega)}}\leq\| f\| _{{L^{p}(\Omega)}}\| g\| _{{L^{q}(\Omega)}}.

The following theorem characterises the dual space of Lp(Ω)L^{p}(\Omega). It is also well-known, for a proof see for example [2, Chapter IV], [8, Theorem 2.14], or [15, Theorem 10.44].

Theorem A.4.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be open, let 1p<1\leq p<\infty, and let qq be the conjugate exponent of pp. Then

(Lp(Ω))*Lq(Ω).(L^{p}(\Omega))^{*}\cong L^{q}(\Omega).
Remark A.5.
  1. The isomorphism Lq(Ω)Lp(Ω)*L^{q}(\Omega)\cong L^{p}(\Omega)^{*} has an explicit form

    Lq(Ω)f(Tf:gΩf(x)g(x)dx)Lp(Ω)*L^{q}(\Omega)\ni f\mapsto\left(T_{f}:g\mapsto\int _{\Omega}f(x)g(x)\, dx\right)\in L^{p}(\Omega)^{*}
  2. It is not true that L(Ω)*L1(Ω)L^{\infty}(\Omega)^{*}\cong L^{1}(\Omega), since, for any xΩx\in\Omega, the Hahn-Banach theorem shows that the delta functional δx(f)=f(x)\delta _{x}(f)=f(x) defined on 𝒞(Ω)\mathcal{C}^{\infty}(\Omega) extends to a bounded linear functional (call it δ~x\tilde{\delta}_{x}) on L(Ω)L^{\infty}(\Omega). A similar argument to Example 2.10 shows that δ~x\tilde{\delta}_{x} cannot be represented by a function in L1(Ω)L^{1}(\Omega), i.e. there is no gL1(Ω)g\in L^{1}(\Omega) such that

    δ~x(f)=Ωfgdx\tilde{\delta}_{x}(f)=\int _{\Omega}fg\, dx

    for all fL(Ω)f\in L^{\infty}(\Omega).

More generally, this theorem is true for any σ\sigma-finite measure space (see [15, pp182-185]). Since the proof uses the Radon-Nikodym theorem then the result may not be true if the measure is not σ\sigma-finite (see Appendix B for the relevant definitions and statements of the theorems). The following example illustrates this for a simple case.

Example A.6.

Let Σ={,X}\Sigma=\{\emptyset,X\} be the trivial σ\sigma-algebra on a space XX, and let μ\mu be the measure μ(X)=\mu(X)=\infty, μ()=0\mu(\emptyset)=0. Then the measurable functions f:Xf:X\rightarrow\mathbb{C} are constants, and so we see that L1(X,dμ){0}L^{1}(X,d\mu)\cong\{ 0\} consists of only the zero function. Therefore the dual is L1(X,dμ)*{0}L^{1}(X,d\mu)^{*}\cong\{ 0\}, however L(X,dμ)L^{\infty}(X,d\mu)\cong\mathbb{C}, and so the dual of L1(X,dμ)L^{1}(X,d\mu) is not isomorphic to L(X,dμ)L^{\infty}(X,d\mu) in this case.

Next, we define the space Llocp(Ω)L_{{loc}}^{p}(\Omega), which consists of locally integrable functions, in the sense that their integral is finite on compact sets.

Definition A.7.

Let Ω\Omega be an open set in n\mathbb{R}^{n}, and let \mathcal{F} be the set of Lebesgue measurable functions on Ω\Omega. Then

Llocp(Ω)={f:fLp(K)for all compact }.L_{{loc}}^{p}(\Omega)=\left\{ f\in\mathcal{F}\,:\, f\in L^{p}(K)\,\,\text{for all compact $K\subset\Omega$}\right\}.
Remark A.8.

One can easily extend this definition to arbitrary measure spaces that also have a topology (and hence a notion of compactness).

Clearly we have an inclusion Lp(Ω)Llocp(Ω)L^{p}(\Omega)\hookrightarrow L_{{loc}}^{p}(\Omega). The following examples show that this is not surjective.

Example A.9.
  1. Ω=n\Omega=\mathbb{R}^{n}. Let f1f\equiv 1, and note that

    K|f(x)|pdx=m(K)<,\int _{K}|f(x)|^{p}\, dx=m(K)<\infty,

    where m(K)m(K) denotes the Lebesgue measure of KK. Therefore fLlocp(Ω)f\in L_{{loc}}^{p}(\Omega) for any p>0p>0, even though fLp(Ω)f\notin L^{p}(\Omega).

  2. Ω=(0,ε)\Omega=(0,\varepsilon)\subset\mathbb{R}. Let f(x)=1xf(x)=\frac{1}{x}, and note that ff is bounded on any compact subset K(0,ε)K\subset(0,\varepsilon). Therefore K|f(x)|pdx<\displaystyle{\int _{K}|f(x)|^{p}\, dx<\infty}, and so fLlocp(Ω)f\in L_{{loc}}^{p}(\Omega), but fLp(Ω)f\notin L^{p}(\Omega) for any p1p\geq 1. (Note that we can extend this to any p>0p>0 by choosing f(x)=1xnf(x)=\frac{1}{x^{n}} for some nn, or even a function that grows faster at the origin, such as f(x)=exp(1x2)f(x)=\exp(\frac{1}{x^{2}}).)

The spaces LpL^{p} and LlocpL_{{loc}}^{p} for 0<p<10<p<1 have radically different properties to those described above for other values of pp. These properties are discussed further in [13, pp35-36].

A.2 Integration by parts

Since we use integration by parts on open subsets of n\mathbb{R}^{n} in Section 2.2, then we recall the formula here. See [4, Appendix C.1] for a more complete description.

Given an open set Ωn\Omega\subset\mathbb{R}^{n} with a C1C^{1} boundary, define


to be the outward pointing normal at each point of the boundary Ω\partial\Omega, and let dSdS denote the volume element on the boundary.

Theorem A.10 (Integration by parts).

Let Ω\Omega be a bounded open subset of n\mathbb{R}^{n} with a C1C^{1} boundary, and let u,vC1(Ω¯)u,v\in C^{1}(\bar{\Omega}). Then for all i=1,,ni=1,\ldots,n we have

Ωxiuvdx=-Ωuxivdx+ΩuvνidS.\int _{\Omega}\partial _{{x_{i}}}u\, v\, dx=-\int _{\Omega}u\,\partial _{{x_{i}}}v\, dx+\int _{{\partial\Omega}}uv\nu^{i}\, dS.

If uu has compact support in Ω\Omega, then the boundary term disappears, and we have

Ωxiuvdx=-Ωuxivdx.\int _{\Omega}\partial _{{x_{i}}}u\, v\, dx=-\int _{\Omega}u\,\partial _{{x_{i}}}v\, dx.

A.3 The smooth Urysohn lemma and partitions of unity

The goal of this section is to give some consequences of the smooth Urysohn lemma and the existence of partitions of unity on open subsets of n\mathbb{R}^{n}.

Theorem A.11.

Let Ωn\Omega\subset\mathbb{R}^{n} be an open set, let KΩK\subset\Omega be compact, and let UU be an open set such that KUΩK\subset U\subset\Omega. Then there exists a smooth function f𝒞(Ω)f\in\mathcal{C}^{\infty}(\Omega) such that 0f(x)10\leq f(x)\leq 1 for all xΩx\in\Omega, f1f\equiv 1 on KK, and f0f\equiv 0 on ΩU\Omega\setminus U. Moreover, there also exists f𝒞cpt(Ω)f\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) such that 0f(x)10\leq f(x)\leq 1 for all xΩx\in\Omega, and f1f\equiv 1 on KK.

See [3, Theorem 2.6.1] for a proof.

As a consequence of Urysohn’s lemma, we have the following useful results.

Lemma A.12.

Let U1,U2U_{1},U_{2} be open sets in n\mathbb{R}^{n}, and let KU1U2K\subset U_{1}\cup U_{2} be compact. Then there exist non-negative functions ϕ1,ϕ2\phi _{1},\phi _{2} which are smooth on U1U2U_{1}\cup U_{2} and satisfy

  1. ϕ1(x)+ϕ2(x)=1\phi _{1}(x)+\phi _{2}(x)=1 for all xKx\in K,

  2. ϕ1𝒞cpt(U1)\phi _{1}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1}) and ϕ2𝒞cpt(U2)\phi _{2}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{2}).


Urysohn’s lemma shows that there exists ϕ𝒞cpt(U1U2)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1}\cup U_{2}) such that 0ϕ10\leq\phi\leq 1 and ϕ1\phi\equiv 1 on KK. Apply Urysohn’s lemma again to find ϕ~1𝒞cpt(U1)\tilde{\phi}_{1}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1}) such that 0ϕ~110\leq\tilde{\phi}_{1}\leq 1 and ϕ~11\tilde{\phi}_{1}\equiv 1 on a neighbourhood of the compact set supp(ϕ)U2c\supp(\phi)\cap U_{2}^{c}. Let ϕ1=ϕϕ~1\phi _{1}=\phi\cdot\tilde{\phi}_{1}, and note that

  1. ϕ𝒞cpt(U1)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{1}),

  2. 0ϕ1ϕ0\leq\phi _{1}\leq\phi,

  3. ϕ11\phi _{1}\equiv 1 on KU2cK\cap U_{2}^{c},

  4. supp(ϕ1)supp(ϕ)\supp(\phi _{1})\subseteq\supp(\phi) and ϕ1=ϕ\phi _{1}=\phi on U2cU_{2}^{c}.

Define ϕ2=ϕ-ϕ1\phi _{2}=\phi-\phi _{1}, and note that

  1. 0ϕ210\leq\phi _{2}\leq 1,

  2. ϕ2𝒞cpt(U2)\phi _{2}\in\mathcal{C}_{\textup{cpt}}^{\infty}(U_{2}), and

  3. ϕ1+ϕ21\phi _{1}+\phi _{2}\equiv 1 on KK.

Therefore ϕ1\phi _{1} and ϕ2\phi _{2} satisfy the stated conditions. ∎

Any smooth function can be written as the difference of two non-negative continuous functions, just by taking the positive and negative parts of the original function. The next lemma shows that a smooth function can also be written as the difference of two non-negative smooth functions.

Lemma A.13.

Let ϕ𝒞cpt(Ω)\phi\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega). Then there exist functions ϕ+,ϕ-𝒞cpt(Ω)\phi _{+},\phi _{-}\in\mathcal{C}_{\textup{cpt}}^{\infty}(\Omega) such that ϕ+(x)0\phi _{+}(x)\geq 0 and ϕ-(x)0\phi _{-}(x)\geq 0 for all xΩx\in\Omega and ϕ(x)=ϕ+(x)-ϕ-(x)\phi(x)=\phi _{+}(x)-\phi _{-}(x) for all xΩx\in\Omega.


Using Urysohn’s lemma, construct a non-negative smooth function ψ\psi such that ψ(x)supxΩ|ϕ(x)|\psi(x)\equiv\sup _{{x\in\Omega}}|\phi(x)| on supp(ϕ)\supp(\phi), and supp(ψ)Ω\supp(\psi)\subset\Omega. Then both ψ\psi and ψ-ϕ\psi-\phi are non-negative smooth functions, and so we can define ϕ+=ψ\phi _{+}=\psi, ϕ-=ψ-ϕ\phi _{-}=\psi-\phi. ∎

A.4 A corollary of Rademacher’s theorem

Rademacher’s theorem states that a Lipschitz function f:Ωnmf:\Omega\subseteq\mathbb{R}^{n}\rightarrow\mathbb{R}^{m} is differentiable almost everywhere (with respect to Lebesgue measure). This is used in Example 2.21 as part of the proof that a Lipschitz continuous function is in Wloc1,(Ω)W_{{loc}}^{{1,\infty}}(\Omega). The actual statement used in Example 2.21 is that the partial derivatives of ff exist almost everywhere (a slightly weaker statement than Rademacher’s theorem). The purpose of this section is to recall the basic definitions and state the theorem. A proof of Rademacher’s theorem can be found in [5].

First, recall the following definition.

Definition A.14.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be an open set. A function f:Ωmf:\Omega\rightarrow\mathbb{R}^{m} is locally Lipschitz continuous on Ω\Omega if for every xΩx\in\Omega there exists a constant C(x)C(x) and a neighbourhood UU of xx such that the following inequality is satisfied

|f(x)-f(y)|C(x)|x-y|.\left|f(x)-f(y)\right|\leq C(x)|x-y|.(A.1)

If there exists a uniform constant CC such that |f(x)-f(y)|C|x-y|\left|f(x)-f(y)\right|\leq C|x-y| for all x,yΩx,y\in\Omega then we say that ff is uniformly Lipschitz on Ω\Omega. The smallest value of the constant CC is called the Lipschitz constant

Lip(f)=supx,yΩ|f(x)-f(y)||x-y|.\Lip(f)=\sup _{{x,y\in\Omega}}\frac{\left|f(x)-f(y)\right|}{|x-y|}.(A.2)
Theorem A.15.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be open, and let f:Ωf:\Omega\rightarrow\mathbb{R} be locally Lipschitz on Ω\Omega. Then ff is differentiable almost everywhere in Ω\Omega (with respect to the Lebesgue measure on n\mathbb{R}^{n}).

For a proof, see [5, Section 3.1.2].

Remark A.16.

Uniformly Lipschitz implies absolutely continuous, and so we know that the theorem is true for functions f:Ωf:\Omega\subseteq\mathbb{R}\rightarrow\mathbb{R} with one-dimensional domains by general theory of absolutely continuous functions (see for example [15, Theorem 7.29]). This fact is used in the proof for n2n\geq 2, however some further analysis is also necessary (see [5, Section 3.1.2] for the details).

As a consequence of this, we have the following

Corollary A.17.

Let Ω\Omega be an open subset of n\mathbb{R}^{n}, and let f:Ωf:\Omega\rightarrow\mathbb{R} be a locally Lipschitz function. Then the partial derivatives of ff exist almost everywhere on Ω\Omega (with respect to the Lebesgue measure on n\mathbb{R}^{n}).

Appendix B Basic results from measure theory

Since Section 3.4 deals with measure, then, for completeness, here we review the basic definitions. In particular, this includes the definition of a complex measure. Since these notes assume knowledge of Lebesgue integration and the basic theorems associated to this (monotone convergence, dominated convergence, etc.) then this is not included here, the purpose is just to recall the important definitions that are used elsewhere in the notes. Of course, there are already good sources for this material such as [12], [15], [7], [10] (and many more!), so only the material relevant to the rest of the notes is covered in this section. Examples are included wherever possible in order to clarify the theory.

Since we want to deal with sets of infinite measure (n\mathbb{R}^{n} is the standard example), then first we have to define arithmetic in {-,+}\mathbb{R}\cup\{-\infty,+\infty\}. This is an extension of the usual operations of addition and multiplication on \mathbb{R}, together with the following definitions.

a=a={if 0if a\cdot\infty=\infty\cdot a=\left\{\begin{matrix}\infty&\text{if $a\in\mathbb{R}\setminus\{ 0\}$}\\ 0&\text{if $a=0$}\end{matrix}\right.

Care must be taken when cancelling terms from an equation: a+b=a+ca+b=a+c implies b=cb=c only if aa\in\mathbb{R}, and ab=acab=ac only if aa\in\mathbb{R}. The consequence of these definitions is that the integral of any function over a set of measure zero will be zero, and the integral of the zero function over any set will also be zero.

Definition B.1.

A collection Σ\Sigma of subsets of a set XX is a σ\sigma-algebra on XX if all of the following hold.

  1. XΣX\in\Sigma.

  2. If AΣA\in\Sigma, then XAΣX\setminus A\in\Sigma.

  3. If AnΣA_{n}\in\Sigma for all nn\in\mathbb{N}, then nAnΣ\displaystyle{\bigcup _{{n\in\mathbb{N}}}A_{n}\in\Sigma}.

Example B.2.
  1. The set of all subsets of XX forms a σ\sigma-algebra.

  2. Σ={,X}\Sigma=\{\emptyset,X\} is a σ\sigma-algebra, called the trivial σ\sigma-algebra on XX.

  3. The set of all subsets of \mathbb{R} that are open is not a σ\sigma-algebra, since the complement of an open set is not necessarily open.

  4. The set of all subsets of n\mathbb{R}^{n} that are either open or closed is not a σ\sigma-algebra, since it is not closed under the operation of countable unions. For example, in the case n=1n=1

    [a,b)=n[a,b-1n][a,b)=\bigcup _{{n\in\mathbb{N}}}\left[a,b-\frac{1}{n}\right]

    is neither open nor closed.

  5. If Σ\Sigma is a σ\sigma-algebra on XX and UXU\subset X, then the collection

    ΣU={AU:AΣ}\Sigma _{U}=\{ A\cap U\,:\, A\in\Sigma\}

    is a σ\sigma-algebra on UU.

Since open and closed subsets of n\mathbb{R}^{n} are of fundamental importance, then it would be useful to have a σ\sigma-algebra that contains all of these sets. The σ\sigma-algebra of all subsets of n\mathbb{R}^{n} is too large for interesting measures to exist (see [10, Section 5] for more insight into why this is the case), so it would also be useful for this σ\sigma-algebra to have some minimality property, i.e. it is the “smallest” σ\sigma-algebra that contains all of the open and closed subsets of \mathbb{R}. The next theorem shows that such a σ\sigma-algebra exists.

Theorem B.3.

Let \mathcal{F} be a collection of subsets of a set XX. Then there exists a unique σ\sigma-algebra on XX, call it Σ\Sigma, such that

  1. Σ\mathcal{F}\subseteq\Sigma, and

  2. any other σ\sigma-algebra Σ\Sigma^{{\prime}} containing \mathcal{F} satisfies ΣΣ\Sigma\subseteq\Sigma^{{\prime}} (i.e. Σ\Sigma is the smallest σ\sigma-algebra containing \mathcal{F}).

This σ\sigma-algebra is called the σ\sigma-algebra generated by \mathcal{F}.

Proof of Theorem B.3.

Consider the family of all σ\sigma-algebras on XX that contain \mathcal{F}. Since the set of all subsets of XX is a σ\sigma-algebra containing \mathcal{F}, then this family is non-empty. We claim that the intersection Σ\Sigma of all σ\sigma-algebras containing \mathcal{F} is also a σ\sigma-algebra, and the result will then follow, since such a σ\sigma-algebra clearly satisfies both of the conditions of the theorem.

Firstly note that the set XX is in every σ\sigma-algebra containing \mathcal{F}, and so XΣX\in\Sigma also. If a subset AXA\subseteq X is in Σ\Sigma, then it is in every σ\sigma-algebra containing \mathcal{F}, and so XAX\setminus A is in every σ\sigma-algebra containing \mathcal{F}, therefore XAΣX\setminus A\in\Sigma also. Therefore it only remains to check that Σ\Sigma is closed under countable unions. To see this, let {An}n\{ A_{n}\} _{{n\in\mathbb{N}}} be a countable collection of sets in Σ\Sigma. Then {An}nΣ\{ A_{n}\} _{{n\in\mathbb{N}}}\subseteq\Sigma^{{\prime}} for every σ\sigma-algebra Σ\Sigma^{{\prime}} containing \mathcal{F}, and so A:=nAnΣA:=\displaystyle{\bigcup _{{n\in\mathbb{N}}}A_{n}\in\Sigma^{{\prime}}} also. Therefore AΣA\in\Sigma, and we have shown that Σ\Sigma is a σ\sigma-algebra. ∎

Definition B.4.

The Borel σ\sigma-algebra \mathcal{B} on n\mathbb{R}^{n} is the smallest σ\sigma-algebra that contains the collection of open and closed subsets of n\mathbb{R}^{n}. The sets in \mathcal{B} are called the Borel subsets of n\mathbb{R}^{n}.

Definition B.5.

Let Σ\Sigma be a σ\sigma-algebra on a set XX. A positive measure on Σ\Sigma is a function μ:Σ[0,]\mu:\Sigma\rightarrow[0,\infty] such that

μ(nAn)=nμ(An)\mu\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)=\sum _{{n\in\mathbb{N}}}\mu(A_{n})(B.1)

for any disjoint collection {An}nΣ\{ A_{n}\} _{{n\in\mathbb{N}}}\subseteq\Sigma. A function μ\mu satisfying (B.1), but without the restriction that the range is [0,][0,\infty], is called a countably additive set function.

A complex measure on Σ\Sigma is a countably additive function μ:Σ\mu:\Sigma\rightarrow\mathbb{C} (see [12, Chapter 6] for more about complex measures).

Definition B.6.

A measure space (X,Σ,μ)(X,\Sigma,\mu) consists of a set XX, a σ\sigma-algebra Σ\Sigma of subsets of XX, and a measure μ\mu on Σ\Sigma.

A measure space (X,Σ,μ)(X,\Sigma,\mu) is finite if μ(X)\mu(X) is finite. A measure space (X,Σ,μ)(X,\Sigma,\mu) is σ\sigma-finite if XX is the countable union of sets XnΣX_{n}\in\Sigma with μ(Xn)\mu(X_{n}) finite for each nn\in\mathbb{N}.

Remark B.7.

A σ\sigma-finite measure is the countable sum of finite measures. To see this, let (X,Σ,μ)(X,\Sigma,\mu) be σ\sigma-finite, with X=nXn\displaystyle{X=\bigcup _{{n\in\mathbb{N}}}X_{n}} and μ(Xn)\mu(X_{n}) finite for each nn\in\mathbb{N}. Define measures

μn(E)=μ(EXn),\mu _{n}(E)=\mu(E\cap X_{n}),

and note that μ(E)=nμn(E)\mu(E)=\displaystyle{\sum _{{n\in\mathbb{N}}}\mu _{n}(E)}.

An important class of measures on n\mathbb{R}^{n} are those defined on the Borel σ\sigma-algebra.

Definition B.8.

A Borel measure on n\mathbb{R}^{n} is a measure defined on the Borel σ\sigma-algebra.

A Borel measure μ\mu is inner regular (resp. outer regular) if for every EnE\subseteq\mathbb{R}^{n} we have

μ(E)=sup{μ(K): compact and }(resp. μ(E)=inf{μ(U): open and })\mu(E)=\sup\{\mu(K)\,:\,\text{$K$ compact and $K\subset E$}\}\quad\text{(resp. }\,\mu(E)=\inf\{\mu(U)\,:\,\text{$U$ open and $E\subset U$}\}).

If a Borel measure μ\mu is both inner and outer regular, then we say that μ\mu is Borel regular.

Remark B.9.
  1. Again, this notion can be extended to measures on a locally compact Hausdorff space (see [12]).

  2. The definition of a Borel regular measure given above is equivalent to the requirement that every measurable set EE has the same measure as some Borel sets B1EB_{1}\supseteq E and B2EB_{2}\subseteq E. To see this, let B1B_{1} be the intersection of open sets {Un}n\{ U_{n}\} _{{n\in\mathbb{N}}} such that UnEU_{n}\supset E and μ(UnE)<1n\mu(U_{n}\setminus E)<\frac{1}{n}, and let B2B_{2} be the union of compact sets {Kn}n\{ K_{n}\} _{{n\in\mathbb{N}}} such that KnEK_{n}\subset E and μ(EKn)<1n\mu(E\setminus K_{n})<\frac{1}{n}.

A useful way to construct a measure with certain desired properties is to start with an outer measure. For example, Lebesgue measure and Hausdorff measure can both be constructed using outer measures (see also [12] for a construction of Lebesgue measure that doesn’t use outer measure), and the construction in Section 3.4 of a measure associated to a distribution also uses outer measure.

Definition B.10.

A function μ*:P(X)[0,]\mu^{*}:\mathcal{P}(X)\rightarrow[0,\infty] defined on the power set P(X)\mathcal{P}(X) of a space XX is called an outer measure on XX if it satisfies all of the following.

  1. μ*(A)0\mu^{*}(A)\geq 0, μ*()=0\mu^{*}(\emptyset)=0.

  2. μ*(A1)μ*(A2)\mu^{*}(A_{1})\leq\mu^{*}(A_{2}) if A1A2A_{1}\subseteq A_{2}.

  3. μ*(nAn)nμ*(An)\displaystyle{\mu^{*}\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)\leq\sum _{{n\in\mathbb{N}}}\mu^{*}(A_{n})} for any countable collection of sets {An}nP(X)\{ A_{n}\} _{{n\in\mathbb{N}}}\subset\mathcal{P}(X).

The point of studying outer measures is that it is easy to construct an outer measure with certain properties.

To get a feel for outer measure we recall here two main examples: Lebesgue outer measure and Hausdorff outer measure.

Example B.11 (Lebesgue outer measure).

The Lebesgue outer measure is defined on compact rectangular subsets



m*(R):=j=1n(bj-aj).m^{*}(R):=\prod _{{j=1}}^{n}(b_{j}-a_{j}).

For an arbitrary subset EnE\subseteq\mathbb{R}^{n}, consider the collection K\mathcal{K} of all countable covers {Rn}n\{ R_{n}\} _{{n\in\mathbb{N}}} of EE by rectangular sets RnR_{n}, and define

m*(E):=infKnm*(R).m^{*}(E):=\inf _{{\mathcal{K}}}\sum _{{n\in\mathbb{N}}}m^{*}(R).(B.2)

It is easy to check that this definition satisfies the conditions of an outer measure (see for example [15, Theorems 3.3 & 3.4]).

The next theorem is a useful characterisation of the Lebesgue outer measure.

Theorem B.12.

Let EnE\subseteq\mathbb{R}^{n}. Then for each ε>0\varepsilon>0, there exists an open set UnU\subseteq\mathbb{R}^{n} such that EUE\subset U and m*(U)m*(E)+εm^{*}(U)\leq m^{*}(E)+\varepsilon.

In particular, we have

m*(E)=inf{m*(U): open and }.m^{*}(E)=\inf\{ m^{*}(U)\,:\,\text{$U$ open and $E\subset U$}\}.

For a proof, see [15, Theorem 3.6].

The Hausdorff outer measure and the associated Hausdorff measure are useful for studying certain subsets of n\mathbb{R}^{n} that have Lebesgue measure zero. For example, the dd-dimensional Hausdorff measure of a dd-dimensional ball in n\mathbb{R}^{n} is non-trivial, even though the Lebesgue measure is zero. Furthermore, the Hausdorff measure can be used to distinguish fractal sets (sets of fractional Hausdorff dimension), and the study of the properties of Hausdorff measure is a major component of Geometric Measure Theory (see [5], [6], [9]).

Example B.13 (Hausdorff outer measure).

The diameter of a set EnE\subseteq\mathbb{R}^{n} is defined to be

δ(E):=supx,yE|x-y|.\delta(E):=\sup _{{x,y\in E}}|x-y|.

Fix m>0m>0 (not necessarily an integer), and let EnE\subseteq\mathbb{R}^{n}. Let Kε\mathcal{K}_{\varepsilon} denote the collection of countable covers {En}n\{ E_{n}\} _{{n\in\mathbb{N}}} of EE such that δ(En)<ε\delta(E_{n})<\varepsilon for each nn\in\mathbb{N}. Given ε>0\varepsilon>0, define

Hαε(E)=infKεnδ(En)α.\mathcal{H}_{\alpha}^{\varepsilon}(E)=\inf _{{\mathcal{K}_{\varepsilon}}}\sum _{{n\in\mathbb{N}}}\delta(E_{n})^{\alpha}.

If ε1<ε2\varepsilon _{1}<\varepsilon _{2}, then Kε1Kε2\mathcal{K}_{{\varepsilon _{1}}}\subset\mathcal{K}_{{\varepsilon _{2}}}, and so Hαε1(E)Hαε2(E)\mathcal{H}_{\alpha}^{{\varepsilon _{1}}}(E)\geq\mathcal{H}_{\alpha}^{{\varepsilon _{2}}}(E). Therefore

Hα(E):=limε0Hαε(E)exists.\mathcal{H}_{\alpha}(E):=\lim _{{\varepsilon\rightarrow 0}}\mathcal{H}_{\alpha}^{\varepsilon}(E)\quad\text{exists}.

Hα(E)\mathcal{H}_{\alpha}(E) is called the α\alpha-dimensional Hausdorff outer measure of EE. Again, it is easy to check that this is an outer measure (see for example [15, Theorem 11.12]).

The next definition and theorem show that each outer measure has an associated σ\sigma-algebra and that the restriction of the outer measure to this σ\sigma-algebra is a measure.

Definition B.14.

Let μ*\mu^{*} be an outer measure on XX. A subset EXE\subset X is μ*\mu^{*}-measurable if and only if

μ*(A)=μ*(AE)+μ*(A(AE))\mu^{*}(A)=\mu^{*}(A\cap E)+\mu^{*}(A\setminus(A\cap E))(B.3)

for every subset AXA\subseteq X.

Remark B.15.
  1. An equivalent definition is that EE is μ*\mu^{*}-measurable if and only if

    μ*(A1A2)=μ*(A1)+μ*(A2)\mu^{*}(A_{1}\cup A_{2})=\mu^{*}(A_{1})+\mu^{*}(A_{2})(B.4)

    whenever A1EA_{1}\subseteq E and A2XEA_{2}\subseteq X\setminus E. To see that the first definition implies the second, given any sets A1EA_{1}\subseteq E and A2EXA_{2}\subseteq E\setminus X, let A=A1A2A=A_{1}\cup A_{2}. Clearly (B.3) implies (B.4). Conversely, given any set AXA\subseteq X, let A1=AEA_{1}=A\cap E and A2=A(AE)A_{2}=A\setminus(A\cap E). Clearly these satisfy the requirements A1EA_{1}\subseteq E and A2XEA_{2}\subseteq X\setminus E, and we have A=A1A2A=A_{1}\cup A_{2}. Again, it is clear that (B.4) implies (B.3).

  2. Both of these definitions have the same basic idea: the μ*\mu^{*}-measurable subsets of XX are those for which μ*\mu^{*} is additive on arbitrary decompositions into disjoint subsets.

The next theorem justifies the use of the term “measurable” in the previous definition.

Theorem B.16 (Caratheodory).

Let μ*\mu^{*} be an outer measure on XX. Then the collection of μ*\mu^{*}-measurable subsets of XX forms a σ\sigma-algebra, and the restriction of μ*\mu^{*} to this σ\sigma-algebra is a measure.

For a proof, see for example [8, Theorem 1.15].

Remark B.17.

This theorem is used in Section 3.4 to construct the measure associated to a positive distribution.

Definition B.18.
  1. The Lebesgue measure, denoted m()m(\cdot), is the measure associated to the Lebesgue outer measure from Example B.11.

  2. The α\alpha-dimensional Hausdorff measure, denoted Hα\mathcal{H}_{\alpha}, is the measure associated to the α\alpha-dimensional Hausdorff outer measure from Example B.13.

Remark B.19.

Open and closed sets are Lebesgue measurable, and therefore the σ\sigma-algebra of Lebesgue measurable sets contains the Borel σ\sigma-algebra.

Using this definition of Lebesgue measure, together with Theorem B.12, we see that Lebesgue measure is Borel outer regular.

Lemma B.20.

For any Lebesgue measurable set EnE\subseteq\mathbb{R}^{n} we have

m(E)=inf{m(U): open and }.m(E)=\inf\{ m(U)\,:\,\text{$U$ open and $E\subset U$}\}.

The proof follows by restricting the result of Theorem B.12 to the σ\sigma-algebra of Lebesgue-measurable sets. By taking complements, we also see that EE is Borel inner regular.

Lemma B.21.

For any Lebesgue measurable set EnE\subseteq\mathbb{R}^{n} we have

m(E)=sup{m(K): compact and }.m(E)=\sup\{ m(K)\,:\,\text{$K$ compact and $K\subset E$}\}.

This is a consequence of [15, Lemma 3.22], which states that EE is measurable if and only if for all ε>0\varepsilon>0 there exists a closed set FEF\subset E such that m(EF)<εm(E\setminus F)<\varepsilon. The lemma above then follows by taking a sequence of compact sets Kn=FB(0,n)¯K_{n}=F\cap\overline{B(0,n)}.

A natural question arising from Theorem B.3 is whether two measures that agree on the Borel subsets of n\mathbb{R}^{n} also agree on the Borel σ\sigma-algebra (the minimal σ\sigma-algebra generated by the Borel subsets). This question is answered in more generality by the Caratheodory-Hahn Extension Theorem, for which we first need the following definitions.

Definition B.22.

An algebra of subsets of XX is a non-empty collection 𝒜\mathcal{A} of subsets of XX that is closed under the operations of taking complements and finite unions.

Note that, as a consequence, an algebra is also closed under finite intersections, and therefore both XX and the empty set are both in 𝒜\mathcal{A}. The difference between this definition and that of a σ\sigma-algebra is that a σ\sigma-algebra is also closed under countable unions. For example, the set of all open and closed subsets of n\mathbb{R}^{n} is an algebra, but not a σ\sigma-algebra.

Definition B.23.

A measure on an algebra 𝒜\mathcal{A} is a function μ:𝒜[0,]\mu:\mathcal{A}\rightarrow[0,\infty] such that μ()=0\mu(\emptyset)=0, and

μ(nAn)=nμ(An)\mu\left(\bigcup _{{n\in\mathbb{N}}}A_{n}\right)=\sum _{{n\in\mathbb{N}}}\mu(A_{n})

whenever {An}\{ A_{n}\} is a countable collection of disjoint sets in 𝒜\mathcal{A} whose union also belongs to 𝒜\mathcal{A}.

Given a measure on an algebra 𝒜\mathcal{A}, we can construct an outer measure μ*\mu^{*} on XX as follows. For each subset AXA\subset X, let 𝒞={{An}n}\mathcal{C}=\left\{\{ A_{n}\} _{{n\in\mathbb{N}}}\right\} be the collection of countable covers of AA by sets in 𝒜\mathcal{A}. Define

μ*(A)=inf𝒞nμ(An).\mu^{*}(A)=\inf _{\mathcal{C}}\sum _{{n\in\mathbb{N}}}\mu(A_{n}).(B.5)
Theorem B.24.

Let μ\mu be a measure on an algebra 𝒜\mathcal{A}, and let μ*\mu^{*} be as defined in (B.5). Then

  1. μ*\mu^{*} is an outer measure,

  2. μ*(A)=μ(A)\mu^{*}(A)=\mu(A) for all A𝒜A\in\mathcal{A}, and

  3. AA is μ*\mu^{*}-measurable for all A𝒜A\in\mathcal{A}.

For a proof, see [15, Theorems 11.18 and 11.19].

Definition B.25.

Let μ\mu be a measure on an algebra 𝒜\mathcal{A}. If μ~\tilde{\mu} is a measure on a σ\sigma-algebra Σ\Sigma containing 𝒜\mathcal{A}, and μ~(A)=μ(A)\tilde{\mu}(A)=\mu(A) for all A𝒜A\in\mathcal{A}, then we say that μ~\tilde{\mu} is an extension of the measure μ\mu to the σ\sigma-algebra Σ\Sigma.

Theorem B.16 shows that the outer measure μ*\mu^{*} defined in (B.5) is a measure on some σ\sigma-algebra 𝒜*\mathcal{A}^{*} containing 𝒜\mathcal{A}. The next theorem shows that this is the unique extension of μ\mu to any σ\sigma-algebra contained in 𝒜*\mathcal{A}^{*}.

Theorem B.26 (Caratheodory-Hahn Extension Theorem).

Let μ\mu be a measure on an algebra 𝒜\mathcal{A}, let μ*\mu^{*} be the corresponding outer measure, and let 𝒜*\mathcal{A}^{*} be the σ\sigma-algebra of μ*\mu^{*}-measurable sets. Then the restriction of μ*\mu^{*} to 𝒜*\mathcal{A}^{*} is an extension of μ\mu. Moreover, if μ\mu is σ\sigma-finite with respect to 𝒜\mathcal{A}, and if Σ\Sigma is any σ\sigma-algebra with 𝒜Σ𝒜*\mathcal{A}\subseteq\Sigma\subseteq\mathcal{A}^{*}, then μ*\mu^{*} is the only measure on Σ\Sigma that is an extension of μ\mu.

For a proof see [15, Theorem 11.20].

Sobolev spaces are defined in terms of distributions, and in many of the examples from Sections 2 and 3 we consider distributions T𝒟(Ω)*T\in\mathcal{D}(\Omega)^{*} that are represented by a function fLloc1(Ω)f\in L_{{loc}}^{1}(\Omega), i.e. T=TfT=T_{f} where

Tf(ϕ):=Ωfϕdx.T_{f}(\phi):=\int _{\Omega}f\phi\, dx.

Many distributions cannot be represented by a function, for example the delta functional from Example 2.7. The main theorem of Section 3.4 shows that instead of using functions to represent distributions, the right class of objects to look at is the class of regular Borel measures (see Theorem 3.35). A natural question is to ask when a measure can be represented by a function, and, if not, then how can this failure be expressed in terms of properties of the measure. This is the content of the Lebesgue decomposition and Radon-Nikodym theorem.

Definition B.27.

Let μ\mu and ν\nu be measures on the same σ\sigma-algebra Σ\Sigma on a space XX. The measure ν\nu is absolutely continuous with respect to μ\mu if ν(E)=0\nu(E)=0 for every set EΣE\in\Sigma with μ(E)=0\mu(E)=0. The measure ν\nu is singular with respect to μ\mu if there is a set ZΣZ\in\Sigma with μ(Z)=0\mu(Z)=0, and ν(E)=0\nu(E)=0 for every EΣE\in\Sigma such that EXZE\subseteq X\setminus Z.

In other words, if sets of μ\mu-measure zero are also sets of ν\nu-measure zero, then ν\nu is absolutely continuous with respect to μ\mu. If ν\nu is supported on a set of μ\mu-measure zero then it is singular with respect to μ\mu.

Theorem B.28 (Radon-Nikodym theorem).

Let (X,Σ,μ)(X,\Sigma,\mu) be a σ\sigma-finite measure space, and let α\alpha be a measure on Σ\Sigma that is absolutely continuous with respect to μ\mu. Then there exists fL1(X,dμ)f\in L^{1}(X,d\mu) such that

α(E)=Efdμ\alpha(E)=\int _{E}f\, d\mu(B.6)

for each EΣE\in\Sigma.

Theorem B.29.

Let (X,Σ,μ)(X,\Sigma,\mu) be a measure space, and let σ\sigma be a measure on Σ\Sigma that is singular with respect to μ\mu. Then there exists a set ZZ with μ(Z)=0\mu(Z)=0, and

σ(E)=σ(EZ)\sigma(E)=\sigma(E\cap Z)

for each EΣE\in\Sigma.

Theorem B.30 (Lebesgue Decomposition).

Let μ\mu be a σ\sigma-finite measure on a σ\sigma-algebra Σ\Sigma, and let ν\nu be a finite measure on Σ\Sigma. Then there is a unique decomposition


where α\alpha and σ\sigma are measures on Σ\Sigma such that α\alpha is absolutely continuous with respect to μ\mu, and σ\sigma is singular with respect to μ\mu.

See [12] or [15] for different proofs of the above statements. Note that Rudin in [12] considers the more general case of a complex measure.

The following simple example shows that σ\sigma-finiteness is a necessary condition in the Radon-Nikodym theorem. Another example using the counting measure is described in [12, pp123-124].

Example B.31.

Let Σ={,X}\Sigma=\{\emptyset,X\} be the trivial σ\sigma-algebra on a set XX, and let μ\mu and ν\nu be measures on XX with μ()=0\mu(\emptyset)=0, μ(X)=\mu(X)=\infty, ν()=0\nu(\emptyset)=0, and ν(X)=1\nu(X)=1. Note that ν\nu is absolutely continuous with respect to μ\mu, and that μ\mu is not a σ\sigma-finite measure on XX. Then the Σ\Sigma-measurable functions f:Xf:X\rightarrow\mathbb{C} are the constants (since ff measurable implies that f-1(U)Σf^{{-1}}(U)\in\Sigma for all open sets UU\subseteq\mathbb{C}), and so Xfdμ=\displaystyle{\int _{X}f\, d\mu=\infty} for any non-zero measurable function. Since ν(X)=1\nu(X)=1, then there cannot exist any measurable function ff such that Xfdμ=ν(X)\displaystyle{\int _{X}f\, d\mu=\nu(X)}, and therefore the Radon-Nikodym theorem does not hold in this case.

The next lemma is a consequence of the well-known Vitali covering lemma.

Lemma B.32.

Let Ωn\Omega\subseteq\mathbb{R}^{n} be an open set. Then for all δ>0\delta>0 there exists a countable collection {Bn}n\{ B_{n}\} _{{n\in\mathbb{N}}} of disjoint closed balls in Ω\Omega such that

  1. diamBnδ\diam B_{n}\leq\delta for all nn\in\mathbb{N}, and

  2. ΩnBn\displaystyle{\Omega\setminus\bigcup _{{n\in\mathbb{N}}}B_{n}} has Lebesgue measure zero.

See [5, Corollary 2, p28] for a proof.


  • 1
    Robert A. Adams.
    Sobolev spaces.
    Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York-London, 1975.
    Pure and Applied Mathematics, Vol. 65.
  • 2
    S. Banach.
    Theory of linear operations, volume 38 of North-Holland Mathematical Library.
    North-Holland Publishing Co., Amsterdam, 1987.
    Translated from the French by F. Jellett, With comments by A. Pełczyński and Cz. Bessaga.
  • 3
    Lawrence Conlon.
    Differentiable manifolds: a first course.
    Birkhäuser Advanced Texts: Basler Lehrbücher. [Birkhäuser Advanced Texts: Basel Textbooks]. Birkhäuser Boston Inc., Boston, MA, 1993.
  • 4
    Lawrence C. Evans.
    Partial differential equations, volume 19 of Graduate Studies in Mathematics.
    American Mathematical Society, Providence, RI, 1998.
  • 5
    Lawrence C. Evans and Ronald F. Gariepy.
    Measure theory and fine properties of functions.
    Studies in Advanced Mathematics. CRC Press, Boca Raton, FL, 1992.
  • 6
    Herbert Federer.
    Geometric measure theory.
    Die Grundlehren der mathematischen Wissenschaften, Band 153. Springer-Verlag New York Inc., New York, 1969.
  • 7
    Paul R. Halmos.
    Measure Theory.
    D. Van Nostrand Company, Inc., New York, N. Y., 1950.
  • 8
    Elliott H. Lieb and Michael Loss.
    Analysis, volume 14 of Graduate Studies in Mathematics.
    American Mathematical Society, Providence, RI, second edition, 2001.
  • 9
    Frank Morgan.
    Geometric measure theory.
    Academic Press Inc., San Diego, CA, third edition, 2000.
    A beginner’s guide.
  • 10
    John C. Oxtoby.
    Measure and category, volume 2 of Graduate Texts in Mathematics.
    Springer-Verlag, New York, second edition, 1980.
    A survey of the analogies between topological and measure spaces.
  • 11
    Michael Reed and Barry Simon.
    Methods of modern mathematical physics. I.
    Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, second edition, 1980.
    Functional analysis.
  • 12
    Walter Rudin.
    Real and complex analysis.
    McGraw-Hill Book Co., New York, third edition, 1987.
  • 13
    Walter Rudin.
    Functional analysis.
    International Series in Pure and Applied Mathematics. McGraw-Hill Inc., New York, second edition, 1991.
  • 14
    François Trèves.
    Topological vector spaces, distributions and kernels.
    Academic Press, New York, 1967.
  • 15
    Richard L. Wheeden and Antoni Zygmund.
    Measure and integral.
    Marcel Dekker Inc., New York, 1977.
    An introduction to real analysis, Pure and Applied Mathematics, Vol. 43.
  • 16
    William P. Ziemer.
    Weakly differentiable functions, volume 120 of Graduate Texts in Mathematics.
    Springer-Verlag, New York, 1989.
    Sobolev spaces and functions of bounded variation.