Mathematics

Distribution Theory in a Nutshell

Introduction

As shown in the Laurent/Dirac post, the problematic Dirac function clearly works when modeling physical systems but it lacks mathematic rigor. Schwartz’s Distribution theory provides the necessary rigor. A high level overview is provided below, though to truly understand it will require a background in functional analysis. Schwartz was awarded the Fields Medal in 1950, not just for fixing Dirac’s oddity, but for the moving the needle of mathematical analysis with his novel framework. This post seeks to highlight some of its features.

It’s worth observing that in the past few decades, physicists and engineers have implicitly acknowledged his work by changing the wording from “dirac functions” to “dirac distributions” in most textbooks. 

Terminology: Functions, Test Functions and Dual Spaces

A functional $L$ is a mapping from a linear space (vector space) to its field of scalars, $L: V \rightarrow \mathbb{R}$ (or $\mathbb{C}$). The domain of a functional consists of functions.

For example, consider the linear space $L^1$ (the space of integrable functions) on a prescribed interval. Then one such functional is the area under the curve of the function on that interval — the definite integral in this case returns a scalar value.

Another example (shown in operator form) is an integral like this, $ Lf = \int_{\Sigma}K(s)f(s)ds $. The term in the integrand $K(s)$ is called the kernel of this functional. Again, this integral maps a function $f$ to a scalar, i.e., a value in $\mathbb{R}$ or $\mathbb{C}$. This type of integral will appear again when studying Green’s functions.

A test function denoted $C_0^{\infty}(\mathbb{R})$ is an infinitely differentiable (“smooth”) function that is zero everywhere outside a finite closed interval. More technically, it is smooth with compact support.

A canonical example is the bump function, $\rho(x)$:

$$\begin{eqnarray}
\rho(x)=\begin{cases} e^{-1/(1-x^2)} & \text{if } |x|<1 \\ 0 & \text{if } |x|\ge 1 \end{cases}
\end{eqnarray}$$

The space of all such test functions is denoted $D( \mathbb{R})$. It can be shown that this forms a linear (or vector) space and is equipped with topology that ensures convergence is strong (i.e., convergence of all derivatives confined to a compact set).

The dual space of a linear space denoted $D^{‘}( \mathbb{R})$ is simply the set of all continuous linear functionals that map its elements to a scalar, $\mathbb{R}$ (or $\mathbb{C}$; all of the above fields can be complex too).

The Test Function Sequence (Mollifers)

As mentioned above, this space $D( \mathbb{R})$ has an induced topology in which sequences of test functions converge: $ \{\psi_n\} \rightarrow \{\psi\}$

To explain, define a sequence of test functions such as, $\{\psi_n(x)\}_{n \ge 1}$, where each function is a rescaled version of the bump function $\rho(x)$. These test functions are called mollifiers, which integrate to unity ($1$) over $\mathbb{R}$. The indexed sequence $\psi_n(x)$ is then:

$$\begin{eqnarray}
\psi_n(x) = C_n \rho(nx) = C_n \begin{cases} e^{-1/(1-(nx)^2)} & \text{if } |x|<1/n \\ 0 & \text{if } |x|\ge 1/n \end{cases}
\end{eqnarray}$$

Here, $C_n$ is a normalization constant defined such that the integral of each function is unity:

$$\begin{eqnarray}
C_n = n \left( \int_{-1}^1 \rho(t) dt \right)^{-1}
\end{eqnarray}$$

Notice above how as $n$ gets larger, the width of $\psi_n(x)$ gets smaller and its value larger. It should be clear that this sequence converges to Dirac’s $\delta(x)$. This sequence is of course is not unique either.

Distributions ($D^{‘}( \mathbb{R})$)

Understanding all of this, a distribution $T(\varphi)$ is simply $D^{‘}( \mathbb{R})$, the dual space of test functions $D( \mathbb{R})$.

That is, a distribution is a continuous linear functional (i.e., if $ \{\varphi_n\} \rightarrow \{\varphi\}$ then $T(\varphi_n) \rightarrow T(\varphi)$) that maps test functions to scalars. These functionals are typically denoted $\langle T, \varphi \rangle$ and referred to as generalized functions.

There are two classes of distributions, regular and singular.

A regular distribution is one that can be defined as an inner product of a locally integrable function $f(x)$ with a test function $\varphi \in C_0^{\infty}(\mathbb{R})$

$$\begin{eqnarray}
\langle T_f, \varphi \rangle = \int_{\mathbb{R}} f(x) \varphi(x) dx
\end{eqnarray}$$

Notation varies here; this is also denoted $f[\varphi]$ to indicate that it is acting on a test function. Either way, this integral maps a function to a scalar– e.g., the inner product of two functions. And consistent with the belief that function values at a point are nonsense, these distributions are a type of function average over a specified range, the size of which is simply the support of the test function.

A singular distribution is any distribution that’s not regular.

Singular Distributions and Dirac’s Delta

Suppose we define a distribution which just has the sifting property. This satisfies the conditions of a distribution: it maps tests functions to a scalar.

$$\begin{eqnarray}\langle T_{\delta_{x_0}}, \varphi \rangle = \langle\delta_{x_0}, \varphi\rangle = \varphi(x_0)\end{eqnarray}$$

As shown earlier, there is no conventional function $\delta(x)$ for which this holds. But it is a valid distribution, an example of a singular distribution in fact. This satisfies the requirements of the Dirac delta while also satisfying the requirements of a distribution.

One detail. The definition above provides the sifting property for test functions, but what about for all continuous functions as provided by Dirac’s sifting property?

$$\begin{eqnarray}\int_R \delta(x)f(x)dx = f(0)\end{eqnarray}$$

Due to the continuity of the distribution and the density of test functions within the space of continuous functions with compact support, this property can be rigorously extended to any continuous function, $f(x)$. Details not shown.

The unity requirement for delta function integrals, $ \int\delta(x)dx $, is justified similarly. Basically, the theory matches the intuition.

Differentiation (Weak Derivatives)

Schwartz also details how delta distributions can be operated on through differentiation and integration, a la Dirac’s theory. As shown below, this is accomplished using integration by parts to define the derivative of a distribution $T$ by shifting the derivative onto the test function $\varphi(x)$ :

$$\begin{eqnarray}\langle {d f \over dx}, \varphi \rangle = \int_{\mathbb{R}} f'(x) \varphi(x) dx \end{eqnarray}$$ $$\begin{eqnarray}=\left.{f(x)\varphi(x)} \right|^{r=\infty}_{r=-\infty}
-\int_{\mathbb{R}} f(x) \varphi'(x) dx \end{eqnarray}$$ $$\begin{eqnarray}=-\int_{\mathbb{R}} f(x) \varphi'(x) dx \end{eqnarray}$$

That boundary term disappears because the test function is zero at the boundaries! What remains is expressed as the definition of the derivative of a distribution:

$$\begin{eqnarray}
\langle {d T \over dx}, \varphi \rangle = – \langle T, {d  \varphi \over dx} \rangle
\end{eqnarray}$$

Iterating on this, the following pattern emerges for higher derivatives:

$$\begin{eqnarray}
\langle D^k T , \varphi \rangle = (-1)^{|k|} \langle T, D^k \varphi \rangle
\end{eqnarray}$$

This integral-based definition of a derivative is something known as a weak derivative.

And with that, our friend the delta function acquired a mathematically precise meaning and a new moniker: the generalized function known as the delta distribution.

Applications

We’ll start with a simple problem: $f(x)g(x)=0$, which is in the domain of algebra, not calculus. Suppose the roots of $g(x)$ are known, then at these roots $x_0, x_1, …$ the value of $f(x)$ can be non-zero, whereas, everywhere else $f(x)$ must be zero. Sound familiar? One way to cast this problem then is with a delta distribution:

$f(x) = C\delta(x-x_0)$, where $f(x)$ is only “active” at $x_0$.

The fundamental solution of Laplace’s equation is another motivating case. $L \Phi(x| \xi)= \delta(\xi)$.

This fundamental solution, , is the most basic form of what is known as a Green’s function, which serves as the kernel of the inverse differential operator. Lots more on that here and here.

You Can’t Multiply

A notable problem with distributions: there’s no general multiplication operation, a problem known as Schwartz’s Impossibility Theorem. This means that the inner product of a distribution with itself (i.e., the norm) may not be defined. This means the space of distributions $D^{‘}( \mathbb{R})$ is not a Hilbert space.

But there are workarounds. One is called a Colombeau algebra, which allows for multiplication of certain generalized functions. There’s also this thing called the Rigged Hilbert Space. I’m sure each has its merits.

Highly recommended: Robert Strichartz’s A Guide to Distribution Theory and Fourier Transformations

Some examples courtesy of the appendix of Waves and Structures in Nonlinear Nondispersive Media available here.

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.