## The Dirac equation (2): heuristic derivation and basic properties of the Klein-Gordon equation

For more than a century now, we know that light presents some properties typical to particles (for example, in the photoelectric effect, explained in 1905 by Albert Einstein), and also matter can behave as waves (which was first conjectured by Louis de Broglie in 1924, and then observed for example in electron diffraction experiments). This phenomenon is known as wave-particle duality, and it suggests that the equations of motion for matter should at least resemble Maxwell’s equations for light. This is a starting point for our first approximation to relativistic quantum dynamics.

## 3. The Klein-Gordon equation

We agree that the motion of a particle should be described in terms of the wave function $\psi(t, \vec{x})$, possibly taking vector values. We are looking for something Lorentz-invariant, so a good guess is

$\partial_t^2 \psi - \Delta \psi + c_1 \psi + c_2 = 0. \hspace{\stretch{1}} (3.1)$

In principle, the coefficients $c_1$, $c_2$ may depend both on time and position. However, we first consider a free particle, that is, a particle which does not interact with any external force. The equation is therefore expected to be isotropic (invariant under translations and rotations of space), autonomous (invariant under translations of time) and homogeneous (invariant under multiplication of the unknown function $\psi$ by constants) — just as free Maxwell’s equations, that is, Maxwell’s equations with no electric charges and currents. (Homogeneity is not obvious at all here: one cannot ‘multiply’ particles by fractions, and we also have Pauli’s exclusion principle. On the other hand, light is quantized too, but this fact cannot be seen directly from Maxwell’s equations. Hence, a similar behavior is acceptable, or perhaps even expected, in our first approximation to quantum mechanics.)

The above considerations suggest that in equation (3.1) we should take $c_2 = 0$, and $c_1$ should be a constant. Furthermore, $c_1$ should be nonnegative: otherwise, if at some initial time $t_0$ the wave function $\psi$ was constant (this idea can be localized, but consider a global constant here), it would either grow or decay exponentially with time, violating any reasonable energy conservation principle.  This leads to the Klein-Gordon equation:

$\partial_t^2 \psi - \Delta \psi + m^2 \psi = 0. \hspace{\stretch{1}} (3.2)$

Here $\psi(t, \vec{x})$ is a square-integrable function of $\vec{x}$ for each $t$, or a vector of such functions. Although in principle some regularity of $\psi$ is required for the derivatives of $\psi$ to be well-defined, we will see later in this post that (3.2) makes sense also in the more general context.

The choice of the constant $m^2$ is completely arbitrary here. However, it turns out that $m$ in (3.2) corresponds to the rest mass of a particle. If $m = 0$, we recover the potential formulation of free Maxwell’s equations (2.9′), describing the motion of massless photons; mathematically, this is just the classical wave equation. For general $m \ge 0$, a plain wave $\psi(t, \vec{x}) = \exp(\mathrm{i} \omega t - \mathrm{i} \vec{k} \cdot \vec{x})$ (moving with velocity $\vec{k}$ and frequency $\omega$; this function is clearly not square integrable, but it is interesting as a basic building block of the Fourier transform) satisfies (3.2) if and only if $\omega^2 - k^2 = m^2$. This is very similar to the relativistic relation between mass, momentum and energy:

$E^2 - p^2 = m^2, \hspace{\stretch{1}} (3.3)$

better known in non-natural units formulation:

$E^2 - c^2 p^2 = (m c^2)^2 . \hspace{\stretch{1}} (3.3')$

We will come back to this relation in the next post.

The Klein-Gordon equation (3.2) is fundamental for the relativistic quantum theory. It is believed that every relativistic quantum model describing a system without external interactions (a free system) is, in a sense, a special case of the Klein-Gordon equation; in particular, every solution of the potential formulation of free Maxwell’s equations satisfies (3.2), and the same is true for the solutions of the free Dirac equation, which will be introduced in the next post. For this reason, we briefly discuss some basic features of the Klein-Gordon equation.

In the Fourier (or momentum) space, (3.2) reads

$\partial_t^2 \mathcal{F} \psi(t, \vec{p}) + (p^2 + m^2) \mathcal{F} \psi(t, \vec{p}) = 0.$

For a fixed $\vec{p}$, this is an ordinary differential equation in time. The solution is given by a linear combination of two functions, $\exp(\pm \mathrm{i} t \sqrt{p^2 + m^2})$. Hence, the general solution of the Klein-Gordon equation (3.2) is given by

$\psi(t, \vec{x}) = \psi_+(t, \vec{x}) + \psi_-(t, \vec{x}) , \hspace{\stretch{1}} (3.4)$

where

$\mathcal{F} \psi_\pm(t, \vec{p}) = e^{\pm\mathrm{i} t \sqrt{p^2 + m^2}} \mathcal{F} \psi_\pm(0, \vec{p}) . \hspace{\stretch{1}} (3.5)$

The functions $\psi_+$ and $\psi_-$ will correspond to the spaces of positive and negative energies (and negative energy here is related to anti-matter). Observe that each of the evolution equations for $\psi_+$ and $\psi_-$ is the time-reversal of the other one. This gives the charge-time symmetry of quantum systems, our first approximation to the fundamental charge-parity-time (CPT) symmetry of physical laws. (Parity here corresponds to the orientation of space; for example, reflections are parity transformations.)

The components $\psi_\pm$ can be easily found from the initial conditions

\begin{aligned} \mathcal{F} \psi(0, \vec{p}) & = \mathcal{F} \psi_+(0, \vec{p}) + \mathcal{F} \psi_-(0, \vec{p}) , \\ \partial_t \mathcal{F} \psi(0, \vec{p}) & = \mathrm{i} \sqrt{p^2 + m^2} (\mathcal{F} \psi_+(0, \vec{p}) - \mathcal{F} \psi_-(0, \vec{p})). \end{aligned}

Indeed, we have

$\displaystyle \mathcal{F} \psi_{\pm}(0, \vec{p}) = \frac{1}{2} \left( \mathcal{F} \psi(0, \vec{p}) \pm \frac{1}{\mathrm{i} \sqrt{p^2 + m^2}} \, \partial_t \mathcal{F} \psi(0, \vec{p}) \right) . \hspace{\stretch{1}} (3.6)$

We remark that for the free Maxwell’s equations, the four-potential $\psi = \vec{\mathrm{A}}$ is a real function, and therefore $\psi_+$ is conjugate to $\psi_-$. In particular, the positive and negative energy components of a photon have equal ‘weight’.

Later in this post we rewrite formulas (3.5) and (3.6) in the integral form. In the zero mass case, the formula is explicit, but for massive particles we are not able to avoid using Fourier transform completely. Before, however, we make a short digression about the spectral theorem.

## 4. Spectral theorem

Recall that by $\psi_\pm(t)$ we denote $\psi_\pm(t, \vec{x})$ as a function of $\vec{x}$. By (3.5), $\psi_\pm(t)$ is the image of $\psi_\pm(0)$ under a Fourier multiplier: an operator which acts in the Fourier space as a multiplication operator. The symbol of this Fourier multiplier is $\exp(\pm \mathrm{i} \sqrt{p^2 + m^2})$. Since the Laplace operator $\Delta$ is the Fourier multiplier with symbol $-p^2$, it seems reasonable to write

$\psi_\pm(t) = \exp(\pm \mathrm{i} t \sqrt{m^2 - \Delta}) \psi_\pm(0) . \hspace{\stretch{1}} (4.1)$

For a more general (say, bounded and measurable) function $f$, the operator $f(\Delta)$ can be defined using the Fourier transform,

$\displaystyle f(\Delta) \psi(\vec{x}) = \frac{1}{(2 \pi)^{3/2}} \iiint \mathcal{F} \psi(\vec{p}) f(-p^2) e^{\mathrm{i} \vec{p} \cdot \vec{x}} \mathrm{d}\vec{p} \hspace{\stretch{1}} (4.2)$

for smooth, rapidly decreasing functions $f$, and then extended continuously to $L^2(\mathbf{R}^3)$. Formula (4.1) corresponds to $f(-p^2) = \exp(\pm \mathrm{i} t \sqrt{p^2 + m^2})$. The definition given in (4.2) is a particular case of a more general construction of a function of an operator, which requires the spectral theorem.

Spectral Theorem
If $\boldsymbol{A}$ is a unitary or (possibly unbounded) self-adjoint operator (or, more generally, a normal operator) on a Hilbert space $\mathcal{H}$, then there is a corresponding spectral measure (aka resolution of identity): a family of orthogonal projectors $\pi_{\boldsymbol{A}}(E)$ for Borel sets $E \subseteq \mathbf{C}$, such that $\pi_{\boldsymbol{A}}(\mathbf{C})$ is the identity operator, and $\langle \pi_{\boldsymbol{A}}(E) \psi, \varphi \rangle$ is a countably additive function of $E$ (a complex-valued measure) for any $\psi, \varphi \in \mathcal{H}$, and furthermore we have $\langle \boldsymbol{A} \psi, \varphi \rangle = \int_{\mathbf{C}} \lambda \, \langle \pi_{\boldsymbol{A}}(\mathrm{d}\lambda) \psi, \varphi \rangle$ for all $\psi$ in the domain of $\boldsymbol{A}$ and all $\varphi$.

The smallest closed set $E$ such that $\pi_{\boldsymbol{A}}(E)$ is the identity operator is the spectrum of $\boldsymbol{A}$, denoted $\mathrm{sp}(\boldsymbol{A})$ (this is equivalent to the classical definition). And for any measurable function $f$ defined on $\mathrm{sp}(\boldsymbol{A})$, the operator $f(\boldsymbol{A})$ is given by the identity

$\langle f(\boldsymbol{A}) \psi, \varphi \rangle = \int_{\mathbf{C}} f(\lambda) \langle \pi_{\boldsymbol{A}}(\mathrm{d}\lambda) \psi, \varphi \rangle \hspace{\stretch{1}} (4.3)$

whenever $\varphi \in \mathcal{H}$ and $\psi$ is in

$\mathrm{dom}(f(\boldsymbol{A})) = \left\{ \psi \in \mathcal{H} : \int_{\mathbf{C}} f(\lambda) \langle \pi_{\boldsymbol{A}}(\mathrm{d}\lambda) \psi, \psi \rangle < \infty \right\} , \hspace{\stretch{1}} (4.4)$

the domain of $f(\boldsymbol{A})$. Note that if $f$ is a bounded function, then $f(\boldsymbol{A})$ is a bounded operator, even if the original operator $\boldsymbol{A}$ was unbounded. In particular, formulas (3.4) and (3.5) (or (4.1)) define the unique solution of the Klein-Gordon equation (3.2) for arbitrary square-integrable initial data $\psi_+(0)$ and $\psi_-(0)$, with no further regularity assumptions. Clearly, we also have the uniqueness of the solution given the initial data $\psi(0)$ and $\partial_t \psi(0)$ (see (3.6)), but $\partial_t \psi(0)$ may fail to be square-integrable.

We give (4.3) and (4.4) here more to fix the notation, a proper introduction to spectral theory of operators in Hilbert spaces would take too long (and there are many good textbooks covering this subject). Readers unfamiliar with spectral measures but not willing to spend too much time to learn about them, may find helpful the following two examples.

If there is a complete orthonormal set of eigenvectors $\psi_n$ of the operator $\boldsymbol{A}$, $\boldsymbol{A} \psi_n = \lambda_n \psi_n$, then $\pi_{\boldsymbol{A}}(E)$ is simply the orthogonal projection on the subspace spanned by those $\psi_n$ for which $\lambda_n \in E$, and

$\displaystyle f(\boldsymbol{A}) \psi = \sum_n f(\lambda_n) \langle \psi, \psi_n \rangle \psi_n , \hspace{\stretch{1}} (4.5)$

whenever the orthogonal series on the right is convergent. In particular, this is a rather standard construction in finite-dimensional spaces.

The second example is the Laplace operator $\Delta$, which actually motivated the above discussion. The spectral measure of $\Delta$ is easily shown to be the Fourier multiplier with symbol $\mathbf{1}_E(-p^2)$ (here and below $\mathbf{1}_E$ is the indicator function, equal to $1$ if the argument belongs to $E$ and $0$ otherwise), that is,

$\pi_\Delta(E) \psi = \mathcal{F}^{-1} (\mathbf{1}_{E}(-p^2) \mathcal{F} \psi(\vec{p})).$

This proves that formulas (4.2) and (4.3) indeed define the same operator.

## 5. Explicit solutions of the Klein-Gordon equation

We begin with rewriting (3.6) as

$\displaystyle \psi_{\pm}(0, \vec{x}) = \tfrac{1}{2} \bigl( \psi(0, \vec{x}) \mp \mathrm{i} (m^2 - \Delta)^{-1/2} \, \partial_t \psi(0, \vec{x}) \bigr) . \hspace{\stretch{1}} (5.1)$

The operator $(m^2 - \Delta)^{-1/2}$ is closely related to the Bessel potential operator, and for smooth $\psi \in L^2(\mathbf{R}^3)$ we have

$\displaystyle (m^2 - \Delta)^{-1/2} \psi(\vec{x}) = \frac{m}{2 \pi^2} \iiint \frac{K_1(m y)}{y} \, \psi(\vec{x} - \vec{y}) d\vec{y} , \hspace{\stretch{1}} (5.2)$

where $K_1$ is the modified Bessel function of the second kind. Since this result will not be used in the sequel, we omit the proof, which can be found, for example, in the book (or the article) Theory of Bessel potentials by N. Aronszajn and K. T. Smith. Instead, we derive a formula for the evolution of $\psi_\pm(t, \vec{x})$ in the position representation.

The result is rather complicated, and we need another definition. By a principal value integral, denoted $\mathrm{pv}\!\!\int$, we mean the limit of integrals with symmetric intervals around singularities removed. The limit is taken here as the length of the removed intervals tends to zero. For example, in the statement of the theorem, one should remove the interval $[|t|-\varepsilon, |t|+\varepsilon]$ and let $\varepsilon \searrow 0$.

Theorem 5.1 (propagation of positive and negative energy solutions of the Klein-Gordon equation)
Let $M_{\psi_\pm}(\vec{x}, r)$ denote the mean value of $\psi_\pm(0, \vec{y})$ on the sphere $|\vec{y} - \vec{x}| = r$, and let, as usual, $\partial_r M_{\psi_\pm}$ be the derivative of $M_{\psi_\pm}(\vec{x}, r)$ with respect to $r$. Suppose that $\psi$ is a smooth solution of the Klein-Gordon equation (3.2), and that $\psi_\pm$ are its positive and negative energy components. When $m = 0$, we have

\displaystyle \begin{aligned} \psi_\pm(t, \vec{x}) & = M_{\psi_\pm}(\vec{x}, |t|) + |t| \partial_r M_{\psi_\pm}(\vec{x}, |t|) \\ & \quad \pm \frac{i}{\pi} \, \mathrm{pv}\!\! \int_0^\infty \frac{4 t r^2 (M_{\psi_\pm}(\vec{x}, |t|) - M_{\psi_\pm}(\vec{x}, r))}{(t^2 - r^2)^2} \, \mathrm{d}r, \end{aligned} \hspace{\stretch{1}} (5.3)

For $m > 0$, there is an additional term,

\displaystyle \begin{aligned} \psi_\pm(t, \vec{x}) & = M_{\psi_\pm}(\vec{x}, |t|) + |t| \partial_r M_{\psi_\pm}(\vec{x}, |t|) + \tilde{K}_{m,\pm t} * \psi_\pm(0, \vec{x}) \\ & \quad \pm \frac{i}{\pi} \, \mathrm{pv}\!\! \int_0^\infty \frac{4 t r^2 (M_{\psi_\pm}(\vec{x}, |t|) - M_{\psi_\pm}(\vec{x}, r))}{(t^2 - r^2)^2} \, \mathrm{d}r, \end{aligned} \hspace{\stretch{1}} (5.4)

where $*$ denotes the convolution in spatial variables, and $\tilde{K}_{m, \pm t}$ is an $L^2(\mathbf{R}^3)$ function with Fourier transform

$\mathcal{F} \tilde{K}_{m,\pm t}(\vec{p}) = e^{\pm \mathrm{i} t \sqrt{p^2 + m^2}} - e^{\pm \mathrm{i} t p} . \hspace{\stretch{1}} (5.5)$

Before the proof, we note that the evolution of $\psi_\pm$ violates the causality principle in special relativity: the value of $\psi_\pm(t, \vec{x})$ is expected to depend only on the values of $\psi_\pm(t, \vec{x} - \vec{y})$ for $\vec{y}$ in the light cone $y \le |t|$, but in Theorem 5.1 this is not the case. It is quite clear when $m = 0$, and for $m > 0$ the square-integrable term $\tilde{K}_{m,\pm t}$ cannot compensate the highly singular kernel of the principal value integral; we omit the details. As a consequence, any physical solution of the Klein-Gordon equation must either comprise nonzero positive and negative parts, or present some additional symmetry, which makes possible a causal reformulation of (5.3) and (5.4). On the other hand, the evolution of $\psi$ agrees with the causality principle, as stated in the following result.

Theorem 5.2 (propagation of general solutions of the Klein-Gordon equation)
Let $M_\psi(\vec{x}, r)$ denote the mean value of $\psi(0, \vec{y})$ on the sphere $|\vec{y} - \vec{x}| = r$, and let $\partial_r M_\psi$ be the derivative of $M_\psi(\vec{x}, r)$ with respect to $r$. Suppose that $\psi$ is a smooth solution of the Klein-Gordon equation (3.2). When $m = 0$, we have

$\psi(t, \vec{x}) = M_\psi(\vec{x}, |t|) + |t| \partial_r M_\psi(\vec{x}, |t|) + t M_{\partial_t \psi}(\vec{x}, |t|). \hspace{\stretch{1}} (5.6)$

For $m > 0$, there are two additional terms,

\displaystyle \begin{aligned} \psi(t, \vec{x}) & = M_\psi(\vec{x}, |t|) + |t| \partial_r M_\psi(\vec{x}, |t|) + \tilde{K}_{m,t,0} * \psi(\vec{x}, 0) \\ & \qquad + t M_{\partial_t \psi}(\vec{x}, |t|) + \tilde{K}_{m,t,1} * \partial_t \psi(\vec{x}, 0) , \end{aligned} \hspace{\stretch{1}} (5.7)

where $\tilde{K}_{m,t,0}(\vec{x})$ and $\tilde{K}_{m,t,1}(\vec{x})$ are square-integrable functions vanishing outside the ball $x < t$, and with Fourier transforms given by

\displaystyle \begin{aligned} \mathcal{F} \tilde{K}_{m,t,0}(\vec{p}) & = \cos(t \sqrt{p^2 + m^2}) - \cos(t p) , \\ \mathcal{F} \tilde{K}_{m,t,1}(\vec{p}) & = \frac{\sin(t \sqrt{p^2 + m^2})}{\sqrt{p^2 + m^2}} - \frac{\sin(t p)}{p} \, . \end{aligned} \hspace{\stretch{1}} (5.8)

In the proof, we need the following technical result from complex analysis.

Lemma 5.3
For any bounded function $f(r)$ on $(0, \infty)$ which is smooth near $t > 0$, we have

$\displaystyle \lim_{s \to 0^+} \int_0^\infty \left(\frac{1}{t + \mathrm{i} s - r} + \frac{1}{t + \mathrm{i} s + r} \right) f(r) \mathrm{d}r = -\int_0^\infty \frac{2 t (f(t) - f(r))}{t^2 - r^2} \, \mathrm{d}r - \mathrm{i} \pi f(t) ,$

and

\displaystyle \begin{aligned} & \lim_{s \to 0^+} \int_0^\infty \left(\frac{t + \mathrm{i} s}{(t + \mathrm{i} s - r)^2} + \frac{t + \mathrm{i} s}{(t + \mathrm{i} s + r)^2} \right) f(r) \mathrm{d}r \\ & \qquad = -\mathrm{pv}\!\!\int_0^\infty \frac{2 t (t^2 + r^2) (f(t) - f(r))}{(t^2 - r^2)^2} \, \mathrm{d}r + \mathrm{i} t \pi \partial_r f(t) , \end{aligned}

The convergence is dominated by a constant depending only on $t$, the supremum norm of $f$ on $(0, \infty)$, and the supremum norm of $\partial_r f$, $\partial_r^2 f$ on a fixed neighborhood of $t$.

The proof of Lemma 5.3 is given at the end of this section.

Proof of Theorem 5.1. According to (3.5), $\psi(t) = \boldsymbol{K}_t \psi(0)$, where $\boldsymbol{K}_t$ is the Fourier multiplier with symbol $k_t(\vec{p}) = \exp(\mathrm{i} t \sqrt{p^2 + m^2})$. Our goal is to find a more explicit description of $\boldsymbol{K}_t$.

For an $L^2(\mathbf{R}^3)$ symbol, the integral formula for the corresponding Fourier multiplier can be found using the convolution theorem. This method cannot be applied directly to $\boldsymbol{K}_t$, since its symbol is not in $L^2(\mathbf{R}^3)$. However, when $\mathrm{Im} \, z > 0$, we can use the convolution theorem for the operator $\boldsymbol{K}_z$ with a square-integrable symbol $k_z(\vec{p}) = \exp(\mathrm{i} z \sqrt{p^2 + m^2})$. The theorem will then be proved by taking an appropriate limit.

The explicit formula can only be found for massless particles: we assume that $m = 0$. We find that for $\psi \in L^2(\mathbf{R}^3)$,

$\displaystyle \boldsymbol{K}_z \psi(\vec{x}) = K_z * \psi(\vec{x}) , \qquad$ where $\displaystyle \qquad K_z(\vec{x}) = (2 \pi)^{-3/2} \mathcal{F}^{-1} k_z(\vec{x}) .$

Note that $k_z(\vec{p})$ is a radial function, that is, it depends on $\vec{p}$ only through its norm $p$. For this reason it is convenient to compute first the (inverse) Fourier transform of the surface measure $\sigma_r$ on the sphere with radius $r$, centered at the origin. Symmetry, integration in spherical coordinates (rotated appropriately, so that the vector $\vec{x}$ points upwards) and then substitution $s = \sin \vartheta$ give

\displaystyle \begin{aligned} (2\pi)^{3/2} \mathcal{F}^{-1} \sigma_r(\vec{x}) & = \int e^{\mathrm{i} \vec{p} \cdot \vec{x}} \sigma_r(\mathrm{d}\vec{p}) = \int \cos(\vec{p} \cdot \vec{x}) \sigma_r(\mathrm{d}\vec{p}) \\ & = \int_{-\pi}^\pi \int_{-\pi/2}^{\pi/2} r^2 \cos \vartheta \, \cos(r x \sin \vartheta) \mathrm{d}\vartheta \mathrm{d}\varphi \\ & = 2 \pi r^2 \int_{-1}^1 \cos(r x s) \mathrm{d}s = 4 \pi r^2 \, \frac{\sin(r x)}{r x} \, . \end{aligned}

When $f(\vec{p})$ is a radial function, then the function $f(r \vec{u})$ ($r \ge 0$), where $\vec{u}$ is an arbitrary unit vector, is called the profile of $f$, and it is sometimes denoted again by $f$ when this makes no confusion. Integration in spherical coordinates and Fubini’s theorem give

$\displaystyle \mathcal{F}^{-1} f(\vec{x}) = \int_0^\infty f(r) \mathcal{F}^{-1} \sigma_r(\vec{x}) \mathrm{d}r = \sqrt{\frac{2}{\pi}} \, \frac{1}{x} \int_0^\infty r f(r) \sin(r x) \mathrm{d}r .\hspace{\stretch{1}} (5.9)$

Therefore, the three-dimensional (inverse) Fourier transform of a radial function is again a radial function with profile equal to $1/x$ times the (inverse) Fourier sine transform of the profile of $p f(\vec{p})$.

By the above observation,

$\displaystyle K_z(\vec{x}) = \frac{1}{(2\pi)^{3/2}} \, \mathcal{F}^{-1} k_z(\vec{x}) = \frac{1}{2 \pi^2 x} \int_0^\infty e^{\mathrm{i} z r} r \sin(r x) \mathrm{d}r .$

An elementary calculation gives

\displaystyle \begin{aligned} K_z(\vec{x}) & = \frac{\mathrm{i}}{4 \pi^2 x} \left(\frac{1}{(z + x)^2} - \frac{1}{(z - x)^2} \right) \\ & = \frac{\mathrm{i}}{4 \pi^2 x^2} \left( \frac{1}{z - x} + \frac{1}{z + x} - \frac{z}{(z - x)^2} - \frac{z}{(z + x)^2} \right) . \end{aligned}

If $z = t + \mathrm{i} s$ and $s > 0$ converges to $0$, then $|k_z(\vec{p})| \le 1$ and $k_z(\vec{p})$ converges to $k_t(\vec{p})$. Therefore, by Plancherel’s theorem and dominated convergence, $\boldsymbol{K}_z \psi$ converges in $L^2(\mathbf{R}^3)$ to $\boldsymbol{K}_t \psi$. Furthermore, symmetry and integration in spherical coordinates gives

$\displaystyle \boldsymbol{K}_z \psi(\vec{0}) = \iiint \psi(\vec{x}) K_z(\vec x) \mathrm{d}\vec{x} = \int \left( \int_0^\infty \psi(r \vec{u}) r^2 K_z(r \vec{u}) \mathrm{d}u \right) \sigma_1(\mathrm{d}\vec{u}) .$

By Lemma 5.3, it follows that for all unit vectors $\vec{u}$, all smooth, rapidly decreasing functions $\psi$, and all $t > 0$, we have

\displaystyle \begin{aligned} & \lim_{s \to 0^+} \int_0^\infty \psi(r \vec{u}) r^2 K_{t + \mathrm{i} s}(r \vec{u}) \mathrm{d}u \\ & \qquad = \frac{1}{4 \pi} \, (\psi(t \vec{u}) + t \vec{u} \cdot \nabla \psi(t \vec{u})) + \frac{i}{4 \pi^2} \mathrm{pv}\!\!\int_0^\infty \frac{4 t r^2 (\psi(t \vec{u}) - \psi(r \vec{u}))}{(t^2 - r^2)^2} \, \mathrm{d}r \end{aligned}

The last statement of Lemma 5.3 enables us to change the order of the integral and the limit, so that

\displaystyle \begin{aligned} \boldsymbol{K}_t \psi(0) & = \lim_{s \to 0^+} \boldsymbol{K}_{t + \mathrm{i} s} \psi(0) = \frac{1}{4 \pi} \int (\psi(t \vec{u}) + t \vec{u} \cdot \nabla \psi(t \vec{u})) \sigma_1(\mathrm{d}u) \\ & \hspace{4em} + \frac{i}{\pi} \mathrm{pv}\!\!\int_0^\infty \frac{4 t r^2}{(t^2 - r^2)^2} \left( \frac{1}{4 \pi} \int (f(t \vec{u}) - f(r \vec{u})) \sigma_1(\mathrm{d}\vec{u}) \right) \mathrm{d}r . \end{aligned}

Formula (5.3) for the function $\psi_+$ and $t > 0$ is proved. The other cases follow by symmetry.

The case $m > 0$ is now easy. Let $\boldsymbol{K}_{m,t}$ denote the operator $\boldsymbol{K}_t$ corresponding to mass $m$. The Fourier symbol of $\boldsymbol{K}_{m,t} - \boldsymbol{K}_{0,t}$, that is, $\exp(\mathrm{i} t \sqrt{p^2 + m^2}) - \exp(\mathrm{i} t p)$, is a square-integrable function of $\vec{p}$. Formulas (5.4) and (5.5) follow by applying the convolution theorem, as described in the first part of the proof. $\square$

Proof of Theorem 5.2. We use the notation introduced in the proof of Theorem 5.1. Recall that under the assumption that $\psi(t)$ is square-integrable for each $t$, the solution is uniquely determined by $\psi(0)$ and $\partial_t \psi(0)$. Since the real and imaginary parts of a solution of the Klein-Gordon equation (3.2) are again solutions of (3.2), it suffices to consider real-valued solutions. As in the proof of Theorem 5.1, we first consider $m = 0$.

Suppose first that $\partial_t \psi(0, \vec{x}) = 0$ for all $\vec{x}$. By Theorem 5.1,

$\mathrm{Re} \, \psi_+(t, \vec{x}) = M_{\psi_+}(\vec{x}, |t|) + |t| \partial_r M_{\psi_+}(\vec{x}, |t|) ,$

and $\mathrm{Re} \, \psi_+$ is a solution of the Klein-Gordon equation (3.2). Furthermore, by (3.6), $\mathrm{Re} \, \psi_+(0, \vec{x}) = (1/2) \psi(0, \vec{x})$, and $\partial_t (\mathrm{Re} \, \psi_+)(0, \vec{x}) = \mathrm{Re} (i \sqrt{m^2 - \Delta} \, \psi_+(0, \vec{x})) = 0$. Hence, the formula

$\psi(t, \vec{x}) = M_\psi(\vec{x}, |t|) + |t| \partial_r M_\psi(\vec{x}, |t|) \hspace{\stretch{1}} (5.10)$

defines a solution of (3.2) with given $\psi(0)$ and with $\partial_t \psi(0) = 0$.

Suppose now that $\psi(0, \vec{x}) = 0$ and that $\partial_t \psi(0, \vec{x})$ is real. Suppose for a moment that $\partial_t \psi(t) \in L^2(\mathbf{R}^3)$ for all $t$. Then $\tilde{\psi} = \partial_t \psi$ is again a solution of the Klein-Gordon equation (3.2), and $\partial_t \tilde{\psi}(0) = \partial_t^2 \psi(0) = \Delta \psi(0) = 0$. Therefore, for $t > 0$,

$\displaystyle \tilde{\psi}(t, \vec{x}) = M_{\tilde{\psi}}(\vec{x}, t) + t \partial_r M_{\tilde{\psi}}(\vec{x}, t) = \frac{\mathrm{d}}{\mathrm{d}t} \, (t M_{\tilde{\psi}}(\vec{x}, t)).$

Since $\psi(t) = 0$, integration in $t$ gives

$\displaystyle \psi(t, \vec{x}) = t M_{\tilde{\psi}}(\vec{x}, t), \hspace{\stretch{1}} (5.11)$

and a similar formula for $t < 0$ follows by symmetry. Either by a direct substitution to (3.2) or using an approximation argument, we conclude that (5.11) holds without the additional square-integrability assumption on $\partial_t \psi(t)$. By combining two solutions given by (5.10) and (5.11), we obtain a solution for general initial data, and formula (5.6) follows by the uniqueness of the solution.

We can repeat the above argument when $m > 0$. In (5.10) we have an additional term $\tilde{K}_{m,t,0} * \psi(0, \vec{x})$, where $\tilde{K}_{m,t,0}(\vec{x}) = \mathrm{Re} \, \tilde{K}_{m,t}(x)$. By (5.5) and the properties of the Fourier transform, we have

$\mathcal{F} \tilde{K}_{m,t,0}(\vec{p}) = \cos(t \sqrt{p^2 + m^2}) - \cos(t p) .$

In a similar manner, in (5.11) we have an additional term $\tilde{K}_{m,t,1} * \psi(0, \vec{x})$, where

$\displaystyle \tilde{K}_{m,t,1}(\vec{x}) = \int_0^t \tilde{K}_{m,s,0}(\vec{x}) \mathrm{d}s .$

It remains to prove that $\tilde{K}_{m,s,0}(\vec{x}) = 0$ when $x \ge t$.

For $m \ge 0$, the holomorphic function $t \sqrt{z + m^2}$ has a branch cut along $z \in (-\infty, -m^2]$, but the boundary values of this function on $(-\infty, -m^2]$ approached from above and from below are opposite purely imaginary numbers. Since cosine is an even function, $\cos(t \sqrt{z + m^2})$ has a continuous extension to $(-\infty, -m^2]$, and so it is an entire function of $z$. By Morera’s theorem (or, more precisely, one of its corollaries), it follows that $f(z) = \cos(t \sqrt{z + m^2}) - \cos(t \sqrt{z})$ is an entire function. Furthermore, since $|\sqrt{z + m^2}| \le \sqrt{|z|} + m$ and $|\cos(z)| \le e^{|z|}$, we have $|f(z)| \le (1 + e^{t m}) \exp(t \sqrt{z})$.

The Fourier transform of $\tilde{K}_{m,t,0}(\vec{x}) = \mathrm{Re} \, \tilde{K}_{m,t}(x)$ is equal to $f(p^2)$. But $f(p^2)$, $\vec{p} \in \mathbf{C}^3$, is an entire function of three complex variables, and $|f(p^2)| \le c e^{t p}$ for $c = 1 + e^{t m}$. By a multivariate version of Paley-Wiener theorem (see a nice proof in the article by Y. Yang and T. Qian), we conclude that $\tilde{K}_{m,t,0}(\vec{x})$ vanishes outside the ball $x < t$, as desired. $\square$

The multivariate Paley-Wiener theorem is a rather advanced tool, and its proof is far beyond the scope of these notes. Although it would be difficult to avoid it completely in the proof of Theorem 5.2, we could have used only its one-dimensional version. Indeed, $\tilde{K}_{m,t,0}(\vec{x})$ is a radial function, and so, by (5.9), its Fourier transform is expressed in terms of the one-dimensional Fourier sine transform of the profile of $x \tilde{K}_{m,t,0}(\vec{x})$.

Proof of Lemma 5.3. Although we could simply refer to the Sokhotski-Weierstrass-Plemelj formulas, we give an explicit proof. First, we decompose $f$ into the sum of two parts, one vanishing in a neighborhood of $t$, and the other smooth and vanishing outside a larger neighborhood of $t$. The result for the first part is just dominated convergence. For the other part, we use methods of complex analysis.

We therefore assume that $f$ is a smooth function supported in a small neighborhood of $t$. We extend $f$ to an even function on the real line, and define

$\displaystyle F(z) = \int_0^\infty \left(\frac{1}{z - r} + \frac{1}{z + r}\right) f(r) \mathrm{d}r = \int_{-\infty}^\infty \frac{f(r)}{z - r} \, \mathrm{d}r .$

Then $F$ is a holomorphic function in the half-plane $\mathrm{Im} \, z > 0$. For any $\varepsilon \in (0, t)$, we have

$\displaystyle \lim_{s \to o^+} \int\limits_{\mathbf{R} \setminus [t-\varepsilon, t+\varepsilon]} \frac{f(r)}{(t + \mathrm{i} s) - r} \, \mathrm{d}r = \int\limits_{\mathbf{R} \setminus [t-\varepsilon, t+\varepsilon]} \frac{f(r)}{t - r} \, \mathrm{d}r ,$

and

\displaystyle\begin{aligned} \int\limits_{[t-\varepsilon, t+\varepsilon]} \frac{f(r)}{(t + \mathrm{i} s) - r} \, \mathrm{d}r & = \int\limits_{[t-\varepsilon, t+\varepsilon]} \frac{((t-r) - \mathrm{i} s) f(r)}{s^2 + (t - r)^2} \, \mathrm{d}r \\ & = \int\limits_{[t-\varepsilon, t+\varepsilon]} \frac{(t-r) (f(r) - f(t))}{s^2 + (t - r)^2} \, \mathrm{d}r - \mathrm{i} \int\limits_{[t-\varepsilon, t+\varepsilon]} \frac{s f(r)}{s^2 + (t - r)^2} \, \mathrm{d}r . \end{aligned}

For the first integral on the right-hand side, we simply use dominated convergence. The other one is a Poisson integral, which converges to $\pi f(t)$. (This is easy to prove directly, using just the continuity of $f$ at $t$ — an ‘approximate identity’ argument.) We conclude that

$\displaystyle \lim_{s \to 0^+} F(t + \mathrm{i} s) = \int\limits_{\mathbf{R} \setminus [t-\varepsilon, t+\varepsilon]} \frac{f(r)}{t - r} \, \mathrm{d}r + \int\limits_{[t-\varepsilon, t+\varepsilon]} \frac{f(r) - f(t)}{t - r} \, \mathrm{d}r - \mathrm{i} \pi f(t) ,$

and the first statement of the lemma follows by a simple rearrangement. The proof of the other statement is very similar: again $f$ is split into two parts, and for smooth $f$ supported in a neighborhood of $t$, we define

$\displaystyle G(z) = \int_0^\infty \left(\frac{z}{(z - r)^2} + \frac{z}{(z + r)^2}\right) f(r) \mathrm{d}r = \int_{-\infty}^\infty \frac{z f(r)}{(z - r)^2} \, \mathrm{d}r .$

Integration by parts gives

$\displaystyle G(z) = -\int_{-\infty}^\infty \frac{z \partial r f(r)}{z - r} \, \mathrm{d}r = -\int_{-\infty}^\infty \frac{r \partial r f(r)}{z - r} \, \mathrm{d}r ;$

the second equality is a consequence of $\int_{-\infty}^\infty \partial_r f(r) \mathrm{d} r = 0$. Hence, by the first part of the proof and the identity $\mathrm{pv}\!\!\int_0^\infty 1 / (t^2 - r^2) \mathrm{d}r = 0$,

\displaystyle \begin{aligned} \lim_{s \to 0^+} G(t + \mathrm{i} s) & = \int_0^\infty \frac{2 t^2 \partial_r f(t) - 2 t r \partial_r f(r)}{t^2 - r^2} \, \mathrm{d}r + \mathrm{i} \pi t \partial_r f(t) \\ & = - \mathrm{pv}\!\!\int_0^\infty \frac{2 t r \partial_r f(r)}{t^2 - r^2} \, \mathrm{d}r + \mathrm{i} \pi t \partial_r f(t) . \end{aligned}

Again integrating by parts (carefully: this is a principal value integral), we conclude that

$\displaystyle \lim_{s \to 0^+} G(t + \mathrm{i} s) = \mathrm{pv}\!\!\int_0^\infty \frac{2 t (t^2 + r^2) (f(r) - f(t))}{(t^2 - r^2)^2} \, \mathrm{d}r + \mathrm{i} \pi t \partial_r f(t) ,$

as desired. Finally, by inspecting the above argument, one proves the last statement of the lemma; we omit the details. $\square$