Although it took me much more time than I expected, the introductory part of the notes on the Dirac equation is ready. I welcome all comments.
In order to properly understand the Dirac equation, one needs some background on the Lorentz transformation. In this post, we also discuss briefly some aspects of Maxwell’s equations, which will become important later when we couple the Dirac particle with electromagnetic field.
To keep the notes readable for mathematicians, it is important to keep the notation as simple and as consistent as possible. In quantum mechanics, the state of a particle (or a system of particles) at time is described completely by an element of a fixed Hilbert space with norm . (This definition is already not strictly Lorentz-invariant; we will discuss this later.) In the Dirac model, is the space of square-integrable -valued functions on . Measurable values in this theory correspond to self-adjoint, typically unbounded linear operators on , called observables. If, for example, and are the position and the momentum vectors of a classical particle, then we denote the corresponding observables by and .
Three-dimensional vectors (space vectors) are denoted by , etc. By , we denote the lenght of vectors , . Relativistic four-vectors are written in roman font, for example . For that reason, we sometimes write . Mathematically, a four-vector is just a four-dimensional vector. We use the name four-vector to emphasize that a natural class of transformations acting on four-vectors is the group of Lorentz transformations (see below), and not isometries. The set of four-vectors is called space-time.
The partial derivative operators are denoted by (). Furthermore, is the vector of spatial partial derivatives and is the spatial Laplace operator. The gradient of a function, divergence of a vector field, and its curl are denoted by , and . The time derivative is denoted by , or when four-vectors are discussed. Sometimes we also use a dot placed over a symbol, like in .
The evolution of the state is described by the wave function . We often drop the arguments from the notation when they are clear from the context. Furthermore, we often write for , a function of the spatial variable. Perhaps it is worth noting that the wave function typically does not satisfy the usual wave equation, but a Schrödinger one. As it was already mentioned, in the Dirac approach the function takes values in , so that .
Physicists often use fixed variable names for fixed spaces, or representations. The symbol corresponds to the position, and the wave function is given in the so-called position (or standard) representation. However, it is often more convenient to work with the (spatial) Fourier transform of , which corresponds to the so-called momentum representation, related to the momentum variable . Instead of writing the Fourier transform explicitly, it is customary to write simply for the Fourier transform of . This is just a very convenient short hand notation, but at first it may seem very informal. For that reason, we try to avoid it, and write instead. We use for the Fourier transform normalized to be an isometry,
The inverse Fourier transform is then given by the formula
Finally, we choose to use natural units. That is, we choose unit system in such a way that some chosen physical constants are equal to one, thanks to which we can drop them from the formulas. We will assume that the speed of light , the electric and magnetic constants and and the (reduced) Planck constant are all equal to one.
One of the main features of the Dirac equation is its Lorentz invariance: the Dirac equation has the same form in all inertial (in the sense of special relativity) frames of reference. It is therefore reasonable to start with a short introduction to the Lorentz transformation. And since its discovery was mostly motivated by Maxwell’s equations, we shall begin with a brief introduction to classical electromagnetism.
where is the density of electric charge and is the density of electric current. These two objects are ruled by a rather complicated mechanism which depends on what type of media is the space filled with: conducting or not, magnetic or not etc. Here we will consider and simply to be parameters, describing an external source of electromagnetic wave, and only note the general continuity equation:
Intuitively, (2.2) is a form of electric charge conservation law: it says that describes the flow of the electric charge .
When there is no electric charge and no current, all components of and satisfy the classical wave equation: and . For example, using the identity , we obtain
Maxwell’s equations agree perfectly with experimental data. However, they are not preserved by Galilean transformations. This is true even with the absence of electric charge and current, because Galilean transformations do not preserve the classical wave operator . It is not dificult to find linear transformation of coordinates preserving : the frame of reference moving with constant speed in the direction should be described by the Lorentz transformation (a Lorentz boost):
where is the Lorentz factor. It is an easy (but very instructive) exercise to verify explicitly that the transformation (2.3) indeed preserves the classical wave operator. The Lorentz boost corresponding to the frame of reference moving with arbitrary constant speed can be obtained from (2.3) by rotations, but the result is not very simple:
again with . Here is a matrix with entries , and is the ortogonal projection on the line containing . Lorentz boosts, rotations, translations and their compositions, form the group of Lorenz transformations of space-time. Since Lorenz transformation do not preserve time, they were first considered as a purely mathematical notion, and it was Albert Einstein who first considered them (in special relativity) to be the true physical description of inertial frames of reference.
There are two types of Lorentz invariance. Suppose that two inertial frames are related by a Lorentz boost (2.3). A path (here corresponds to the time variable ) in the primed frame is given by , with
and similar formula can be written for the derivatives of . Physicists say that four-vectors, such as and , transform in a contravariant way. On the other hand, a function (again is the time coordinate) in the primed frame is given by the formula , so that its derivatives are transformed in a covariant way:
This is a natural transformation for gradient-like operations. (The above discussion may seem completely trivial for most physicist. However, I always found this covariant and contravariant terminology quite confusing, so I hope such a lay explanation will help many mathematicians. At least, I needed it.)
It is perhaps surprising — it was for me — to note that Maxwell’s equations (2.2) (or, more precisely, the electric and magnetic fields) are not strictly invariant under the Lorentz transformation. The easiest way to see this is to consider a single stationary charge. Since it is at rest, it generates no current, and so the magnetic field is constant zero. On the other hand, in a different inertial frame, the charge is no longer stationary, the electric current is no longer zero, and so cannot vanish. Therefore, the magnetic field cannot be measured absolutely, without fixing an inertial frame of reference. There are, however, relatively simple (but different from (2.5) and (2.6)) transformation rules for and , which is no longer true when Galilean transformation are considered.
What is Lorentz-invariant (well, contravariant) is the potential. To introduce this notion, we need the Helmholtz theorem. Informally, it states that any vector field can be written in the form for some function (the scalar potential) and some vector field (the vector potential). The first summand in the decomposition has no curl (it is irrotational), the other one has zero divergence (it is solenoidal).
The main topic of these notes is the Dirac equation, which deals with square integrable functions. Therefore, we give the simplest, version of Helmholtz Theorem instead of the ‘continuous’ version typical in electrodynamics. This way we also introduce the notion of Sobolev spaces, a very important object in quantum mechanics. A function is said to belong to the Sobolev space if and partial derivatives of of order up to (defined in the distributional sense) are square integrable. Equivalently, if and only if is square integrable. A vector field is said to be square integrable etc., if so are all of its components.
Suppose that is a square integrable vector field. Suppose furthermore that , and that is integrable. Then there exist a function and a vector field such that . Furthermore, and each component of are in the Sobolev space .
Sketch of the proof. By the assumptions, and are square integrable. Define the potentials using the Fourier transform, by the formulas
and verify the statements of the theorem.
In principle, the vector potential is not defined uniquely: the curl is not changed when is replaced by for any function . Choosing a particular vector potential is known as gauge fixing. When is the vector potential constructed in the proof of Helmholtz theorem, then . In this case we have . Hence, the function can be recovered from by the formula . This means that gauge fixing is equivalent to an arbitrary choice of the divergence of the vector potential.
We now come back to the electric and magnetic fields . Since , the magnetic field has zero scalar potential. Let be a vector potential of , the magnetic potential. The vector field has zero curl, and therefore its vector potential vanishes. Let be the scalar potential of , the electric potential. When fixing , we use the Lorenz gauge (note that Ludvig Lorenz and Hendrik Lorentz were two different physicists): we require that
Before we discuss why this condition can be satisfied, let us note that with the above definitions, we have
and Maxwell’s equations (after some manipulation) can be rewritten as
accompanied by the Lorenz gauge condition (2.7) and the continuity equation (2.2).
It was noted above that the classical wave operator is preserved by Lorentz transformations. This proves that the Maxwell’s equations (2.9) are Lorentz-invariant. On the other hand, (2.8) and (2.2) are not strictly Lorentz-invariant. In fact, these formulas say that and are vector fields on space-time and transform according to (2.5). For this is rather intuitive, and by (2.9), should transform in the same way. For this reason, it is sometimes convenient to define the electromagnetic four-potential by taking . Then (2.9) reduces to a single equation
where is the four-current, with . Furthermore, the relation (2.8) can be written in a more abstract form using the electromagnetic tensor (the word matrix may sound more familiar here, though):
Formula (2.8) says that , where and . Also (2.1) could be written in terms of , but it is no longer that elegant (see, for example, the Wikipedia article).
It remains to explain why the Lorentz gauge condition is in fact a gauge condition. The general magnetic and electric potentials, without fixing any gauge, can be described as follows. We start with the magnetic potential with divergence zero and the corresponding electric potential . (This corresponds to the Coulomb gauge.) In general, we have , where is an arbitrary, sufficiently smooth function. It follows that . The Lorenz gauge condition can be rewritten as , which transforms to an ordinary differential equation in the Fourier space, . General theory gives existence of a solution. Note that, however, is not defined uniquely: all Lorenz gauge functions differ from each other by a solution of the classical wave equation . Therefore, the pair of electric and magnetic potentials is defined uniquely up to the four-gradient of a solution of the classical wave equation: changing and to and does not affect (2.7)–(2.9).
One thing should be pointed out here: we do not discuss any regularity properties (smoothness, square integrability etc.) of the solutions of Maxwell’s equations. This issue will be partially addressed later, when we will couple a wave function with electromagnetic field. However, we will usually assume that the potentials are smooth enough and consider them as parameters of the environment.