Quantization of a field theory and the existence of a mass gap

Jorma Jormakka

Assume that you invent a nice new theory, say a generalization of General relativity, and you want to apply this theory in very small distances, like in the scale of the Planck length.  Then most probably your colleagues insist that you should quantize your theory because experiments point out to that in very small distances classical theories are not valid and physical interactions are described by quantum theories.

            You may of course reply that as gravitation is such a weak force in atomic distances and quantum theories you in any case usually solve by an approximate method (evaluating some first terms from a perturbation theory) that surely gravitational effects do not play any role (unless you study e.g. the big bang and the temperature was so high that all forces were of comparable strength). But this reply is countered by a beauty argument: should it not be so that all interactions are described by somewhat similar theories? So, as all other interactions (electro-magnetical, weak, strong) are quantum field theories and gauge theories with the gauge group SU(N) or U(N) for some N, should gravitation also not have such a theory? Of course, you admit, the forces must be similar. God does not throw dices and makes similar theories everywhere. But in reality, the beauty argument is quite good. Indeed, we would expect that the theories are of a similar type. Therefore you should try to quantize your theory.

            How to do it? It is not that difficult. Your classical equations, be they the Einstein equations for gravitation or Maxwell equations for electro-magnetic fields, must be modified a bit so that the solutions are discrete, but in the macroworld the solutions are still very close to the classical solutions. Thus, your classical solutions have a continuous set of solutions and you add a small term to the equations. This term is so small that in the macroworld its effect is ignorable, but in atomic distances it changes the equations so much that they have a discrete set of solutions.

            You could do this in many ways, but physicists follow Schrödinger. Based on empirical results from electrons Schrödinger proposed an equation (Schrödinger’s equation) that should describe any elementary particle that has mass. (The equation was derived from empirical results for electrons, then why should it describe anything else than electrons?) This equation is somewhat similar to the differential equation describing a harmonic oscillator and has a discrete set of solutions. In order to describe more complicated situations than the movement of an elementary of mass m, you add the terms you want to the Schrödinger’s equation. Thus, if the electron is in a Coulomb potential, add this kind of a term to the equation. The easiest way is not to add the terms directly to the equation of movement but to do it as in variational calculus: you find a function (called the Hamiltonian) such that when this function H is put to Euler-Lagrange equations, you get your equations of motion. This naturally has long ago been done to the Schrödinger’s equation. In order to modify the Schrödinger’s equation by adding a Coulomb force, you add a Coulomb potential to the Hamiltonian. In a similar way, if you want to modify the equations of motion so that they describe both electronic and magnetic interactions, you add to the Hamiltonian a term corresponding to the Maxwell equations. This term is a certain (well-known) expression of the 4-dimensional vector potential A.

            The Schrödinger’s equation is not relativistic, so in a fundamental theory you want to start with a relativistic version of Schrödinger’s equation. Dirac derived such an equation (called naturally Dirac’s equation). The solutions to Dirac’s equations are a bit different from solutions to Schrödinger’s equation and the differences are thought to be relativistic corrections. There are many ways of deriving Dirac’s equation, but I especially liked the one given in Volker Heine’s book Group theory and quantum mechanics. That derivation starts from considering the Lorentz transform as a rotation in a 4-dimensional space. This rotation leaves invariant a term that turns out to be the equation of motion of a harmonic oscillator. This equation is quadratic but the equations of motion should not be quadratic as the state in a given time should describe the state in the immediate future (without knowing the first differential, in a quadratic equation the initial values include the value and the value of the first differential, but in quantum mechanics the state includes the place and the momentum, which must be enough to give the future state). That is why Dirac expressed the equation as a product of two linear terms. One of these terms is the Dirac equation. What is interesting is that the Dirac equation only describes electrons (and positrons). It does not describe any possible elementary particle, because one can deduce that the particle in Dirac’s equation has the properties of an electron (like a magnetic momentum). Schrödinger’s equation looks like it describes any elementary particle, but the relativistic (thus more correct) Dirac’s equation describes only electrons. Consequently, taking the beginning term in the Hamiltonian of Dirac’s equation, this term is called the free propagator, and adding the electro-magnetic field as the term involving the vector potential A, you get a reasonably good quantum mechanical good theory for electro-magnetism.

            However, this theory is not capable of describing many particles. There is a better formulation, called a quantum field theory. In that theory you replace the wavefunction of the Schrödinger’s equation by a field. This field is an operator composed of creation and annihilation operators (a†, a) and it acts on a state |n1 n2 n3 n4 …> which describes how many (nj) particles are on the jth energy state. This formalism gives rather similar looking Hamiltonian and apples to any number of elementary particles. The main difference between the classical field or classical mechanical, quantum mechanical and quantum field descriptions are that in the classical theories the entity that the differential operator in the equations of motion acts on is a function (or a function group), while in the quantum mechanical description it is a wavefunction (that is, a probability amplitude of finding a particle in a given location), and in the quantum field description the entity is a field operator, which itself acts on the state of the multiparticle system. Despite these descriptions acting on different entities, the expressions in the Hamiltonian are not so different. In the quantum theories there is the free propagator term, which the classical theories do not have. This free propagator term can be thought of a small modification to the classical theory, thus in the macroworld solutions are very close to the classical solution. However, in the microworld they are not, and one cannot treat the solution as a deterministic function. It can be understood as a probability amplitude or as a field operator.

            Considering all this, one would expect that electro-magnetism can be well described by a quantum field theory that has the free propagator taken from the Dirac equation and Maxwell equations are included as the vector potential term. This is exactly correct. This kind of a theory, Quantum Electrodynamics, gives very good predictions and is considered a correct quantum theory of electro-magnetism. It is a gauge field theory where the gauge symmetry group is U(1), the phase of the (complex) vector potential A.

            Weak interactions appear as a small set of reactions between light particle (leptons, like electron) and heavy particles (hadrons, like proton and neutron). These interactions have one particular character which led to the modeling of this interaction as a gauge field theory: the reactions break the chiral symmetry. Normally, there should not be symmetry breaking, but in the nature there are some symmetry breakdowns: some materials become magnetic when cooled. Magnetism is caused by atoms being placed in same direction, so the rotation symmetry is broken. As this happens when the material is cooled, it is spontaneous symmetry breaking. A similar mechanism was used to describe weak interactions: in high temperatures there is symmetry, but when the temperature lowers, symmetry is spontaneously broken and we see parity violation. The mathematical mechanism of spontaneous symmetry breaking is the Higgs method: a Hamiltonian has Mexican hat type energy levels. The Higgs method requires that there originally was symmetry. This was initially a problem as any symmetry results into massless Goldstone bosons, which were not seen. But this problem was solved by ‘t Hooft’s method: the massless Goldstone bosons are eaten by massive bosons and what we can see are the massive bosons of the weak interaction. This theory unified weak interactions with the electro-magnetic interactions. It is a gauge field theory with the symmetry group SU(2) of leptons and U(1) of isospin. Considering the motivation of this theory by the parity violation, I think this is a very reasonably theory. A phase transition by a spontaneous symmetry breaking is a good way to explain chirality violation in weak interactions. There is enough empirical verification of this theory: the interaction bosons have all been found and the Higgs particle was also found. We can conclude that this theory is fine.

            Then there are the strong interactions. They are currently described by a similar quantum field theory. It is a gauge theory with SU(3) color symmetry. The theory has a Hamiltonian that is similar to the one in electro-weak interactions, but this theory has a problem. It explains that hadrons consist of quarks, yet these quarks are not seen in particle detectors. This is explained to be caused by quark confinement. Quarks have asymptotic freedom (free to move in close distances) and they are confined (a pair of quarks cannot be separated). Asymptotic freedom can be explained more or less logically, but to my knowledge there is no good explanation to quark confinement, that is, there are analogous situations (like two ends of a string), but not any explanation. For some time quantum string theory was expected to explain it, but currently string theory is not seen as so promising. (I guess Witten’s M-theory is still a possibility, but experiments do not support any string theory. It is 20 years since the second superstring revolution, so that one failed, let us face it.)

            So, this is the situation. How to quantize your theory?

            If you want to make physicists happy, you do as follows. First you rewrite the Hamiltonian of your theory in a way that it can be inserted to a quantum field theory which has the Dirac equation free propagator in the Hamiltonian. The rewritten form relates to your classical form like the vector potential term in Quantum Electrodynamics relates to Maxwell’s equations, so you have a model how to do it. Then you probably should consider what gauge symmetry there should be. The Lorentz group (global) symmetry can be turned into local (gauge) symmetry, but that may not be what you want. Most probably you would like a symmetry group of SU(N) or U(N), and this could be done if you assume that in a very high temperature all interactions are unified into one and in lower temperatures they become separated because of spontaneous symmetry breaking. (That is, you probably want Grand Unification to your theory.) In this high temperature you would like to calculate transition probabilities and for that reason you look for a renormalizable theory, i.e., that the perturbation series can be summed in some small area.

            Let me just make a small comment on renormalization. It seems that physicists imagine that if a series can be summed in some small open ball, then the solution can be analytically continued to the whole space. This is so with complex analytic functions. If singularities are isolated, the solution can be analytically continued. But this applies only to a function of a single complex variable. If there are two or more complex variables, the derivative cannot be so well defined. I once made a mistake of this type when trying to prove the Riemann hypothesis: I made a function of two complex variables, a so simple one that it certainly would have been complex analytic, but limits when approaching a point from two different directions in the space of two complex variables, the result need not be the same. This problem happens immediately when you have a complex dimension higher than one and quantum field theories are made in higher complex dimensions. It is not at all obvious that analytic continuation can motivate renormalization.

            That’s it! That means, if you want to make physicists happy. Assuming you do not, you also do not need to start with the free propagator from Dirac’s equation, which comes from an equation describing only electrons. You can basically do quantization in many ways. The issue is only to get discrete values to your solutions. Let us think of simply selecting a discrete set of classical solutions. These cannot be solutions to Schrödinger’s equation because there we had the original equations of motion of a particle of mass m given by Schrödinger’s equation and then the external field was included as an additional term. The solutions to Schrödinger’s equation are not exactly solutions to the classical equations, so taking a discrete set of solutions to the classical equations we can only get approximations to the discrete set of solutions to the quantum equations. Can these approximations be close? It depends.

            Let us consider the problem of the mass gap. The existence of a mass gap means that the difference in energy between the zero solution and the lowest positive energy solution has a lower bound. The existence of a mass gap was a part of the question in one Clay Millennium Prize problems. I solved it in the following way: I found classical solutions to Yang-Mills equations such that these solutions could give arbitrarily small energies (masses). Then, in order to quantize the classical field, I selected a discrete set of solutions. This was legal in the problem setting, as this problem statement did not say that you should have a free propagator in the Hamiltonian or that your quantization should in some way correspond to what physicists do. Quantization in that problem statement was defined to mean fulfilling a set of requirements, which my quantization did fulfill. Much later some physicists wrote a paper against my solution stating that if you first start by quantizing the field (that is, start by building a Hamiltonian by adding terms to a free particle propagator), then you do not get the set of solutions that allow any arbitrarily small energy. This is obvious, because if you start by the free propagator, your quantized solutions will be like in a harmonic oscillator and they will show the mass gap. But this was not the Millenium problem, as it is well known that a harmonic oscillator does have a mass gap. The question was that quantization may be made in many ways, so quantization was only described by a set of requirements (axioms), and then it was asked if there is a mass gap. 

            Why you necessarily get a mass gap if you quantize starting with the free particle propagator? The members in the discrete set of classical solutions have to be very close to each other in order to have no mass gap, but if they are very close, then the difference between two quantum solutions is essentially the difference of two solutions in Dirac’s equation, and those solutions are solutions to a harmonic oscillator and cannot be very close. What you would have to do is to insert a parameter to the free propagator and let this parameter get very small when the classical solutions in your discrete set come very close to each other. But this is not what physicists like to do. They take the free propagator without any weight parameter and they put the external field also without any weight parameter. Doing so it is not possible to get quantum solutions come very close to each other. The only way to get a situation where there is no mass gap would be to find a sequence of solutions that never get close to each other but give energy that tends to zero. As the solutions are linear combinations of eigenfunctions, this seems to imply that the dimension of solutions as a vector space must be infinite. But it is not infinite in these theories, so you will get the mass gap proceeding this way.

Thus, obviously I could not first quantize the field and then try to produce my solutions to the Yang-Mills equations. My solutions are solutions to the classical Yang-Mills equations and they are not solutions if you add a free propagator. Furthermore, adding the free propagator gives the solutions similar to the harmonic oscillator (as that is what Dirac’s equation gives) and you will get a mass gap. But this result would prove nothing. You are required to prove that there is a mass gap for any possible quantization fulfilling the given axioms. There must be an infinite number of possible ways of filling those axioms, so you cannot first quantize. I found one way to quantize so that there is no mass gap, and this is the easy way to disprove the claim. But I did this way only because the Clay problem was posed this way. If your problem is to quantize your gravitation theory, so use one of the two quantization ways physicists approve for a physical theory: the canonical quantization or the functional integral quantization. Both lead to about the same result. Canonical quantization is the one where you start from quantum mechanical Schroedinger or Heisenberg equation and then change the wave functions to field operators consisting of creation and annihilation operators. For the path (of functional) integral quantization read ten first pages from Bailin, Love Introduction to Gauge Field Theory, it has a charming completely intuitive derivation of the path integral method from the transition probability. In path integral quantization the fields are classical functions (boson fields, fermion fields are spinors, i.e., elements of a Grassman algebra, not complex valued functions), but there is a functional integration so that the results you get are transition probabilities. You cannot get results as functions with deterministic values from that quantization.

            Well, that’s about all. Hope this helps somebody to quantize a gravitation theory.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.