Skip to content

Hamiltonian systems with constraints

Audience: graduate students with prior exposure to Lagrangian/Hamiltonian mechanics and basic field theory.
Goal: learn the Dirac–Bergmann treatment of constrained Hamiltonian systems and see it in action for Maxwell theory, the Proca (massive vector) field, and general relativity (ADM).


In ordinary (regular) Lagrangian mechanics, the Legendre map

(qa,q˙a)(qa,pa),pa=Lq˙a(q^a,\dot q^a)\mapsto (q^a,p_a),\qquad p_a=\frac{\partial L}{\partial\dot q^a}

is invertible because the Hessian

Wab(q,q˙)=2Lq˙aq˙bW_{ab}(q,\dot q)=\frac{\partial^2L}{\partial\dot q^a\,\partial\dot q^b}

has detW0\det W\neq 0. One can solve q˙=q˙(q,p)\dot q=\dot q(q,p) and define the Hamiltonian H=pq˙LH=p\dot q-L.

A constrained system arises when the Hessian is singular: detW=0\det W=0. Then the momenta are not independent; they satisfy relations

ϕα(q,p)0,\phi_\alpha(q,p)\approx 0,

called primary constraints. The symbol \approx (“weak equality”) means the relation holds on the constraint surface in phase space, but you should not use it inside Poisson brackets until you have computed them.

A helpful geometric picture is this: the equations ϕα=0\phi_\alpha=0 define a submanifold of phase space, the constraint surface. A weak equality is an equality only after restricting to this submanifold. For example, even if ϕ0\phi\approx0, the bracket {F,ϕ}\{F,\phi\} may be nonzero and can generate a physically meaningful motion along or away from the constraint surface. This is why one computes Poisson brackets first and imposes constraints afterward.

In gauge theories the singularity is not an accident: it reflects redundancy in the description (gauge symmetry). In massive theories (e.g. Proca) singularity can also occur because some variables are nondynamical multipliers even without gauge symmetry.


2. Dirac–Bergmann algorithm in a nutshell

Section titled “2. Dirac–Bergmann algorithm in a nutshell”

For canonical fields (qa(x),pa(x))(q^a(\mathbf{x}),p_a(\mathbf{x})),

{qa(x),pb(y)}=δabδ(3)(xy),{qa,qb}={pa,pb}=0.\{q^a(\mathbf{x}),p_b(\mathbf{y})\}=\delta^a{}_b\,\delta^{(3)}(\mathbf{x}-\mathbf{y}), \qquad \{q^a,q^b\}=\{p_a,p_b\}=0.

For general functionals F[q,p]F[q,p] and G[q,p]G[q,p], this means

{F,G}=d3x(δFδqa(x)δGδpa(x)δFδpa(x)δGδqa(x)).\{F,G\} = \int d^3x\left( \frac{\delta F}{\delta q^a(\mathbf{x})}\frac{\delta G}{\delta p_a(\mathbf{x})} - \frac{\delta F}{\delta p_a(\mathbf{x})}\frac{\delta G}{\delta q^a(\mathbf{x})} \right).

This formula is often safer than manipulating unsmeared delta functions directly.

Given a Lagrangian density L(q,q˙,q)\mathcal{L}(q,\dot q,\nabla q):

  1. Define canonical momenta

    pa=Lq˙a.p_a=\frac{\partial\mathcal{L}}{\partial \dot q^a}.

    Relations among (q,p)(q,p) that do not determine velocities are primary constraints ϕα0\phi_\alpha\approx 0.

  2. Canonical Hamiltonian (Legendre transform where possible)

    Hc=paq˙aL.\mathcal{H}_c = p_a\dot q^a - \mathcal{L}.
  3. Total Hamiltonian

    HT=d3x(Hc+uα(x)ϕα(x)),H_T = \int d^3x\left(\mathcal{H}_c + u^\alpha(\mathbf{x})\,\phi_\alpha(\mathbf{x})\right),

    where uαu^\alpha are Lagrange multipliers enforcing the primary constraints.

  4. Consistency conditions Require constraints be preserved under time evolution:

    ϕ˙α(x)={ϕα(x),HT}0.\dot\phi_\alpha(\mathbf{x})=\{\phi_\alpha(\mathbf{x}),H_T\}\approx 0.

    This may:

    • produce secondary constraints, tertiary, etc.; and/or
    • fix some multipliers uαu^\alpha.

Continue until closure.

A practical checklist is:

  • write all primary constraints before solving for any remaining velocities;
  • build HTH_T, not only HcH_c;
  • impose ϕ˙0\dot\phi\approx0 for every new constraint;
  • separate equations that produce new constraints from equations that determine multipliers;
  • stop only when every constraint is preserved and every relevant multiplier is either fixed or remains arbitrary because of gauge freedom.

Let {ΦA}\{\Phi_A\} be the complete set of constraints.

  • First-class constraint: {ΦA,ΦB}0\{\Phi_A,\Phi_B\}\approx 0 for all BB.
    These generate gauge transformations (redundancies).

  • Second-class constraint: the matrix (kernel)

    CAB(x,y)={ΦA(x),ΦB(y)}C_{AB}(\mathbf{x},\mathbf{y})=\{\Phi_A(\mathbf{x}),\Phi_B(\mathbf{y})\}

    is (functionally) invertible on the constraint surface.
    These do not generate gauge; they simply remove phase-space directions.

For many gauge systems, an individual first-class constraint is best understood as one member of a gauge-generator chain. The full gauge transformation is generated by a tuned combination of primary, secondary, and sometimes higher-stage constraints, with time derivatives of the gauge parameter. Maxwell theory in §3.6 is the simplest example.

A common but slightly dangerous slogan is:

First-class constraints leave Lagrange multipliers arbitrary, whereas second-class constraints determine them.

The correct version is a little more precise.

The total Hamiltonian contains arbitrary multipliers only for primary constraints,

HT=Hc+d3xuαϕα,H_T=H_c+\int d^3x\,u^\alpha\phi_\alpha,

where the ϕα\phi_\alpha are primary. Once the full set of constraints ΦI\Phi_I has been found, consistency requires

0Φ˙I={ΦI,Hc}+d3yuα(y){ΦI,ϕα(y)}.0\approx \dot\Phi_I = \{\Phi_I,H_c\} + \int d^3y\,u^\alpha(\mathbf y) \{\Phi_I,\phi_\alpha(\mathbf y)\}.

These equations can do three things: generate new constraints, determine some combinations of the multipliers uαu^\alpha, or impose no new condition.

The practical rule is:

  • A multiplier along a primary first-class direction remains arbitrary before gauge fixing. This arbitrariness is the Hamiltonian form of gauge freedom.
  • A multiplier along a primary second-class direction is fixed by consistency, provided the constraint-bracket matrix has the appropriate constant rank and inverse on the second-class sector.
  • After imposing gauge-fixing conditions, first-class constraints are converted into second-class pairs, and the formerly arbitrary multipliers are fixed by preserving the gauge conditions.

Two caveats are worth keeping in mind. First, secondary constraints do not come with independent multipliers in the ordinary total Hamiltonian, although one sometimes introduces an extended Hamiltonian for formal purposes. Second, words like “always” assume a regular system: no changing-rank constraint matrix, no unresolved boundary zero modes, and no hidden reducibility relations.

Maxwell and Proca illustrate the distinction sharply. In Maxwell theory, the multiplier uu in front of π0\pi^0 remains arbitrary until a gauge is chosen. In Proca theory, the same-looking primary constraint π00\pi^0\approx0 belongs to a second-class pair, and consistency fixes uu.

2.5 Dirac bracket (for second-class constraints)

Section titled “2.5 Dirac bracket (for second-class constraints)”

If χA0\chi_A\approx 0 are second-class and CABC_{AB} is invertible with inverse CABC^{AB},

{F,G}D={F,G}d3xd3y  {F,χA(x)}CAB(x,y){χB(y),G}.\{F,G\}_D = \{F,G\} - \int d^3x\,d^3y\; \{F,\chi_A(\mathbf{x})\}\,C^{AB}(\mathbf{x},\mathbf{y})\,\{\chi_B(\mathbf{y}),G\}.

Then {F,χA}D=0\{F,\chi_A\}_D=0 for all FF, so you may set χA=0\chi_A=0 strongly after switching to Dirac brackets.

Here CABC^{AB} is an inverse kernel, meaning

d3zCAC(x,z)CCB(z,y)=δABδ(3)(xy).\int d^3z\,C_{AC}(\mathbf{x},\mathbf{z})C^{CB}(\mathbf{z},\mathbf{y}) = \delta_A{}^B\delta^{(3)}(\mathbf{x}-\mathbf{y}).

If the inverse requires solving an elliptic equation, as in Coulomb gauge, one must specify boundary conditions. Zero modes of the operator are not harmless bookkeeping details: they may represent residual gauge transformations or global charges.

Let NN be the number of configuration fields per space point (so phase space has dimension 2N2N). Let N1N_1 be the number of first-class constraints and N2N_2 the number of second-class constraints (per point, in an appropriate local sense). Then

Nphys=NN112N2.N_{\text{phys}} = N - N_1 - \frac12 N_2.

Equivalently, in phase space:

dimΓphys=2N2N1N2.\dim \Gamma_{\text{phys}} = 2N - 2N_1 - N_2.

Before the field-theory examples, it is useful to see the algorithm without delta functions. Consider

L=12(x˙y)2,L=\frac12(\dot x-y)^2,

with configuration variables (x,y)(x,y). The momenta are

px=x˙y,py=0.p_x=\dot x-y, \qquad p_y=0.

Thus

ϕ1py0\phi_1\equiv p_y\approx0

is a primary constraint. Solving x˙=px+y\dot x=p_x+y, the canonical Hamiltonian is

Hc=pxx˙L=12px2+ypx.H_c=p_x\dot x-L=\frac12p_x^2+yp_x.

The total Hamiltonian is HT=Hc+upyH_T=H_c+u p_y. Consistency of the primary constraint gives

p˙y={py,HT}=HTy=px0,\dot p_y=\{p_y,H_T\}=-\frac{\partial H_T}{\partial y}=-p_x\approx0,

so there is a secondary constraint

ϕ2px0.\phi_2\equiv p_x\approx0.

The two constraints have vanishing Poisson bracket,

{py,px}=0,\{p_y,p_x\}=0,

so they are first-class. The arbitrary multiplier uu reflects the gauge freedom

δx=ϵ(t),δy=ϵ˙(t),\delta x=\epsilon(t), \qquad \delta y=\dot\epsilon(t),

under which x˙y\dot x-y is invariant. This toy model mirrors Maxwell theory: one primary constraint plus one secondary constraint together represent one gauge function.


3. Worked example I: Maxwell theory (massless spin-1)

Section titled “3. Worked example I: Maxwell theory (massless spin-1)”

Take vacuum Maxwell in flat space:

L=14FμνFμν,Fμν=μAννAμ.\mathcal{L} = -\frac14 F_{\mu\nu}F^{\mu\nu}, \qquad F_{\mu\nu}=\partial_\mu A_\nu-\partial_\nu A_\mu.

With our signature,

FμνFμν=2F0iF0i+FijFij.F_{\mu\nu}F^{\mu\nu}=-2F_{0i}F_{0i}+F_{ij}F_{ij}.

Define

F0i=A˙iiA0,Bk=12ϵkijFij,FijFij=2B2.F_{0i}=\dot A_i-\partial_iA_0, \qquad B^k=\frac12\epsilon^{kij}F_{ij}, \qquad F_{ij}F_{ij}=2\mathbf{B}^2.

Then

L=12(A˙iiA0)212B2=12E212B2,EiF0i.\mathcal{L} = \frac12(\dot A_i-\partial_iA_0)^2-\frac12\mathbf{B}^2 = \frac12\mathbf{E}^2-\frac12\mathbf{B}^2, \quad E_i\equiv F_{0i}.

3.2 Canonical momenta and primary constraint

Section titled “3.2 Canonical momenta and primary constraint”

Define

πμ=LA˙μ.\pi^\mu=\frac{\partial\mathcal{L}}{\partial \dot A_\mu}.

There is no A˙0\dot A_0 in L\mathcal{L}, hence

π0=0ϕ1(x)π0(x)0(primary constraint).\pi^0 = 0 \quad\Rightarrow\quad \phi_1(\mathbf{x})\equiv \pi^0(\mathbf{x})\approx 0 \qquad\text{(primary constraint).}

For i=1,2,3i=1,2,3,

πi=LA˙i=A˙iiA0=F0i=Ei.\pi^i=\frac{\partial\mathcal{L}}{\partial \dot A_i} =\dot A_i-\partial_iA_0 =F_{0i} =E_i.

Thus πi\pi^i is the electric field.

Compute the Hamiltonian density

Hc=πiA˙iL.\mathcal{H}_c=\pi^i\dot A_i-\mathcal{L}.

Solve A˙i=πi+iA0\dot A_i=\pi^i+\partial_iA_0:

πiA˙i=π2+πiiA0.\pi^i\dot A_i = \pi^2+\pi^i\partial_iA_0.

Since L=12π212B2\mathcal{L}=\tfrac12\pi^2-\tfrac12\mathbf{B}^2,

Hc=12π2+12B2+πiiA0.\mathcal{H}_c = \frac12\pi^2+\frac12\mathbf{B}^2+\pi^i\partial_iA_0.

Integrating by parts (dropping boundary terms),

d3xπiiA0=d3xA0iπi,\int d^3x\,\pi^i\partial_iA_0 = -\int d^3x\,A_0\,\partial_i\pi^i,

so the canonical Hamiltonian is

Hc=d3x[12(π2+B2)A0iπi].H_c = \int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) - A_0\,\partial_i\pi^i \right].

3.4 Total Hamiltonian and Gauss constraint

Section titled “3.4 Total Hamiltonian and Gauss constraint”

Add the primary constraint with multiplier u(x)u(\mathbf{x}):

HT=Hc+d3xu(x)π0(x).H_T=H_c+\int d^3x\,u(\mathbf{x})\,\pi^0(\mathbf{x}).

Preserve π00\pi^0\approx 0:

π˙0(x)={π0(x),HT}=δHTδA0(x)=iπi(x)0.\dot\pi^0(\mathbf{x}) = \{\pi^0(\mathbf{x}),H_T\} = -\frac{\delta H_T}{\delta A_0(\mathbf{x})} = \partial_i\pi^i(\mathbf{x}) \approx 0.

This yields the secondary constraint

ϕ2(x)iπi(x)0,\phi_2(\mathbf{x})\equiv \partial_i\pi^i(\mathbf{x})\approx 0,

i.e. Gauss’s law in vacuum.

No further constraints arise; the multiplier u(x)u(\mathbf{x}) remains undetermined.

To see explicitly why the algorithm closes, use the Hamilton equation

π˙i=δHTδAi=jFji.\dot\pi^i=-\frac{\delta H_T}{\delta A_i}=\partial_jF_{ji}.

Then

ϕ˙2=iπ˙i=ijFji=0,\dot\phi_2=\partial_i\dot\pi^i=\partial_i\partial_jF_{ji}=0,

because FjiF_{ji} is antisymmetric while ij\partial_i\partial_j is symmetric. Thus Gauss’s law is automatically preserved.

3.5 Constraint algebra and first-class nature

Section titled “3.5 Constraint algebra and first-class nature”

Using canonical brackets,

{π0(x),ϕ2(y)}=0,{ϕ2(x),ϕ2(y)}=0.\{\pi^0(\mathbf{x}),\phi_2(\mathbf{y})\}=0, \qquad \{\phi_2(\mathbf{x}),\phi_2(\mathbf{y})\}=0.

Thus ϕ1\phi_1 and ϕ2\phi_2 are first-class.

The basic Hamilton equations are also instructive:

A˙i={Ai,HT}=πi+iA0,A˙0={A0,HT}=u.\dot A_i=\{A_i,H_T\}=\pi_i+\partial_iA_0, \qquad \dot A_0=\{A_0,H_T\}=u.

The first equation is just the definition πi=A˙iiA0\pi_i=\dot A_i-\partial_iA_0. The second says that the time evolution of A0A_0 is arbitrary; this is the Hamiltonian trace of gauge freedom.

3.6 Gauge generator: one gauge function, two constraints

Section titled “3.6 Gauge generator: one gauge function, two constraints”

A Maxwell gauge transformation is controlled by one spacetime function ϵ(t,x)\epsilon(t,\mathbf{x}):

AμAμμϵ.A_\mu\mapsto A_\mu-\partial_\mu\epsilon.

Thus

δA0=ϵ˙,δAi=iϵ.\delta A_0=-\dot\epsilon, \qquad \delta A_i=-\partial_i\epsilon.

The appropriate Hamiltonian generator can be written, using Castellani’s algorithm, as

G[ϵ]=d3x(ϵ˙(x)π0(x)+ϵ(x)ϕ2(x)),ϕ2=iπi.G[\epsilon] = \int d^3x\Big(-\dot\epsilon(\mathbf{x})\,\pi^0(\mathbf{x})+\epsilon(\mathbf{x})\,\phi_2(\mathbf{x})\Big), \qquad \phi_2=\partial_i\pi^i.

Then

δA0={A0,G}=ϵ˙,δAi={Ai,G}=iϵ,\delta A_0=\{A_0,G\}=-\dot\epsilon, \qquad \delta A_i=\{A_i,G\}=-\partial_i\epsilon,

so the canonical generator reproduces the spacetime U(1)U(1) gauge symmetry.

The important point is that π0\pi^0 and iπi\partial_i\pi^i are not associated with two independent gauge functions. They are the two members of one primary-secondary chain:

π0iπi.\pi^0 \quad\longrightarrow\quad \partial_i\pi^i.

The arrow means that preserving the primary constraint produces the secondary constraint:

π˙00iπi0.\dot\pi^0\approx0 \quad\Longrightarrow\quad \partial_i\pi^i\approx0.

If one tried to use only Gauss’s law,

GGauss[ϵ]=d3xϵiπi,G_{\rm Gauss}[\epsilon]=\int d^3x\,\epsilon\,\partial_i\pi^i,

then

δAi=iϵ,δA0=0.\delta A_i=-\partial_i\epsilon, \qquad \delta A_0=0.

This is not the full gauge transformation for time-dependent ϵ\epsilon. The primary term ϵ˙π0-\dot\epsilon\,\pi^0 is needed to transform A0A_0 correctly.

3.7 Degrees of freedom and the need for gauge fixing

Section titled “3.7 Degrees of freedom and the need for gauge fixing”

We have N=4N=4 configuration fields AμA_\mu, so the full phase space

(A0,Ai;π0,πi)(A_0,A_i;\pi^0,\pi^i)

has dimension 88 per spatial point. There are two first-class constraints and no second-class constraints:

π00,iπi0.\pi^0\approx0, \qquad \partial_i\pi^i\approx0.

Therefore

dimΓphys=82×2=4,Nphys=2,\dim\Gamma_{\rm phys}=8-2\times2=4, \qquad N_{\text{phys}}=2,

the two transverse photon polarizations.

A complete canonical gauge fixing of the full phase space usually adds two equal-time gauge conditions, for example

χ0=A00,χC=iAi0.\chi_0=A_0\approx0, \qquad \chi_C=\partial_iA_i\approx0.

Together,

π0,iπi,A0,iAi\pi^0, \quad \partial_i\pi^i, \quad A_0, \quad \partial_iA_i

form a second-class set, assuming the relevant Laplacian has no problematic zero modes after boundary conditions are specified. The pair (A0,π0)(A_0,\pi^0) removes the nondynamical scalar-potential sector, while (iAi,iπi)(\partial_iA_i,\partial_i\pi^i) removes the longitudinal spatial sector.

If one treats A0A_0 as a Lagrange multiplier from the beginning and works only with (Ai,πi)(A_i,\pi^i), then there is only one first-class constraint, iπi0\partial_i\pi^i\approx0, and one spatial gauge condition such as iAi0\partial_iA_i\approx0 is enough. This is why different books sometimes appear to count gauge conditions differently.

3.8 Optional: explicit reduction in Coulomb gauge

Section titled “3.8 Optional: explicit reduction in Coulomb gauge”

The nontrivial part of the canonical reduction is the spatial longitudinal sector. To see the “constraint + gauge” removal explicitly, decompose

Ai=AiT+iα,iAiT=0,A_i = A_i^T + \partial_i\alpha, \qquad \partial_iA_i^T=0, πi=πTi+iβ,iπTi=0.\pi^i = \pi_T^i + \partial^i\beta, \qquad \partial_i\pi_T^i=0.

Then Gauss’s constraint is iπi=2β0\partial_i\pi^i=\nabla^2\beta\approx 0, which removes the longitudinal momentum β\beta (up to boundary conditions). The longitudinal coordinate α\alpha is removed by gauge.

If you impose Coulomb gauge χiAi0\chi\equiv \partial_iA_i\approx 0, then (χ,ϕ2)(\chi,\phi_2) form a second-class pair:

{χ(x),ϕ2(y)}={iAi(x),jπj(y)}=2δ(3)(xy),\{\chi(\mathbf{x}),\phi_2(\mathbf{y})\} = \{\partial_iA_i(\mathbf{x}),\partial_j\pi^j(\mathbf{y})\} = -\nabla^2\delta^{(3)}(\mathbf{x}-\mathbf{y}),

which is invertible (as an operator) after specifying boundary conditions. The reduced Hamiltonian becomes a Hamiltonian for the transverse fields only:

Hred=d3x12(πT2+B2).H_{\text{red}}=\int d^3x\,\frac12\big(\pi_T^2+\mathbf{B}^2\big).

With this gauge choice, the Dirac bracket of the unreduced variables projects onto the transverse part:

{Ai(x),πj(y)}D=(δijij2)δ(3)(xy)PijTδ(3)(xy),\{A_i(\mathbf{x}),\pi^j(\mathbf{y})\}_D = \left(\delta_i{}^j-\frac{\partial_i\partial^j}{\nabla^2}\right) \delta^{(3)}(\mathbf{x}-\mathbf{y}) \equiv P_i{}^j{}_{T}\,\delta^{(3)}(\mathbf{x}-\mathbf{y}),

where PTP_T is the transverse projector. This formula makes the two physical photon polarizations visible directly in the bracket.

3.9 Lorenz gauge vs canonical gauge conditions

Section titled “3.9 Lorenz gauge vs canonical gauge conditions”

In covariant field theory one often says, “impose the Lorenz gauge to eliminate gauge freedom.” The spelling is Lorenz, after Ludvig Lorenz; “Lorentz gauge” is a common misnomer.

The Lorenz condition is one spacetime differential condition,

μAμ=0.\partial^\mu A_\mu=0.

With the conventions of this page,

μAμ=A˙0+iAi,\partial^\mu A_\mu=-\dot A_0+\partial_iA_i,

so Lorenz gauge says

A˙0=iAi.\dot A_0=\partial_iA_i.

But in the Hamiltonian formulation,

A˙0={A0,HT}=u,\dot A_0=\{A_0,H_T\}=u,

where uu is the arbitrary multiplier of the primary constraint π00\pi^0\approx0. Therefore Lorenz gauge is, canonically,

u+iAi=0.-u+\partial_iA_i=0.

It is not simply one equal-time condition on the phase-space variables (Aμ,πμ)(A_\mu,\pi^\mu); it is a velocity-dependent condition, or equivalently a condition that fixes the multiplier uu.

This reconciles the two common statements:

  • In the full canonical phase space, complete gauge fixing is often represented by two equal-time conditions, e.g. A00,iAi0.A_0\approx0, \qquad \partial_iA_i\approx0.
  • In the covariant Lagrangian formulation, Lorenz gauge is one spacetime condition imposed on entire histories Aμ(t,x)A_\mu(t,\mathbf x).

These are different gauge choices. The temporal-plus-Coulomb choice is noncovariant and stronger than Lorenz gauge. Lorenz gauge is covariant, but it leaves residual transformations

AμAμ+μαA_\mu\mapsto A_\mu+\partial_\mu\alpha

with

α=0,\Box\alpha=0,

because

μAμμAμ+α.\partial^\mu A_\mu\mapsto \partial^\mu A_\mu+\Box\alpha.

Thus Lorenz gauge fixes the gauge only after one also specifies boundary conditions or removes zero modes of \Box.

For a plane wave Aμ=aμeikxA_\mu=a_\mu e^{ik\cdot x} with k2=0k^2=0, Lorenz gauge gives

kμaμ=0,k^\mu a_\mu=0,

which reduces four components to three. A residual transformation α=α0eikx\alpha=\alpha_0e^{ik\cdot x} shifts

aμaμ+iα0kμ,a_\mu\mapsto a_\mu+i\alpha_0 k_\mu,

and removes one more unphysical component. Thus

41 Lorenz condition1 residual gauge freedom=24-1\ \text{Lorenz condition}-1\ \text{residual gauge freedom}=2

physical polarizations, in agreement with the Hamiltonian count.

3.10 Maxwell path integral: where did the two canonical gauge conditions go?

Section titled “3.10 Maxwell path integral: where did the two canonical gauge conditions go?”

The naive Lagrangian path integral

Z=DAeiS[A],S[A]=14d4xFμνFμν,Z=\int \mathcal DA\,e^{iS[A]}, \qquad S[A]=- \frac14\int d^4x\,F_{\mu\nu}F^{\mu\nu},

overcounts gauge-equivalent configurations and has a noninvertible kinetic operator. The Faddeev—Popov procedure chooses one spacetime gauge condition, for instance

F[A]=μAμω(x)=0,F[A]=\partial^\mu A_\mu-\omega(x)=0,

and inserts

1=ΔFP[A]Dαδ ⁣(F[Aα]),Aμα=Aμ+μα.1=\Delta_{\rm FP}[A]\int\mathcal D\alpha\, \delta\!\left(F[A^\alpha]\right), \qquad A_\mu^\alpha=A_\mu+\partial_\mu\alpha.

For Lorenz gauge,

δF[Aα]δα=,\frac{\delta F[A^\alpha]}{\delta\alpha}=\Box,

so

ΔFP=det\Delta_{\rm FP}=\det\Box

(up to an irrelevant sign convention, often written as det(2)\det(-\partial^2)). In abelian Maxwell theory this determinant is independent of AμA_\mu, so ghosts decouple.

Averaging over ω\omega with a Gaussian gives the familiar covariant ξ\xi-gauge action,

Leff=14FμνFμν12ξ(μAμ)2.\mathcal L_{\rm eff} = -\frac14F_{\mu\nu}F^{\mu\nu} - \frac{1}{2\xi}\left(\partial^\mu A_\mu\right)^2.

For finite ξ\xi, this does not impose μAμ=0\partial^\mu A_\mu=0 sharply; it weights violations of the gauge condition. The sharp Landau-gauge limit is ξ0\xi\to0.

The apparent mismatch with the Hamiltonian statement is resolved as follows. The Hamiltonian, phase-space version of a completely gauge-fixed path integral has the schematic Faddeev—Senjanovic form

Zχ=DAμDπμ  δ[π0]δ[iπi]δ[χ0]δ[χC]det{χa,ϕb}exp[idtd3x(πμA˙μHc)].Z_\chi = \int\mathcal DA_\mu\,\mathcal D\pi^\mu\; \delta[\pi^0]\, \delta[\partial_i\pi^i]\, \delta[\chi_0]\, \delta[\chi_C]\, \det\{\chi_a,\phi_b\} \exp\left[i\int dt\,d^3x\left(\pi^\mu\dot A_\mu-\mathcal H_c\right)\right].

For temporal plus Coulomb gauge,

χ0=A0,χC=iAi,\chi_0=A_0, \qquad \chi_C=\partial_iA_i,

one gets

{χa,ϕb}(1002),det{χa,ϕb}det(2).\{\chi_a,\phi_b\} \sim \begin{pmatrix} 1&0\\ 0&-\nabla^2 \end{pmatrix}, \qquad \det\{\chi_a,\phi_b\}\sim\det(-\nabla^2).

Here the two equal-time gauge conditions are visible explicitly.

By contrast, the covariant Lorenz path integral fixes one spacetime gauge function α(t,x)\alpha(t,\mathbf x) on whole field histories. Its Faddeev—Popov operator \Box is second order in time, so with appropriate boundary conditions it controls both pieces of Cauchy data

α(t0,x),α˙(t0,x).\alpha(t_0,\mathbf x), \qquad \dot\alpha(t_0,\mathbf x).

These are the covariant-history counterpart of the two equal-time canonical gauge directions associated with

π0andiπi.\pi^0 \quad\text{and}\quad \partial_i\pi^i.

Another way to say the same thing: once the covariant gauge-fixing term

12ξ(μAμ)2-\frac{1}{2\xi}(\partial^\mu A_\mu)^2

is added, the gauge-fixed Lagrangian contains a A˙02\dot A_0^2 contribution. The original primary constraint π0=0\pi^0=0 is no longer a constraint of the gauge-fixed quadratic Lagrangian; the gauge quotient has already been implemented by the Faddeev—Popov construction.


4. Worked example II: Proca (massive spin-1)

Section titled “4. Worked example II: Proca (massive spin-1)”

With our signature (+++)(-+++), a convenient Proca Lagrangian that leads to a positive-energy Hamiltonian is

L=14FμνFμν12m2AμAμ.\mathcal{L} = -\frac14 F_{\mu\nu}F^{\mu\nu} -\frac12 m^2 A_\mu A^\mu.

The field equation is

μFμνm2Aν=0,\partial_\mu F^{\mu\nu} - m^2 A^\nu = 0,

and taking ν\partial_\nu gives the on-shell constraint

νAν=0(m0).\partial_\nu A^\nu = 0 \quad (m\neq 0).

Using AμAμ=A02+Ai2A_\mu A^\mu=-A_0^2+A_i^2, we get

12m2AμAμ=+12m2A0212m2Ai2.-\frac12 m^2 A_\mu A^\mu = +\frac12 m^2 A_0^2 - \frac12 m^2 A_i^2.

Therefore

L=12(A˙iiA0)212B2+12m2A0212m2Ai2.\mathcal{L} = \frac12(\dot A_i-\partial_iA_0)^2 -\frac12\mathbf{B}^2 +\frac12 m^2 A_0^2 -\frac12 m^2 A_i^2.

The crucial difference from Maxwell is the +12m2A02+\frac12m^2A_0^2 term. The field A0A_0 still has no velocity, but it is no longer a pure Lagrange multiplier: the secondary constraint will solve for A0A_0 rather than impose a gauge-generating Gauss law.

4.3 Canonical momenta and primary constraint

Section titled “4.3 Canonical momenta and primary constraint”

Exactly as in Maxwell,

π0=LA˙0=0ϕ1(x)π0(x)0,\pi^0=\frac{\partial\mathcal{L}}{\partial\dot A_0}=0 \quad\Rightarrow\quad \phi_1(\mathbf{x})\equiv \pi^0(\mathbf{x})\approx 0,

and

πi=LA˙i=A˙iiA0.\pi^i=\frac{\partial\mathcal{L}}{\partial\dot A_i}=\dot A_i-\partial_iA_0.

Compute

Hc=πiA˙iL,A˙i=πi+iA0.\mathcal{H}_c=\pi^i\dot A_i-\mathcal{L}, \qquad \dot A_i=\pi^i+\partial_iA_0.

One finds

Hc=12π2+12B2+12m2Ai2+πiiA012m2A02.\mathcal{H}_c = \frac12\pi^2+\frac12\mathbf{B}^2 +\frac12 m^2 A_i^2 +\pi^i\partial_iA_0 -\frac12 m^2 A_0^2.

Integrating by parts,

Hc=d3x[12(π2+B2)+12m2Ai2A0iπi12m2A02].H_c=\int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) +\frac12 m^2 A_i^2 - A_0\,\partial_i\pi^i -\frac12 m^2 A_0^2 \right].

4.5 Total Hamiltonian and secondary constraint

Section titled “4.5 Total Hamiltonian and secondary constraint”

Total Hamiltonian:

HT=Hc+d3xu(x)π0(x).H_T = H_c+\int d^3x\,u(\mathbf{x})\,\pi^0(\mathbf{x}).

Preserve π00\pi^0\approx 0:

π˙0(x)=δHTδA0(x)=iπi(x)+m2A0(x)0.\dot\pi^0(\mathbf{x}) = -\frac{\delta H_T}{\delta A_0(\mathbf{x})} = \partial_i\pi^i(\mathbf{x}) + m^2 A_0(\mathbf{x}) \approx 0.

So the secondary constraint is

ϕ2(x)iπi(x)+m2A0(x)0.\phi_2(\mathbf{x})\equiv \partial_i\pi^i(\mathbf{x})+m^2A_0(\mathbf{x})\approx 0.

4.6 Second-class structure (no gauge symmetry)

Section titled “4.6 Second-class structure (no gauge symmetry)”

Compute the constraint bracket:

{ϕ1(x),ϕ2(y)}={π0(x),iπi(y)+m2A0(y)}=m2δ(3)(xy)0.\{\phi_1(\mathbf{x}),\phi_2(\mathbf{y})\} = \{\pi^0(\mathbf{x}),\partial_i\pi^i(\mathbf{y})+m^2A_0(\mathbf{y})\} = -m^2\delta^{(3)}(\mathbf{x}-\mathbf{y})\neq 0.

Thus ϕ1,ϕ2\phi_1,\phi_2 are second-class: there is no gauge symmetry. Consistency now fixes the multiplier uu rather than leaving it arbitrary.

It is useful to see what “fixes the multiplier” means. Using

A˙0=u,π˙i=jFjim2Ai,\dot A_0=u, \qquad \dot\pi^i=\partial_jF_{ji}-m^2A_i,

we get

ϕ˙2=iπ˙i+m2A˙0=m2iAi+m2u0.\dot\phi_2 = \partial_i\dot\pi^i+m^2\dot A_0 = -m^2\partial_iA_i+m^2u \approx0.

Therefore

uiAi.u\approx\partial_iA_i.

Combining this with ϕ2=0\phi_2=0 gives the Hamiltonian version of the Proca transversality condition:

μAμ=A˙0+iAi0.\partial_\mu A^\mu=-\dot A_0+\partial_iA_i\approx0.

Unlike Maxwell theory, no arbitrary gauge function remains.

4.7 Eliminating A0A_0 and the reduced Hamiltonian

Section titled “4.7 Eliminating A0A_0A0​ and the reduced Hamiltonian”

The constraint ϕ2=0\phi_2=0 is algebraic in A0A_0:

A0=1m2iπi.A_0 = -\frac{1}{m^2}\partial_i\pi^i.

Substitute into the Hamiltonian. The A0A_0-dependent terms combine as

A0iπi12m2A02=12m2(iπi)2.- A_0\,\partial_i\pi^i - \frac12 m^2 A_0^2 = \frac{1}{2m^2}(\partial_i\pi^i)^2.

Hence the reduced Hamiltonian is

Hred=d3x[12(π2+B2)+12m2Ai2+12m2(iπi)2],H_{\text{red}} = \int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) +\frac12 m^2 A_i^2 +\frac{1}{2m^2}(\partial_i\pi^i)^2 \right],

which is manifestly bounded below.

For reference, the second-class matrix is ultralocal:

CAB(x,y)=(0m2m20)δ(3)(xy),(χ1,χ2)=(π0,iπi+m2A0).C_{AB}(\mathbf{x},\mathbf{y}) = \begin{pmatrix} 0 & -m^2\\ m^2 & 0 \end{pmatrix} \delta^{(3)}(\mathbf{x}-\mathbf{y}), \qquad (\chi_1,\chi_2)=(\pi^0,\partial_i\pi^i+m^2A_0).

Its inverse exists only for m0m\neq0. This is why the massless limit is structurally singular in the constrained-Hamiltonian sense: as m0m\to0, the second-class pair turns into the first-class Maxwell chain.

The spatial canonical bracket is unchanged by the Proca Dirac bracket,

{Ai(x),πj(y)}D=δijδ(3)(xy),\{A_i(\mathbf{x}),\pi^j(\mathbf{y})\}_D = \delta_i{}^j\delta^{(3)}(\mathbf{x}-\mathbf{y}),

while A0A_0 is no longer independent; equivalently,

{A0(x),Ai(y)}D=1m2ixδ(3)(xy),\{A_0(\mathbf{x}),A_i(\mathbf{y})\}_D = \frac{1}{m^2}\partial_i^{\mathbf{x}}\delta^{(3)}(\mathbf{x}-\mathbf{y}),

consistent with A0=(iπi)/m2A_0=-(\partial_i\pi^i)/m^2.

Here N=4N=4, N1=0N_1=0, N2=2N_2=2. Thus

Nphys=4122=3,N_{\text{phys}}=4-\frac12\cdot 2 = 3,

corresponding to helicities 1,0,+1-1,0,+1 of a massive spin-1 particle.

4.9 Optional: Stückelberg trick and “gauge vs second-class”

Section titled “4.9 Optional: Stückelberg trick and “gauge vs second-class””

Introduce a scalar φ\varphi and replace

AμAμ+1mμφ.A_\mu \to A_\mu + \frac{1}{m}\partial_\mu\varphi.

Then the Proca mass term becomes gauge invariant under

δAμ=μλ,δφ=mλ.\delta A_\mu=\partial_\mu\lambda, \qquad \delta\varphi = -m\lambda.

The theory is now gauge invariant (first-class constraints reappear), but it contains an extra field φ\varphi. After gauge fixing (e.g. φ=0\varphi=0) you recover Proca with 3 physical DOF. This is a useful conceptual bridge: second-class constraints can be viewed as gauge-fixed first-class systems (under appropriate extensions).


5. General relativity as a constrained Hamiltonian system (ADM)

Section titled “5. General relativity as a constrained Hamiltonian system (ADM)”

The Hamiltonian formulation of GR is the prototype of a field theory with:

  • singular Lagrangian (lapse and shift are nondynamical),
  • first-class constraints (encoding diffeomorphism invariance),
  • a nontrivial constraint algebra with structure functions (hypersurface deformation algebra).

We sketch the derivation carefully enough to see where each ingredient comes from.

5.1 Einstein–Hilbert action and 3+1 split

Section titled “5.1 Einstein–Hilbert action and 3+1 split”

Start from the Einstein–Hilbert action (with cosmological constant Λ\Lambda)

S=116πGd4xg(R2Λ)+Sboundary.S = \frac{1}{16\pi G}\int d^4x\,\sqrt{-g}\,(R-2\Lambda) + S_{\text{boundary}}.

The boundary term (e.g. Gibbons–Hawking–York) ensures a well-posed variational principle when fixing the induced metric on the boundary.

Assume spacetime is foliated by spacelike hypersurfaces Σt\Sigma_t, with coordinates xix^i on each Σt\Sigma_t. The spacetime metric can be written in ADM form:

ds2=N2dt2+hij(dxi+Nidt)(dxj+Njdt),ds^2 = -N^2dt^2 + h_{ij}(dx^i+N^i dt)(dx^j+N^j dt),

where:

  • hij(t,x)h_{ij}(t,\mathbf{x}) is the induced spatial metric on Σt\Sigma_t,
  • N(t,x)N(t,\mathbf{x}) is the lapse,
  • Ni(t,x)N^i(t,\mathbf{x}) is the shift (with Ni=hijNjN_i=h_{ij}N^j).

Useful identities:

g=Nh,\sqrt{-g}=N\sqrt{h},

where h=det(hij)h=\det(h_{ij}).

Geometrically, NdtNdt is the proper time separation between neighboring slices along the unit normal, while NidtN^idt tells how the spatial coordinates slide within the next slice. Thus lapse and shift describe how the foliation is threaded through spacetime; they are not local propagating gravitational degrees of freedom.

Define the covariant derivative DiD_i compatible with hijh_{ij}: Dkhij=0D_k h_{ij}=0.

The extrinsic curvature of Σt\Sigma_t embedded in spacetime is

Kij=12N(h˙ijDiNjDjNi),K=hijKij.K_{ij} = \frac{1}{2N}\left(\dot h_{ij}-D_iN_j-D_jN_i\right), \qquad K=h^{ij}K_{ij}.

This shows explicitly that h˙ij\dot h_{ij} appears linearly in KijK_{ij}, while N˙\dot N and N˙i\dot N^i do not appear at all.

A standard result of the Gauss–Codazzi decomposition (up to total derivatives absorbed by SboundaryS_{\text{boundary}}) is:

gR=Nh((3)R+KijKijK2)+(total derivative).\sqrt{-g}\,R = N\sqrt{h}\left({}^{(3)}R + K_{ij}K^{ij} - K^2\right) +\text{(total derivative)}.

Therefore, dropping total derivatives already accounted for by boundary terms, the ADM Lagrangian density is

LADM=h16πGN((3)R+KijKijK22Λ).\mathcal{L}_{\text{ADM}} = \frac{\sqrt{h}}{16\pi G}\,N\left({}^{(3)}R + K_{ij}K^{ij} - K^2 - 2\Lambda\right).

The canonical momentum conjugate to hijh_{ij} is

πij(x)=LADMh˙ij(x).\pi^{ij}(\mathbf{x}) = \frac{\partial \mathcal{L}_{\text{ADM}}}{\partial \dot h_{ij}(\mathbf{x})}.

Since h˙ij\dot h_{ij} enters only through KijK_{ij}, and

Kklh˙ij=12Nδi(kδjl),\frac{\partial K_{kl}}{\partial \dot h_{ij}}=\frac{1}{2N}\delta^i{}_{(k}\delta^j{}_{l)},

one finds

πij=h16πG(KijhijK).\pi^{ij} = \frac{\sqrt{h}}{16\pi G}\left(K^{ij}-h^{ij}K\right).

Taking the trace πhijπij\pi\equiv h_{ij}\pi^{ij} gives

π=h16πG2KK=8πGhπ.\pi = -\frac{\sqrt{h}}{16\pi G}\,2K \quad\Rightarrow\quad K = -\frac{8\pi G}{\sqrt{h}}\,\pi.

You can invert to express KijK_{ij} in terms of πij\pi^{ij}:

Kij=16πGh(πij12hijπ).K_{ij} = \frac{16\pi G}{\sqrt{h}}\left(\pi_{ij}-\frac12 h_{ij}\pi\right).

The combination

πijπij12π2\pi_{ij}\pi^{ij}-\frac12\pi^2

is the inverse-DeWitt-supermetric contraction of the momentum. Its minus sign in the trace direction is the origin of the familiar “conformal factor” indefiniteness of the gravitational kinetic term. This subtlety does not spoil the constraint analysis, but it is one reason canonical GR looks very different from a collection of ordinary positive-energy scalar fields.

Primary constraints: because N˙\dot N and N˙i\dot N^i do not appear in LADM\mathcal{L}_{\text{ADM}},

πN(x)LN˙=0,πi(x)LN˙i=0.\pi_N(\mathbf{x}) \equiv \frac{\partial\mathcal{L}}{\partial \dot N}=0,\qquad \pi_i(\mathbf{x}) \equiv \frac{\partial\mathcal{L}}{\partial \dot N^i}=0.

Thus,

πN0,πi0\pi_N\approx 0,\qquad \pi_i\approx 0

are primary constraints.

5.5 Canonical Hamiltonian and the ADM constraints

Section titled “5.5 Canonical Hamiltonian and the ADM constraints”

The canonical Hamiltonian is

Hc=d3x(πijh˙ijLADM),H_c=\int d^3x\,\left(\pi^{ij}\dot h_{ij}-\mathcal{L}_{\text{ADM}}\right),

where h˙ij\dot h_{ij} should be expressed using

h˙ij=2NKij+DiNj+DjNi.\dot h_{ij}=2NK_{ij}+D_iN_j+D_jN_i.

A key computation uses integration by parts:

d3xπij(DiNj+DjNi)=2d3xNjDiπij\int d^3x\,\pi^{ij}(D_iN_j+D_jN_i) = -2\int d^3x\,N_j D_i\pi^{ij}

(up to boundary terms). After rewriting KijK_{ij} in terms of πij\pi^{ij}, one arrives at the standard ADM form

Hc=d3x(NH+NiHi)+HΣ.H_c = \int d^3x\left( N\,\mathcal{H}_\perp + N^i\,\mathcal{H}_i\right) + H_{\partial\Sigma}.

Here HΣH_{\partial\Sigma} is a boundary term (e.g. ADM energy for asymptotically flat spacetimes). The bulk constraint densities are:

Momentum (diffeomorphism) constraint

  Hi=2Djπij  \boxed{\;\mathcal{H}_i = -2 D_j \pi_i{}^{j}\;}

Hamiltonian (scalar) constraint

  H=16πGh(πijπij12π2)h16πG((3)R2Λ)  \boxed{\; \mathcal{H}_\perp = \frac{16\pi G}{\sqrt{h}}\left(\pi_{ij}\pi^{ij}-\frac12\pi^2\right) -\frac{\sqrt{h}}{16\pi G}\left({}^{(3)}R-2\Lambda\right) \;}

(again, up to convention-dependent signs/factors).

5.6 Total Hamiltonian and secondary constraints

Section titled “5.6 Total Hamiltonian and secondary constraints”

The total Hamiltonian adds the primary constraints:

HT=Hc+d3x(uπN+uiπi).H_T = H_c + \int d^3x\left(u\,\pi_N + u^i \pi_i\right).

Preserving πN0\pi_N\approx 0 gives

π˙N(x)={πN(x),HT}=δHTδN(x)=H(x)0,\dot\pi_N(\mathbf{x}) = \{\pi_N(\mathbf{x}),H_T\} = -\frac{\delta H_T}{\delta N(\mathbf{x})} = -\mathcal{H}_\perp(\mathbf{x}) \approx 0,

hence the Hamiltonian constraint

H(x)0.\mathcal{H}_\perp(\mathbf{x})\approx 0.

Preserving πi0\pi_i\approx 0 gives

π˙i(x)=δHTδNi(x)=Hi(x)0,\dot\pi_i(\mathbf{x}) = -\frac{\delta H_T}{\delta N^i(\mathbf{x})} = -\mathcal{H}_i(\mathbf{x}) \approx 0,

hence the momentum constraints

Hi(x)0.\mathcal{H}_i(\mathbf{x})\approx 0.

No new independent constraints appear beyond these (for pure GR); instead, consistency fixes nothing because the theory is gauge invariant (diffeomorphisms).

5.7 Smeared constraints and the constraint algebra

Section titled “5.7 Smeared constraints and the constraint algebra”

It is cleaner to use smeared functionals:

H[N]d3xN(x)H(x),D[N]d3xNi(x)Hi(x).\mathcal{H}[N] \equiv \int d^3x\,N(\mathbf{x})\,\mathcal{H}_\perp(\mathbf{x}), \qquad \mathcal{D}[\vec{N}] \equiv \int d^3x\,N^i(\mathbf{x})\,\mathcal{H}_i(\mathbf{x}).

Then (schematically) the Poisson brackets close as:

{D[N],D[M]}=D[LNM],\{\mathcal{D}[\vec{N}],\mathcal{D}[\vec{M}]\} = \mathcal{D}[\mathcal{L}_{\vec{N}}\vec{M}], {D[N],H[M]}=H[LNM],\{\mathcal{D}[\vec{N}],\mathcal{H}[M]\} = \mathcal{H}[\mathcal{L}_{\vec{N}}M], {H[N],H[M]}=D ⁣[hij(NjMMjN)].\{\mathcal{H}[N],\mathcal{H}[M]\} = \mathcal{D}\!\left[h^{ij}(N\partial_j M - M\partial_j N)\right].

This is the hypersurface deformation algebra (often called “Dirac algebra”). It is not a Lie algebra with constant structure constants; it has structure functions involving hijh^{ij}.

The last bracket is the most characteristic one. The “structure coefficient” is hij(x)h^{ij}(\mathbf{x}), which is itself a phase-space variable. Therefore the algebra is not an ordinary Lie algebra of constraints with constant coefficients; it is the algebra of deformations of embedded hypersurfaces.

The closure implies that H\mathcal{H}_\perp and Hi\mathcal{H}_i are first-class (together with the primary constraints πN,πi\pi_N,\pi_i).

  • D[N]\mathcal{D}[\vec{N}] generates spatial diffeomorphisms on Σt\Sigma_t: {hij,D[N]}=LNhij,{πij,D[N]}=LNπij.\{h_{ij},\mathcal{D}[\vec{N}]\}=\mathcal{L}_{\vec{N}}h_{ij}, \qquad \{\pi^{ij},\mathcal{D}[\vec{N}]\}=\mathcal{L}_{\vec{N}}\pi^{ij}.
  • H[N]\mathcal{H}[N] generates normal deformations of the hypersurface (time reparametrizations / refoliations).

A precise mapping between (H,Hi)(\mathcal{H}_\perp,\mathcal{H}_i) and spacetime diffeomorphisms requires care, because the algebra closes with structure functions; nonetheless, the standard viewpoint is that these first-class constraints encode the redundancy under diffeomorphisms.

As in Maxwell theory, the complete spacetime-diffeomorphism generator also includes the primary constraints conjugate to lapse and shift. The secondary constraints H\mathcal{H}_\perp and Hi\mathcal{H}_i act on the canonical geometry (hij,πij)(h_{ij},\pi^{ij}), while the primary constraints control how NN and NiN^i transform.

5.9 GR counterpart of “one gauge function, two constraints”

Section titled “5.9 GR counterpart of “one gauge function, two constraints””

Maxwell theory has one gauge function and one primary-secondary first-class chain:

ϵ:π0iπi.\epsilon: \qquad \pi^0\longrightarrow \partial_i\pi^i.

The ADM analogue is that spacetime diffeomorphisms have four descriptors, which can be decomposed relative to the foliation into one normal deformation and three tangential deformations,

ξ(t,x),ξi(t,x).\xi^\perp(t,\mathbf x), \qquad \xi^i(t,\mathbf x).

In the full ADM phase space, these are associated with four primary-secondary chains:

ξ:πNH,\xi^\perp: \qquad \pi_N\longrightarrow \mathcal H_\perp, ξi:πiHi.\xi^i: \qquad \pi_i\longrightarrow \mathcal H_i.

Thus the full set

πN,πi,H,Hi\pi_N, \quad \pi_i, \quad \mathcal H_\perp, \quad \mathcal H_i

contains eight first-class constraints, but they correspond to four spacetime gauge functions, not eight independent gauge functions.

Schematically, the diffeomorphism generator has the same architecture as Maxwell:

GGR[ξ,ξi]d3x(ξ˙πN+ξ˙iπi+ξH+ξiHi+).G_{\rm GR}[\xi^\perp,\xi^i] \sim \int d^3x\left( \dot\xi^\perp\pi_N+ \dot\xi^i\pi_i+ \xi^\perp\mathcal H_\perp+ \xi^i\mathcal H_i+ \cdots \right).

The dots are not cosmetic. In full GR the correct generator contains additional lapse-, shift-, and structure-function-dependent terms. This complication reflects the hypersurface deformation algebra rather than an ordinary Lie algebra with constant structure constants. Conceptually, however, the parallel with Maxwell is clear: the primary constraints transform the nondynamical variables (N,Ni)(N,N^i), while the secondary constraints transform the canonical geometry (hij,πij)(h_{ij},\pi^{ij}).

If NN and NiN^i are treated from the beginning as Lagrange multipliers rather than canonical variables, one usually discusses only the four secondary constraints H\mathcal H_\perp and Hi\mathcal H_i. This is analogous to treating A0A_0 as a multiplier in Maxwell theory and focusing only on Gauss’s law in the spatial phase space.

In 3+1D, the configuration variable hijh_{ij} is a symmetric 3×33\times 3 tensor: N=6N=6 per point.

Constraints:

  • primary: πN0\pi_N\approx 0 (1), πi0\pi_i\approx 0 (3),
  • secondary: H0\mathcal{H}_\perp\approx 0 (1), Hi0\mathcal{H}_i\approx 0 (3).

Altogether there are 8 constraints; however, the standard DOF counting for GR focuses on the true canonical pair (hij,πij)(h_{ij},\pi^{ij}) and treats N,NiN,N^i as multipliers.

On the (hij,πij)(h_{ij},\pi^{ij}) phase space:

  • 2N=122N = 12 phase-space dimensions per point,
  • there are 44 independent first-class constraints (H,Hi)(\mathcal{H}_\perp,\mathcal{H}_i).

Thus

dimΓphys=122×4=4,Nphys=42=2.\dim \Gamma_{\text{phys}} = 12 - 2\times 4 = 4, \qquad N_{\text{phys}} = \frac{4}{2}=2.

These are the two polarizations of the graviton (gravitational waves) in 4D.

If instead you include NN and NiN^i as canonical variables, then there are 1010 configuration variables (hij,N,Ni)(h_{ij},N,N^i) and 88 first-class constraints (πN,πi,H,Hi)(\pi_N,\pi_i,\mathcal{H}_\perp,\mathcal{H}_i). The same formula gives

Nphys=108=2.N_{\text{phys}}=10-8=2.

This agrees with the reduced count above; the lapse and shift sector contributes no physical local degrees of freedom.

5.11 Linearized check (TT gauge intuition)

Section titled “5.11 Linearized check (TT gauge intuition)”

Linearize around Minkowski: gμν=ημν+hμνg_{\mu\nu}=\eta_{\mu\nu}+h_{\mu\nu}. In harmonic gauge one can reduce to transverse-traceless (TT) components hijTTh_{ij}^{\text{TT}} satisfying a wave equation. The TT condition removes gauge redundancy and constraints, leaving two propagating modes—consistent with the Hamiltonian count above.

Details: harmonic gauge ⇒ TT wave equation and the “2 polarizations” count

Start from

gμν=ημν+hμν,hμν1.g_{\mu\nu}=\eta_{\mu\nu}+h_{\mu\nu},\qquad |h_{\mu\nu}|\ll 1.

Infinitesimal diffeomorphisms xμxμξμ(x)x^\mu\to x^\mu-\xi^\mu(x) act as a gauge symmetry:

δhμν=μξν+νξμ.\delta h_{\mu\nu}=\partial_\mu\xi_\nu+\partial_\nu\xi_\mu.

It is convenient to use the trace-reversed field

hημνhμν,hˉμνhμν12ημνh.h\equiv \eta^{\mu\nu}h_{\mu\nu},\qquad \bar h_{\mu\nu}\equiv h_{\mu\nu}-\frac12\,\eta_{\mu\nu}h.

The harmonic (Lorenz) gauge condition is

μhˉμν=0.\partial^\mu \bar h_{\mu\nu}=0.

In this gauge, the vacuum linearized Einstein equations simplify to

hˉμν=0,ηρσρσ.\Box\,\bar h_{\mu\nu}=0,\qquad \Box\equiv \eta^{\rho\sigma}\partial_\rho\partial_\sigma.

So the radiative degrees propagate as massless waves.

Harmonic gauge does not fully fix the gauge: it is preserved by residual transformations with

ξμ=0.\Box\,\xi^\mu=0.

For a plane wave hˉμν=εμνeikx\bar h_{\mu\nu}=\varepsilon_{\mu\nu}e^{ik\cdot x}, the field equation gives k2=0k^2=0 and the gauge condition gives transversality kμεμν=0k^\mu\varepsilon_{\mu\nu}=0. Using the residual gauge freedom one can impose the stronger TT conditions (for a wave moving along zz):

h0μ=0,ihij=0,hii=0.h_{0\mu}=0,\qquad \partial^i h_{ij}=0,\qquad h^i{}_i=0.

The only nonzero components then live in the 2×22\times2 block transverse to the propagation direction and satisfy

hijTT=0.\Box\,h^{TT}_{ij}=0.

A standard basis is

hxxTT=hyyTTh+,hxyTT=hyxTTh×,h^{TT}_{xx}=-h^{TT}_{yy}\equiv h_+,\qquad h^{TT}_{xy}=h^{TT}_{yx}\equiv h_\times,

i.e. the two gravitational-wave polarizations.

This is the linearized counterpart of the Hamiltonian counting: the constraints remove non-propagating components and the gauge symmetry quotients out redundancies, leaving two physical modes (in 4D), exactly as in §5.10.


6. Conceptual comparison: Maxwell vs Proca vs GR

Section titled “6. Conceptual comparison: Maxwell vs Proca vs GR”
TheoryNondynamical variablesConstraint classGauge functions/chainsMultiplier behaviorPhysical DOF in 3+1D
MaxwellA0A_0π0\pi^0, iπi\partial_i\pi^i first-classone U(1)U(1) function: π0iπi\pi^0\to\partial_i\pi^iu=A˙0u=\dot A_0 arbitrary until gauge fixing22
ProcaA0A_0π0\pi^0, iπi+m2A0\partial_i\pi^i+m^2A_0 second-classnoneuu fixed by consistency: uiAiu\approx\partial_iA_i33
GR/ADMN,NiN,N^iπN\pi_N, πi\pi_i, H\mathcal H_\perp, Hi\mathcal H_i first-class in full phase spacefour diffeomorphism descriptors: πNH\pi_N\to\mathcal H_\perp, πiHi\pi_i\to\mathcal H_ilapse/shift multipliers arbitrary until coordinate gauge fixing22

The moral is that “a variable has no velocity” is only the beginning. The decisive question is whether preservation of its primary constraint leaves a multiplier arbitrary, as in gauge theories, or fixes it, as in Proca.

6.1 Why first-class “removes more” than second-class

Section titled “6.1 Why first-class “removes more” than second-class”

A single first-class constraint does two things:

  1. it restricts to the constraint surface, and
  2. it generates a gauge flow—points along that flow are physically equivalent.

Therefore each first-class constraint removes two phase-space dimensions (one for the surface, one for the orbit), while each second-class constraint removes only one.

  • Maxwell: constraints are first-class \Rightarrow gauge symmetry \Rightarrow 2 physical DOF.
  • Proca: constraints are second-class \Rightarrow no gauge symmetry \Rightarrow 3 physical DOF.
  • GR: constraints are first-class \Rightarrow diffeomorphism redundancy \Rightarrow 2 physical DOF in 4D.

Both Maxwell and Proca have π0=0\pi^0=0 because A0A_0 has no time derivative.
The difference is what happens next:

  • Maxwell: preservation produces Gauss law iπi0\partial_i\pi^i\approx 0 (first-class).
  • Proca: preservation produces iπi+m2A00\partial_i\pi^i+m^2A_0\approx 0 (second-class with π0\pi^0).

In GR, lapse and shift are nondynamical: πN=πi=0\pi_N=\pi_i=0. Their preservation produces Hamiltonian/momentum constraints, all first-class.

6.3 Canonical gauge fixing vs covariant gauge fixing

Section titled “6.3 Canonical gauge fixing vs covariant gauge fixing”

The Maxwell example is the cleanest place to see a general lesson.

In canonical language, gauge fixing is imposed on a time slice. If the full phase space contains (A0,Ai;π0,πi)(A_0,A_i;\pi^0,\pi^i), then a complete canonical gauge choice naturally supplies two equal-time conditions, because the gauge generator involves both ϵ\epsilon and ϵ˙\dot\epsilon.

In covariant path-integral language, one fixes the spacetime gauge orbit of the single function ϵ(t,x)\epsilon(t,\mathbf x) by one spacetime differential condition such as μAμ=0\partial^\mu A_\mu=0. The price is that the Faddeev—Popov operator is \Box, and residual zero modes or boundary conditions must be handled. The same pattern appears in linearized gravity: harmonic gauge μhˉμν=0\partial^\mu\bar h_{\mu\nu}=0 is one four-component spacetime condition, but it leaves residual diffeomorphisms ξμ=0\Box\xi^\mu=0 until further conditions or boundary data are specified.


  1. Maxwell with sources: Add AμJμ-A_\mu J^\mu to the Maxwell Lagrangian and show that Gauss’s law becomes iπi=ρ\partial_i\pi^i=\rho. Discuss which parts of the constraint structure change.

  2. Coulomb gauge Dirac bracket: In Maxwell theory, impose Coulomb gauge iAi=0\partial_iA_i=0 and compute the Dirac bracket for the transverse fields.

  3. Proca Dirac bracket: Treat π0\pi^0 and iπi+m2A0\partial_i\pi^i+m^2A_0 as second-class and compute the Dirac bracket for A0A_0 and AiA_i.

  4. ADM constraint algebra: Verify at least one of the ADM bracket relations using smeared constraints and integration by parts.

  5. GR DOF in DD dimensions: Generalize the ADM DOF count to DD spacetime dimensions and show that the number of propagating graviton DOF is D(D3)/2D(D-3)/2.

  6. Lorenz gauge and residual gauge freedom: Show explicitly that μAμ=0\partial^\mu A_\mu=0 is preserved by gauge transformations satisfying α=0\Box\alpha=0. For a plane wave, use this residual freedom to reduce three Lorenz-gauge components to two physical polarizations.

  7. Hamiltonian vs Lagrangian gauge fixing: Starting from the phase-space Maxwell path integral, impose A0=0A_0=0 and iAi=0\partial_iA_i=0 and compute the determinant det{χa,ϕb}\det\{\chi_a,\phi_b\}. Compare it with the covariant Faddeev—Popov determinant det\det\Box in Lorenz gauge.

Hints and checkpoints
  1. Maxwell with sources. With the interaction AμJμ=A0ρAiJi-A_\mu J^\mu=-A_0\rho-A_iJ^i in these conventions, check the sign of the A0A_0 term in HcH_c. Gauge consistency requires current conservation tρ+iJi=0\partial_t\rho+\partial_iJ^i=0 if JμJ^\mu is prescribed externally.

  2. Coulomb gauge Dirac bracket. Use the second-class pair

    χ1=iAi,χ2=iπi.\chi_1=\partial_iA_i, \qquad \chi_2=\partial_i\pi^i.

    The constraint matrix contains 2δ(3)(xy)-\nabla^2\delta^{(3)}(\mathbf{x}-\mathbf{y}) up to signs. Its inverse is the Green’s function of the Laplacian, and the final bracket should be the transverse projector shown in §3.8.

  3. Proca Dirac bracket. Use

    χ1=π0,χ2=iπi+m2A0.\chi_1=\pi^0, \qquad \chi_2=\partial_i\pi^i+m^2A_0.

    The inverse of the constraint matrix is algebraic because the nonzero entry is m2δ(3)(xy)m^2\delta^{(3)}(\mathbf{x}-\mathbf{y}). Check that AiA_i and πj\pi^j keep their canonical bracket, while A0A_0 becomes a dependent variable.

  4. ADM constraint algebra. The easiest bracket to verify first is {D[N],D[M]}\{\mathcal{D}[\vec N],\mathcal{D}[\vec M]\}. Show that D[N]\mathcal{D}[\vec N] acts by Lie derivative on hijh_{ij} and πij\pi^{ij}, then use the commutator of Lie derivatives:

    [LN,LM]=L[N,M].[\mathcal{L}_{\vec N},\mathcal{L}_{\vec M}]=\mathcal{L}_{[\vec N,\vec M]}.
  5. GR DOF in DD dimensions. Let the spatial dimension be d=D1d=D-1. The spatial metric has d(d+1)/2d(d+1)/2 independent components, and there are DD first-class secondary constraints. Therefore

    Nphys=d(d+1)2D=D(D3)2.N_{\text{phys}} = \frac{d(d+1)}2-D = \frac{D(D-3)}2.
  6. Lorenz residual freedom. Under AμAμ+μαA_\mu\to A_\mu+\partial_\mu\alpha,

    μAμμAμ+α.\partial^\mu A_\mu\to \partial^\mu A_\mu+\Box\alpha.

    For a null plane wave, take α=α0eikx\alpha=\alpha_0e^{ik\cdot x}, so α=k2α=0\Box\alpha=-k^2\alpha=0. The residual shift aμaμ+iα0kμa_\mu\to a_\mu+i\alpha_0k_\mu removes the pure-gauge polarization.

  7. Path-integral determinants. With

    ϕ1=π0,ϕ2=iπi,χ1=A0,χ2=iAi,\phi_1=\pi^0, \qquad \phi_2=\partial_i\pi^i, \qquad \chi_1=A_0, \qquad \chi_2=\partial_iA_i,

    the gauge-fixing matrix is block diagonal, with entries 11 and 2-\nabla^2 up to signs. The Lorenz determinant det\det\Box is different because it fixes a spacetime history rather than an equal-time phase-space representative.


  • P. A. M. Dirac, Lectures on Quantum Mechanics (constrained Hamiltonian systems).
  • M. Henneaux & C. Teitelboim, Quantization of Gauge Systems (modern, systematic).
  • K. Sundermeyer, Constrained Dynamics (classic).
  • R. M. Wald, General Relativity (ADM, constraints, canonical structure).
  • E. Poisson, A Relativist’s Toolkit (3+1 tools, extrinsic curvature).
  • Arnowitt–Deser–Misner (ADM) original papers/lectures (historical source).
  • 梁灿彬, 周彬. 微分几何入门与广义相对论(下册)第二版. 科学出版社.