Hamiltonian systems with constraints

Audience: graduate students with prior exposure to Lagrangian/Hamiltonian mechanics and basic field theory.
Goal: learn the Dirac–Bergmann treatment of constrained Hamiltonian systems and see it in action for Maxwell theory, the Proca (massive vector) field, and general relativity (ADM).

1. Why constraints appear

In ordinary (regular) Lagrangian mechanics, the Legendre map

(q^a,\dot q^a)\mapsto (q^a,p_a),\qquad p_a=\frac{\partial L}{\partial\dot q^a}

is invertible because the Hessian

W_{ab}(q,\dot q)=\frac{\partial^2L}{\partial\dot q^a\,\partial\dot q^b}

has $\det W\neq 0$ . One can solve $\dot q=\dot q(q,p)$ and define the Hamiltonian $H=p\dot q-L$ .

A constrained system arises when the Hessian is singular: $\det W=0$ . Then the momenta are not independent; they satisfy relations

\phi_\alpha(q,p)\approx 0,

called primary constraints. The symbol $\approx$ (“weak equality”) means the relation holds on the constraint surface in phase space, but you should not use it inside Poisson brackets until you have computed them.

A helpful geometric picture is this: the equations $\phi_\alpha=0$ define a submanifold of phase space, the constraint surface. A weak equality is an equality only after restricting to this submanifold. For example, even if $\phi\approx0$ , the bracket $\{F,\phi\}$ may be nonzero and can generate a physically meaningful motion along or away from the constraint surface. This is why one computes Poisson brackets first and imposes constraints afterward.

In gauge theories the singularity is not an accident: it reflects redundancy in the description (gauge symmetry). In massive theories (e.g. Proca) singularity can also occur because some variables are nondynamical multipliers even without gauge symmetry.

2. Dirac–Bergmann algorithm in a nutshell

2.1 Canonical Poisson brackets for fields

For canonical fields $(q^a(\mathbf{x}),p_a(\mathbf{x}))$ ,

\{q^a(\mathbf{x}),p_b(\mathbf{y})\}=\delta^a{}_b\,\delta^{(3)}(\mathbf{x}-\mathbf{y}), \qquad \{q^a,q^b\}=\{p_a,p_b\}=0.

For general functionals $F[q,p]$ and $G[q,p]$ , this means

\{F,G\} = \int d^3x\left( \frac{\delta F}{\delta q^a(\mathbf{x})}\frac{\delta G}{\delta p_a(\mathbf{x})} - \frac{\delta F}{\delta p_a(\mathbf{x})}\frac{\delta G}{\delta q^a(\mathbf{x})} \right).

This formula is often safer than manipulating unsmeared delta functions directly.

2.2 Total Hamiltonian and consistency

Given a Lagrangian density $\mathcal{L}(q,\dot q,\nabla q)$ :

Define canonical momenta
$p_a=\frac{\partial\mathcal{L}}{\partial \dot q^a}.$
Relations among $(q,p)$ that do not determine velocities are primary constraints $\phi_\alpha\approx 0$ .
Canonical Hamiltonian (Legendre transform where possible)
$\mathcal{H}_c = p_a\dot q^a - \mathcal{L}.$
Total Hamiltonian
$H_T = \int d^3x\left(\mathcal{H}_c + u^\alpha(\mathbf{x})\,\phi_\alpha(\mathbf{x})\right),$
where $u^\alpha$ are Lagrange multipliers enforcing the primary constraints.
Consistency conditions Require constraints be preserved under time evolution:
$\dot\phi_\alpha(\mathbf{x})=\{\phi_\alpha(\mathbf{x}),H_T\}\approx 0.$
This may:
- produce secondary constraints, tertiary, etc.; and/or
- fix some multipliers $u^\alpha$ .

Continue until closure.

A practical checklist is:

write all primary constraints before solving for any remaining velocities;
build $H_T$ , not only $H_c$ ;
impose $\dot\phi\approx0$ for every new constraint;
separate equations that produce new constraints from equations that determine multipliers;
stop only when every constraint is preserved and every relevant multiplier is either fixed or remains arbitrary because of gauge freedom.

2.3 First-class vs second-class

Let $\{\Phi_A\}$ be the complete set of constraints.

First-class constraint: $\{\Phi_A,\Phi_B\}\approx 0$ for all $B$ .
These generate gauge transformations (redundancies).
Second-class constraint: the matrix (kernel)
$C_{AB}(\mathbf{x},\mathbf{y})=\{\Phi_A(\mathbf{x}),\Phi_B(\mathbf{y})\}$
is (functionally) invertible on the constraint surface.
These do not generate gauge; they simply remove phase-space directions.

For many gauge systems, an individual first-class constraint is best understood as one member of a gauge-generator chain. The full gauge transformation is generated by a tuned combination of primary, secondary, and sometimes higher-stage constraints, with time derivatives of the gauge parameter. Maxwell theory in §3.6 is the simplest example.

2.4 What happens to Lagrange multipliers?

A common but slightly dangerous slogan is:

First-class constraints leave Lagrange multipliers arbitrary, whereas second-class constraints determine them.

The correct version is a little more precise.

The total Hamiltonian contains arbitrary multipliers only for primary constraints,

H_T=H_c+\int d^3x\,u^\alpha\phi_\alpha,

where the $\phi_\alpha$ are primary. Once the full set of constraints $\Phi_I$ has been found, consistency requires

0\approx \dot\Phi_I = \{\Phi_I,H_c\} + \int d^3y\,u^\alpha(\mathbf y) \{\Phi_I,\phi_\alpha(\mathbf y)\}.

These equations can do three things: generate new constraints, determine some combinations of the multipliers $u^\alpha$ , or impose no new condition.

The practical rule is:

A multiplier along a primary first-class direction remains arbitrary before gauge fixing. This arbitrariness is the Hamiltonian form of gauge freedom.
A multiplier along a primary second-class direction is fixed by consistency, provided the constraint-bracket matrix has the appropriate constant rank and inverse on the second-class sector.
After imposing gauge-fixing conditions, first-class constraints are converted into second-class pairs, and the formerly arbitrary multipliers are fixed by preserving the gauge conditions.

Two caveats are worth keeping in mind. First, secondary constraints do not come with independent multipliers in the ordinary total Hamiltonian, although one sometimes introduces an extended Hamiltonian for formal purposes. Second, words like “always” assume a regular system: no changing-rank constraint matrix, no unresolved boundary zero modes, and no hidden reducibility relations.

Maxwell and Proca illustrate the distinction sharply. In Maxwell theory, the multiplier $u$ in front of $\pi^0$ remains arbitrary until a gauge is chosen. In Proca theory, the same-looking primary constraint $\pi^0\approx0$ belongs to a second-class pair, and consistency fixes $u$ .

2.5 Dirac bracket (for second-class constraints)

If $\chi_A\approx 0$ are second-class and $C_{AB}$ is invertible with inverse $C^{AB}$ ,

\{F,G\}_D = \{F,G\} - \int d^3x\,d^3y\; \{F,\chi_A(\mathbf{x})\}\,C^{AB}(\mathbf{x},\mathbf{y})\,\{\chi_B(\mathbf{y}),G\}.

Then $\{F,\chi_A\}_D=0$ for all $F$ , so you may set $\chi_A=0$ strongly after switching to Dirac brackets.

Here $C^{AB}$ is an inverse kernel, meaning

\int d^3z\,C_{AC}(\mathbf{x},\mathbf{z})C^{CB}(\mathbf{z},\mathbf{y}) = \delta_A{}^B\delta^{(3)}(\mathbf{x}-\mathbf{y}).

If the inverse requires solving an elliptic equation, as in Coulomb gauge, one must specify boundary conditions. Zero modes of the operator are not harmless bookkeeping details: they may represent residual gauge transformations or global charges.

2.6 Counting physical degrees of freedom

Let $N$ be the number of configuration fields per space point (so phase space has dimension $2N$ ). Let $N_1$ be the number of first-class constraints and $N_2$ the number of second-class constraints (per point, in an appropriate local sense). Then

N_{\text{phys}} = N - N_1 - \frac12 N_2.

Equivalently, in phase space:

\dim \Gamma_{\text{phys}} = 2N - 2N_1 - N_2.

2.7 A finite-dimensional warm-up example

Before the field-theory examples, it is useful to see the algorithm without delta functions. Consider

L=\frac12(\dot x-y)^2,

with configuration variables $(x,y)$ . The momenta are

p_x=\dot x-y, \qquad p_y=0.

Thus

\phi_1\equiv p_y\approx0

is a primary constraint. Solving $\dot x=p_x+y$ , the canonical Hamiltonian is

H_c=p_x\dot x-L=\frac12p_x^2+yp_x.

The total Hamiltonian is $H_T=H_c+u p_y$ . Consistency of the primary constraint gives

\dot p_y=\{p_y,H_T\}=-\frac{\partial H_T}{\partial y}=-p_x\approx0,

so there is a secondary constraint

\phi_2\equiv p_x\approx0.

The two constraints have vanishing Poisson bracket,

\{p_y,p_x\}=0,

so they are first-class. The arbitrary multiplier $u$ reflects the gauge freedom

\delta x=\epsilon(t), \qquad \delta y=\dot\epsilon(t),

under which $\dot x-y$ is invariant. This toy model mirrors Maxwell theory: one primary constraint plus one secondary constraint together represent one gauge function.

3. Worked example I: Maxwell theory (massless spin-1)

3.1 Lagrangian and 3+1 split

Take vacuum Maxwell in flat space:

\mathcal{L} = -\frac14 F_{\mu\nu}F^{\mu\nu}, \qquad F_{\mu\nu}=\partial_\mu A_\nu-\partial_\nu A_\mu.

With our signature,

F_{\mu\nu}F^{\mu\nu}=-2F_{0i}F_{0i}+F_{ij}F_{ij}.

Define

F_{0i}=\dot A_i-\partial_iA_0, \qquad B^k=\frac12\epsilon^{kij}F_{ij}, \qquad F_{ij}F_{ij}=2\mathbf{B}^2.

Then

\mathcal{L} = \frac12(\dot A_i-\partial_iA_0)^2-\frac12\mathbf{B}^2 = \frac12\mathbf{E}^2-\frac12\mathbf{B}^2, \quad E_i\equiv F_{0i}.

3.2 Canonical momenta and primary constraint

Define

\pi^\mu=\frac{\partial\mathcal{L}}{\partial \dot A_\mu}.

There is no $\dot A_0$ in $\mathcal{L}$ , hence

\pi^0 = 0 \quad\Rightarrow\quad \phi_1(\mathbf{x})\equiv \pi^0(\mathbf{x})\approx 0 \qquad\text{(primary constraint).}

For $i=1,2,3$ ,

\pi^i=\frac{\partial\mathcal{L}}{\partial \dot A_i} =\dot A_i-\partial_iA_0 =F_{0i} =E_i.

Thus $\pi^i$ is the electric field.

3.3 Canonical Hamiltonian

Compute the Hamiltonian density

\mathcal{H}_c=\pi^i\dot A_i-\mathcal{L}.

Solve $\dot A_i=\pi^i+\partial_iA_0$ :

\pi^i\dot A_i = \pi^2+\pi^i\partial_iA_0.

Since $\mathcal{L}=\tfrac12\pi^2-\tfrac12\mathbf{B}^2$ ,

\mathcal{H}_c = \frac12\pi^2+\frac12\mathbf{B}^2+\pi^i\partial_iA_0.

Integrating by parts (dropping boundary terms),

\int d^3x\,\pi^i\partial_iA_0 = -\int d^3x\,A_0\,\partial_i\pi^i,

so the canonical Hamiltonian is

H_c = \int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) - A_0\,\partial_i\pi^i \right].

3.4 Total Hamiltonian and Gauss constraint

Add the primary constraint with multiplier $u(\mathbf{x})$ :

H_T=H_c+\int d^3x\,u(\mathbf{x})\,\pi^0(\mathbf{x}).

Preserve $\pi^0\approx 0$ :

\dot\pi^0(\mathbf{x}) = \{\pi^0(\mathbf{x}),H_T\} = -\frac{\delta H_T}{\delta A_0(\mathbf{x})} = \partial_i\pi^i(\mathbf{x}) \approx 0.

This yields the secondary constraint

\phi_2(\mathbf{x})\equiv \partial_i\pi^i(\mathbf{x})\approx 0,

i.e. Gauss’s law in vacuum.

No further constraints arise; the multiplier $u(\mathbf{x})$ remains undetermined.

To see explicitly why the algorithm closes, use the Hamilton equation

\dot\pi^i=-\frac{\delta H_T}{\delta A_i}=\partial_jF_{ji}.

Then

\dot\phi_2=\partial_i\dot\pi^i=\partial_i\partial_jF_{ji}=0,

because $F_{ji}$ is antisymmetric while $\partial_i\partial_j$ is symmetric. Thus Gauss’s law is automatically preserved.

3.5 Constraint algebra and first-class nature

Using canonical brackets,

\{\pi^0(\mathbf{x}),\phi_2(\mathbf{y})\}=0, \qquad \{\phi_2(\mathbf{x}),\phi_2(\mathbf{y})\}=0.

Thus $\phi_1$ and $\phi_2$ are first-class.

The basic Hamilton equations are also instructive:

\dot A_i=\{A_i,H_T\}=\pi_i+\partial_iA_0, \qquad \dot A_0=\{A_0,H_T\}=u.

The first equation is just the definition $\pi_i=\dot A_i-\partial_iA_0$ . The second says that the time evolution of $A_0$ is arbitrary; this is the Hamiltonian trace of gauge freedom.

3.6 Gauge generator: one gauge function, two constraints

A Maxwell gauge transformation is controlled by one spacetime function $\epsilon(t,\mathbf{x})$ :

A_\mu\mapsto A_\mu-\partial_\mu\epsilon.

Thus

\delta A_0=-\dot\epsilon, \qquad \delta A_i=-\partial_i\epsilon.

The appropriate Hamiltonian generator can be written, using Castellani’s algorithm, as

G[\epsilon] = \int d^3x\Big(-\dot\epsilon(\mathbf{x})\,\pi^0(\mathbf{x})+\epsilon(\mathbf{x})\,\phi_2(\mathbf{x})\Big), \qquad \phi_2=\partial_i\pi^i.

Then

\delta A_0=\{A_0,G\}=-\dot\epsilon, \qquad \delta A_i=\{A_i,G\}=-\partial_i\epsilon,

so the canonical generator reproduces the spacetime $U(1)$ gauge symmetry.

The important point is that $\pi^0$ and $\partial_i\pi^i$ are not associated with two independent gauge functions. They are the two members of one primary-secondary chain:

\pi^0 \quad\longrightarrow\quad \partial_i\pi^i.

The arrow means that preserving the primary constraint produces the secondary constraint:

\dot\pi^0\approx0 \quad\Longrightarrow\quad \partial_i\pi^i\approx0.

If one tried to use only Gauss’s law,

G_{\rm Gauss}[\epsilon]=\int d^3x\,\epsilon\,\partial_i\pi^i,

then

\delta A_i=-\partial_i\epsilon, \qquad \delta A_0=0.

This is not the full gauge transformation for time-dependent $\epsilon$ . The primary term $-\dot\epsilon\,\pi^0$ is needed to transform $A_0$ correctly.

3.7 Degrees of freedom and the need for gauge fixing

We have $N=4$ configuration fields $A_\mu$ , so the full phase space

(A_0,A_i;\pi^0,\pi^i)

has dimension $8$ per spatial point. There are two first-class constraints and no second-class constraints:

\pi^0\approx0, \qquad \partial_i\pi^i\approx0.

Therefore

\dim\Gamma_{\rm phys}=8-2\times2=4, \qquad N_{\text{phys}}=2,

the two transverse photon polarizations.

A complete canonical gauge fixing of the full phase space usually adds two equal-time gauge conditions, for example

\chi_0=A_0\approx0, \qquad \chi_C=\partial_iA_i\approx0.

Together,

\pi^0, \quad \partial_i\pi^i, \quad A_0, \quad \partial_iA_i

form a second-class set, assuming the relevant Laplacian has no problematic zero modes after boundary conditions are specified. The pair $(A_0,\pi^0)$ removes the nondynamical scalar-potential sector, while $(\partial_iA_i,\partial_i\pi^i)$ removes the longitudinal spatial sector.

If one treats $A_0$ as a Lagrange multiplier from the beginning and works only with $(A_i,\pi^i)$ , then there is only one first-class constraint, $\partial_i\pi^i\approx0$ , and one spatial gauge condition such as $\partial_iA_i\approx0$ is enough. This is why different books sometimes appear to count gauge conditions differently.

3.8 Optional: explicit reduction in Coulomb gauge

The nontrivial part of the canonical reduction is the spatial longitudinal sector. To see the “constraint + gauge” removal explicitly, decompose

A_i = A_i^T + \partial_i\alpha, \qquad \partial_iA_i^T=0,

\pi^i = \pi_T^i + \partial^i\beta, \qquad \partial_i\pi_T^i=0.

Then Gauss’s constraint is $\partial_i\pi^i=\nabla^2\beta\approx 0$ , which removes the longitudinal momentum $\beta$ (up to boundary conditions). The longitudinal coordinate $\alpha$ is removed by gauge.

If you impose Coulomb gauge $\chi\equiv \partial_iA_i\approx 0$ , then $(\chi,\phi_2)$ form a second-class pair:

\{\chi(\mathbf{x}),\phi_2(\mathbf{y})\} = \{\partial_iA_i(\mathbf{x}),\partial_j\pi^j(\mathbf{y})\} = -\nabla^2\delta^{(3)}(\mathbf{x}-\mathbf{y}),

which is invertible (as an operator) after specifying boundary conditions. The reduced Hamiltonian becomes a Hamiltonian for the transverse fields only:

H_{\text{red}}=\int d^3x\,\frac12\big(\pi_T^2+\mathbf{B}^2\big).

With this gauge choice, the Dirac bracket of the unreduced variables projects onto the transverse part:

\{A_i(\mathbf{x}),\pi^j(\mathbf{y})\}_D = \left(\delta_i{}^j-\frac{\partial_i\partial^j}{\nabla^2}\right) \delta^{(3)}(\mathbf{x}-\mathbf{y}) \equiv P_i{}^j{}_{T}\,\delta^{(3)}(\mathbf{x}-\mathbf{y}),

where $P_T$ is the transverse projector. This formula makes the two physical photon polarizations visible directly in the bracket.

3.9 Lorenz gauge vs canonical gauge conditions

In covariant field theory one often says, “impose the Lorenz gauge to eliminate gauge freedom.” The spelling is Lorenz, after Ludvig Lorenz; “Lorentz gauge” is a common misnomer.

The Lorenz condition is one spacetime differential condition,

\partial^\mu A_\mu=0.

With the conventions of this page,

\partial^\mu A_\mu=-\dot A_0+\partial_iA_i,

so Lorenz gauge says

\dot A_0=\partial_iA_i.

But in the Hamiltonian formulation,

\dot A_0=\{A_0,H_T\}=u,

where $u$ is the arbitrary multiplier of the primary constraint $\pi^0\approx0$ . Therefore Lorenz gauge is, canonically,

-u+\partial_iA_i=0.

It is not simply one equal-time condition on the phase-space variables $(A_\mu,\pi^\mu)$ ; it is a velocity-dependent condition, or equivalently a condition that fixes the multiplier $u$ .

This reconciles the two common statements:

In the full canonical phase space, complete gauge fixing is often represented by two equal-time conditions, e.g. $A_0\approx0, \qquad \partial_iA_i\approx0.$
In the covariant Lagrangian formulation, Lorenz gauge is one spacetime condition imposed on entire histories $A_\mu(t,\mathbf x)$ .

These are different gauge choices. The temporal-plus-Coulomb choice is noncovariant and stronger than Lorenz gauge. Lorenz gauge is covariant, but it leaves residual transformations

A_\mu\mapsto A_\mu+\partial_\mu\alpha

with

\Box\alpha=0,

because

\partial^\mu A_\mu\mapsto \partial^\mu A_\mu+\Box\alpha.

Thus Lorenz gauge fixes the gauge only after one also specifies boundary conditions or removes zero modes of $\Box$ .

For a plane wave $A_\mu=a_\mu e^{ik\cdot x}$ with $k^2=0$ , Lorenz gauge gives

k^\mu a_\mu=0,

which reduces four components to three. A residual transformation $\alpha=\alpha_0e^{ik\cdot x}$ shifts

a_\mu\mapsto a_\mu+i\alpha_0 k_\mu,

and removes one more unphysical component. Thus

4-1\ \text{Lorenz condition}-1\ \text{residual gauge freedom}=2

physical polarizations, in agreement with the Hamiltonian count.

3.10 Maxwell path integral: where did the two canonical gauge conditions go?

The naive Lagrangian path integral

Z=\int \mathcal DA\,e^{iS[A]}, \qquad S[A]=- \frac14\int d^4x\,F_{\mu\nu}F^{\mu\nu},

overcounts gauge-equivalent configurations and has a noninvertible kinetic operator. The Faddeev—Popov procedure chooses one spacetime gauge condition, for instance

F[A]=\partial^\mu A_\mu-\omega(x)=0,

and inserts

1=\Delta_{\rm FP}[A]\int\mathcal D\alpha\, \delta\!\left(F[A^\alpha]\right), \qquad A_\mu^\alpha=A_\mu+\partial_\mu\alpha.

For Lorenz gauge,

\frac{\delta F[A^\alpha]}{\delta\alpha}=\Box,

\Delta_{\rm FP}=\det\Box

(up to an irrelevant sign convention, often written as $\det(-\partial^2)$ ). In abelian Maxwell theory this determinant is independent of $A_\mu$ , so ghosts decouple.

Averaging over $\omega$ with a Gaussian gives the familiar covariant $\xi$ -gauge action,

\mathcal L_{\rm eff} = -\frac14F_{\mu\nu}F^{\mu\nu} - \frac{1}{2\xi}\left(\partial^\mu A_\mu\right)^2.

For finite $\xi$ , this does not impose $\partial^\mu A_\mu=0$ sharply; it weights violations of the gauge condition. The sharp Landau-gauge limit is $\xi\to0$ .

The apparent mismatch with the Hamiltonian statement is resolved as follows. The Hamiltonian, phase-space version of a completely gauge-fixed path integral has the schematic Faddeev—Senjanovic form

Z_\chi = \int\mathcal DA_\mu\,\mathcal D\pi^\mu\; \delta[\pi^0]\, \delta[\partial_i\pi^i]\, \delta[\chi_0]\, \delta[\chi_C]\, \det\{\chi_a,\phi_b\} \exp\left[i\int dt\,d^3x\left(\pi^\mu\dot A_\mu-\mathcal H_c\right)\right].

For temporal plus Coulomb gauge,

\chi_0=A_0, \qquad \chi_C=\partial_iA_i,

one gets

\{\chi_a,\phi_b\} \sim \begin{pmatrix} 1&0\\ 0&-\nabla^2 \end{pmatrix}, \qquad \det\{\chi_a,\phi_b\}\sim\det(-\nabla^2).

Here the two equal-time gauge conditions are visible explicitly.

By contrast, the covariant Lorenz path integral fixes one spacetime gauge function $\alpha(t,\mathbf x)$ on whole field histories. Its Faddeev—Popov operator $\Box$ is second order in time, so with appropriate boundary conditions it controls both pieces of Cauchy data

\alpha(t_0,\mathbf x), \qquad \dot\alpha(t_0,\mathbf x).

These are the covariant-history counterpart of the two equal-time canonical gauge directions associated with

\pi^0 \quad\text{and}\quad \partial_i\pi^i.

Another way to say the same thing: once the covariant gauge-fixing term

-\frac{1}{2\xi}(\partial^\mu A_\mu)^2

is added, the gauge-fixed Lagrangian contains a $\dot A_0^2$ contribution. The original primary constraint $\pi^0=0$ is no longer a constraint of the gauge-fixed quadratic Lagrangian; the gauge quotient has already been implemented by the Faddeev—Popov construction.

4. Worked example II: Proca (massive spin-1)

4.1 Lagrangian and sign conventions

With our signature $(-+++)$ , a convenient Proca Lagrangian that leads to a positive-energy Hamiltonian is

\mathcal{L} = -\frac14 F_{\mu\nu}F^{\mu\nu} -\frac12 m^2 A_\mu A^\mu.

The field equation is

\partial_\mu F^{\mu\nu} - m^2 A^\nu = 0,

and taking $\partial_\nu$ gives the on-shell constraint

\partial_\nu A^\nu = 0 \quad (m\neq 0).

4.2 3+1 split

Using $A_\mu A^\mu=-A_0^2+A_i^2$ , we get

-\frac12 m^2 A_\mu A^\mu = +\frac12 m^2 A_0^2 - \frac12 m^2 A_i^2.

Therefore

\mathcal{L} = \frac12(\dot A_i-\partial_iA_0)^2 -\frac12\mathbf{B}^2 +\frac12 m^2 A_0^2 -\frac12 m^2 A_i^2.

The crucial difference from Maxwell is the $+\frac12m^2A_0^2$ term. The field $A_0$ still has no velocity, but it is no longer a pure Lagrange multiplier: the secondary constraint will solve for $A_0$ rather than impose a gauge-generating Gauss law.

4.3 Canonical momenta and primary constraint

Exactly as in Maxwell,

\pi^0=\frac{\partial\mathcal{L}}{\partial\dot A_0}=0 \quad\Rightarrow\quad \phi_1(\mathbf{x})\equiv \pi^0(\mathbf{x})\approx 0,

and

\pi^i=\frac{\partial\mathcal{L}}{\partial\dot A_i}=\dot A_i-\partial_iA_0.

4.4 Canonical Hamiltonian

Compute

\mathcal{H}_c=\pi^i\dot A_i-\mathcal{L}, \qquad \dot A_i=\pi^i+\partial_iA_0.

One finds

\mathcal{H}_c = \frac12\pi^2+\frac12\mathbf{B}^2 +\frac12 m^2 A_i^2 +\pi^i\partial_iA_0 -\frac12 m^2 A_0^2.

Integrating by parts,

H_c=\int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) +\frac12 m^2 A_i^2 - A_0\,\partial_i\pi^i -\frac12 m^2 A_0^2 \right].

4.5 Total Hamiltonian and secondary constraint

Total Hamiltonian:

H_T = H_c+\int d^3x\,u(\mathbf{x})\,\pi^0(\mathbf{x}).

Preserve $\pi^0\approx 0$ :

\dot\pi^0(\mathbf{x}) = -\frac{\delta H_T}{\delta A_0(\mathbf{x})} = \partial_i\pi^i(\mathbf{x}) + m^2 A_0(\mathbf{x}) \approx 0.

So the secondary constraint is

\phi_2(\mathbf{x})\equiv \partial_i\pi^i(\mathbf{x})+m^2A_0(\mathbf{x})\approx 0.

4.6 Second-class structure (no gauge symmetry)

Compute the constraint bracket:

\{\phi_1(\mathbf{x}),\phi_2(\mathbf{y})\} = \{\pi^0(\mathbf{x}),\partial_i\pi^i(\mathbf{y})+m^2A_0(\mathbf{y})\} = -m^2\delta^{(3)}(\mathbf{x}-\mathbf{y})\neq 0.

Thus $\phi_1,\phi_2$ are second-class: there is no gauge symmetry. Consistency now fixes the multiplier $u$ rather than leaving it arbitrary.

It is useful to see what “fixes the multiplier” means. Using

\dot A_0=u, \qquad \dot\pi^i=\partial_jF_{ji}-m^2A_i,

we get

\dot\phi_2 = \partial_i\dot\pi^i+m^2\dot A_0 = -m^2\partial_iA_i+m^2u \approx0.

Therefore

u\approx\partial_iA_i.

Combining this with $\phi_2=0$ gives the Hamiltonian version of the Proca transversality condition:

\partial_\mu A^\mu=-\dot A_0+\partial_iA_i\approx0.

Unlike Maxwell theory, no arbitrary gauge function remains.

4.7 Eliminating $A_0$ and the reduced Hamiltonian

The constraint $\phi_2=0$ is algebraic in $A_0$ :

A_0 = -\frac{1}{m^2}\partial_i\pi^i.

Substitute into the Hamiltonian. The $A_0$ -dependent terms combine as

- A_0\,\partial_i\pi^i - \frac12 m^2 A_0^2 = \frac{1}{2m^2}(\partial_i\pi^i)^2.

Hence the reduced Hamiltonian is

H_{\text{red}} = \int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) +\frac12 m^2 A_i^2 +\frac{1}{2m^2}(\partial_i\pi^i)^2 \right],

which is manifestly bounded below.

For reference, the second-class matrix is ultralocal:

C_{AB}(\mathbf{x},\mathbf{y}) = \begin{pmatrix} 0 & -m^2\\ m^2 & 0 \end{pmatrix} \delta^{(3)}(\mathbf{x}-\mathbf{y}), \qquad (\chi_1,\chi_2)=(\pi^0,\partial_i\pi^i+m^2A_0).

Its inverse exists only for $m\neq0$ . This is why the massless limit is structurally singular in the constrained-Hamiltonian sense: as $m\to0$ , the second-class pair turns into the first-class Maxwell chain.

The spatial canonical bracket is unchanged by the Proca Dirac bracket,

\{A_i(\mathbf{x}),\pi^j(\mathbf{y})\}_D = \delta_i{}^j\delta^{(3)}(\mathbf{x}-\mathbf{y}),

while $A_0$ is no longer independent; equivalently,

\{A_0(\mathbf{x}),A_i(\mathbf{y})\}_D = \frac{1}{m^2}\partial_i^{\mathbf{x}}\delta^{(3)}(\mathbf{x}-\mathbf{y}),

consistent with $A_0=-(\partial_i\pi^i)/m^2$ .

4.8 Degrees of freedom

Here $N=4$ , $N_1=0$ , $N_2=2$ . Thus

N_{\text{phys}}=4-\frac12\cdot 2 = 3,

corresponding to helicities $-1,0,+1$ of a massive spin-1 particle.

4.9 Optional: Stückelberg trick and “gauge vs second-class”

Introduce a scalar $\varphi$ and replace

A_\mu \to A_\mu + \frac{1}{m}\partial_\mu\varphi.

Then the Proca mass term becomes gauge invariant under

\delta A_\mu=\partial_\mu\lambda, \qquad \delta\varphi = -m\lambda.

The theory is now gauge invariant (first-class constraints reappear), but it contains an extra field $\varphi$ . After gauge fixing (e.g. $\varphi=0$ ) you recover Proca with 3 physical DOF. This is a useful conceptual bridge: second-class constraints can be viewed as gauge-fixed first-class systems (under appropriate extensions).

5. General relativity as a constrained Hamiltonian system (ADM)

The Hamiltonian formulation of GR is the prototype of a field theory with:

singular Lagrangian (lapse and shift are nondynamical),
first-class constraints (encoding diffeomorphism invariance),
a nontrivial constraint algebra with structure functions (hypersurface deformation algebra).

We sketch the derivation carefully enough to see where each ingredient comes from.

5.1 Einstein–Hilbert action and 3+1 split

Start from the Einstein–Hilbert action (with cosmological constant $\Lambda$ )

S = \frac{1}{16\pi G}\int d^4x\,\sqrt{-g}\,(R-2\Lambda) + S_{\text{boundary}}.

The boundary term (e.g. Gibbons–Hawking–York) ensures a well-posed variational principle when fixing the induced metric on the boundary.

Assume spacetime is foliated by spacelike hypersurfaces $\Sigma_t$ , with coordinates $x^i$ on each $\Sigma_t$ . The spacetime metric can be written in ADM form:

ds^2 = -N^2dt^2 + h_{ij}(dx^i+N^i dt)(dx^j+N^j dt),

where:

$h_{ij}(t,\mathbf{x})$ is the induced spatial metric on $\Sigma_t$ ,
$N(t,\mathbf{x})$ is the lapse,
$N^i(t,\mathbf{x})$ is the shift (with $N_i=h_{ij}N^j$ ).

Useful identities:

\sqrt{-g}=N\sqrt{h},

where $h=\det(h_{ij})$ .

Geometrically, $Ndt$ is the proper time separation between neighboring slices along the unit normal, while $N^idt$ tells how the spatial coordinates slide within the next slice. Thus lapse and shift describe how the foliation is threaded through spacetime; they are not local propagating gravitational degrees of freedom.

5.2 Extrinsic curvature and kinematics

Define the covariant derivative $D_i$ compatible with $h_{ij}$ : $D_k h_{ij}=0$ .

The extrinsic curvature of $\Sigma_t$ embedded in spacetime is

K_{ij} = \frac{1}{2N}\left(\dot h_{ij}-D_iN_j-D_jN_i\right), \qquad K=h^{ij}K_{ij}.

This shows explicitly that $\dot h_{ij}$ appears linearly in $K_{ij}$ , while $\dot N$ and $\dot N^i$ do not appear at all.

5.3 ADM Lagrangian density

A standard result of the Gauss–Codazzi decomposition (up to total derivatives absorbed by $S_{\text{boundary}}$ ) is:

\sqrt{-g}\,R = N\sqrt{h}\left({}^{(3)}R + K_{ij}K^{ij} - K^2\right) +\text{(total derivative)}.

Therefore, dropping total derivatives already accounted for by boundary terms, the ADM Lagrangian density is

\mathcal{L}_{\text{ADM}} = \frac{\sqrt{h}}{16\pi G}\,N\left({}^{(3)}R + K_{ij}K^{ij} - K^2 - 2\Lambda\right).

5.4 Canonical momenta

The canonical momentum conjugate to $h_{ij}$ is

\pi^{ij}(\mathbf{x}) = \frac{\partial \mathcal{L}_{\text{ADM}}}{\partial \dot h_{ij}(\mathbf{x})}.

Since $\dot h_{ij}$ enters only through $K_{ij}$ , and

\frac{\partial K_{kl}}{\partial \dot h_{ij}}=\frac{1}{2N}\delta^i{}_{(k}\delta^j{}_{l)},

one finds

\pi^{ij} = \frac{\sqrt{h}}{16\pi G}\left(K^{ij}-h^{ij}K\right).

Taking the trace $\pi\equiv h_{ij}\pi^{ij}$ gives

\pi = -\frac{\sqrt{h}}{16\pi G}\,2K \quad\Rightarrow\quad K = -\frac{8\pi G}{\sqrt{h}}\,\pi.

You can invert to express $K_{ij}$ in terms of $\pi^{ij}$ :

K_{ij} = \frac{16\pi G}{\sqrt{h}}\left(\pi_{ij}-\frac12 h_{ij}\pi\right).

The combination

\pi_{ij}\pi^{ij}-\frac12\pi^2

is the inverse-DeWitt-supermetric contraction of the momentum. Its minus sign in the trace direction is the origin of the familiar “conformal factor” indefiniteness of the gravitational kinetic term. This subtlety does not spoil the constraint analysis, but it is one reason canonical GR looks very different from a collection of ordinary positive-energy scalar fields.

Primary constraints: because $\dot N$ and $\dot N^i$ do not appear in $\mathcal{L}_{\text{ADM}}$ ,

\pi_N(\mathbf{x}) \equiv \frac{\partial\mathcal{L}}{\partial \dot N}=0,\qquad \pi_i(\mathbf{x}) \equiv \frac{\partial\mathcal{L}}{\partial \dot N^i}=0.

Thus,

\pi_N\approx 0,\qquad \pi_i\approx 0

are primary constraints.

5.5 Canonical Hamiltonian and the ADM constraints

The canonical Hamiltonian is

H_c=\int d^3x\,\left(\pi^{ij}\dot h_{ij}-\mathcal{L}_{\text{ADM}}\right),

where $\dot h_{ij}$ should be expressed using

\dot h_{ij}=2NK_{ij}+D_iN_j+D_jN_i.

A key computation uses integration by parts:

\int d^3x\,\pi^{ij}(D_iN_j+D_jN_i) = -2\int d^3x\,N_j D_i\pi^{ij}

(up to boundary terms). After rewriting $K_{ij}$ in terms of $\pi^{ij}$ , one arrives at the standard ADM form

H_c = \int d^3x\left( N\,\mathcal{H}_\perp + N^i\,\mathcal{H}_i\right) + H_{\partial\Sigma}.

Here $H_{\partial\Sigma}$ is a boundary term (e.g. ADM energy for asymptotically flat spacetimes). The bulk constraint densities are:

Momentum (diffeomorphism) constraint

\boxed{\;\mathcal{H}_i = -2 D_j \pi_i{}^{j}\;}

Hamiltonian (scalar) constraint

\boxed{\; \mathcal{H}_\perp = \frac{16\pi G}{\sqrt{h}}\left(\pi_{ij}\pi^{ij}-\frac12\pi^2\right) -\frac{\sqrt{h}}{16\pi G}\left({}^{(3)}R-2\Lambda\right) \;}

(again, up to convention-dependent signs/factors).

5.6 Total Hamiltonian and secondary constraints

The total Hamiltonian adds the primary constraints:

H_T = H_c + \int d^3x\left(u\,\pi_N + u^i \pi_i\right).

Preserving $\pi_N\approx 0$ gives

\dot\pi_N(\mathbf{x}) = \{\pi_N(\mathbf{x}),H_T\} = -\frac{\delta H_T}{\delta N(\mathbf{x})} = -\mathcal{H}_\perp(\mathbf{x}) \approx 0,

hence the Hamiltonian constraint

\mathcal{H}_\perp(\mathbf{x})\approx 0.

Preserving $\pi_i\approx 0$ gives

\dot\pi_i(\mathbf{x}) = -\frac{\delta H_T}{\delta N^i(\mathbf{x})} = -\mathcal{H}_i(\mathbf{x}) \approx 0,

hence the momentum constraints

\mathcal{H}_i(\mathbf{x})\approx 0.

No new independent constraints appear beyond these (for pure GR); instead, consistency fixes nothing because the theory is gauge invariant (diffeomorphisms).

5.7 Smeared constraints and the constraint algebra

It is cleaner to use smeared functionals:

\mathcal{H}[N] \equiv \int d^3x\,N(\mathbf{x})\,\mathcal{H}_\perp(\mathbf{x}), \qquad \mathcal{D}[\vec{N}] \equiv \int d^3x\,N^i(\mathbf{x})\,\mathcal{H}_i(\mathbf{x}).

Then (schematically) the Poisson brackets close as:

\{\mathcal{D}[\vec{N}],\mathcal{D}[\vec{M}]\} = \mathcal{D}[\mathcal{L}_{\vec{N}}\vec{M}],

\{\mathcal{D}[\vec{N}],\mathcal{H}[M]\} = \mathcal{H}[\mathcal{L}_{\vec{N}}M],

\{\mathcal{H}[N],\mathcal{H}[M]\} = \mathcal{D}\!\left[h^{ij}(N\partial_j M - M\partial_j N)\right].

This is the hypersurface deformation algebra (often called “Dirac algebra”). It is not a Lie algebra with constant structure constants; it has structure functions involving $h^{ij}$ .

The last bracket is the most characteristic one. The “structure coefficient” is $h^{ij}(\mathbf{x})$ , which is itself a phase-space variable. Therefore the algebra is not an ordinary Lie algebra of constraints with constant coefficients; it is the algebra of deformations of embedded hypersurfaces.

The closure implies that $\mathcal{H}_\perp$ and $\mathcal{H}_i$ are first-class (together with the primary constraints $\pi_N,\pi_i$ ).

5.8 Gauge interpretation: diffeomorphisms

$\mathcal{D}[\vec{N}]$ generates spatial diffeomorphisms on $\Sigma_t$ : $\{h_{ij},\mathcal{D}[\vec{N}]\}=\mathcal{L}_{\vec{N}}h_{ij}, \qquad \{\pi^{ij},\mathcal{D}[\vec{N}]\}=\mathcal{L}_{\vec{N}}\pi^{ij}.$
$\mathcal{H}[N]$ generates normal deformations of the hypersurface (time reparametrizations / refoliations).

A precise mapping between $(\mathcal{H}_\perp,\mathcal{H}_i)$ and spacetime diffeomorphisms requires care, because the algebra closes with structure functions; nonetheless, the standard viewpoint is that these first-class constraints encode the redundancy under diffeomorphisms.

As in Maxwell theory, the complete spacetime-diffeomorphism generator also includes the primary constraints conjugate to lapse and shift. The secondary constraints $\mathcal{H}_\perp$ and $\mathcal{H}_i$ act on the canonical geometry $(h_{ij},\pi^{ij})$ , while the primary constraints control how $N$ and $N^i$ transform.

5.9 GR counterpart of “one gauge function, two constraints”

Maxwell theory has one gauge function and one primary-secondary first-class chain:

\epsilon: \qquad \pi^0\longrightarrow \partial_i\pi^i.

The ADM analogue is that spacetime diffeomorphisms have four descriptors, which can be decomposed relative to the foliation into one normal deformation and three tangential deformations,

\xi^\perp(t,\mathbf x), \qquad \xi^i(t,\mathbf x).

In the full ADM phase space, these are associated with four primary-secondary chains:

\xi^\perp: \qquad \pi_N\longrightarrow \mathcal H_\perp,

\xi^i: \qquad \pi_i\longrightarrow \mathcal H_i.

Thus the full set

\pi_N, \quad \pi_i, \quad \mathcal H_\perp, \quad \mathcal H_i

contains eight first-class constraints, but they correspond to four spacetime gauge functions, not eight independent gauge functions.

Schematically, the diffeomorphism generator has the same architecture as Maxwell:

G_{\rm GR}[\xi^\perp,\xi^i] \sim \int d^3x\left( \dot\xi^\perp\pi_N+ \dot\xi^i\pi_i+ \xi^\perp\mathcal H_\perp+ \xi^i\mathcal H_i+ \cdots \right).

The dots are not cosmetic. In full GR the correct generator contains additional lapse-, shift-, and structure-function-dependent terms. This complication reflects the hypersurface deformation algebra rather than an ordinary Lie algebra with constant structure constants. Conceptually, however, the parallel with Maxwell is clear: the primary constraints transform the nondynamical variables $(N,N^i)$ , while the secondary constraints transform the canonical geometry $(h_{ij},\pi^{ij})$ .

If $N$ and $N^i$ are treated from the beginning as Lagrange multipliers rather than canonical variables, one usually discusses only the four secondary constraints $\mathcal H_\perp$ and $\mathcal H_i$ . This is analogous to treating $A_0$ as a multiplier in Maxwell theory and focusing only on Gauss’s law in the spatial phase space.

5.10 Degrees of freedom of GR

In 3+1D, the configuration variable $h_{ij}$ is a symmetric $3\times 3$ tensor: $N=6$ per point.

Constraints:

primary: $\pi_N\approx 0$ (1), $\pi_i\approx 0$ (3),
secondary: $\mathcal{H}_\perp\approx 0$ (1), $\mathcal{H}_i\approx 0$ (3).

Altogether there are 8 constraints; however, the standard DOF counting for GR focuses on the true canonical pair $(h_{ij},\pi^{ij})$ and treats $N,N^i$ as multipliers.

On the $(h_{ij},\pi^{ij})$ phase space:

$2N = 12$ phase-space dimensions per point,
there are $4$ independent first-class constraints $(\mathcal{H}_\perp,\mathcal{H}_i)$ .

Thus

\dim \Gamma_{\text{phys}} = 12 - 2\times 4 = 4, \qquad N_{\text{phys}} = \frac{4}{2}=2.

These are the two polarizations of the graviton (gravitational waves) in 4D.

If instead you include $N$ and $N^i$ as canonical variables, then there are $10$ configuration variables $(h_{ij},N,N^i)$ and $8$ first-class constraints $(\pi_N,\pi_i,\mathcal{H}_\perp,\mathcal{H}_i)$ . The same formula gives

N_{\text{phys}}=10-8=2.

This agrees with the reduced count above; the lapse and shift sector contributes no physical local degrees of freedom.

5.11 Linearized check (TT gauge intuition)

Linearize around Minkowski: $g_{\mu\nu}=\eta_{\mu\nu}+h_{\mu\nu}$ . In harmonic gauge one can reduce to transverse-traceless (TT) components $h_{ij}^{\text{TT}}$ satisfying a wave equation. The TT condition removes gauge redundancy and constraints, leaving two propagating modes—consistent with the Hamiltonian count above.

Details: harmonic gauge ⇒ TT wave equation and the “2 polarizations” count

Start from

g_{\mu\nu}=\eta_{\mu\nu}+h_{\mu\nu},\qquad |h_{\mu\nu}|\ll 1.

Infinitesimal diffeomorphisms $x^\mu\to x^\mu-\xi^\mu(x)$ act as a gauge symmetry:

\delta h_{\mu\nu}=\partial_\mu\xi_\nu+\partial_\nu\xi_\mu.

It is convenient to use the trace-reversed field

h\equiv \eta^{\mu\nu}h_{\mu\nu},\qquad \bar h_{\mu\nu}\equiv h_{\mu\nu}-\frac12\,\eta_{\mu\nu}h.

The harmonic (Lorenz) gauge condition is

\partial^\mu \bar h_{\mu\nu}=0.

In this gauge, the vacuum linearized Einstein equations simplify to

\Box\,\bar h_{\mu\nu}=0,\qquad \Box\equiv \eta^{\rho\sigma}\partial_\rho\partial_\sigma.

So the radiative degrees propagate as massless waves.

Harmonic gauge does not fully fix the gauge: it is preserved by residual transformations with

\Box\,\xi^\mu=0.

For a plane wave $\bar h_{\mu\nu}=\varepsilon_{\mu\nu}e^{ik\cdot x}$ , the field equation gives $k^2=0$ and the gauge condition gives transversality $k^\mu\varepsilon_{\mu\nu}=0$ . Using the residual gauge freedom one can impose the stronger TT conditions (for a wave moving along $z$ ):

h_{0\mu}=0,\qquad \partial^i h_{ij}=0,\qquad h^i{}_i=0.

The only nonzero components then live in the $2\times2$ block transverse to the propagation direction and satisfy

\Box\,h^{TT}_{ij}=0.

A standard basis is

h^{TT}_{xx}=-h^{TT}_{yy}\equiv h_+,\qquad h^{TT}_{xy}=h^{TT}_{yx}\equiv h_\times,

i.e. the two gravitational-wave polarizations.

This is the linearized counterpart of the Hamiltonian counting: the constraints remove non-propagating components and the gauge symmetry quotients out redundancies, leaving two physical modes (in 4D), exactly as in §5.10.

6. Conceptual comparison: Maxwell vs Proca vs GR

Theory	Nondynamical variables	Constraint class	Gauge functions/chains	Multiplier behavior	Physical DOF in 3+1D
Maxwell	$A_0$	$\pi^0$ , $\partial_i\pi^i$ first-class	one $U(1)$ function: $\pi^0\to\partial_i\pi^i$	$u=\dot A_0$ arbitrary until gauge fixing	$2$
Proca	$A_0$	$\pi^0$ , $\partial_i\pi^i+m^2A_0$ second-class	none	$u$ fixed by consistency: $u\approx\partial_iA_i$	$3$
GR/ADM	$N,N^i$	$\pi_N$ , $\pi_i$ , $\mathcal H_\perp$ , $\mathcal H_i$ first-class in full phase space	four diffeomorphism descriptors: $\pi_N\to\mathcal H_\perp$ , $\pi_i\to\mathcal H_i$	lapse/shift multipliers arbitrary until coordinate gauge fixing	$2$

The moral is that “a variable has no velocity” is only the beginning. The decisive question is whether preservation of its primary constraint leaves a multiplier arbitrary, as in gauge theories, or fixes it, as in Proca.

6.1 Why first-class “removes more” than second-class

A single first-class constraint does two things:

it restricts to the constraint surface, and
it generates a gauge flow—points along that flow are physically equivalent.

Therefore each first-class constraint removes two phase-space dimensions (one for the surface, one for the orbit), while each second-class constraint removes only one.

Maxwell: constraints are first-class $\Rightarrow$ gauge symmetry $\Rightarrow$ 2 physical DOF.
Proca: constraints are second-class $\Rightarrow$ no gauge symmetry $\Rightarrow$ 3 physical DOF.
GR: constraints are first-class $\Rightarrow$ diffeomorphism redundancy $\Rightarrow$ 2 physical DOF in 4D.

6.2 Nondynamical fields and constraints

Both Maxwell and Proca have $\pi^0=0$ because $A_0$ has no time derivative.
The difference is what happens next:

Maxwell: preservation produces Gauss law $\partial_i\pi^i\approx 0$ (first-class).
Proca: preservation produces $\partial_i\pi^i+m^2A_0\approx 0$ (second-class with $\pi^0$ ).

In GR, lapse and shift are nondynamical: $\pi_N=\pi_i=0$ . Their preservation produces Hamiltonian/momentum constraints, all first-class.

6.3 Canonical gauge fixing vs covariant gauge fixing

The Maxwell example is the cleanest place to see a general lesson.

In canonical language, gauge fixing is imposed on a time slice. If the full phase space contains $(A_0,A_i;\pi^0,\pi^i)$ , then a complete canonical gauge choice naturally supplies two equal-time conditions, because the gauge generator involves both $\epsilon$ and $\dot\epsilon$ .

In covariant path-integral language, one fixes the spacetime gauge orbit of the single function $\epsilon(t,\mathbf x)$ by one spacetime differential condition such as $\partial^\mu A_\mu=0$ . The price is that the Faddeev—Popov operator is $\Box$ , and residual zero modes or boundary conditions must be handled. The same pattern appears in linearized gravity: harmonic gauge $\partial^\mu\bar h_{\mu\nu}=0$ is one four-component spacetime condition, but it leaves residual diffeomorphisms $\Box\xi^\mu=0$ until further conditions or boundary data are specified.

7. Exercises

Maxwell with sources: Add $-A_\mu J^\mu$ to the Maxwell Lagrangian and show that Gauss’s law becomes $\partial_i\pi^i=\rho$ . Discuss which parts of the constraint structure change.
Coulomb gauge Dirac bracket: In Maxwell theory, impose Coulomb gauge $\partial_iA_i=0$ and compute the Dirac bracket for the transverse fields.
Proca Dirac bracket: Treat $\pi^0$ and $\partial_i\pi^i+m^2A_0$ as second-class and compute the Dirac bracket for $A_0$ and $A_i$ .
ADM constraint algebra: Verify at least one of the ADM bracket relations using smeared constraints and integration by parts.
GR DOF in $D$ dimensions: Generalize the ADM DOF count to $D$ spacetime dimensions and show that the number of propagating graviton DOF is $D(D-3)/2$ .
Lorenz gauge and residual gauge freedom: Show explicitly that $\partial^\mu A_\mu=0$ is preserved by gauge transformations satisfying $\Box\alpha=0$ . For a plane wave, use this residual freedom to reduce three Lorenz-gauge components to two physical polarizations.
Hamiltonian vs Lagrangian gauge fixing: Starting from the phase-space Maxwell path integral, impose $A_0=0$ and $\partial_iA_i=0$ and compute the determinant $\det\{\chi_a,\phi_b\}$ . Compare it with the covariant Faddeev—Popov determinant $\det\Box$ in Lorenz gauge.

Hints and checkpoints

Maxwell with sources. With the interaction $-A_\mu J^\mu=-A_0\rho-A_iJ^i$ in these conventions, check the sign of the $A_0$ term in $H_c$ . Gauge consistency requires current conservation $\partial_t\rho+\partial_iJ^i=0$ if $J^\mu$ is prescribed externally.
Coulomb gauge Dirac bracket. Use the second-class pair
$\chi_1=\partial_iA_i, \qquad \chi_2=\partial_i\pi^i.$
The constraint matrix contains $-\nabla^2\delta^{(3)}(\mathbf{x}-\mathbf{y})$ up to signs. Its inverse is the Green’s function of the Laplacian, and the final bracket should be the transverse projector shown in §3.8.
Proca Dirac bracket. Use
$\chi_1=\pi^0, \qquad \chi_2=\partial_i\pi^i+m^2A_0.$
The inverse of the constraint matrix is algebraic because the nonzero entry is $m^2\delta^{(3)}(\mathbf{x}-\mathbf{y})$ . Check that $A_i$ and $\pi^j$ keep their canonical bracket, while $A_0$ becomes a dependent variable.
ADM constraint algebra. The easiest bracket to verify first is $\{\mathcal{D}[\vec N],\mathcal{D}[\vec M]\}$ . Show that $\mathcal{D}[\vec N]$ acts by Lie derivative on $h_{ij}$ and $\pi^{ij}$ , then use the commutator of Lie derivatives:
$[\mathcal{L}_{\vec N},\mathcal{L}_{\vec M}]=\mathcal{L}_{[\vec N,\vec M]}.$
GR DOF in $D$ dimensions. Let the spatial dimension be $d=D-1$ . The spatial metric has $d(d+1)/2$ independent components, and there are $D$ first-class secondary constraints. Therefore
$N_{\text{phys}} = \frac{d(d+1)}2-D = \frac{D(D-3)}2.$
Lorenz residual freedom. Under $A_\mu\to A_\mu+\partial_\mu\alpha$ ,
$\partial^\mu A_\mu\to \partial^\mu A_\mu+\Box\alpha.$
For a null plane wave, take $\alpha=\alpha_0e^{ik\cdot x}$ , so $\Box\alpha=-k^2\alpha=0$ . The residual shift $a_\mu\to a_\mu+i\alpha_0k_\mu$ removes the pure-gauge polarization.
Path-integral determinants. With
$\phi_1=\pi^0, \qquad \phi_2=\partial_i\pi^i, \qquad \chi_1=A_0, \qquad \chi_2=\partial_iA_i,$
the gauge-fixing matrix is block diagonal, with entries $1$ and $-\nabla^2$ up to signs. The Lorenz determinant $\det\Box$ is different because it fixes a spacetime history rather than an equal-time phase-space representative.

8. References for further study

P. A. M. Dirac, Lectures on Quantum Mechanics (constrained Hamiltonian systems).
M. Henneaux & C. Teitelboim, Quantization of Gauge Systems (modern, systematic).
K. Sundermeyer, Constrained Dynamics (classic).
R. M. Wald, General Relativity (ADM, constraints, canonical structure).
E. Poisson, A Relativist’s Toolkit (3+1 tools, extrinsic curvature).
Arnowitt–Deser–Misner (ADM) original papers/lectures (historical source).
梁灿彬, 周彬. 微分几何入门与广义相对论（下册）第二版. 科学出版社.

Hamiltonian systems with constraints

1. Why constraints appear

2. Dirac–Bergmann algorithm in a nutshell

2.1 Canonical Poisson brackets for fields

2.2 Total Hamiltonian and consistency

2.3 First-class vs second-class

2.4 What happens to Lagrange multipliers?

2.5 Dirac bracket (for second-class constraints)

2.6 Counting physical degrees of freedom

2.7 A finite-dimensional warm-up example

3. Worked example I: Maxwell theory (massless spin-1)

3.1 Lagrangian and 3+1 split

3.2 Canonical momenta and primary constraint

3.3 Canonical Hamiltonian

3.4 Total Hamiltonian and Gauss constraint

3.5 Constraint algebra and first-class nature

3.6 Gauge generator: one gauge function, two constraints

3.7 Degrees of freedom and the need for gauge fixing

3.8 Optional: explicit reduction in Coulomb gauge

3.9 Lorenz gauge vs canonical gauge conditions

3.10 Maxwell path integral: where did the two canonical gauge conditions go?

4. Worked example II: Proca (massive spin-1)

4.1 Lagrangian and sign conventions

4.2 3+1 split

4.3 Canonical momenta and primary constraint

4.4 Canonical Hamiltonian

4.5 Total Hamiltonian and secondary constraint

4.6 Second-class structure (no gauge symmetry)

4.7 Eliminating A0A_0A0​ and the reduced Hamiltonian

4.8 Degrees of freedom

4.9 Optional: Stückelberg trick and “gauge vs second-class”

5. General relativity as a constrained Hamiltonian system (ADM)

5.1 Einstein–Hilbert action and 3+1 split

5.2 Extrinsic curvature and kinematics

5.3 ADM Lagrangian density

5.4 Canonical momenta

5.5 Canonical Hamiltonian and the ADM constraints

5.6 Total Hamiltonian and secondary constraints

5.7 Smeared constraints and the constraint algebra

5.8 Gauge interpretation: diffeomorphisms

5.9 GR counterpart of “one gauge function, two constraints”

5.10 Degrees of freedom of GR

5.11 Linearized check (TT gauge intuition)

6. Conceptual comparison: Maxwell vs Proca vs GR

6.1 Why first-class “removes more” than second-class

6.2 Nondynamical fields and constraints

6.3 Canonical gauge fixing vs covariant gauge fixing

7. Exercises

8. References for further study

4.7 Eliminating $A_0$ and the reduced Hamiltonian