Show code cell content
import mmf_setup; mmf_setup.nbinit()
import os
from pathlib import Path
FIG_DIR = Path(mmf_setup.ROOT) / '../Docs/_build/figures/'
os.makedirs(FIG_DIR, exist_ok=True)
import logging; logging.getLogger("matplotlib").setLevel(logging.CRITICAL)
%matplotlib inline
import numpy as np, matplotlib.pyplot as plt
try: from myst_nb import glue
except: glue = None
import manim.utils.ipython_magic
!manim --version
This cell adds /home/docs/checkouts/readthedocs.org/user_builds/physics-521-classical-mechanics-i/checkouts/latest/src to your path, and contains some definitions for equations and some CSS for styling the notebook. If things look a bit strange, please try the following:
- Choose "Trust Notebook" from the "File" menu.
- Re-execute this cell.
- Reload the notebook.
Manim Community v0.18.0
Manim Community v0.18.0
Hamiltonian Mechanics#
Recall that with Lagrangian mechanics, one recovers Newton’s laws as a principle of extremal action:
The quantities \(p_i = \partial L/\partial\dot{q}_i\) are called the conjugate momenta. Expressed in this form, we have:
The idea of Hamiltonian mechanics is to effect a Legendre transformation, replacing the coordinates \(\dot{\vect{q}}\) with the conjugate momenta \(\vect{p}\):
Now we have Hamilton’s equations of motion:
Do it! Prove that these are correct.
We first solve the first equation for \(\dot{\vect{q}} = \dot{\vect{q}}(\vect{q}, \vect{p}, t)\), then substituting:
Now we simply differentiate, using the chain rule, and cancel \(\vect{p} = \pdiff L/\partial\dot{\vect{q}}\):
Do it! Invert the process: find \(L(\vect{q}, \dot{\vect{q}}, t)\) from \(H(\vect{q}, \vect{p}, t)\).
From \(H(\vect{q}, \vect{p}, t)\), use [Hamilton’s equation][] of motion to find the velocity, then reverse the transformation:
Phase Flow#
Although one can study dynamics by plotting the flow of trajectories in configuration space by plotting the velocity verses position \((\vect{q}, \dot{\vect{q}})\), there are advantages to considering dynamics in phase space \((\vect{q}, \vect{p})\), replacing the velocity with the generalized momentum. Specifically, Hamiltonian dynamics preserves area in phase space, which is not necessarily true if one uses the velocity.
Hamilton’s equations define phase flow. Let \(\vect{y}(0) = (\vect{q}_0, \vect{p}_0)\) be a point in phase space. The phase flow is defined by the trajectory \(\vect{y}(t)\) through phase space which satisfies Hamilton’s equations:
Example#
Here we implement a simple example of a pendulum of mass \(m\) hanging down distance \(r\) from a pivot point in a gravitational field \(g>0\) with coordinate \(\theta\) so that the mass is at \((x, z) = (r\sin\theta, -r\cos\theta)\) and \(\theta=0\) is the downward equilibrium position:
Show code cell source
plt.rcParams['figure.dpi'] = 300
from scipy.integrate import solve_ivp
alpha = 0.0
m = r = g = 1.0
w = g/r
T = 2*np.pi / w # Period of small oscillations.
def f(t, y):
# We will simultaneously evolve N points.
N = len(y)//2
theta, p_theta = np.reshape(y, (2, N))
dy = np.array([p_theta/m/r**2*np.exp(-alpha*t),
-m*g*r*np.sin(theta)*np.exp(alpha*t)])
return dy.ravel()
# Start with a circle of points in phase space centered here
def plot_set(y, dy=0.1, T=T, phase_space=True, N=10, Nt=5, c='C0',
Ncirc=1000, max_step=0.01, _alpha=0.7,
fig=None, ax=None):
"""Plot the phase flow of a circle centered about y0.
Arguments
---------
y : (theta0, ptheta_0)
Center of initial circle.
dy : float
Radius of initial circle.
T : float
Time to evolve to.
phase_space : bool
If `True`, plot in phase space $(q, p)$, otherwise plot in
the "physical" phase space $(q, P)$ where $P = pe^{-\alpha t}$.
N : int
Number of points to show on circle and along path.
Nt : int
Number of images along trajectory to show.
c : color
Color.
_alpha : float
Transparency of regions.
max_step : float
Maximum spacing for times dt.
Ncirc : int
Minimum number of points in circle.
"""
global alpha
skip = int(np.ceil(Ncirc // N))
N_ = N * skip
th = np.linspace(0, 2*np.pi, N_ + 1)[:-1]
z = dy * np.exp(1j*th) + np.asarray(y).view(dtype=complex)
y0 = np.ravel([z.real, z.imag])
skip_t = int(np.ceil(T / max_step / Nt))
Nt_ = Nt * skip_t + 1
t_eval = np.linspace(0, T, Nt_)
res = solve_ivp(f, t_span=(0, T), y0=y0, t_eval=t_eval)
assert Nt_ == len(res.t)
thetas, p_thetas = res.y.reshape(2, N_, Nt_)
ylabel = r"$p_{\theta}$"
if phase_space:
ylabel = r"$P_{\theta}=p_{\theta}e^{-\alpha t}$"
p_thetas *= np.exp(-alpha * res.t)
if ax is None:
fig, ax = plt.subplots()
ax.plot(thetas[::skip].T, p_thetas[::skip].T, "-k", lw=0.1)
for n in range(Nt+1):
tind = n*skip_t
ax.plot(thetas[::skip, tind], p_thetas[::skip, tind], '.', ms=0.5, c=c)
ax.fill(thetas[:, tind], p_thetas[:, tind], c=c, alpha=_alpha)
ax.set(xlabel=r"$\theta$", ylabel=ylabel, aspect=1)
return fig, ax
# Version without damping for now
alpha = 0.0
xlim = (-3, 14)
ylim = (-2, 3)
fig, ax = plt.subplots(figsize=(10, 5))
ys = np.linspace(0.25, 3.5, 13)
for n, y in enumerate(ys):
plot_set(y=(y, y), c=f"C{n}", phase_space=False, ax=ax)
ax.set(xlim=xlim, ylim=ylim)
plt.tight_layout()
fig.savefig(FIG_DIR / "phase_space_pendulum_no_damping.svg")
display(fig)
plt.close(fig)
# Version with damping for use elsewhere.
xlim = (-1.5, 14)
fig, ax = plt.subplots(figsize=(10, 10 * abs(np.diff(ylim)/np.diff(xlim))[0]))
alpha = 0.3
for n, y in enumerate(ys):
plot_set(y=(y, y), c=f"C{n}", ax=ax, T=1.3*T, phase_space=True)
ax.set(ylabel=r"$p_{\theta}$", xlim=xlim, ylim=ylim, aspect='auto')
plt.tight_layout()
fig.savefig(FIG_DIR / "phase_space_pendulum_damping.svg")
plt.close(fig)
Liouville’s Theorem#
An interesting property of phase flow is that it conserves areas in phase-space. To see this, recall that the a mapping \(\vect{y} \mapsto \vect{f}(\vect{y})\) changes areas by a factor of the determinant of the Jacobian of the transformation:
Liouville’s theorem states that if the divergence of \(f(\vect{y})\) vanishes \(\vect{\nabla}_{\vect{y}}\cdot \vect{f}(\vect{y}) = 0\), then the mapping preserves volumes.
Sketch of Proof
See [Arnol'd, 1989] Chapter 3 section 16 for a complete proof.
Let \(v(t)\) be the volume of a region \(D(t)\) at time \(t\):
Show that \(\dot{v}|_{t=0} = \int_{D(0)} \vect{\nabla}_{\vect{y}}\cdot \vect{f} \d{y} = 0\) by expanding the Jacobian at time \(t=0\):
This is valid at any time, so \(\dot{v}(t) = 0\) and the area is unchanging. That the divergence of \(\vect{f}\) is zero follows from the equality of mixed partial derivatives:
A Hamiltonian with Damping#
A common misconception is that damping or dissipation cannot be included in a Hamiltonian or Lagrangian framework. Consider the following:
This is just Newtonian dynamics for a particle of mass \(m\) in a potential \(V(q)\) with a drag term \(-m\alpha v\) such as might be seen for a particle falling through the atmosphere.
It is instructive to consider this in the Lagrangian framework:
Here we see that the factor \(\lambda(t) = e^{\alpha t}\) simplify multiplies the Lagrangian. In the action, this can be interpreted as a rescaling of time \(\tau = e^{\alpha t}\d{t}\):
This presents an analogy of cosmology where the universe is decelerating as it expands, leading to an effective cooling from the dissipative term introduced by the scaling of time.
We now demonstrate Liouville’s theorem. Note that this applies only to he canonically conjugate variables \((q, p)\) that enter the Hamiltonian (top plot below), but not to the conventional coordinates \((q, m\dot{q})\) for which the damping results in contraction.
The key is that the canonical momentum is not \(m\dot{q}\), but instead
The extra time-dependent factor \(e^{\alpha t}\) exactly counters this contraction to ensure Liouville’s theorem. However, now the phase space dynamics are not independent of time.
Show code cell source
fig, axs = plt.subplots(
2, 1, figsize=(20, 12), sharex=True,
gridspec_kw=dict(height_ratios=(3.5, 1)))
# Saved version with damping.
fig0, ax0 = plt.subplots(figsize=(10,5))
alpha = 0.3
for n, y in enumerate(np.linspace(0.25, 1.75, 6)):
plot_set(y=(y, y), c=f"C{n}", ax=axs[0], T=1.3*T, phase_space=False)
[plot_set(y=(y, y), c=f"C{n}", ax=ax, T=1.3*T)
for ax in [ax0, axs[1]]]
axs[0].set(ylim=(-6, 7))
axs[0].set(title=fr"$\alpha = {alpha}$, $T=1.3\times 2\pi \sqrt{{r/g}}$");
#fig0.savefig(FIG_DIR / "phase_space_pendulum_damping.svg")
plt.close(fig0)
Dispersion and Relativity#
A useful application of the Hamiltonian framework is if you know the dispersion relation \(E(p)\) – the kinetic energy as a function of the momentum. Simply apply Hamilton’s equations of motion:
Group Velocity#
If you have studied waves, you may recognize the second equation \(\dot{q} = E'(p)\) as the group velocity of a wave, which is the derivative of the dispersion relationship. The last equation \(\dot{q} = F = -V'(q)\) is just Newton’s law.
Effective Mass#
We can get the more conventional form of Newton’s law by taking the time derivative of the second equation and using the chain rule:
This shows us that the effective mass is the inverse of the second derivative of the dispersion \(m = 1/E''(q)\). I.e. the curvature of the dispersion defines the mass.
Constant Force#
There is a peculiar special case which we can solve easily: that of a particle under a constant force \(F\):
The solution has two arbitrary constants \(a\) (position) and \(b\) (speed):
Do It! Derive this solution.
In this case, we have the following equations:
Hence, the momentum is linear in time:
Using the usual chain-rule trick twice:
Finally, substituting \(p(t)\) and redefining the constants, have
where \(a\) is a constant position, and \(b\) is a constant velocity. Evaluating this at time \(t=0\) allows us to replace \(a\) and \(b\) with the initial position \(q_0\) and velocity \(v_0\):
Here the trajectory of the particle has the same shape as the dispersion.
Special Relativity#
For a specific example, recall that in special relativity, the kinetic energy of a particle of rest-mass \(m\) and momentum \(p\) is
where \(c\) is the speed of light, which should be constant for all observers.
Rindler Coordinates#
Combining the relativistic dispersion and constant force result gives us the relativistic motion of a constantly accelerating object. Choosing the constants so that we start at time \(t=0\) at rest \(v_0 = p_0 = 0\) and at position \(q_0\), we have
where \(h\) is the natural length scale. Choosing units so that \(h = c = 1\):
Do It! Find this solution.
Using \(E(p) = \sqrt{p^2c^2 + m^2c^4}\) with \(p_0 = v_0 = 0\) and taking \(a=0\), we have
where \(h = mc^2/F\).
With some work, this allows one to consider physics from the perspective of a constantly accelerating observer using Rindler coordinates: the relativistic equivalent of a constant gravitational field. Even though this is purely special-relativity, the Rindler frame has some very interesting properties, including the existence of an event horizon distance \(h\) below the observer. If you drop a watch, the watch will approach this horizon, slowing both the rate at which it falls, and the rate at which the clock runs, never actually falling across the horizon: just like a black hole.
Relativistic Lagrangian#
One might like to try to figure out what Lagrangian gives rise to an equation with dispersion \(E(p)\) by effecting the Legendre transformation:
We must now solve for \(p(\dot{q})\) by inverting \(\dot{q} = E'(p)\):
Note that, in general, this does not have the form of \(L=T-V\), kinetic minus potential unless \(E(p) \propto p^2\), i.e. as for Newtonian mechanics.
Do It! Find the conditions for \(L = T-V\) to work.
Noting that \(\dot{q} = E'(p)\) we are looking for the forms of \(E(p)\) such that
Thus, the form \(L=T(\dot{q})-V(q)\) only works for Newtonian mechanics where \(E(p)\) is purely quadratic in \(p\).
Simplifying the relationship for the group velocity \(\dot{q} = E'(p)\) we have
Thus, we recover the well-known relativistic relationship:
in terms of the Lorentz factor \(\gamma(\beta)\) that relates the rate of change of time \(t\) in the inertial frame to proper time \(\tau\) in the co-moving frame:
Collecting and simplifying we find the correct Lagrangian:
Do It! Derive \(L(q, \dot{q}, t)\).
We start by noting that \(\dot{q} = pc^2/E(p)\) so that:
Then, we use the Legendre transformation
This makes more sense when we consider the action and introduce proper time \(\tau\) such that \(1/\gamma = \d{\tau}/\d{t}\):
Note that the relativistic Lagrangian does not have the form \(T - V\). Instead, the action corresponding to the kinetic piece becomes manifestly Lorentz invariant: it is simply the proper time of a clock moving with the observer, multiplied by appropriate dimensional factors.
Canonical Transformations#
The power of the Hamiltonian framework is that it allows one to express the most general possible coordinate transformations that preserve the structure of Hamilton’s equations. These are called a Canonical transformations, and can be checked using the Poisson bracket which we now define.
Poisson Bracket#
Given a set of canonical variables: coordinates \(\vect{q}\) and associated conjugate momenta \(\vect{p}\), one can define the Poisson bracket \(\{\cdot, \cdot\}\):
Do It! Prove the following properties of the Poisson bracket:
Note that these are the same properties shared by the matrix commutator \([\mat{A}, \mat{B}] = \mat{A}\mat{B} - \mat{B}\mat{A}\).
Note that
Canonical Quantization
Once canonically conjugate variables \(q\) and \(p\) have been identified, one can seek linear operators \(\op{q}\) and \(\op{p}\) whose commutator satisfy an analogous commutation relation:
Once such operators are found, the theory can be canonically quantized by replacing classical observables the Hamiltonian \(H(q, p)\) with the quantum operators:
This works for operators that are quadratic in the coordinates and momenta, but can fail in general, especially if there are higher order terms in both \(p\) and \(q\) where the order of operators becomes ambiguous. For more information, see the associated discussion with the Moyal bracket, which generalizes the Poisson bracket.
Canonically Conjugate Variables#
Starting from a Lagrangian framework, we can identify canonically conjugate variables \(q_i\) and \(p_i = \partial L/\partial \dot{q}_i\). The Poisson bracket allows us to check whether or not new coordinates \(\vect{Q}(\vect{q}, \vect{p}, t)\) and \(\vect{P}(\vect{q}, \vect{p}, t)\) are canonically conjugate: They are if the they satisfy
Note that these relations hold for any set of conjugate pairs \((\vect{q}, \vect{p})\) and \((\vect{Q}, \vect{P})\), so we can suppress the subscript \(\{\cdot, \cdot\}_{qp} \equiv \{\cdot, \cdot\}_{QP} \equiv \{\cdot, \cdot\}\): the Poisson bracket expresses an operation that does not fundamentally depend on the coordinates. (See the Poisson bracket in coordinate-free language)
Do It! Prove that \(\{f, g\}_{qp} = \{f, g\}_{QP}\).
We limit our proof to a single variable and suppress the dependence on \(t\) – the generalization is straightforward. They key is to note that
Introducing the comma notation for partial derivatives, \(f_{,Q} = \partial f/\partial Q\) etc., we have
Evolution as a Canonical Transformation#
One very important set of canonically conjugate variables is defined by evolution. Note that evolution defines an map from phase space into itself:
I.e., if we take our initial conditions \((q_0, p_0)\) at time \(t=t_0\) as a point in phase space, then evolving these for time \(t\) gives a new point in phase space \((Q, P)\) which where the initial point ends up at time \(t_0 + t\). This evolution defines a canonical transformation from the old coordinates \((q, p) \equiv (q_0, p_0)\):
Do It! Show that evolution is indeed a canonical transformation.
Hint: show that the Poisson bracket \(\{Q, P\}_{q_0p_0} = 1\). This requires relating changes at the end of a trajectory to changes in the initial condition. This can be done by using the fact that physical trajectories extremize the action. Recall that physical trajectories extremize the action
These trajectories satisfy the Euler-Lagrange equation:
Now consider physical trajectories (i.e. satisfying the Euler-Lagrange equations) that start and end at slightly different positions \(q_0 + \d{q_0}\) and \(q_1 + \d{q_1}\). The variation in the action is now
The first two terms are related to varying the endpoints of the integral, but note that they are not simply the Lagrangian because we must change \(t_1\) while readjusting the trajectory so that \(q(t_1+\delta t_1) = q_1\) remains fixed. We will return to this in a later problem, but for now, we fixed \(t_0\) and \(t_1\) so \(\d{t_0} = \d{t_1} = 0\). To compute the second contribution, we use the equations of motion
Hence, \(p_1 = S_{,q_1}\) and \(p_0 = -S_{,q_0}\).
Now let \((Q, P) = (q_1, p_1)\) be the new coordinates, and \(q=(q_0, p_0)\) be the old coordinates, i.e., looking at evolution as a phase-space homomorphism:
Therefore
Generating Functions#
Canonical transformations can be generated by specifying an arbitrary generating function function of one old and one new coordinate. There are thus four types of such functions:
From these, one can generate the omitted coordinates and the new Hamiltonian as follows:
Or, more compactly:
Note that these equations give implicit relationships between the old and new coordinates which must be solved to complete the transformation. Once this is done, the new Hamiltonian in all cases is:
which will give Hamilton’s equations
The general approach for deriving these is to note that the most general transformation from \((q, p)\) to \((Q, P)\) preserves the equations of motion after minimizing the action consists of three things: 1) scaling the Lagrangian, (just changes the time unit), 2) effecting a coordinate change, and 3) adding a total derivative.
The scaling just changes the units of time, so we neglect this (\(\alpha = 1\)). It is the addition of a total derivative that provides additional freedom over the standard Lagrangian approach of writing the new coordinates as a simple function of the old \(Q=Q(q, t)\).
Expressing this relationship in terms of differentials, and introducing the Hamiltonians, we have
Equating these, and collecting the differentials gives our formula above for the canonical transformation generated by \(F_1(q, Q, t)\). To obtain the relationship for another function, say \(F_2(q, P, t)\) we need to switch \(-P\d{Q}\) to \(Q\d{P}\), which can be done by adding \(\d{QP} = Q\d{P} + P\d{Q}\). I.e., \(F_2 = F_1 + QP\):
This accounts for the sign change.
Do It! Show explicitly that this procedure works.
Our task is to show that if
then
One can proceed by brute force. The idea here is to choose three independent variables, and then equate partials. We choose \((q, P, t)\) since these are the natural variables for \(F\):
Equating \(H' = H + F_{,t}\) gives the following conditions (by varying only one of \(\d{q}\), \(\d{P}\), and \(\d{t}\) at a time respectively):
Solving, we have
We can then solve for \(\dot{Q}\) and \(\dot{P}\):
Q.E.D.
Hamilton-Jacobi Equation#
One important transformation is to find a generating function \(F_{1}(q, Q, t) = S(q, Q, t)\) such that \(H' = 0\). This transformation defines coordinates which are completely independent of time \(\dot{Q} = \dot{P} = 0\). The conditions \(H'(Q, P, t) = H(q, p, t) + S_{,t}\) and \(p = S_{,q}\) define the Hamilton-Jacobi equation:
Do It! Show that the classical action \(S(q, Q, t)\) is a solution.
Show that classical action for a physical trajectory \(q(t)\) with \(q(t_0) = q_0\) and \(q(t_1) = q_1\),
gives a solution \(S(q, Q, t)\) to the Hamilton-Jacobi equation with \(Q=q_0\), and \(t=t_1\).
Solution
Computing the total differential, we have
as we did above when showing that evolution is a canonical transformation. We now focus on the first partials.
Warning
There is a subtly here: \(S_{,t_1} \neq L_1\) as we might expect from varying the upper endpoint. This is because the action is defined over physical trajectories. Computing the partial \(S_{,t_1}\) requires readjusting \(q(t)\) so that \(q(t_1+\delta t_1) = q_1\) remains fixed!
Suppose that \(q(t)\) is a physical trajectory satisfying some initial conditions \(q(t_0) = q_0\) and \(\dot{q}(t_0) = \dot{q}_0\). In this case, the functions \(q(t)\), \(\dot{q}(t)\), and thus \(L(q, \dot{q}, t)\) are simply functions of time. For this trajectory only, we can say that \(\delta S = L_1\delta t_1\), or similar by extending backwards in time. What this really means is:
Hence, the relationships we need follow from \(\delta{q}_i / \delta t_i = \dot{q}_i\):
The total differential is thus:
Thus, setting \(Q=q_0\), \(q=q_1\), \(p=p_1\), and \(t=t_1\) we have
Q.E.D.
Hamilton’s Principal and Characteristic Functions#
The typical approach for solving the Hamilton-Jacobi equation is to look for separability. For example, if the Hamiltonian is time-independent, then we can write
The left-hand-side contains no time, while the right-hand-side contains no \(\vect{q}\) and \(\vect{Q}\). Thus, the two sides must both be constant. That constant is the conserved Hamiltonian implied by Noether’s theorem when one has time-translation invariance. In most cases, that constant is the energy \(E\) is the energy of the system, so we use this notation, and solve for \(f(t) = -Et\):
This differentiates between the time-dependent action \(S(\vect{q}, \vect{Q}, t)\), which is called Hamilton’s principal function, and the time-independent action \(W(\vect{q}, \vect{Q}) = S + Et\), which is called Hamilton’s characteristic function.
Action-Angle Coordinates#
Another related set of coordinates called action-angle coordinates. We will give the general formulation below, but the basic idea applies to time-independent Hamiltonians where the Hamilton-Jacobi equation is separable so that the motion can be considered quasi-periodic. I.e., each of the coordinates executes either a libration \(q_k(t+T_k) = q_k(t)\) or a rotation (where the coordinate is periodic like an angle) \(q_k + Q_k \equiv q_k\). In both cases, we can define the action over a complete period/cycle:
where the integral is over the path defined by the condition of constant energy \(E=E(q_k, p_k)\). The energy of such a system is a function only of these actions
hence the corresponding canonical coordinates (the angle coordinates) have constant velocity:
The geometric structure is one of an \(n\)-torus in phase space defined by the constant energy manifold \(E = E(\vect{q}, \vect{p}) = E(\vect{I})\). The periodic trajectories are orbits around these tori, and the angle variables specify where on the orbits the system is at a given time, parameterized in such a way as to advance at a constant rate in time.
The process of finding these is straightforward
Calculate the new generalized momenta \(\vect{I}\) – the action variables:
\[\begin{gather*} I_k = \oint q_k\,\frac{\d{p_k}}{2\pi} = \tfrac{1}{2\pi}\oint q_{k}\dot{p}_k \, \d{t}. \end{gather*}\]Express the original Hamiltonian in terms of these \(H'(\vect{I}) = E(\vect{I})\).
Differentiate to get the frequencies:
\[\begin{gather*} \omega_k = \pdiff{H'}{I_k}, \qquad \theta_k = \theta_k(0) + \omega_k t. \end{gather*}\]
Action-Angle Generating Function#
These coordinates are generated by the type-2 generating function \(W(\vect{q}, \vect{I})\) called Hamilton’s characteristic function
For a single coordinate, the action-angle coordinates can be generated from the following type-2 function with \(P = I\) being the area in phase space (divided by \(2\pi\) for our units):
For applications to Canonical Perturbation Theory, we would like to have a form of action-angle variables where the transformed Hamiltonian \(H' = 0\). To do this, we simply note that the Hamiltonian is a constant function of the action variables \(H'(\vect{I}) = E\). Thus, we simply use the type-2 generating functional
which is Hamilton’s principal function as before. This defines the following coordinates
I.e., Recall that the angle variables march forward with constant angular velocity \(\vect{\omega}\):
The new coordinates are just the (constant) starting angles \(\vect{\theta}_0\).
In summary, we have two transformation:
where
and \(\vect{I}\), \(\vect{\theta}_0\), and \(\vect{\omega}\) are constants.
Formalism#
The formalism for a system of \(n\) particles starts with identifying \(n\) integrals of motion \(F_{k}(\vect{q}, \vect{p})\) with \(F_1(\vect{q}, \vect{p}) = H(\vect{q}, \vect{p})\) which are in involution with each other, meaning that their Poisson brackets are zero:
Liouville proved the following about the level set
\(M_{\vect{f}}\) is a smooth manifold, invariant under phase flow with the Hamiltonian \(H = F_1\).
If the manifold \(M_{\vect{f}}\) is compact and connected, then it is diffeomorphic to the \(n\)-dimensional torus
\[\begin{gather*} T^{n} = \{\vect{\phi} \mod 2\pi\} \end{gather*}\]The phase flow with Hamiltonian \(H = F_1\) determines conditionally periodic motion on \(M_{\vect{f}}\). I.e.:
\[\begin{gather*} \dot{\vect{\phi}} = \vect{\omega}(\vect{f}). \end{gather*}\]The canonical equations can be integrated by quadratures. (I.e., although we only have \(n\) integrals, we can get all \(2n\) integrals of motion.
The angle coordinates will be
but we must still identify the conjugate momenta \(\vect{I}\), which will not in general be the functions \(\vect{F}\). They are given as discussed above, by computing the action about the periodic orbit (though, it should not be obvious from this discussion that these do indeed do the job).
Example: Harmonic Oscillator
The harmonic oscillator has the following solution:
From this, we can compute the action variable
\[\begin{gather*} I = \oint p\, \frac{\d{q}}{2\pi} = \frac{1}{2\pi}\int p\dot{q}\,\d{t} = \frac{1}{2\pi}\int_0^{2\pi/\omega} m \omega^2 A^2\sin^2\omega \t \,\d{\t} = \frac{m \omega A^2}{2} = \frac{E}{\omega}. \end{gather*}\]We then invert this to express \(E(I)\):
\[\begin{gather*} E(I) = I\omega. \end{gather*}\]Now we differentiate to obtain the angular velocity and angle variable:
\[\begin{gather*} \omega = \pdiff{E}{I}, \qquad \phi(\t) = \omega \t. \end{gather*}\]Thus, we see that we recover the angular frequency.
Note
In this case it was trivial to invert the first expression to obtain \(E(I)\), but this might be complicated in general. Since one really only needs the partials to identify the angle variables, it is easiest to express the differentials from which the angular velocities can be obtained implicitly:
We now have the coordinate transformation:
The generating function is
This is most easily evaluated by considering \(\phi(q, I)\):
As a check:
Note that this is Hamilton’s characteristic function \(W = S + Et\) expressed in the appropriate coordinates as a type-2 generating function. Needs checking.*
Note: When trying to derive these types of relationships, it is much easier to work in natural units where \(m\omega = 2I = 1\). The latter throws some of the baby out with the bathwater (you lose the dependence on \(I\) which is needed for derivatives) but simplifies algebra.
Canonical Perturbation Theory#
To organize the perturbative calculation, one can turn to canonical perturbation theory which maintains the canonical relationship between variables as we add perturbative corrections.
We consider a Hamiltonian system with conjugate variables \((\vect{q}, \vect{p})\):
We now assume that a solution to the Hamiltonian problem \(H_0(\vect{q}, \vect{p}, t)\) is known such that we can effect a canonical transform via the generating function \(F_2(\vect{q}, \vect{P}, t)\) to a new set of conjugate variables \((\vect{Q}, \vect{P})\) with Hamiltonian \(K_0(\vect{Q}, \vect{P}, t)\):
Effecting the same transform on the full Hamiltonian \(\vect{H}\) gives:
This is still exact, but note that the motion is now of order \(\epsilon\).
The first approach, explained in [Goldstein et al., 2000], is to use these equations to recursively build successive approximations. Thus, substituting the order \(\epsilon^0\) solution on the right-hand-side of these equations gives a linear system that can be integrated to obtain the order \(\epsilon^1\) approximation. The order \(\epsilon^1\) solution can then be inserted to compute the order \(\epsilon^2\) approximation, etc.
Explicitly, let \(\vect{y} = (\vect{Q}, \vect{P})\). The full solution will have the form:
In the case of a single degree of freedom \(\vect{y} = (Q, P)\) we have
Note that if \(H_1(Q,P)\) is independent of time, then, since \(Q_0\) and \(P_0\) are constant, \((Q_1, P_1)\) is linear in time, \((Q_2, P_2)\) is quadratic, etc., and all integrals are trivial.
Important
Note: By itself, applying this formalism for canonical perturbation theory does not ensure that you won’t have secular terms. I.e. if you try to apply these equations to the transformation to constant coordinates \((\vect{q}_0, \vect{p}_0)\) generated by \(F_2(\vect{q}, \vect{q}_0, t) = S(\vect{q}_0, t_0; \vect{q}, t)\), you will generally end up with linearly increasing perturbations.
You must also use action-angle coordinates as generated by the \(F_2(\vect{q}, \vect{I}, t) = S(\vect{q}, \vect{I}, t)\) described above. In this case, the secular terms give rise to changes in the angular frequency \(\vect{\omega}\), preserving the periodic nature of the solutions.
For details, see Worked Example: The Pendulum.
Adiabatic Invariants#
For 1D systems, the action variable \(I\) is an adabatic invariant, which we explain here by way of an example. Consider a harmonic oscillator with a time-dependent frequency that varies slowly
Since the Hamiltonian now depends on time, the energy will not be conserved as we change the frequency, i.e., work may be done on the system. However, if we vary \(\omega(t)\) slowly enough, then we might expect that the energy will return if we take the system slowly (adiabatically) away from \(\omega_0\) but then slowly bring it back. The question is then, how does the energy depend on \(\omega(t)\) in the adiabatic limit?
For constant \(\omega\) we can find the action-angle variables (see the example above):
and we shall now show that \(I\) is an adiabatic invariant, meaning that it is quadratic \(I = I_0 + O(\dot{\omega}^2)\) for small \(\dot{\omega}\). Thus, the answer to the question posed above is:
Do It! Find a \(g(t)\) so that \(2J = q^2/g^2 + (g\dot{q} - q\dot{g})^2\) is constant.
Incomplete: See Tong
Constants of motion should have zero Poisson bracket with the Hamiltonian,
so we start by expressing this constant in terms of \(p = m \dot{q}\):
Thus, we are looking for a function \(g(t)\) which satisfies:
Warning
Exactly what slowly or adiabatic means is quite subtle. See [Wells and Siklos, 2006] for a discussion. Averaging over cycles place a key role that we have not discussed here.
WKB Approximation#
The traditional approach for the WKB approximation of quantum mechanics is to express the wavefunction as follows, then insert it into the Schrödinger equation:
Expanding \(W\) in powers of \(\hbar\), we have the following lowest two orders:
\(\order(\hbar^0)\): Hamilton-Jacobi Equation#
The order \(\hbar^0\) equation is the well-known Hamilton-Jacobi equation, which is satisfied by the classical action as a function of initial and final states.
\(\order(\hbar^1)\): Continuity Equation#
The order \(\hbar^1\) equation is the well-known continuity equation, expressing the conservation of particle number or probability. To see this, multiply through by \(2A^2\):
This is the 1-dimensional form of the familiar continuity equation, once we identify \(S'\) as the momentum, and \(A^2\) as the probability density:
Exercise
Show that the order \(\hbar^1\) equation is satisfied by
as given by the path integral formulation. Note: this is only valid up to an overall constant giving the dependence of \(A(x, t)\) on \(x\) and \(t\). One must adjust this overall constant to normalize the wavefunction.
Solution
Following the hint, we express \(A^2 = \partial S'/\partial x_0 \equiv S'_{,x_0}\) (using Einstein’s notation). We are trying to show that:
We proceed using the linearity of the derivatives to rearrange the left-hand-side as:
because the potential does not depend on the initial position \(x_0\).
Exercise: Geometry
Note that since \(S' = p\), the factor \(A\) can be expressed as
Explain this geometrically using Liouville’s theorem and the requirement that classical evolution conserves particle number
for fixed initial condition \(p_0(x_0)\).
Solution (Incomplete)
Liouville’s theorem tells us that evolution conserves phase-space volumes when considered as functions of \(t\) and initial conditions \(x_0\), and \(p_0\) at time \(t_0\):
To get \(n(x, t)\) from the previous relationship, we hold \(\d{x} = \d{t} = \d{t_0} = 0\).
WKB: Path Integral Formulation#
In quantum mechanics, one can use the Feynman path-integral approach to construct the propagator (here expressed in terms of position-to-position transitions):
where the integral is over all paths that start at \(q(t_0) = q_0\) and end at \(q(t) = q\), and \(S[q]\) is the classical action
Given an initial wavefunction \(\psi(q_0, t_0)\), the wavefunction at time \(t\) is:
the WKB approximation relies on the idea that classical trajectories where \(S'[q_{\mathrm{cl}}] = 0\) – the famous principle of extremal action – dominate the propagator, and use the expansion of the action
The WKB approximation amount to considering all classical trajectories with appropriate boundary conditions, performing the path integral over \(\xi\), and dropping terms of order \(\order(\xi^3)\) and higher to obtain:
where \(\S = \S(q,t;q_0,t_0)\) is the classical action with \(q=q_{\mathrm{cl}}(t)\) and \(q_0 = q_{\mathrm{cl}}(t_0)\) are the final and initial points of the classical trajectory. The key point here is that all of the information about the propagator in this approximation is contained in the classical action \(\S(q,t;q_0,t_0)\), sometimes called Hamilton’s principal function.
Once the path integrals over \(\xi\) have been done, everything is expressed in terms of the classical trajectory \(q_{\mathrm{cl}}(t)\) and we shall drop the “cl” subscript in what follows.
(Note: if there are multiple trajectories that satisfy the boundary conditions, then they should be added, giving rise to quantum interference patterns.)
Example: Free Particle
As an example, consider a free particle with Hamiltonian \(H = p^2/2m\). This has the following solution \(x(t)\) and Hamilton’s principal function \(\S\):
Hence, the WKB propagator is:
A quick check shows that this is exact:
Extending this to higher dimensions, we have:
Similar results can be obtained from the momentum-to-position transitions if the initial state is expressed in terms of momentum, however, in this case, since the boundary conditions are no longer the same, we must use a different form of \(\S(x,t;p_0,t_0)\):
Example: Free Particle continued
Changing variables, we now have:
Hence, the WKB propagator is:
A quick check shows that this is also exact, and confirms the need for the extra piece \(q_0 p_0\) in \(\S\):
Examples#
Example: Falling Particle 1D
As a second example, consider a particle in free-fall with Hamiltonian \(H = p^2/2m + mgz\). The classical problem is most easily solved with conservation of energy \(E\):
Since our Hamiltonian is time-independent, the results will depend only on \(t-t_0\) and we can choose \(t_0=0\) without loss of generality. This has the following solution \(z(t)\) and Hamilton’s principal function \(\S\):
The appropriate functional dependence can be deduced by solving for \(p_0(z,\t;z_0)\) and \(E(z,\t;z_0)\):
From these, we can compute the action and various partials:
Hence, the WKB propagator is:
To obtain the approximate Airy function, note that \(U_{WKB}\bigr(z,z_0,-(t-t_0)\bigl) = U_{WKB}^*\bigl(z,z_0,(t-t_0)\bigr)\): i.e. for a particle traveling up, then falling back down, there will be an interference between the up-going and down-going wavefunctions that results in twice the real part of \(U_{WKB}\). Suitably normalized, this provides an extremely good approximation of the Airy function which exactly solves the corresponding quantum mechanics problem:
The quantum wavefunction for a particle in a gravitational field is solved by this if we shift \(z-z_0\) by the maximum height \(z_0 = E/mg\), and scale by the natural scale \(\xi\) so that \(x = (z-z_0)/\xi\) is dimensionless:
Thus, the WKB approximation gives
where the normalization needs to be fixed for negative \(x\), but is given here \(1/\sqrt{\pi}\) for the standard Airy function normalization.
Show code cell content
from scipy.special import airy
m = hbar = g = 1
t0 = 0
z0 = 0
t = t0 + np.linspace(0, 6, 1000)[1:]
z = z0 - g*t**2/2
xi = (hbar**2/m**2/2/g)**(1/3)
x = (z-z0) / xi
phi = m/hbar*(
(z-z0)**2/2/(t-t0) - g/2*(z+z0)*(t-t0) - g**2*(t-t0)**3/24
)
psi = np.sqrt(m/2/np.pi/hbar/1j/(t-t0))*np.exp(1j*phi)
Ai, Aip, Bi, Bip = airy(z/xi)
fig, ax = plt.subplots(figsize=(5, 2.5))
ax.plot(x, Ai, label=r"$\mathrm{Ai}(z/\xi)$")
ax.plot(x, psi.real * Ai[-1] /np.real(psi)[-1], label=r"$\Re \psi_{WKB}$")
y = np.cos((-x)**(3/2)*2/3 - np.pi/4)/abs(x)**(1/4)/np.sqrt(np.pi)
ax.plot(x, y, ":",
label=r"$\cos(\frac{2}{3}(-x)^{3/2} + \frac{\pi}{4})/\sqrt{\pi}|x|^{1/4}$")
ax.set(ylim=(-0.5, 1), xlabel=r"$z/\xi$", ylabel=r"$\psi$",
title=r"$\xi=\sqrt[3]{\hbar^2/(2m^2g)}$")
ax.legend();
if glue: glue("fig:airy", fig, display=False);
Example: General Particle 1D
Slightly more general, we now consider a particle falling in an arbitrary time-independent potential \(V(z) = mgz + \delta(z)\) where we will ultimately consider \(\delta(z)\) to be small. Here, since energy is conserved, we immediately have:
The trajectory \(z(t)\) no longer has a closed form, but we can still express the action by changing variables \(\d{z} = \dot{z}\d{t} = p\d{t}/m\):
This form has two complications. First, the sign of the denominator must be chosen appropriately to match the direction of motion. This is often clear from the physics, and so does not pose a fundamental problem. Second, this form of the action as an explicit function of either \(S(z, p; z_0)\) or \(S(z;z_0, p_0)\) since \(E = E(z_0, p_0) = E(z, p)\) is conserved and a function of the initial or final coordinates.
A comment about the role of the conserved energy \(E\) here. Note that if \(E=0\), then the numerator and denominator both contain factors of \(\sqrt{-V(z)}\) and can be combined. The presence of \(E\) seems to spoil this, but, as is generally well known, in classical mechanics, only the relative value of the energy is physically significant. To make this explicit, we note that
The first term clearly does not affect the physics, and in quantum mechanics, corresponds to an overall global phase. This is exactly the effect of shifting the zero-energy level. The second term is a common form of the action, as an integral of a generalized momentum with respect to the corresponding coordinate. This form appears in the action-angle variable formulation for example.
To compute the normalization factor, we must perform the appropriate change of variables using the analogy of the Maxwell relations in thermodynamics using:
We must compute the partials holding this constant, so we have:
We can now compute the normalization factor. We take the first derivative using the well-known properties of Hamilton’s principle function
and then use the expression above for \(p(z, E) = \pm \sqrt{2m\bigl(E-V(z)\bigr)}\) to compute:
Example: Harmonic Oscillator
The harmonic oscillator has the following solution: