- The initial \(t=0\) Bose gas momentum space distribution \(n_{|k\rangle}(t=0)\) is sharply peaked at some \(k=k_0\), with the low-energy bosons of \(k<k_0\) in the so-called IR regime and the high-energy bosons of \(k>k_0\) in the UV regime.
- The strongly interacting regime corresponds to bosons with \(k\xi\ll 1\) with \(\xi\propto a^{-1/2}\) the healing length, whereas the weakly interacting regime is \(k\xi\gg 1\).
- In the UV regime, if it starts weakly interacting then it will remain like that because \(k\) just grows, pushing further into weakly interacting.
- In the IR regime, there is a transition from strongly to weakly interacting, so more interesting to study.
- Gevorg’s speed limit paper showed that the coherent coarsening dynamics as quantified by the speed limit \(3\hbar/m\) is, after an initial transient, independent of the (dimensionless) interaction strength \(1/(k\xi)\).
- WWT applies to the weakly interacting \(k\xi\gg 1\) regime.
- # the wiggles at the end are due to diffraction off the BEC, not the sinc momentum space contribution from the BEC which is approximately (need to crop them from the data)
- # a homogeneous top hat in real space.
- # Each experimental cycle lasts about 30 seconds, the first 25 seconds is just the standard steps (MOT, evaporation, Sisyphus cooling, molasses, etc.), the last 5 seconds
- # only like a few seconds is actually the physics.
- # Right now, Gevorg & Simon decided to have 15 increasing TOFs, in order to probe the momentum space distribution n_k of the BEC.
- # The idea is that you measure a 2D momentum space distribution of the BEC along your line of sight, then inverse Abel transform (involves a derivative)
- # it to get the 3D momentum space distribution. But also, the only part you can reliably inverse Abel transform is the part that is not diffracted
- # off the BEC and doesn’t saturate the optical density OD at 3, so can only reliably measure in sort of the “outskirt” regions of the cylinder so to speak
- # where the BEC is not too dense (i.e. OD < 3). Also this is why it’s hard to measure the low-k part of the momentum space distribution, b/c the BEC is so dense
- # there (OD > 3) that it saturates the imaging system.
- # Numerical differentiation is much less robust than numerical integration (Gevorg gave example of a monotonic function like exp(x)),
- # so it’s much harder to get the 3D momentum space distribution. In the case of the inverse Abel transform, you have to differentiate by subtracting the
- # of neighbouring pixels.
- # Need to be on the Cambridge VPN to “SSH” into the VNC viewer to see Analysis GpUI (imaging computer), Cicero, Origin, or any of the office computers
- # each of the lab computers has a IP address, and you can only access them from the Cambridge network. Also a remote Toptica software for
- # relocking the laser if it drifts/unlocks overnight (somehow not so easy to just automate this b/c need to manually play with the current and piezo voltage in the AOMs).
- # Ground state wavefunction of BEC in k-space is not exactly a sinc, rather a Bessel b/c it’s a cylinder.
- # one-body loss, evaporative loss, usually temperatures too high cf. box trap depth, in that case lose a lot of energy but not many particles
- # b/c only particels you lose are the high-energy ones, so you lose a lot of energy but not many particles
- # another worry is counting worry, the n_rad_k distributions might overestimate or underestimate the number of particles
- # e.g. the 350000 atoms seems too high…truncate integrals as well to avoid the noisy region at high-k.
- # for n_k use log-log, for k**2*n_k or k**4*n_k use lin-lin maybe? or log-lin… advantage of lin-lin is that area you visually see is proportional to the
- # integral of the function and for k**2 and k**4 this is nice to see.
- #try playing around with some different definition of “speed of thermalization”, see if they give a monotonic thing or not
- # also compare with the GPE & WWT theory of the paper (which plots k**2*n_k) see if it matches…
- # change energy to temperature scale (nK), also get E/N for each set
Weak Wave Turbulence (WWT)
Classical Field Theory
The purpose of this post is to solve some problems touching on various aspects of classical field theory. Most are just taken from this problem sheet, with solutions here, but there are also a few extra problems meant to clarify aspects of classical field theory.
Problem #\(1\): A string of length \(L\), mass per unit length \(\mu\), under uniform tension \(T\) is fixed at each end. The Lagrangian \(\mathcal L\) governing the time evolution of the transverse displacement \(\psi(x,t)\) is:
\[\mathcal L=\int_{0}^L\left(\frac{\mu}{2}\dot{\psi}^2-\frac{T}{2}\psi’^2\right)dx\]
where \(x\in[0,L]\) identifies position along the string from one endpoint. By expressing the transverse displacement as a Fourier sine series:
\[\psi(x,t)=\sqrt{\frac{2}{L}}\sum_{n=1}^{\infty}q_n(t)\sin\frac{n\pi x}{L}\]
Show that the Lagrangian \(\mathcal L\) becomes:
\[\mathcal L=\sum_{n=1}^{\infty}\left(\frac{\mu}{2}\dot{q}^2_n-\frac{T}{2}\left(\frac{n\pi}{L}\right)^2q_n^2\right)\]
Derive the equations of motion. Hence, show that the string is equivalent to an infinite set of decoupled harmonic oscillators with frequencies:
\[\omega_n=\sqrt{\frac{T}{\mu}}\frac{n\pi}{L}\]
Solution #\(1\):

Problem #\(2\): Show that the solution space of the Klein-Gordon equation is closed under Lorentz transformations.
Solution #\(2\):

Problem #\(3\): The motion of a complex scalar field \(\psi(X)\) is governed by the Lagrangian density:
\[\mathcal L=\partial_{\mu}\psi^*\partial^{\mu}\psi-m^2\psi^*\psi-\frac{\lambda}{2}(\psi^*\psi)^2\]
Write down the Euler-Lagrange field equations for this system. Verify that the Lagrangian density \(\mathcal L\) is invariant under the infinitesimal transformation:
\[\delta\psi=i\alpha\psi\]
\[\delta\psi^*=-i\alpha\psi^*\]
Derive the Noether current \(j^{\mu}\) associated with this transformation and verify explicitly that it is conserved using the field equations satisfied by \(\psi\).
Solution #\(3\):


Problem #\(4\): Verify that the Lagrangian density:
\[\mathcal L=\frac{1}{2}\partial_{\mu}\phi_{\alpha}\partial^{\mu}\phi_{\alpha}-\frac{1}{2}m^2\phi_{\alpha}\phi_{\alpha}\]
for a triplet of real fields \(\phi_{\alpha}(X)\), \(\alpha=1,2,3\), is invariant under the infinitesimal \(SO(3)\) rotation by \(\theta\):
\[\phi_{\alpha}\mapsto\phi_{\alpha}+\theta\varepsilon_{\alpha\beta\gamma}n_{\beta}\phi_{\gamma}\]
where \(n_{\beta}\) is a unit vector. Compute the Noether current \(j^{\mu}\) associated to this transformation. Deduce that the three quantities:
\[Q_{\alpha}=\int d^3\textbf x\varepsilon_{\alpha\beta\gamma}\dot{\phi}_{\beta}\phi_{\gamma}\]
are all conserved.
Solution #\(4\):


Problem #\(5\): By requiring that Lorentz transformations \(\Lambda^{\mu}_{\space\space\nu}\) should preserve the Minkowski norm of \(4\)-vectors \(\eta_{\mu\nu}X’^{\mu}X’^{\nu}=\eta_{\mu\nu}X^{\mu}X^{\nu}\), show that this implies:
\[\eta_{\mu\nu}=\eta_{\sigma\tau}\Lambda^{\sigma}_{\space\space\mu}\Lambda^{\tau}_{\space\space\nu}\]
Show that an infinitesimal transformation of the form \(\Lambda^{\mu}_{\space\space\nu}=\delta^{\mu}_{\space\space\nu}+\omega^{\mu}_{\space\space\nu}\) is specifically a Lorentz transformation iff \(\omega_{\mu\nu}=-\omega_{\nu\mu}\) is antisymmetric.
Write down the matrix for \(\omega^{\mu}_{\space\space\nu}\) corresponding to an infinitesimal rotation by angle \(\theta\) around the \(x^3\)-axis. Do the same for an infinitesimal Lorentz boost along the \(x^1\)-axis by velocity \(v\).
Solution #\(5\):


Problem #\(6\): Consider a general infinitesimal Lorentz transformation \(X^{\mu}\mapsto X’^{\mu}=X^{\mu}+\omega^{\mu}_{\space\space\nu}X^{\nu}\) acting at the level of the \(4\)-vector \(X\). How does this perturbation manifest at the level of a scalar field \(\phi=\phi(X)\)? What about at the level of the Lagrangian density \(\mathcal L=\mathcal L(\phi)\)? What about at the level of the action \(S=S(\mathcal L)\)? In particular, show that the perturbation \(\delta\mathcal L\) at the level of \(\mathcal L\) is a total spacetime derivative, and hence describe the Noetherian implication of this.
Solution #\(6\):



Problem #\(7\): Maxwell’s Lagrangian for the electromagnetic field is:
\[\mathcal L=-\frac{1}{4}F_{\mu\nu}F^{\mu\nu}\]
where \(F_{\mu\nu}=\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu}\) and \(A_{\mu}\) is the \(4\)-vector potential. Show that \(\mathcal L\) is invariant under gauge transformations \(A_{\mu}\mapsto A_{\mu}+\partial_{\mu}\Gamma\) where \(\Gamma=\Gamma(X)\) is a scalar field with arbitrary (differentiable) dependence on \(X\). Use Noether’s theorem, and the spacetime translational invariance of the action \(S\) to construct the energy-momentum tensor \(T^{\mu\nu}\) for the electromagnetic field. Show that the resulting object is neither symmetric nor gauge invariant. Consider a new tensor given by:
\[\Theta^{\mu\nu}=T^{\mu\nu}-F^{\rho\mu}\partial_{\rho}A^{\nu}\]
Show that this object also defines \(4\) conserved currents. Moreover, show that it is symmetric, gauge invariant and traceless.
Solution #\(7\):







Problem #\(8\): In Problem #\(3\), a classical field theory involving two complex scalar fields \(\psi(X),\psi^*(X)\) with Lagrangian density:
\[\mathcal L(\psi,\psi^*,\partial_{\mu}\psi,\partial_{\mu}\psi^*)=\partial_{\mu}\psi^*\partial^{\mu}\psi-V(\psi^*\psi)\]
was analyzed (there the potential \(V\) was taken to be analytic in \(\psi^*\psi\) and so Taylor expanded to quadratic order). In particular, the Noether current \(j^{\mu}\) was obtained explicitly for the continuous global, internal \(U(1)\) \(\mathcal L\)-symmetry \(\psi\mapsto e^{i\alpha}\psi\) for a constant \(\alpha\in\textbf R\). By promoting \(\alpha=\alpha(X)\) but still acting infinitesimally across \(X\) in spacetime, recompute the Noether current \(j^{\mu}\), and notice in particular that the Noether current \(j^{\mu}\) doesn’t care about the potential terms \(V(\psi^*\psi)\), only the kinetic terms.
Solution #\(8\):


Problem #9:
AC Stark Effect & Optical Dipole Traps
Consider an atomic two-level system with ground state \(|0\rangle\) and excited state \(|1\rangle\). Recall that in the interaction picture, after making the rotating wave approximation and boosting into a steady-state rotating frame, one had the resultant time-independent steady-state Hamiltonian:
\[H_{\infty}=\frac{\hbar}{2}\tilde{\boldsymbol{\Omega}}\cdot\boldsymbol{\sigma}\]
Invoking the identity of Pauli matrices \((\tilde{\boldsymbol{\Omega}}\cdot\boldsymbol{\sigma})^2=-|\tilde{\boldsymbol{\Omega}}|^21\), it is clear that the eigenvalues of this Hamiltonian are thus \(E_{\pm}=\pm\frac{\hbar|\tilde{\boldsymbol{\Omega}}|}{2}=\frac{\hbar\sqrt{\Omega^2+\delta^2}}{2}\), and this is known as the light shift resulting from the AC Stark effect (also called the Autler-Townes effect). In particular, if \(\Omega=0\) then \(E_{\pm}=\).
\[E_{\pm}=\pm\left(\frac{\hbar\delta}{2}+\frac{\hbar\Omega^2}{4\delta}\right)\]
It is not a coincidence that this light shift calculated from time-independent perturbation theory, after a first-order binomial expansion, to the result of first-order nondegenerate time-independent perturbation theory applied to … turns out in the framework of QED that these correspond to so-called dressed states of the atom-photon system.
Classical Optics
The purpose of this post is to explain the \(2\) key models of classical optics, namely geometrical optics (also known as ray optics) and physical optics (also known as wave optics). Although historically geometrical optics came before physical optics, and indeed this is also usually the order in which they are conventionally taught, this post will take the more unconventional approach of presenting physical optics first, and then showing how it reduces to geometrical optics in the \(\lambda\to 0\) limit.
Physical Optics
Discuss:
- Fourier optics in the Fraunhofer regime.
- Gaussian (pilot) beams
- TE/TM/TEM modes in EM waveguides
- How Fresnel diffraction is an exact solution to the paraxial Helmholtz equation and what this has to do with the eikonal approximation/Hamilton-Jacobi equation from classical dynamics.
Maxwell’s equations assert that the electric and magnetic field \(\textbf E,\textbf B\) satisfy vector wave equations in vacuum:
\[\biggr|\frac{\partial}{\partial\textbf x}\biggr|^2\textbf E-\frac{1}{c^2}\ddot{\textbf E}=\textbf 0\]
\[\biggr|\frac{\partial}{\partial\textbf x}\biggr|^2\textbf B-\frac{1}{c^2}\ddot{\textbf B}=\textbf 0\]
Any of their \(6\) components (denoted \(\psi\)) thus satisfy the scalar wave equation \(|\frac{\partial}{\partial\textbf x}|^2\psi+\frac{1}{c^2}\ddot{\psi}=0\). The spacetime Fourier transform yields the trivial dispersion relation \(\omega=ck\) from which it is evident that performing just a temporal Fourier transform (to avoid the minutiae of \(t\)-dependence) leads to the scalar Helmholtz equation for \(\psi(\textbf x)\):
\[\left(\biggr|\frac{\partial}{\partial\textbf x}\biggr|^2+k^2\right)\psi=\textbf 0\]
In other words, one is looking for eigenfunctions of the Laplacian \(\biggr|\frac{\partial}{\partial\textbf x}\biggr|^2\) with eigenvalue \(-k^2\). To begin, consider one of Green’s identities, valid for arbitrary scalar fields \(\psi(\textbf x’),\tilde{\psi}(\textbf x’)\) which are \(C^2\) everywhere in the volume \(V\):
\[\iint_{\textbf x’\in\partial V}\left(\psi\frac{\partial\tilde{\psi}}{\partial\textbf x’}-\tilde{\psi}\frac{\partial\psi}{\partial\textbf x’}\right)\cdot d^2\textbf x’=\iiint_{\textbf x’\in V}\left(\psi\biggr|\frac{\partial}{\partial\textbf x’}\biggr|^2\tilde{\psi}-\tilde{\psi}\biggr|\frac{\partial}{\partial\textbf x’}\biggr|^2\psi\right)d^3\textbf x’\]
(it’s just the divergence theorem applied to the vector field \(\psi\frac{\partial\tilde{\psi}}{\partial\textbf x’}-\tilde{\psi}\frac{\partial\psi}{\partial\textbf x’}\)). It is now obvious that the volume integral will vanish if one then imposes that both \(\psi(\textbf x’)\) and \(\tilde{\psi}(\textbf x’)\) also satisfy the scalar Helmholtz equation. Given any point \(\textbf x\in\textbf R^3\), it is physically clear that the spherical wave Green’s function \(\tilde{\psi}(\textbf x’|\textbf x)=e^{ik|\textbf x-\textbf x’|}/|\textbf x-\textbf x’|\) is one possible (though certainly not a unique) solution to the scalar Helmholtz equation, provided one stays away from the singularity at \(\textbf x’=\textbf x\). This motivates the choice of volume \(V\) to be some arbitrary region but with an \(\varepsilon\)-ball cut around \(\textbf x\), in which case the volume integral can legitimately be taken to vanish over this choice of \(V\). In that case, the surface \(\partial V=S^2_{\varepsilon}\cup S\) can be partitioned into an inner surface \(S^2_{\varepsilon}\) and an outer surface \(S\):

The flux through these two surfaces \(S^2_{\varepsilon},S\) must thus be equal:
\[\iint_{\textbf x’\in S^2_{\varepsilon}}\left(\psi\frac{\partial\tilde{\psi}}{\partial\textbf x’}-\tilde{\psi}\frac{\partial\psi}{\partial\textbf x’}\right)\cdot d^2\textbf x’=\iint_{\textbf x’\in S}\left(\tilde{\psi}\frac{\partial\psi}{\partial\textbf x’}-\psi\frac{\partial\tilde{\psi}}{\partial\textbf x’}\right)\cdot d^2\textbf x’\]
The integral over \(S^2_{\varepsilon}=\{\textbf x’\in\textbf R^3:|\textbf x’-\textbf x|=\varepsilon\}\) is straightforward in the limit \(\varepsilon\to 0\):
\[\iint_{\textbf x’\in S^2_{\varepsilon}}\left(\psi\frac{\partial\tilde{\psi}}{\partial\textbf x’}-\tilde{\psi}\frac{\partial\psi}{\partial\textbf x’}\right)\cdot d^2\textbf x’=\psi(\textbf x)\lim_{\varepsilon\to 0}4\pi\varepsilon^2\frac{1-ik\varepsilon}{\varepsilon^2}e^{ik\varepsilon}+\hat{\textbf n}\cdot\frac{\partial\psi}{\partial\textbf x’}(\textbf x)\lim_{\varepsilon\to 0}4\pi\varepsilon^2\frac{e^{ik\varepsilon}}{\varepsilon}=4\pi\psi(\textbf x)\]
One thus obtains Kirchoff’s integral formula for the scalar Helmholtz equation:
\[\psi(\textbf x)=\frac{1}{4\pi}\iint_{\textbf x’\in S:\textbf x\in V}\left(\frac{e^{ik|\textbf x-\textbf x’|}}{|\textbf x-\textbf x’|}\frac{\partial\psi}{\partial\textbf x’}-\psi(\textbf x’)\frac{\partial}{\partial\textbf x’}\frac{e^{ik|\textbf x-\textbf x’|}}{|\textbf x-\textbf x’|}\right)\cdot d^2\textbf x’\]
where as above:
\[\frac{\partial}{\partial\textbf x’}\frac{e^{ik|\textbf x-\textbf x’|}}{|\textbf x-\textbf x’|}=\frac{ik|\textbf x-\textbf x’|-1}{|\textbf x-\textbf x’|^3}e^{ik|\textbf x-\textbf x’|}(\textbf x’-\textbf x)\]
As an aside, Kirchoff’s integral formula is very similar in spirit to another more well-known integral formula, namely the Cauchy integral formula \(f(z_0)=\frac{1}{2\pi i}\oint_{z\in\gamma:z_0\in\text{int}(\gamma)}\frac{f(z)}{z-z_0}dz\) from complex analysis; the constraint of complex analyticity is analogous to constraining \(\psi\) to obey the Helmholtz equation; if one specifies both Dirichlet and Neumann boundary conditions for \(\psi(\textbf x’)\) everywhere on \(\textbf x’\in S\), then in principle this is enough to uniquely determine \(\psi(\textbf x)\) everywhere in the interior \(V\) of the enclosing surface \(S\).
Now consider the following standard diffraction setup:

Here the surface \(S\) is chosen to be a sphere of radius \(R\) centered at \(\textbf x\), except where it flattens along the aperture with some distribution of slits. As one takes \(R\to\infty\), then in analogy to Jordan’s lemma from complex analysis, one can argue that the flux through this spherical cap portion of \(S\) in Kirchoff’s integral formula vanishes like \(\sim 1/R\to 0\) (this is admittedly still a bit handwavy, for a rigorous argument see the Sommerfeld radiation condition). Thus, the behavior of \(\psi\) on the aperture alone is sufficient to determine its value \(\psi(\textbf x)\) at an arbitrary “screen location” \(\textbf x\) beyond the aperture. Supposing a monochromatic plane wave \(\psi(\textbf x’)=\psi(x’,y’,0)e^{ikz’}\) of momentum \(k\) (hence solving the scalar Helmholtz equation) is normally incident on the aperture \(z’=0\) and that \(k|\textbf x-\textbf x’|\gg 1\) (easily true in most cases), this imposes the boundary condition \(-\frac{\partial\psi}{\partial z’}(x’,y’,0)=-ik\psi(x’,y’,0)\) so one can check that Kirchoff’s integral formula simplifies to:
\[\psi(\textbf x)\approx -\frac{ik}{4\pi}\iint_{\textbf x’\in\text{aperture}}\psi(\textbf x’)\frac{e^{ik|\textbf x-\textbf x’|}}{|\textbf x-\textbf x’|}(1+\cos\angle(\textbf x-\textbf x’,\hat{\textbf k}))d^2\textbf x’\]
Or equivalently:
\[\psi(\textbf x)=\frac{1}{i\lambda}\iint_{\textbf x’\in\text{aperture}}\psi(\textbf x’)\frac{e^{ik|\textbf x-\textbf x’|}}{|\textbf x-\textbf x’|}K(\textbf x-\textbf x’)d^2\textbf x’\]
where the obliquity kernel is \(K(\textbf x-\textbf x’):=\frac{1+\cos\angle(\textbf x-\textbf x’,\hat{\textbf k})}{2}=\cos^2\frac{\angle(\textbf x-\textbf x’,\hat{\textbf k})}{2}\). This is nothing more than a mathematical expression of the Huygens-Fresnel principle.
Fresnel vs. Fraunhofer Diffraction
In general the Huygens-Fresnel integral is difficult to evaluate analytically for an arbitrary point \(\textbf x\) on a screen. Thus, one often begins by making the paraxial approximation \(K(\textbf x-\textbf x’)\approx 1\iff |\textbf x-\textbf x’|\approx z\), except in the complex exponential (otherwise all Huygens wavelets would interfere constructively which is silly). Here instead, one implements a less strict version of the paraxial approximation in the form of a \(z^2\gg |\textbf x-\textbf x’|^2-z^2\) binomial expansion:
\[|\textbf x-\textbf x’|=\sqrt{z^2+|\textbf x-\textbf x’|^2-z^2}=z+\frac{|\textbf x-\textbf x’|^2-z^2}{2z}-\frac{(|\textbf x-\textbf x’|^2-z^2)^2}{8z^3}+…\]
In practice the quadratic term is negligible in the paraxial limit, so neglecting it and all higher-order terms yields the Fresnel diffraction integral:
\[\psi(\textbf x)=\frac{e^{ikz}}{i\lambda z}\iint_{\textbf x’\in\text{aperture}}\psi(\textbf x’)e^{ik(|\textbf x-\textbf x’|^2-z^2)/2z}d^2\textbf x’\]
Or equivalently, expanding \(|\textbf x-\textbf x’|^2=|\textbf x|^2+|\textbf x’|^2-2\textbf x\cdot\textbf x’\):
\[\psi(\textbf x)=\frac{e^{ikz}e^{ik(|\textbf x|^2-z^2)/2z}}{i\lambda z}\iint_{\textbf x’\in\text{aperture}}\psi(\textbf x’)e^{ik|\textbf x’|^2/2z}e^{-i\textbf k\cdot\textbf x’}d^2\textbf x’\]
where \(\textbf k=k\textbf x/z\). If in addition one also assumes that \(k|\textbf x’|^2/2z\ll 1\), then one obtains the Fourier optics case of far-field/Fraunhofer diffraction:
\[\psi(\textbf x)=\frac{e^{ikz}e^{ik(|\textbf x|^2-z^2)/2z}}{i\lambda z}\iint_{\textbf x’\in\text{aperture}}\psi(\textbf x’)e^{-i\textbf k\cdot\textbf x’}d^2\textbf x’\]
The reason that Fraunhofer diffraction is only considered to apply in the far-field is that the above condition for its validity can be rewritten as \(z\gg z_R\) where \(z_R:=\rho’^2/\lambda\) is the Rayleigh distance of an aperture of typical length scale \(\rho’\sim\sqrt{x’^2+y’^2}\) when illuminated by monochromatic light of wavelength \(\lambda\). In other words, the precise meaning of “far-field” is “farther than the Rayleigh distance \(z_R\)”. Otherwise, when \(z≲z_R\) then such a term cannot be neglected and one simply refers to it as Fresnel diffraction. Notice that \(z≲z_R\) is not saying the same thing as \(z\ll z_R\); it is often said that Fresnel diffraction is the regime of near-field diffraction but that phrase can be misleading because it suggests that \(z\) can be arbitrarily small, yet clearly at some point if one kept decreasing \(z\) then eventually the higher-order terms in the binomial expansion would also start to matter (moreover, the paraxial approximation would also start to break down). Instead of calling it “near-field diffraction”, a more accurate name for Fresnel diffraction would be “not-far-enough diffraction” \(z≲z_R\). By contrast, Fraunhofer diffraction truly is arbitrarily far-field \(z\gg z_R\). Of course nothing stops one from also considering the case \(z\ll z_R\), there just doesn’t seem to be any special name given to this regime and in practice it’s not as relevant. Finally, sometimes one also encounters the terminology of the Fresnel number \(F(z):=z_R/z\); in this jargon, Fraunhofer diffraction occurs when \(F(z)\ll 1\) whereas Fresnel diffraction occurs when \(F(z)≳1\).
In practice, one typically ignores the pre-factor in front of the aperture integrals since it is the general profile of the irradiance \(|\psi(\textbf x)|^2\) that is mainly of interest. In this case, particular for Fraunhofer diffraction, one can write \(\hat{\psi}(\textbf k)\equiv\psi(\textbf x)\) as just the \(2\)D spatial Fourier transform of the aperture.
Diffraction Through a Single Slit
Consider a single \(y\)-invariant slit of width \(\Delta x\) centered at \(x’=0\). Then the Fraunhofer interference pattern has the form:
\[\hat{\psi}(k_x)\sim\int_{-\Delta x/2}^{\Delta x/2}e^{-ik_x x}dx=\frac{1}{ik_x}2i\sin\frac{k_x\Delta x}{2}\sim\Delta x\text{sinc}\frac{k_x\Delta x}{2}\]
where \(k_x=k\sin\theta\):

By contrast, the Fresnel interference pattern is given by:
\[\psi(x)\sim\int_{-\Delta x/2}^{\Delta x/2}e^{ik(x’-x)^2/2z}dx’\]
Although in general such an integral needs to be evaluated numerically, there is a simple geometric way to gain some intuition for how \(\psi(x)\in\textbf C\) behaves at fixed \(z≲z_R\sim\Delta x^2/\lambda\) via the Cornu spiral (also called the Euler spiral in contexts outside of physical optics). The idea is to shift the \(x\)-dependence from the integrand into the limits via the substitution \(\pi t’^2/2:=k(x-x’)^2/2z\iff t’=\sqrt{\frac{2}{\lambda z}}(x-x’)\). Then, ignoring chain rule factors, one has:
\[\psi(x)\sim\int_{t’_1(x)}^{t’_2(x)}e^{i\pi t’^2/2}dt’\]
where the limits are \(t’_1(x)=\sqrt{\frac{2}{\lambda z}}(x+\Delta x/2)\) and \(t’_2(x)=\sqrt{\frac{2}{\lambda z}}(x-\Delta x/2)\). Written in terms of the normalized Fresnel integral \(\text{Fr}(t):=\int_0^{t}e^{i\pi t’^2/2}dt’\):
\[\psi(x)\sim\text{Fr}(t’_2(x))-\text{Fr}(t’_1(x))\]
The object \(\text{Fr}(t)\) is a trajectory in \(\textbf C\) which is the aforementioned Cornu spiral:

where one can check the limits \(\lim_{t\to\pm\infty}\text{Fr}(t)=\pm(1+i)/2\). Noting that \(\dot{\text{Fr}}(t)=e^{i\pi t^2/2}\), it follows that the speed \(|\dot{\text{Fr}}(t)|=1\) is uniform and thus the distance/arc length traversed in time \(\Delta t\) is always just \(\Delta t\). Moreover, the curvature \(\kappa(t)=\Im(\dot{\text{Fr}}^{\dagger}\ddot{\text{Fr}})/|\dot{\text{Fr}}|^3=\pi t\) increases linearly in the arc length \(t\) (essentially what defines a spiral!). The point is that the irradiance \(|\psi(x)|^2\sim|\text{Fr}(t’_2(x))-\text{Fr}(t’_1(x))|^2\) is now visually just the length (squared) of a vector between the points \(\text{Fr}(t’_1(x))\) and \(\text{Fr}(t’_2(x))\) on the Cornu spiral. The idea is to first trek a distance \(\frac{t’_1(x)+t’_2(x)}{2}=\sqrt{\frac{2}{\lambda z}}x\) from the origin to some central \(x\)-dependent point on the spiral, and then extend around it by the \(x\)-independent amount \(t’_1-t’_2=\sqrt{\frac{2}{\lambda z}}\Delta x\sim\sqrt{F(z)}\) to get a corresponding line segment whose length will be \(|\psi(x)|\).
Diffraction Through a Circular Aperture
Given a circular aperture of radius \(R\), its \(2\)D isotropic nature means that the Fraunhofer interference pattern is just proportional to the Hankel transform of the aperture:
\[\hat{\psi}(k_{\rho})\sim\int_{0}^R\rho’J_0(k_{\rho}\rho’)d\rho’\sim\frac{J_1(k_{\rho}R)}{k_{\rho}}\]
with \(k_{\rho}=k\sin\theta\). This is sometimes called a sombrero or \(\text{jinc}\) function, being the polar analog of the \(\text{sinc}\) function. It has its first zero at \(k_{\rho}R\approx 3.8317\) which defines the boundary of the Airy disk (cf. \(\text{sinc}(k_x\Delta x/2)\) having its first zero at \(k_x\Delta x/2=\pi\approx 3.1415\)). This is often expressed paraxially via the angular radius of the Airy disk \(\theta_{\text{Airy}}\approx 1.22\frac{\lambda}{D}\) with \(D=2R\) the diameter of the aperture (cf. \(\theta_{\text{central max}}\approx\frac{\lambda}{\Delta x}\) for the single slit).
For the same circular aperture setup, one can also ask what happens in the Fresnel regime \(z≲z_R\). In general, the integral is complicated:
\[\psi(\rho,z)\sim\int_0^R e^{ik\rho’^2/2z}\rho’ J_0(k_{\rho}\rho)d\rho’\]
But if one restricts attention to the symmetric case of being on-axis \(\rho=k_{\rho}=0\), then:
\[\psi(0,z)\sim e^{ikR^2/2z}-1\Rightarrow|\psi(0,z)|^2\sim\sin^2\frac{kR^2}{4z}\]
This is essentially the topologist’s favorite pathological sine function, but of course it was already mentioned that this solution is only reliable when the argument \(kR^2/4z\sim F(z)≳1\). For this specific on-axis case, it turns out one can significantly relax the paraxial assumption, namely, although one still assumes the obliquity kernel \(K(\textbf x-\textbf x’)\approx 1\), otherwise one acknowledges that \(r^2=\rho’^2+z^2\):
\[\psi(0,z)\sim\int_{z}^{\sqrt{R^2+z^2}}e^{ikr}dr\Rightarrow |\psi(0,z)|^2\sim\sin^2\frac{k(\sqrt{R^2+z^2}-z)}{2}\]
where if one were to binomial expand \(\sqrt{R^2+z^2}\approx z+R^2/2z\) one would just recover the Fresnel solution. If one fixes a given on-axis observation distance \(z\) and instead views \(|\psi(0,z)|^2\) as a function of the aperture radius \(R\) (and not \(z\)), then clearly it alternates between bright maxima and dark minima at aperture radii \(R\equiv\rho’_n\) given by:
\[\frac{k(\sqrt{{\rho’_n}^2+z^2}-z)}{2}=\frac{n\pi}{2}\iff \rho’_n=\sqrt{n\lambda z+\left(\frac{n\lambda}{2}\right)^2}\approx\sqrt{n\lambda z}\]
Thus, in general for a fixed aperture radius \(R\), there will be \(\sim F(z)\) concentric annuli of the form \(\rho’\in[\rho’_{n-1},\rho’_n]\) that can be made to partition the aperture disk; the annulus \([\rho’_{n-1},\rho’_n]\) is called the \(n\)-th Fresnel half-period zone. Note that the area of each Fresnel half-period zone is a constant \(\pi\lambda z\) in the Fresnel regime thus providing equal but alternating contributions to \(\psi(0,z)\) that lead to the observed oscillatory behavior in \(|\psi(0,z)|^2\). The existence of this Fresnel half-period zone structure motivates the construction of Fresnel zone plates which are vaguely like polar analogs of diffraction gratings, except rather than being regularly spaced, they block alternate Fresnel half-period zones with some opaque material to reinforce constructive interference for a given \(z\) and \(\lambda\) (thus, in order to design such a zone plate, one has to already have in mind a \(z\) and a \(\lambda\) ahead of time in order to compute the radii \(\rho’_n\approx\sqrt{n\lambda z}\) to be etched out). If a given \((z,\lambda)\) zone plate has already been constructed, but one then proceeds to move \(z\mapsto z/m\) for some \(m\in\textbf Z^+\), then in each of the Fresnel half-period zones associated to \(z\), there would now be \(m\) Fresnel half-period zones associated to \(z/m\), and so each transparent region of the \((z,\lambda)\) zone plate would allow through \(m\) of the \(z/m\) Fresnel half-period zones. Thus, if \(m\in 2\textbf Z^+-1\) is odd, then one would still expect a net constructive interference on-axis at \(z/m\), whereas if \(m\in 2\textbf Z^+\) then destructive interference of pairs of adjacent Fresnel half-period zones wins out (this parity argument is easy to remember because if \(m=1\) then nothing happens and the whole point of constructing the zone plate was to amplify the constructive interference at \(z\)). Finally, it is sometimes said that a given \((z,\lambda)\) Fresnel zone plate acts like a lens of focal length \(z\); however, due to the dependence on \(\lambda\), such a lens suffers from chromatic aberration.
Now consider the complimentary problem of calculating \(\tilde{\psi}(0,z)\) for a circular obstruction of radius \(R\), rather than a circular aperture of radius \(R\) carved into an infinitely-extending obstruction. Clearly, the two are related by subtracting the solution \(\psi(0,z)\) of the circular aperture of radius \(R\) from the free, unobstructed plane wave \(e^{ikz}\); this obvious corollary of the linearity of the scalar Helmholtz equation is an instance of Babinet’s principle. This leads to the counterintuitive prediction of Poisson’s spot (also called Arago’s spot):
\[\tilde{\psi}(0,z)=e^{ikz}-\psi(0,z)\sim e^{ik\sqrt{R^2+z^2}}\Rightarrow |\psi(0,z)|^2\sim 1\]
where, taking into account the obliquity kernel \(K(\textbf x-\textbf x’)\), this holds as long as one doesn’t wander too close to the aperture. Note from the earlier case of the circular aperture that there was the complementary (and equally counterintuitive) prediction that one could get a dark on-axis spot at certain \(z\) (i.e. those for which the aperture \(R=\rho’_{2m}\) partitions into an even number \(2m\) of Fresnel half-period zones as evident from the formula \(|\psi(0,z)|^2\sim\sin^2k(\sqrt{R^2+z^2}-z)/2\)).
Talk about how, by working with a scalar \(\psi\), have basically neglected polarization which only comes about from the vectorial nature of the electromagnetic field; forms basis of so-called scalar wave theory or scalar diffraction theory. Connect all this to the Lippman-Schwinger equation in quantum mechanical scattering theory (specifically, this is basically the first-order Born approximation solution to LS equation).
Fraunhofer is taxicab, treats wavefronts as planar, Fresnel is \(\ell^2\), actually considers their curvature. Actually, anywhere that Fraunhofer works, Fresnel also works.
Geometrical Optics
Consider a spherical glass of index \(n’\) and radius \(R>0\) placed in a background of index \(n\), and a paraxial light ray incident at angle \(\theta\) and distance \(\rho\) (where both \(\theta\) and \(\rho\) are measured with respect to some suitable choice of principal \(z\)-axis):

The incident angle of the yellow ray is \(\theta+\rho/R\) while its refracted angle is \(\theta’+\rho/R\) so Snell’s law asserts (paraxially) that:
\[n’\left(\theta’+\frac{\rho}{R}\right)=n\left(\theta+\frac{\rho}{R}\right)\]
This directly yields the paraxial ray transfer matrix for this spherical glass:
\[\begin{pmatrix}\rho’\\\theta’\end{pmatrix}=\begin{pmatrix}1&0\\ (n-n’)/n’R&n/n’\end{pmatrix}\begin{pmatrix}\rho\\\theta\end{pmatrix}\]
There is no need to memorize such a matrix; instead, because it is \(2\times 2\), it can always be quickly rederived by finding two linearly independent vectors on which the action of such a matrix is physically obvious. The natural choice are its eigenvectors, which correspond physically to the following two “eigenrays”:


In the limit \(R\to\infty\) of a flat interface (e.g. in a plano-convex lens), the paraxial ray transfer matrix reduces to the diagonal matrix \(\begin{pmatrix}1&0\\ 0&n/n’\end{pmatrix}\).
Welding two such spherical glasses (of the same index \(n’\) and radii \(R>0,R'<0\) in the usual Cartesian sign convention) together back-to-back and assuming the usual thin-lens approximation (otherwise one would also need to include a propagation ray transfer matrix \(\begin{pmatrix}1&\Delta z\\ 0&1\end{pmatrix}\) if the thickness \(\Delta z>0\) were non-negligible), one obtains the paraxial ray transfer matrix of a thin convex lens (indeed any thin lens):
\[\begin{pmatrix}1&0\\ (n’-n)/nR’&n’/n\end{pmatrix}\begin{pmatrix}1&0\\ (n-n’)/n’R&n/n’\end{pmatrix}=\begin{pmatrix}1&0\\-P&1\end{pmatrix}\]
where \(P:=1/f\) is the optical power of the thin convex lens and the focal length \(f>0\) is given by the lensmaker’s equation:
\[\frac{1}{f}=\frac{n’-n}{n}\left(\frac{1}{R}-\frac{1}{R’}\right)\]
As a check of this formula’s self-consistency, consider the special case of a thin plano-convex lens (where the convex side has radius \(R>0\)). According to the lensmaker’s equation, this should have focal length \(1/f=(n’-n)/nR\). On the other hand, if one were to flip this plano-convex lens around (and call the radius of the convex side \(R'<0\)), then the lensmaker’s formula says it should now have focal length \(1/f’=-(n’-n)/nR’\). But, since both plano-convex lenses are thin, if one simply puts them next to each other then one would reform the thin convex lens as before, with effective focal length:
\[\frac{1}{f_{\text{eff}}}=\frac{n’-n}{n}\left(\frac{1}{R}-\frac{1}{R’}\right)=\frac{1}{f}+\frac{1}{f’}\]
In other words, the optical powers are additive \(P_{\text{eff}}=P+P’\). Note that this holds for any \(2\) thin lenses placed next to each other to form an “effective lens”, not just for the example of \(2\) plano-convex lenses given above. More generally, because the group of shears on \(\textbf R^2\) along a given direction is isomorphic to the additive abelian group \(\textbf R\), it holds for any \(N\in\textbf Z^+\) thin lenses arranged in an arbitrary order:
\[\begin{pmatrix}1&0\\-P_N&1\end{pmatrix}…\begin{pmatrix}1&0\\-P_2&1\end{pmatrix}\begin{pmatrix}1&0\\-P_1&1\end{pmatrix}=\begin{pmatrix}1&0\\-(P_1+P_2+…+P_N)&1\end{pmatrix}\]
It’s worth quickly clarifying why these are even thin lenses in the first place and why the claimed \(f\) really is a suitable notion of focal length. One can proceed axiomatically, demanding that a thin lens be any optical element which:
- Focuses all incident light rays parallel to the principal \(z\)-axis to a focal point \(f\) (i.e. \((\rho,0)\mapsto(\rho,-\rho/f)\)).
- Doesn’t affect any light rays that pass through the principal \(z\)-axis (i.e. \((0,\theta)\mapsto(0,\theta)\)).
These are two linearly independent vectors (though only the latter is an eigenvector, as shear transformations are famous for being non-diagonalizable), so these \(2\) axioms are sufficient to fix the form \(\begin{pmatrix}1&0\\-1/f&1\end{pmatrix}\) of the paraxial ray transfer matrix of a thin lens.
Often, one would like to use thin lenses to image various objects. Consider an arbitrary point in space sitting a distance \(\rho\) above the the principal \(z\)-axis and a distance \(z>0\) away from a thin lens. If a light ray is emitted from this point at some angle \(\theta\), and this light ray refracts through the thin lens (of focal length \(f\)), and ends up at some point \(\rho’,z’\) after the lens during its trajectory, then one has:
\[\begin{pmatrix}\rho’\\\theta’\end{pmatrix}=\begin{pmatrix}1&z’\\0&1\end{pmatrix}\begin{pmatrix}1&0\\-1/f&1\end{pmatrix}\begin{pmatrix}1&z\\0&1\end{pmatrix}\begin{pmatrix}\rho\\\theta\end{pmatrix}\]
where the composition of those \(3\) matrices evaluates to \(\begin{pmatrix}1-z’/f&z+z’-zz’/f\\-1/f&1-z/f\end{pmatrix}\). But this has an important corollary; if one were to specifically choose the distance \(z’>0\) such as to make the top-right entry vanish \(z+z’-zz’/f=0\iff f^2=(z-f)(z’-f)\iff 1/f=1/z+1/z’\), then \(\rho’=(1-z’/f)\rho=\rho/(1-z/f)\) would be independent of \(\theta\)! The condition \(1/f=1/z+1/z’\) is sometimes called the (Gaussian) thin lens equation, though a better name would simply be the imaging condition. The corresponding linear transverse magnification is \(M_{\rho}:=\rho’/\rho=-z’/z=1/(1-z/f)\). One sometimes also sees the linear longitudinal magnification \(M_{z}:=\partial z’/\partial z=-1/(1-z/f)^2=-M_{\rho}^2<0\) which is always negative.
A magnifying glass works by placing an object at \(z\approx f\) so as to form a virtual image at a distance \(z’\to -\infty\). In that case, both \(M_{\rho},M_z\to\infty\) exhibit poles at \(z=f\), so what does it mean when a company advertises a magnifying glass as offering e.g. \(\times 40\) magnification? It turns out this is actually a specification of the angular magnification \(M_{\theta}:=\theta’/\theta=(\rho/f)/(\rho/d)=d/f\) of the convex lens when viewed at a distance \(d=25\text{ cm}\) from the object (not from the lens). So the statement that \(M_{\theta}=40\) is really a statement about the focal length \(f=d/M_{\theta}=0.625\text{ cm}\) of the lens. In turn, if the glass of the magnifier has a typical index such as \(n=1.5\) and is intended to be symmetric, then the lensmaker’s equation requires one to use \(R=-R’=0.625\text{ cm}\) in air (coincidentally the same as \(f\)).
The set of paraxial rays \((\rho,\theta)\) constitute a real, \(2\)-dimensional vector space on which optical elements such as lenses act by linear transformations. For instance, a collection of parallel rays incident on the lens (represented by the horizontal line below) is first sheared vertically by the lens, and subsequently free space propagation by a distance \(f\) shears the resultant line horizontally to the point that it becomes vertical, indicating that all the parallel rays have been focused to the same point (thus, this is an instance of the general identity \(\arctan f+\arctan 1/f=\text{sgn}(f)\pi/2\) for arbitrary \(f\in\textbf R-\{0\}\)).

An important corollary of this is that, if one wishes to observe the Fraunhofer interference pattern of some aperture at any distance \(f\) of interest, not necessarily just in the far-field \(f\gg z_R\), a simple way to achieve this is to just place a thin convex lens of focal length \(f\) into the aperture. Recalling that the Fraunhofer interference pattern arises by the superposition of (essentially) parallel Huygens wavelet contributions from each point on the aperture (parallel because one is working in the far-field), and recalling that a lens focuses all incident parallel rays onto a given point in its back focal plane \(f\), this provides a geometrical optics way of seeing why one can form the Fraunhofer interference pattern at an arbitrary distance \(f\) simply by choosing a suitable convex lens.
There is also a more physical optics way of seeing the same result. Recall that, at the end of the day, a lens is just two spherical caps of radii \(R,R’\) that have been welded together. In Cartesian coordinates, the equations of such caps are \(z=-\sqrt{R^2-x’^2-y’^2}\) and \(z=\sqrt{R’^2-x’^2-y’^2}\), but in the paraxial approximation, these look like the paraboloids \(z\approx -R+(x’^2+y’^2)/2R\) and \(z\approx -R’+(x’^2+y’^2)/2R’\) (where \(R'<0\) for a convex lens, etc.). Here, despite using a “thin” lens approximation, one cannot completely ignore the thickness profile across the lens (also a paraboloid):
\[\Delta z(x’,y’)=R-R’-\frac{x’^2+y’^2}{2}\left(\frac{1}{R}-\frac{1}{R’}\right)\]
So a light ray incident at \((x’,y’)\) on the lens will, upon exiting the lens, have acquired an additional phase shift of:
\[\Delta\phi(x’,y’)=n’k\Delta z(x’,y’)-nk\Delta z(x’,y’)=(n’-n)k\Delta z(x’,y’)\]
It is important to understand that \(k\) here is the free space wavenumber, but that in a medium \(n\) it becomes \(k\mapsto nk\) because \(\omega=ck=vnk\) is fixed. This corresponds to a spatially-varying \(U(1)\) modulation of the aperture field:
\[\psi(x’,y’,0)\mapsto\psi(x’,y’,0)e^{i\Delta\phi(x’,y’)}=\psi(x’,y’,0)e^{i(n’-n)k(R-R’)}e^{-ink(x’^2+y’^2)/2f}\]
where the lensmaker’s equation has been used. But notice that, when inserted into the Fresnel diffraction integral (with \(k\mapsto nk\) and hence \(\lambda\mapsto\lambda/n\)), if one places the screen exactly at \(z=f\), then the quadratic phase terms cancel out and one is left with precisely the Fraunhofer interference pattern:
\[\psi(\textbf x)=\frac{ne^{i(n’-n)k(R-R’)}e^{inkf}e^{ink(|\textbf x|^2-f^2)/2f}}{i\lambda f}\iint_{\textbf x’\in\text{aperture}}\psi(\textbf x’)e^{-in\textbf k\cdot\textbf x’}d^2\textbf x’\]
More generally, for an arbitrary optical component with ray transfer matrix \(\begin{pmatrix}A&B\\C&D\end{pmatrix}\) in the geometrical optics picture, its corresponding operator in the physical optics picture is \(e^{}\).
- Collimator setup
- Lenses, sign conventions (basically, one key point is that an optical element which tends to converge light rays has positive focal length).
- Real objects are a source of rays, real images are a sink of rays, virtual images are a source of rays (probably all of this can be made precise in the eikonal approximation).
- Gaussian optics as paraxial geometrical optics
- No notion of \(\lambda\) (can view geometrical optics as the \(\lambda\to 0\) limit of physical optics)
- Ray tracing algorithms.
- Spherical aberration
- Chromatic aberration due to optical dispersion \(n(\lambda)\) as the only time where \(\lambda\) shows up, sort of ad hoc.
Phases of the Classical Ising Model
Problem #\(1\): When someone comes up to you on the street and just says “Ising model”, what should be the first thing you think of?
Solution #\(1\): The classical Hamiltonian:
\[H=-E_{\text{int}}\sum_{\langle i,j\rangle}\sigma_i\sigma_j-E_{\text{ext}}\sum_i\sigma_i\]
(keeping in mind though that there many variants on this simple Ising model).
Problem #\(2\): Is the Ising model classical or quantum mechanical?
Solution #\(2\): It is purely classical. Indeed, this is a very common misconception, because many of the words that get tossed around when discussing the Ising model (e.g. “spins” on a lattice, “Hamiltonian”, “(anti)ferromagnetism”, etc.) sound like they are quantum mechanical concepts, and indeed they are but the Ising model by itself is a purely classical mathematical model that a priori need not have any connection to physics (and certainly not to quantum mechanical systems; that being said it’s still useful for intuition to speak about it as if it were a toy model of a ferromagnet).
To hit this point home, remember that the Hamiltonian \(H\) is just a function on phase space in classical mechanics, whereas it is an operator in quantum mechanics…but in the formula for \(H\) in Solution #\(1\), there are no operators on the RHS, the \(\sigma_i\in\{-1,1\}\) are just some numbers which specify the classical microstate \((\sigma_1,\sigma_2,…)\) of the system, so it is much more similar to just a classical (as opposed to quantum) Hamiltonian. And there are no superpositions of states, or non-commuting operators, or any other quantum voodoo going on. So, despite the discreteness/quantization which is built into the Ising model, it is purely classical.
Problem #\(3\): What does it mean to “solve” the Ising model? (i.e. what properties of the Ising lattice is one interested in understanding?)
Solution #\(3\): The mental picture one should have in mind is that of coupling the Ising lattice with a heat bath at some temperature \(T\), and then ask how the order parameter \(m\) of the lattice (in this case the Boltzmann-averaged mean magnetization) varies with the choice of heat bath temperature \(T\). Intuitively, one should already have a qualitative sense of the answer:

So to “solve” the Ising model just means to quantitatively get the equation of those curves \(m=m(T)\) for all possible combinations of parameters \(E_{\text{int}},E_{\text{ext}}\in\textbf R\) in the Ising Hamiltonian \(H\).
Problem #\(4\):

Solution #\(4\):





Problem #\(5\):


Solution #\(5\):





Comparing with the earlier intuitive sketch (note all the inner loop branches at low temperature are unstable):

In particular, the phase transition at \(E_{\text{ext}}=0\) is manifest by the trifurcation at the critical point \(\beta=\beta_c\).
Problem #\(6\):

Solution #\(6\):






Problem #\(7\): Show that in the mean field approximation, the short-range Ising model at \(E_{\text{ext}}=0\) experiences a \(2\)-nd order phase transition in the equilibrium magnetization \(m_*(T)\) for \(T<T_c\) (but \(T\) close to \(T_c\)) goes like \(m_*(T)\approx\pm\sqrt{3(T_c/T-1)}\).
Solution #\(7\): Within (stupid!) mean-field theory, the effective free energy is:
\[f(m)=-k_BT\ln 2-E_{\text{ext}}m-\frac{qE_{\text{int}}}{2}m^2+\frac{1}{2}k_BT\biggr[(1+m)\ln(1+m)+(1-m)\ln(1-m)\biggr]\]
Proof:







So anyways, Maclaurin-expanding the mean-field effective free energy \(f(m)\) per unit spin:





The spontaneous \(\textbf Z_2\) symmetry breaking (i.e. ground state not preserved!) associated to the \(T<T_c\) ordered phase at \(E_{\text{ext}}=0\) is apparent:

Problem #\(8\): In the Ehrenfest classification of phase transitions, an \(N\)-th order phase transition occurs when the \(N\)-th derivative of the free energy \(\frac{\partial^N F}{\partial m^N}\) is discontinuous at some critical value of the order parameter \(m_*\). But considering that \(F=-k_BT\ln Z\) and the partition function \(Z=\sum_{\{\sigma_i\}}e^{-\beta E_{\{\sigma_i\}}}\) is a sum of \(2^N\) analytic exponentials, how can phase transitions be possible?
Solution #\(8\): By analogy, consider the Fourier series for a certain square wave:
\[f(t)=\frac{4}{\pi}\sum_{n=1,3,5,…}^{\infty}\frac{\sin(2\pi n t/T)}{n}\]
Although each sinusoid in the Fourier series is everywhere analytic, the series converges in \(L^2\) norm to a limiting square wave which has discontinuities at \(t_m=mT\), hence not being analytic at those points! So the catch here is that while any finite series of analytic functions (e.g. a partial sum truncation) will have its analyticity preserved, an infinite series need not! This simple result of analysis underpins the existence of phase transitions! In practice of course, for any finite number of spins \(N<\infty\), \(2^N\) will still be finite and in fact there are strictly speaking no phase transitions in any finite system. But in practice \(N\sim 10^{23}\) is so large that it is effectively infinite, and so in this \(N=\infty\) system limit it looks to all intents and purposes as a phase transition.
Similar to the phase transitions, spontaneous symmetry breaking is also an \(N=\infty\) phenomenon only, strictly speaking:
\[m_{SSB}=\lim_{E_{\text{ext}}\to 0}\lim_{N\to\infty}\biggr\langle\frac{1}{N}\sum_{i=1}^N\sigma_i\biggr\rangle\]
where the limits do not commute \(\lim_{E_{\text{ext}}\to 0}\lim_{N\to\infty}\neq\lim_{N\to\infty}\lim_{E_{\text{ext}}\to 0}\) because for any finite \(N<\infty\), \(\langle m\rangle_N=-\frac{1}{N}\frac{\partial F_H}{\partial E_{\text{ext}}}|_{E_{\text{ext}}=0}=0\) since \(\textbf Z_2\) symmetry enforces \(F_H(E_{\text{ext}})=F_H(-E_{\text{ext}})\) so that its derivative must be odd and therefore vanishing at the origin.
Problem #\(9\): Show that, in the mean-field short-range Ising model at \(E_{\text{ext}}=0\) the specific/intensive heat capacity \(c\) is discontinuous at \(T=T_c\).
Solution #\(9\):




Appendix: Physical Systems Described by Classical Ising Statistics
The purpose of this post is to dive into the intricacies of the classical Ising model. For this, it is useful to imagine the Bravais lattice \(\textbf Z^d\) in \(d\)-dimensions of lattice parameter \(a\), together with a large number \(N\) of neutral spin \(s=1/2\) fermions (e.g. neutrons, ignoring the fact that isolated neutrons are unstable) tightly bound to the lattice sites \(\textbf x\in\textbf Z^d\), each site accommodating at most one fermion by the Pauli exclusion principle. On top of all this, apply a uniform external magnetic field \(\textbf B_{\text{ext}}\) across the entire sample of \(N\) fermions. Physically then, ignoring any kinetic energy or hopping/tunneling between lattice sites (cf. Fermi-Hubbard model), there are two forms of potential energy that contribute to the total Hamiltonian \(H\) of this lattice of spins:
- Each of these \(N\) neutral fermions has a magnetic dipole moment \(\boldsymbol{\mu}_{\textbf S}=\gamma_{\textbf S}\textbf S\) arising from its spin angular momentum \(\textbf S\) (in the case of charged fermions such as electrons \(e^-\), this is just the usual \(\gamma_{\textbf S}=-g_{\textbf S}\mu_B/\hbar\) but for neutrons the origin of such a magnetic dipole moment is more subtle, ultimately arising from its quark structure). This magnetic dipole moment \(\boldsymbol{\mu}_{\textbf S}\) couples with the external magnetic field \(\textbf B_{\text{ext}}\), leading to an interaction energy of the form:
\[V_{\text{ext}}=-\sum_{i=1}^N\boldsymbol{\mu}_{\textbf S,i}\cdot\textbf B_{\text{ext}}=-\gamma_{\textbf S}\sum_{i=1}^N\textbf S_i\cdot\textbf B_{\text{ext}}\]
2. Nearby magnetic dipole moments couple with each other across space, mediating a local internal interaction of the form:
\[V_{\text{int}}=\frac{\mu_0\gamma_{\textbf S}^2}{4\pi}\sum_{1\leq i\neq j\leq N}\frac{3(\textbf S_i\cdot\Delta\hat{\textbf x}_{ij})(\textbf S_j\cdot\Delta\hat{\textbf x}_{ij})-\textbf S_i\cdot\textbf S_j}{|\Delta\textbf x_{ij}|^3}\]
Thus, the total Hamiltonian \(H\) on this state space \(\mathcal H\cong\textbf Z^d\otimes\textbf C^{\otimes 2N}\) is:
\[H=V_{\text{int}}+V_{\text{ext}}\]
Right now, it is hopelessly complicated. From this point onward, a sequence of dubious approximations will be applied to transform this current Hamiltonian \(H\mapsto H_{\text{Ising}}\) to the Ising Hamiltonian \(H_{\text{Ising}}\) (in fact, as mentioned, even the apparently complicated form of the Hamiltonian \(H\) is already approximate; the reason for using neutral fermions is to avoid dealing with an additional Coulomb repulsion contribution to \(H\)).
- Approximation #1: Recall that the direction of the applied magnetic field, say along the \(z\)-axis \(\textbf B_{\text{ext}}=B_{\text{ext}}\hat{\textbf k}\), defines the quantization axis of all the relevant angular momenta. For a sufficiently strong magnetic field \(\textbf B_{\text{ext}}\) (cf. the Paschen-Back effect in atoms), the external coupling \(V_{\text{ext}}\) should dominate the internal coupling \(V_{\text{int}}\) and so all \(N\) spin angular momenta \(\textbf S_i\) will Larmor-precess around \(\textbf B_{\text{ext}}\) with \(m_{s,i}\in\{-1/2,1/2\}\) becoming a good quantum number.
- Approximation #2: Assume that only nearest-neighbour dipolar couplings are important (in \(\textbf Z^d\) there would be \(2d\) nearest neighbours) and that moreover, because all the spins are roughly aligned in the direction of \(\textbf B_{\text{ext}}\), the term \((\textbf S_i\cdot\Delta\hat{\textbf x}_{ij})(\textbf S_j\cdot\Delta\hat{\textbf x}_{ij})\) is not as important as the spin-spin coupling term \(\textbf S_i\cdot\textbf S_j\).
Combining these two approximations, one obtains the Ising Hamiltonian \(H_{\text{Ising}}\) acting on the Ising state space \(\mathcal H_{\text{Ising}}\cong\{-1,1\}^N\):
\[H\approx H_{\text{Ising}}=-E_{\text{int}}\sum_{\langle i,j\rangle}\sigma_i\sigma_j-E_{\text{ext}}\sum_{i=1}^N\sigma_i\]
where \(\sigma_i:=2m_{s,i}\in\{-1,1\}\), \(E_{\text{int}}:=\mu_0\hbar^2\gamma_{\textbf S}^2/16\pi a^3\) is a proxy for the interaction strength between adjacent fermions via the energy gain of being mutually spin-aligned and \(E_{\text{ext}}:=\hbar\gamma_{\textbf S}B_{\text{ext}}/2\) is a proxy for the external field strength via the energy gain of being spin-aligned with it. In the context of magnetism, a material with \(E_{\text{int}}>0\) would be thought of as a ferromagnet while \(E_{\text{int}}<0\) is called an antiferromagnet (this possibility does not arise however after the various approximations that were made). Similarly, \(E_{\text{ext}}\) can be either positive or negative (e.g. for neutrons it is actually negative \(E_{\text{ext}}<0\) because \(\gamma_{\textbf S}<0\)) but for intuition purposes one can just think \(E_{\text{ext}}>0\) so that being spin-aligned with \(\textbf B_{\text{ext}}\) is the desirable state of affairs.
From \(H_{\text{Ising}}\) to \(Z_{\text{Ising}}\)
As usual, once the Hamiltonian \(H_{\text{Ising}}\) has been found (i.e. once the physics has been specified), the rest is just math. In particular, the usual next task is to calculate its canonical partition function \(Z_{\text{Ising}}=\text{Tr}(e^{-\beta H_{\text{Ising}}})\). The calculation of \(Z_{\text{Ising}}\) can be done exactly in dimension \(d=1\) for arbitrary \(E_{\text{ext}}\) (this is what Ising did in his PhD thesis) and also for \(d=2\) provided the absence of an external magnetic field \(E_{\text{ext}}=0\) (this is due to Onsager). In higher dimensions \(d\gg 1\), as the number \(2d\) of nearest neighbours increases, the accuracy of an approximate method for evaluating \(Z_{\text{Ising}}\) known as mean field theory increases accordingly, becoming an exact solution only for the unphysical \(d=\infty\). It is simplest to first work through the mathematics of the mean field theory approach before looking at the special low-dimensional cases \(d=1,(d=2,E_{\text{ext}}=0)\) (it is worth emphasizing that the Ising model can also be trivially solved in any dimension \(d\) if interactions are simply turned off \(E_{\text{int}}=0\) but this would be utterly missing the whole point of the Ising model! edit: in hindsight, maybe not really after all, see the section below on mean field theory).
First, just from inspecting the Hamiltonian \(H_{\text{Ising}}\) it is clear that the net “magnetization” \(\Sigma:=\sum_{i=1}^N\sigma_i\) is conjugate to \(E_{\text{ext}}\), so in the canonical ensemble it fluctuates around the expectation:
\[\langle\Sigma\rangle=-\frac{\partial F_{\text{Ising}}}{\partial E_{\text{ext}}}\]
The ensemble-averaged spin is therefore \(\langle\sigma\rangle=\langle\Sigma\rangle/N\). The usual “proper” way to calculate \(\langle\sigma\rangle\) would be to just directly and analytically evaluate the sums in \(Z_{\text{Ising}}=e^{-\beta F_{\text{Ising}}}\) so in particular \(\langle\sigma\rangle\) shouldn’t appear anywhere until one is explicitly calculating for it. However, using mean field theory, it turns out one will end up with an implicit equation for \(\langle\sigma\rangle\) that can nevertheless still be solved in a self-consistent manner.
To begin, write \(\sigma_i=\langle\sigma\rangle+\delta\sigma_i\) (cf. the Reynolds decomposition used to derive the RANS equations in turbulent fluid mechanics). Then the interaction term in \(H_{\text{Ising}}\) (which is both the all-important term but also the one that makes the problem hard) can be written:
\[V_{\text{int}}=-E_{\text{int}}\sum_{\langle i,j\rangle}(\langle\sigma\rangle+\delta\sigma_i)(\langle\sigma\rangle+\delta\sigma_j)=-E_{\text{int}}\langle\sigma\rangle^2\sum_{\langle i,j\rangle}1-E_{\text{int}}\langle\sigma\rangle\sum_{\langle i,j\rangle}(\delta\sigma_i+\delta\sigma_j)-E_{\text{int}}\sum_{\langle i,j\rangle}\delta\sigma_i\delta\sigma_j\]
Although the variance \(\langle\delta\sigma_i^2\rangle=\langle\sigma_i^2\rangle-\langle\sigma\rangle^2=1-\langle\sigma\rangle^2\) of each individual spin \(\sigma_i\) from the mean background spin \(\langle\sigma\rangle\) is not in general going to be zero (unless of course the entire system is magnetized along or against \(\textbf B_{\text{ext}}\), i.e. \(\langle\sigma\rangle=\pm 1\)), the mean field approximation says that the covariance between distinct neighbouring spins \(\langle i,j\rangle\) should average to \(\sum_{\langle i,j\rangle}\delta\sigma_i\delta\sigma_j\approx 0\), so that, roughly speaking, the overall \(N\times N\) covariance matrix of the spins is not only diagonal but just proportional to the identity \((1-\langle\sigma\rangle^2)1_{N\times N}\).
Thus, reverting back to \(\delta\sigma_i=\sigma_i-\langle\sigma\rangle\) and using for the lattice \(\textbf Z^d\) the identity \(\sum_{\langle i,j\rangle}1\approx Nd\) (because each of \(N\) spins has \(2d\) nearest neighbours but a factor of \(1/2\) is needed to compensate double-counting each bond) and the identity \(\sum_{\langle i,j\rangle}(\sigma_i+\sigma_j)\approx 2d\sum_{i=1}^N\sigma_i\) (just draw a picture), the mean field Ising Hamiltonian \(H’_{\text{Ising}}\) simplifies to:
\[H’_{\text{Ising}}=NdE_{\text{int}}\langle\sigma\rangle^2-E_{\text{ext}}’\sum_{i=1}^N\sigma_i\]
where the constant \(NdE_{\text{int}}\langle\sigma\rangle^2\) doesn’t affect any of the physics (although it will be kept in the calculations below for clarity) and \(E_{\text{ext}}’=E_{\text{ext}}+2dE_{\text{int}}\langle\sigma\rangle\) is the original energy \(E_{\text{ext}}\) together now with a mean field contribution \(2dE_{\text{int}}\langle\sigma\rangle\). This has a straightforward interpretation; one is still acknowledging that only the \(2d\) nearest neighbouring spins can influence a given spin, but now, rather than each one having its own spin \(\sigma_j\), one is assuming that they all exert the same mean field \(\langle\sigma\rangle\) that permeates the entire Ising lattice \(\textbf Z^d\). Basically, the mean field approximation has removed interactions \(E_{\text{int}}’=0\), reducing the problem to a trivial non-interacting one for which it is straightforward to calculate (this is just repeating the usual steps of calculating, e.g. the Schottky anomaly):
\[Z’_{\text{Ising}}=(2e^{dE_{\text{int}}\langle\sigma\rangle^2}\cosh\beta E’_{\text{ext}})^N\]
\[F’_{\text{Ising}}=-\frac{N}{\beta}(\ln\cosh\beta E’_{\text{ext}}+dE_{\text{int}}\langle\sigma\rangle^2+\ln 2)\]
\[\langle\sigma\rangle=\tanh\beta E’_{\text{ext}}=\tanh\beta(E_{\text{ext}}+2dE_{\text{int}}\langle\sigma\rangle)\]
As promised earlier, this is an implicit equation for the average spin \(\langle\sigma\rangle\) that can, for a fixed dimension \(d\), be solved for various values of temperature \(\beta\) (which intuitively wants to randomize the spins) and energies \(E_{\text{int}},E_{\text{ext}}\) (both of which intuitively want to align the spins). The outcome of this competition is the following:

If one applies any kind of external magnetic field \(E_{\text{ext}}\neq 0\), then as one increases the temperature \(1/2d\beta E_{\text{int}}\to\infty\), the mean spin \(\langle\sigma\rangle\to 0\) randomizes gradually (more precisely, as \(\langle\sigma\rangle=\beta E_{\text{ext}}+2dE_{\text{int}}E_{\text{ext}}\beta^2+O_{\beta\to 0}(\beta^3)\)). The surprise though occurs in the absence of any external magnetic field \(E_{\text{ext}}=0\); here, driven solely by mean field interactions, the mean spin \(\langle\sigma\rangle=0\) abruptly vanishes at all temperatures \(T\geq T_c\) exceeding a critical temperature \(kT_c=2dE_{\text{int}}\). This is a second-order ferromagnetic-to-paramagnetic phase transition (it is second order because the discontinuity occurs in the derivative of \(\langle\sigma\rangle\) which itself is already a derivative of the free energy \(F’_{\text{Ising}}\)). Meanwhile, there is also a first-order phase transition given by fixing a subcritical temperature \(T<T_c\) and varying \(E_{\text{ext}}\), as in this case it is the mean magnetization \(\langle\sigma\rangle\) itself that jumps discontinuously.
Note also that, similar to the situation for the Van der Waals equation when one had \(T<T_c\), here it is apparent that at sufficiently low temperatures for arbitrary \(E_{\text{ext}}\), the mean field Ising model predicts \(3\) possible mean magnetizations \(\langle\sigma\rangle\). For \(E_{\text{ext}}=0\), the unmagnetized solution \(\langle\sigma\rangle=0\) solution turns out to be an unstable equilibrium. For \(E_{\text{ext}}>0\), the solution on the top branch with \(\langle\sigma\rangle>0\) aligned with the external magnetic field is stable while among the two solutions with \(\langle\sigma\rangle<0\), one is likewise unstable while one is metastable, and similarly for \(E_{\text{ext}}<0\).
Critical Exponents
Solving The Ising Chain (\(d=1\)) Via Transfer Matrices
Low & High-\(T\) Limits of Ising Model in \(d=2\) Dimensions
Talk about Peierls droplet, prove Kramers-Wannier duality between the low and high-\(T\) regimes.
Beyond Ferromagnetism
The point of the Ising model isn’t really to be some kind of accurate model for any real-life physical system, but just a “proof of concept” demonstration that phase transitions can arise from statistical mechanics; although the sum of finitely many analytic functions is analytic, in the thermodynamic limit a phase transition can appear. Similar vein of mathematical modelling can be used to model lattice gases,
Talk about the Metropolis-Hastings algorithm.
Turbulence
The purpose of this post is to study the universal properties of fully developed turbulence \(\text{Re}\gg\text{Re}^*\sim 10^3\). Thanks to direct numerical simulation (DNS), there is strong evidence to suggest that the nonlinear advective term \(\left(\textbf v\cdot\frac{\partial}{\partial\textbf x}\right)\textbf v\) in the Navier-Stokes equations correctly captures turbulent flow in fluids. However, rather than trying to find analytical solutions \(\textbf v(\textbf x,t)\) that exhibit turbulence (which is clearly pretty hopeless), it makes sense to decompose \(\textbf v=\bar{\textbf v}+\delta\textbf v\) into a mean velocity field \(\bar{\textbf v}(\textbf x,t)\) superimposed by some fluctuations \(\delta\textbf v(\textbf x,t)\). The precise meaning of the word “mean” in the phrase “mean velocity field” for \(\bar{\textbf v}\) is time-averaged over some “suitable” period \(T\), also known as Reynolds averaging:
\[\bar{\textbf v}(\textbf x,t):=\frac{1}{T}\int_{t}^{t+T}\textbf v(\textbf x,t’)dt’\]
By construction, this implies that the Reynolds time average of the fluctuations vanishes \(\overline{\delta\textbf v}=\overline{\textbf v-\bar{\textbf v}}=\bar{\textbf v}-\bar{\textbf v}=\textbf 0\).
One can also check that \(\textbf v\) is incompressible if and only if both the Reynolds averaged flow \(\bar{\textbf v}\) and the fluctuations \(\delta\textbf v\) are also incompressible. One similarly works with the Reynolds averaged pressure \(p=\bar p+\delta p\) so that by design \(\overline{\delta p}=0\).
Substituting \(\textbf v=\bar{\textbf v}+\delta\textbf v\) and \(p=\bar p+\delta p\) into the Navier-Stokes equations and Reynolds averaging both sides of the equation yields the well-named Reynolds-averaged Navier-Stokes (RANS) equations:
\[\rho\left(\frac{\partial\bar{\textbf v}}{\partial t}+\left(\bar{\textbf v}\cdot\frac{\partial}{\partial\textbf x}\right)\bar{\textbf v}\right)=-\frac{\partial\bar p}{\partial\textbf x}+\eta\left|\frac{\partial}{\partial\textbf x}\right|^2\bar{\textbf v}-\rho\overline{\left(\delta\textbf v\cdot\frac{\partial}{\partial\textbf x}\right)\delta\textbf v}+\bar{\textbf f}\]
Or in a Cartesian basis:
\[\rho(\dot{\bar{v}}_i+\bar v_j\partial_j\bar v_i)=\partial_j\bar{\sigma}_{ij}+\bar{f}_i\]
where the Reynolds averaged stress tensor \(\bar{\sigma}\) now includes an additional turbulent contribution \(\bar{\sigma}_{\text{Reynolds}}=-\rho\overline{\delta\textbf v\otimes\delta\textbf v}\) known as the Reynolds stress:
\[\bar{\sigma}_{ij}=-\bar p\delta_{ij}+\eta(\partial_j\bar v_i+\partial_i\bar v_j)-\rho\overline{\delta v_i\delta v_j}\]
(this can be quickly checked using the incompressibility conditions \(\partial_j\bar v_j=\partial_j\delta v_j=0\)).
At this point, assuming that the external body forces have no fluctuations \(\delta\textbf f=\textbf f-\bar{\textbf f}=\textbf 0\), one can subtract the RANS equations from the original Navier-Stokes equations to obtain:
\[\rho\left(\frac{\partial\delta\textbf v}{\partial t}+\left(\bar{\textbf v}\cdot\frac{\partial}{\partial\textbf x}\right)\delta\textbf v+\left(\delta\textbf v\cdot\frac{\partial}{\partial\textbf x}\right)\bar{\textbf v}+\delta\left(\left(\delta\textbf v\cdot\frac{\partial}{\partial\textbf x}\right)\delta\textbf v\right)\right)=-\frac{\partial\delta p}{\partial\textbf x}+\eta\left|\frac{\partial}{\partial\textbf x}\right|^2\delta\textbf v\]
Taking the outer product of both sides with \(\delta\textbf v\) and then Reynolds averaging yields (to be added: closure problem, Boussinesq approximation as a closure model).
Dimensional Analysis
Working with Optical Tables
The purpose of this post is to document the uses of several standard components used in optics experiments.
Optical Fibers & APC Connectors
An optical fiber is a waveguide for light waves. The idea is to use it to transmit light over long distances with minimal loss. It consists of an inner core, made of glass or plastic, where total internal reflection can take place within the waveguide (ignoring evanescent transmitted waves) because of a cladding with (higher/lower?) refractive index, and a jacket (blue layer in the picture).
At the ends of optical fibers, one typically also has angled physical contact (APC) connectors to minimize back-reflection of light (by using an angled design usually around \(8^{\circ}\)). These ensure alignment of optical fiber cores when connecting two optical fibers to each other.
Often, optical fibers can be polarization-maintaining (PM) meaning that when one excites a given optical fiber. This is because apparently the core of an optical fiber is typically already pre-stressed to give it some kind of birefringence \(\Delta n=n_{\text{slow}}-n_{\text{fast}}\) (general rule of thumb: any symmetry which is easily broken will be broken; for example the magnetic field is never actually \(\textbf B=\textbf 0\) due to Earth, someone’s phone, etc. and since you don’t want other things to be defining your quantization axis, so you should just apply a magnetic field yourself anyways).
Coupling Laser Light into an Optical Fiber
Goal is to get laser beam to be normally incident \(\theta_x=\theta_y=0\) at the center \(x=y=0\) of an optical fiber. Although initially this sounds quite trivial, as with any waveguide, the optical fiber is extraordinarily sensitive to any small deviations in these \(4\) degrees of freedom \(x,y,\theta_x,\theta_y\) and will only work if these \(4\) conditions are almost perfectly met (hence rendering the task highly non-trivial). Thus, the naive solution of just trying to align the laser beam into the optical fiber “by hand” is hopeless since one’s hands afford merely coarse control over \(x,y,\theta_x,\theta_y\) but clearly here one requires much finer control in order to successfully couple the laser light into the optical fiber.
The way to obtain such fine control is to use mirrors; each mirror comes with fine control in both spherical coordinates \(\phi,\theta\) (and also there is leeway in exactly where the laser is incident on the mirror and the fact that it need not be exactly \(45^{\circ}\) or anything like that). Of course changing the azimuth \(\phi\) of a given mirror will simultaneously change both \(x,\theta_x\) and similarly changing the zenith angle \(\theta\) of a mirror simultaneously affects both \(y,\theta_y\), so in this sense these degrees of freedom are “coupled”. Specifically, each mirror provides for \(2\) degrees of freedom \(\phi,\theta\) which is why in total \(2\) mirrors are actually needed to properly couple the laser into the optical fiber.
One can connect the output end of the optical fiber to a fiber pen and use a translucent polymer sheet to see where the laser beam from the laser intersects the laser beam from the fiber pen at various regions in the setup. From having played around with the setup, it is more sensible to focus on aligning them at the extremes of the path, which tends to automatically ensure that they will be aligned everywhere else in the middle. Moreover, a general rule of thumb turns out to be that in order to align a section, the mirror one should do fine adjustments is, perhaps counterintuitively, the one further away (is there some name for this kind of algorithm?). Doing it iteratively like this will converge onto an aligned optical system; doing it the other way will diverge into a hopelessly misaligned system.
After having completed the “fine structure” alignment of the mirrors properly so that there is for sure some non-zero signal coming out the output of the optical fiber, one can then proceed to a “hyperfine” level of adjustments, putting the output of the optical fiber into a photodiode and measuring the photocurrent developed across a potentiometer \(R\) via a multimeter, or just directly using a power meter. Here again, one essentially seeks to maximize the photodiode signal by an algorithm which vaguely feels like a manual implementation of gradient descent. More precisely, it turns out to be more advisable to make some small random perturbation to the \(\phi\) (resp. \(\theta\)) of the mirror farther away (not necessarily physically, but in the sense of the optical path length) from the input of the optical fiber, then adjusting \(\phi\) (resp. \(\theta\)) of the mirror closer to the optical fiber input until the signal is locally maximized, and repeating this until one eventually converges onto not merely a local, but global maximum (2D search). Finally, also consider the focal length of the lens relative to the fiber (this is a 1D search at the end). At this point, one can feel pretty confident that the laser light is properly coupled into the optical fiber, i.e. that \(x\approx y\approx\theta_x\approx\theta_y\approx 0\). Each time one takes a fiber out and puts it back in again, one has to recouple because of how sensitive the whole alignment is.
Optical Tables & Breadboards
Small vibrations (e.g. footsteps, motors, etc.) can perturb the delicate alignment of optical systems, hence all optical components need to be firmly bolted down to an optical table (possibly with the aid of ferromagnetic bases). The top and bottom layers of an optical table are usually manufactured from some grade of stainless steel perforated by a square lattice of \(\text{M}6\) threaded holes with lattice parameter \(\Delta x=25\text{ mm}\) (recall that \(\text{M}D\times L\) is the standard notation for a metric thread of outer diameter \(D\text{ mm}\) and length \(L\text{ mm}\) and typically one assumes the thread pitch \(\delta\text{ mm}\) is the coarsest/largest one that is standardized for that particular thread diameter \(D\) so that the helix winds \(N=L/\delta\) times around, although \(\delta\) could be finer/smaller too, see this reference). The exact engineering details of how an optical table seeks to critically damp external vibrations is interesting, involving the use of pneumatic legs and several layers of viscoelastic materials sandwiched between the steel layers in a rigid honeycomb structure.

Optical breadboards are basically just smaller, less fancy version of an optical table, mainly used for prototyping and easier portability of a particular modular setup into some main optical table.
Acousto-Optic Modulators (AOMs)
An acousto-optic modulator (AOM), also known as an acousto-optic deflector (AOD), is at first glance similar to a diffraction grating for light in the sense that if one shines some incident plane wave from a laser through the hole in the AOM, then out comes an \(m=0\) order mode in addition to \(m=\pm 1\) and occasionally higher-order modes too (the exact distribution of intensities among these harmonics will depend very sensitively on the incident angle that one shines the laser light at into the AOM).




However, despite being superficially similar to a diffraction grating, there are some notable differences; the first is that the Fraunhofer interference pattern of a diffraction grating typically occurs via a (\(2\)-dimensional) screen with a bunch of slits on it; here a (\(3\)-dimensional!) volume Bragg grating (VBG) is used instead, which in practice means some kind of glass attached to a piezoelectric transducer that drives the glass (i.e. applies periodic stress to it) at some radio frequency \(f_{\text{ext}}\sim 100\text{ MHz}\) via an external RF driver. This induces a periodic modulation in the glass’s refractive index \(n=n(x)\) where the “period” \(\lambda_{\text{ext}}=c_{\text{glass}}/f_{\text{ext}}\) over which \(n(x+\lambda_{\text{ext}})=n(x)\) corresponds to the wavelength of the sound waves, where \(c_{\text{glass}}\) is the phase velocity of sound waves in the glass.
Provided the light is incident at the Bragg angle \(\theta_B\approx\sin\theta_B\approx 2\lambda/\lambda_{\text{ext}}\), then one has an effective crystal with interplanar spacing \(\lambda_{\text{ext}}\) and so the Bragg condition yields the angular positions of the constructive maxima of the Brillouin scattering:
\[2\lambda_{\text{ext}}\sin\theta_m=m\lambda\]
In addition, whereas for ordinary light incident on a diffraction grating the wavelength and frequency don’t change after diffraction, here because the photons either absorb or emit a phonon quasiparticle (respectively \(m=\pm 1\) orders), they do also accrue a slight Doppler shift in the frequency. When an AOM is labelled as being \(110\text{ MHz}\) for instance, it does not mean that the only Doppler shifts it is able to provide are exactly \(\pm 110\text{ MHz}\) but rather the diffraction efficiency \(\eta\) is greatest at this frequency, with some FWHM bandwidth \(\delta f_{\text{ext}}\) around this. For instance, for \(2\) AOMs in the lab, the following frequency response efficiency curves were measured (for both single pass and double pass, the latter of which should roughly be the square of the former).

AOMs are commonly used in a double-pass configuration, which means that light is passed through, then passed back again in exactly along the trajectory it came. If the diffraction efficiency of the first-order is \(\eta(\omega)<1\) at some frequency \(\omega=2\pi f\) ideally around the central \(\omega\) of the AOM (e.g. \(\omega=2\pi\times 110\text{ MHz}\)), then double-passing will lead to a reduced frequency \(\eta^2(\omega)<\eta(\omega)\). Provided one picks out the right order (not always trivial to do, need to change the driving amplitude to see which order drops faster, and use geometrical ray optics arguments), then this allows accruing a Doppler shift of \(2f_{\text{ext}}\) without sacrificing too much efficiency (if tried to get this from the \(m=2\) mode on a single-pass, would lose a lot of efficiency). AOMs are also commonly used for Q-switching in lasers (i.e. as glorified switches b/c they can switch on nanosecond time scales).
Laser (Toptica) with massive DLC Pro driver? Talk about how lasers work + lasing requirements
Notes on how Zoran’s lab works:
- The UHV in the MOT and science cells are like \(10^{-11},10^{-13}\text{ mbar}\) respectively, measured by a current which is on the order of \(\text{nA}\) (but at such low pressures, with such few particles, one can argue that pressure fails to even be a well-defined quantity).
- There are \(4\) AOM drivers for D1 cooling/repump and D2 cooling/repump light. Each has frequency, TTL, and amplitude control which need to be connected to analog channels like AO1, AO2, etc. which in turn are controlled in Cicero.
- Laser goggles have certain wavelength ranges over which they block best. The ODT uses 767 nm red light, but the box trap uses 532 nm green light.
- The Toptica laser controller is one component of a PID control loop.
- First, saturated absorption spectroscopy (require heat b/c K-39 to be in a gaseous form b/c otherwise just K-39 liquid/solid sitting at the bottom of the tube; this is achieved by winding some coils around and passing large current through coils and relying on resultant Joule heating; for K-39 need around human body temperature? \(35-40^{\circ}\text{ C}\) (the double-pass thing in the absorption cell) is used to get a Doppler-free \(\lambda_{D1},\lambda_{D2}\) signals that are fed to photodiodes, which send this to the Toptica laser controller which sends it to the Toptica software that’s used for laser locking.
- Need to lock the laser b/c a piezoelectric crystal has some voltage applied to it that causes mechanical deformation, move distance b/w 2 mirrors, but overtime it can drift due to temperature fluctuations, etc.
- The photodiodes need to be powered (by old car battery in this case) and also a separate cable which feeds into Toptica laser controller (it is also this cable which has the extra resistor at its end…I think idea is that the photodiode converts absorption signal into a photocurrent that flows across the resistor, and gets converted into a voltage…note that it’s a BNC cable, and most BNC cables already have some internal resistance, so this resistor really is just an extra resistor which I guess is to decrease the “gain” in some sense?).
- Kibble-Zurek mechanism?
- Anything in the lab (e.g. PCs, soldering irons, vacuum pumps, all kettle plugs, etc.) connected to AC mains needs to be PAT tested.
- There are \(4\) sets of coils in the experiment. In chronological order of use, they are:
- Quadrupole field coils (both \(x\),\(y\) and \(z\)) for the MOT and magnetic trapping.
- Guide field coils (to impose a quantization axis?) on MOT side for pumping and on the imaging side.
- Feshbach (“Fesh”) field coils for the science cell (to exploit Feshbach resonance of hyperfine states in order to tune s-wave scattering length).
- Compensation coils in \(x,y,z\) (the \(z\) compensation coil is also called “anti-\(g\)” coil for obvious reasons).
- Speedy coils? For quantum quench experiments?
- One of the coils cancels the curvature in the Feshbach coils.
- Each of these coils obviously requires a very bulky power supply.
- Igor’s thesis should contain more information about the coils.
- The track (arm which moves the magnetically trapped atoms) has \(3\) states, START, MOVE, MOVE2, and ENERGIZE? There is a track control box connected to the analog channels which one can use to control how the track moves in Cicero during an experimental sequence.
- Regarding water cooling of the experiment, the water is already pressurized, so adding a pump would only slow it down?
- The pipes also contain flow meters which monitor the flow rate \(|\textbf v|\) of the water (not sure how?), and send this information to a logic circuit which also uses temperature control. Will suddenly stop all current flowing through Feshbach coils if it detects that some thresholds are breached on both; thus, behaves as a current-controlled switch, aka a transistor, and more precisely they are IGBTs (insulated-gate bipolar transistor) because it turns out only these transistors are rated for the kinds of currents being used here.
- For all the coils, one frequently would like to switch them off suddenly. If you just do this directly, the significant inductance \(L\) of the coils will lead to a substantial back emf that would destroy the PSU. Hence the need for an alternative path for current to flow, which is why we also have a capacitor in parallel?
- Apparently, the light inside an optical fiber can also heat the fiber enough to melt it…
- There can be up to \(I\sim 200\text{ A}\) of current flowing through the Feshbach coils, with \(V=400\text{ V}\)…the whole circuit is low-resistance so if you touch probably not lethal but still better to be safe. The
The D1, D2 cooling and repump light must first get the required frequency shifts, then it all gets coupled simultaneously into a TA (amplifier) which should be seeded at all times, is externally controlled by a current knob \(I\) that dictates how much amplification \(A=A(I)\) it gives to the laser power. This is all then coupled into a polarization-maintaining optical fiber that goes into an optical fiber port cluster (FPC) (see the ChatGPT blurb about it) which is basically a compact setup of mirrors/lenses/polarizing beamsplitters (Chris says conceptually it’s not hard to build one yourself, just that save time with a company at the cost of double the price cf. self-building; similar remarks even apply to e.g. a laser which can be self-built and indeed many labs do that, just takes time). This then takes the incident light from the fiber and redistributes it into \(6\) beams of roughly equal power for the MOT (i.e. the “O” in “MOT”).
The MOT loading time \(\Delta t_{\text{load}}\) is the time to load the MOT from the vapor of K-39 atoms that sits at some background pressure \(p_o\) and temperature \(T_0\). Some exponential “charging curve” \(1-e^{-t/t_{\text{load}}}\)? And also, normally you gauge how well the MOT is working (and decide when need to fire again) by measuring atom number in the BEC in the science cell. If the science cell isn’t working, what you can instead do is to measure an initial \(I_0\) from absorption spectroscopy, then do magnetic transport of the atoms to the science cell and back to the MOT, and measure \(I\); then the recapture efficiency of the MOT is \(I/I_0\).
Also in science cell, one-body losses are very significant. Relative to the BEC, the thermal cloud around it is at effectively infinite energy heat bath, so if any such atom collides with an atom in the BEC, it will remove it…(I guess thermalization is always happening, and at the microscopic/kinetic level what this looks like is precisely one-body losses).
One very effective practice/way to learn more about how any lab with a bunch of cables/wires works is to just trace/route wires, one at a time, to gain some sense for how different components are connected to each other.
General EQ Stuff
If you’re building a new machine/experiment, need to make the shop ppl’s life “living hell”, ask about stock available and be persistent, ask “can you get it to me by tomorrow”, etc. and don’t leave it to the point that they have to reach out to some more senior ppl etc, then stuff will never get done. Example in this case was for boards to enclose the perimeter of the optical table with, some were not right size so were looking for companies to get new ones from. Simon found a company and even more quickly found that they had a contact, so he just called them right away and got the order sorted out very efficiently.
Beer-Lambert Law & Radiative Broadening
In cold atom experiments, one very basic question one can ask is, given some atom cloud, what is the number of atoms \(N\) in the cloud? One way is to basically shine some light on the atom cloud and see how much is absorbed. This absorption effect is quantified by the Beer-Lambert law.
\[I(z)=I(0)e^{-n\sigma z}\]
where \(n=N/V\) is the number density of atoms in the cloud of volume \(V\) and \(\sigma=\sigma(\omega_{\text{ext}})\) is the optical absorption cross-section presented by each atom in the cloud to incident monochromatic light of frequency \(\omega_{\text{ext}}\).
It is instructive to derive the Beer-Lambert law from first principles. In particular, the derivation is meant to emphasize that, for the most part, one can basically just think of the Beer-Lambert law as a mathematical theorem about probabilities, with some quantum mechanical asterisks to that statement. To get a sense of this, consider first a \(2\)D version of the Beer-Lambert law, in which one has an atom cloud confined to a plane, along with an incident beam of photons of frequency \(\omega_{\text{ext}}\) travelling along the (arbitrarily defined) \(z\)-direction.

The (average) number density of atoms is \(n\) (units: \(\text{atoms}/\text m^2\)) and each atom can be thought of as a “hard circle” with diameter \(\sigma\) (units: \(\text m/\text{atom}\)). In that case, in a small strip of width \(dz\), there will be \(ndz\) atoms per unit length along the strip, or equivalently the average interatomic spacing is \(1/ndz\) along the strip (see the picture). The probability that a given photon “collides” with such an atom is therefore \(\sigma/(1/ndz)=n\sigma dz\); such photons are depicted red on the diagram, while those that make it through the first layer \(dz\) are depicted green. Over many photons, this manifests as a loss \(dI<0\) in their collective intensity \(I\) across the layer \(dz\), so one may equate the fractional loss of intensity with the absorption probability:
\[\frac{dI}{I}=-n\sigma dz\]
for which the solution of this ODE yields the Beer-Lambert law:
\[I(z)=I(0)e^{-n\sigma z}\]
where \(1/n\sigma\) is the length scale of this exponential attenuation in the beam intensity. Of course, this argument generalizes readily to the \(3\)D case where now \(n\) (units: \(\text{atoms}/\text m^3\)) is the number density of atoms in \(\textbf R^3\) and \(\sigma\) (units: \(\text m^2/\text{atom}\)) is now the optical cross-section presented by each atom. As stressed earlier, there isn’t really much physics going on here, it’s just a statement about the statistics of a \(3\)D Galton board.
At this point however, one would like to introduce some quantum mechanical modifications to this simple Beer-Lambert law. As usual, suppose the laser light \(\omega_{\text{ext}}\) is not too detuned from a particular atomic transition \(\omega_{01}\) between some ground state \(|0\rangle\) and some excited state \(|1\rangle\) in each of the atoms in the cloud (also assume for simplicity that both \(|0\rangle\) and \(|1\rangle\) are non-degenerate). In that case, it makes sense to distinguish \(n=n_0+n_1\) between the number density \(n_0\) of atoms in the ground state \(|0\rangle\) vs. the number density \(n_1\) of atoms in the excited state \(|1\rangle\) since only the atoms in the ground state \(|0\rangle\) can absorb the incident photons, after which they go into the excited state \(|1\rangle\) and so are no longer able to absorb any more photons. Thus, one might think that the correct form of the Beer-Lambert law should be:
\[\frac{dI}{I}=-n_0\sigma dz\]
But this is forgetting that atoms in the excited state \(|1\rangle\) can undergo stimulated emission too back down to the ground state \(|0\rangle\) (and in the steady state, recall from Einstein’s statistical argument that the rates of stimulated absorption and emission are equal). In contrast to absorption, this would have the effect of actually increasing the intensity \(I\) because the atom emits a photon back into the beam. Thus, the correct form of the Beer-Lambert law is actually:
\[\frac{dI}{I}=-n_0\sigma dz+n_1\sigma dz=(n_1-n_0)\sigma dz\]
where by time-reversal symmetry the optical cross-section \(\sigma\) is the same for both stimulated absorption and emission. In the steady state (i.e. when \(\dot n_1=\dot n_2=0\) reach an equilibrium), it is clear that one must also have \((n_0-n_1)\sigma I=n_1\Gamma\hbar\omega_{\text{ext}}\) where \(\Gamma=A_{10}\) is the rate of spontaneous emission/decay from the excited state \(|1\rangle\) back down to the ground state \(|0\rangle\) (note that it really is \(\hbar\omega_{\text{ext}}\) and not \(\hbar\omega_{01}\) in the formula; whatever frequency an atom absorbs must also be what it emits by energy conservation). On the other hand, also in the steady state, the optical Bloch equations assert that:
\[\rho_{11}=\frac{n_1}{n}=\frac{1}{2}\frac{s}{1+s+(2\delta/\Gamma)^2}\]
where \(s=I/I_{\text{sat}}=2(\Omega/\Gamma)^2\) is the saturation. Combining these two expressions allows one to obtain an explicit formula for how the optical cross-section \(\sigma\) depends on the “driving frequency” \(\omega_{\text{ext}}\) of the incident photons in e.g. a laser:
\[\sigma(\omega_{\text{ext}})=\frac{1}{1+(2\delta/\Gamma)^2}\frac{\hbar\omega_{\text{ext}}\Omega^2}{\Gamma I}\]
where there is also an \(\omega_{\text{ext}}\)-dependence hiding in the detuning \(\delta=\omega_{\text{ext}}-\omega_{01}\). At first glance, this seems to suggest that the optical cross-section \(\sigma\), in addition to depending on \(\omega_{\text{ext}}\) also depends on the intensity \(I\) of the incident photons, but actually this is an illusion, because the Rabi frequency \(\Omega\) also depends on \(I\) in such a way that the two effects cancel out so as to actually make \(\sigma\) independent of \(I\). To see this, recall that the time-average of the Poynting vector over a period \(2\pi/\omega_{\text{ext}}\) is \(I=\varepsilon_0 c|\textbf E_0|^2/2\) and that the Rabi frequency is \(\hbar\Omega=e\textbf E_0\cdot \langle 1|\textbf X|0\rangle\). The unsightly presence of the matrix element can be further removed by recalling that (in the dipole approximation) one has \(\Gamma=4\alpha\omega_{01}^3|\langle 1|\textbf X|0\rangle|^2/3c^2\). Therefore, in the best case where the incident light is polarized along the dipole moments of the atoms, then \(\Omega^2=e^2|\textbf E_0|^2|\langle 1|\textbf X|0\rangle|^2/\hbar^2\). If on the other hand the incident light were unpolarized or the atoms in the cloud were randomly oriented, then isotropic averaging would contribute an additional factor of \(1/3\):
\[\langle\cos^2\theta\rangle_{S^2}=\frac{1}{4\pi}\int_0^{2\pi}d\phi\int_0^{\pi}d\theta\cos^2\theta\sin\theta=\frac{1}{3}\]
Sticking to the best case scenario (which can be thought of as an upper bound if one likes though it is experimentally the typical situation since one often tries to maximize \(\sigma\) anyways), this leads to the explicitly \(I\)-independent form of the optical cross-section:
\[\sigma(\omega_{\text{ext}})=\frac{1}{1+(2\delta/\Gamma)^2}\frac{6\pi\omega_{\text{ext}}c^2}{\omega_{01}^3}\]
so the optical cross-section takes its maximum value at \(\omega_{\text{ext}}=\sqrt{\omega_{01}^2+(\Gamma/2)^2}\) but because the line width \(\Gamma\ll\omega_{01}\) is typically much less than the transition frequency itself, this is basically just \(\omega_{\text{ext}}\approx \omega_{01}\) so the maximum cross-section \(\sigma_{01}\) occurs on resonance and is given by:
\[\sigma_{01}=\sigma(\omega_{01})=\frac{6\pi c^2}{\omega_{01}^2}=\frac{3\lambda_{01}^2}{2\pi}\]
This also allows one to approximate the spectrum of the optical cross-section \(\sigma\) as just a Lorentzian profile centered at \(\omega_{\text{ext}}\approx \omega_{01}\) with \(\Gamma\) being its FWHM:
\[\sigma(\omega_{\text{ext}})\approx\frac{\sigma_{01}}{1+(2\delta/\Gamma)^2}\]
Typical transition wavelengths (e.g. visible light) might be around \(\lambda_{01}\sim 10^{-7}\text{ m}\) which far exceeds the length scale \(\sim a_0\sim 10^{-11}\text{ m}\) of the individual atoms themselves. The corresponding optical cross-section \(\sigma_{01}\sim\lambda_{01}^2\) is thus much larger than the actual “size” of the atoms themselves, so this emphasizes another quantum mechanical discrepancy to the classically-minded picture where \(\sigma\) would have just been interpreted as the size of individual “hard sphere” atoms (and in that case it wouldn’t have any \(\omega_{\text{ext}}\)-dependence in the first place). Moreover, the fact that near resonance \(\sigma\) is much larger than the atoms themselves also helps to ensure laser cooling actually works since it gives each photon more “leeway” in that it doesn’t need to hit an atom “head-on” to be absorbed, but merely has to pass within the cross-section \(\sigma\).
Intensity Saturation & Broadening
At low incident intensities \(s\ll 1\), spontaneous emission dominates stimulated absorption/emission \(\Gamma\gg\Omega\) and so any atom which is excited from the ground state \(|0\rangle\) into the excited state \(|1\rangle\) will quickly decay back down to the ground state \(|0\rangle\) by spontaneous emission. However, as one ramps up the laser intensity to saturation \(s\to 1\) and even \(s>1\), although there is a cap \(\rho_{11}<1/2\) on the excited state population, nevertheless the ground state population \(\rho_{00}\to 1/2\) will have depleted so much that there won’t be that many atoms left to absorb any more incident photons, so one would expect the sample to get worse and worse at absorbing incident photons. Recalling that \(s=I/I_{\text{sat}}=2(\Omega/\Gamma)^2\) (note that in the optimal case \(I_{\text{sat}}=\hbar\omega_{01}^3\Gamma/12\pi c^2\) but importantly is an intrinsic property of the atomic transition that scales with the transition frequency as \(I_{\text{sat}}\propto\omega_{01}^6\) due to the extra factor of \(\omega_{01}^3\) in \(\Gamma\)), it is clear that when \(s\to 1\), the Rabi frequency \(\Omega\) grows to the point of being comparable with the spontaneous decay rate \(\Gamma\), so now stimulated emission starts competing with spontaneous emission. In order to see this mathematically, it is useful to look at the absorption coefficient whose reciprocal directly governs the length scale of attenuation in the Beer-Lambert law:
\[(n_0-n_1)\sigma=\frac{n\sigma_{01}}{1+s}\frac{1}{1+(2\delta/\Gamma\sqrt{1+s})^2}\]
This is just another Lorentzian similar to the cross-section \(\sigma(\omega_{\text{ext}})\) itself. But there’s a crucial difference; whereas the FWHM of the Lorentzian for \(\sigma\) was fixed at \(\Gamma\), here it is \(\Gamma\sqrt{1+s}\); but this is now dependent on the laser intensity \(s\), causing the Lorentzian to broaden as \(s\) increases (this is exactly the same kind of broadening seen in \(\rho_{11}\); one difference though is that while \(\rho_{11}\to 1/2\) saturates, here the resonant absorption coefficient \(n\sigma_{01}/(1+s)\) just decreases monotonically as \(s\) is ramped up).
Finally, one can revisit the original Beer-Lambert law \(I(z)=I_0e^{-n\sigma z}\) and ask what becomes of it after all the modifications; from the expression for the absorption coefficient above, one has:
\[\frac{ds}{dz}=-\frac{n\sigma_{01}s}{1+s+(2\delta/\Gamma)^2}\]
In terms of the line-of-sight atomic column density \(n_c(z):=\int_0^zn(z’)dz’\), this ODE is trivial to integrate:
\[n_c\sigma_{01}=\ln\frac{I_0}{I}+\frac{I_0-I}{I_{\text{sat}}}\]
where \(I_0:=I(z=0)\) is the incident irradiance. The quantity \(\ln I_0/I\) is often called the optical density (OD) in AMO physics, or the absorbance in chemistry. In practice, this formula cannot just be used as is, but rather requires calibrating for the polarization, detuning fluctuations, optical pumping losses, etc. by sweeping over a range of incident intensities \(I_0\) and, using some known atom number \(N=n_cA\) obtained by other methods, choosing \(I_{\text{sat}}\) so that \(n_c\sigma_{01}\) is approximately invariant for all \(I_0\) and corresponding \(I\).
Oversaturated Absorption Imaging of Atomic Clouds
The purpose of this post is to describe the relevant theory needed to understand the paper “High signal to noise absorption imaging of alkali atoms at moderate
magnetic fields” by Hans et al. In particular, a key paper which they cite that details the calibration of the absorption imaging setup is “Strong saturation absorption imaging of dense clouds of ultracold atoms” by Reinaudi et al. Another useful resource is the PhD dissertation of Hans which goes into more depth on details that are omitted in their paper.
Atomic Structure of \(^{39}\text K\)
The alkali atom isotope \(^{39}\text K\) has fixed, non-negotiable electron spin \(s=1/2\) and nuclear spin \(i=3/2\); hence it is bosonic \(s+i=2\). Within the gross \(n\)-manifold for \(n=4\), consider either the \(4s_{1/2}\) or \(4p_{1/2}\) fine \(j\)-manifolds for \(j=1/2\). In both cases, there are two hyperfine \(f\)-manifolds corresponding to total atomic angular momenta \(f=1,2\). In the strict absence \(\textbf B=\textbf 0\) of an external magnetic field, the \(f=1\) hyperfine manifold has \(3\) degenerate \(m_f\)-sublevels corresponding to projections \(m_f=-1,0,1\) of the total atomic angular momentum along some arbitrary \(z\)-axis, while the \(f=2\) hyperfine manifold has \(5\) degenerate \(m_f\)-sublevels corresponding to \(m_f=-2,-1,0,1,2\). However, upon turning \(\textbf B\neq\textbf 0\) on with \(B:=|\textbf B|\), the Breit-Rabi formula asserts that the \(2f+1\)-fold degeneracy among the Zeeman sublevels within each hyperfine \(f\)-manifold is lifted exactly according to the trajectories:
\[\Delta E_{|f=3/2\pm 1/2,m_f\rangle}(B)=-\frac{A}{4}\pm\frac{1}{2}\sqrt{4A^2+2m_fAg_j\mu_BB+(g_j\mu_BB)^2}\]
where \(g_j=2\) for \(4s_{1/2}\) and \(g_j=2/3\) for \(4p_{1/2}\), and \(A\approx h\times 230.859860\text{ MHz}\) for \(4s_{1/2}\) whereas \(A\approx h\times 27.793\text{ MHz}\) for \(4p_{1/2}\) (see the data for \(^{39}\text K\) here).

- Intuitively, the reason why the \(m_f\) sublevels seem to be inverted in the \(4s_{1/2}\), \(f=1\) hyperfine manifold is that \(g_f=-1/2<0\) is negative and to first-order the Zeeman perturbation is \(g_fm_f\mu_BB\) (originally \(-\boldsymbol{\mu}\cdot\textbf B\) but \(q=-e\)).
- 2D scan of \(I_{\sigma^+}+I_{\sigma^-}\) vs. \(I_{\sigma^+}/I_{\sigma^-}\), in an ideal world the measured OD should be constant across the entire space (as measured at low-field), but
- AOM driver right now is just being controlled by essentially varying a potentiometer \(R_2\) which controls the voltage at the midpoint of a voltage divider, which is fed into a voltage oscillator circuit that effectively maps \(V\to\omega_{\text{ext}}\) to RF-drive the AOM with. By flicking the switch, voltage divider circuit is no longer controlling it, instead it’s externally controlled by a computer in the Cicero Word Generator GUI for AMO physics experiments.
- The natural line width of optical/visible light (THz) transitions is practically zero compared with the RF transition (on the order of 400 MHz) between potassium-39 hyperfine states because \(\Gamma\propto\omega_{01}^3\).
- Need to first lock onto the right B-field (395 G) by doing a frequency sweep. Then, once that is locked onto, need to then impose correct frequency shifts on the AOMs (have a substantial line width/leeway here like 6 MHz or something?), will require a second frequency sweep to find max SNR) centered around roughly where we expect it to be located anyways (show calculation for this).

- Panos’s thesis did pixel-by-pixel calibration.
- https://www.tobiastiecke.nl/archive/PotassiumProperties.pdf
- Description of the experiment:
- The idea is that one would like to do spin-resolved polaron injection spectroscopy.
- D\(1\) repump light is from \(4s_{1/2}\) manifold (typically use \(|1,1\rangle\) for its broad Feshbach resonance) to \(4p_{1/2}\) manifold \(|2,2\rangle\). D\(2\) imaging light is from \(4s_{1/2}\) to \(4p_{3/2}\) stretched state \(|3,3\rangle\).
- The D\(2\) laser light is first incident on a \(\lambda/2\) waveplate which rotates the polarizations so that some go into each arm of a double-pass AOM. It is first passed into an AOM double-pass setup to get \(\pm 220\text{ MHz}\). These then are incident on a D\(2\) flip mirror which redirects this D\(2\) light into the modular optical breadboard setup we built. Specifically, the crossed polarizations are incident on a \(\lambda/2\) waveplate that rotates a certain amount of polarization into each of two double-pass AOM arms. One branch is additive by \(220\text{ MHz}\) in total (after double-pass) while the other branch is subtractive \(-220\text{ MHz}\), so when aligning it is essential to maximize the correct order \(m=\pm 1\), and to check this by turning on the TTL switch of the driver to see which order is left just before the iris. These then need to be overlapped onto an output fiber, with another \(\lambda/2\) waveplate onto a PBS which will throw away \(P/2\) but at the benefit of having a single polarization propagating through your polarization-maintaining fiber and directly into the science cell. This waveplate also allows optimizing \(I_{\sigma^+}/I_{\sigma^-}\).
- The AOM drivers are controlled by a digital channel for using Cicero to do TTL switching and also an analog channel for using Cicero to change driving amplitude of the AOMs (Janet for the \(-220\text{ MHz}\) and Billy for \(+220\text{ MHz}\)). In Cicero, the Override option for the D\(2\) flip mirror needs to be checked, but value is off for it to be down. Also, when overriding a digital channel, it is automatic, but when overriding an analog channel, need to specifically say so.
- If one wishes to abort a given sequence, best to tick the box, and when the sequence is finished (usually around \(30\) seconds). to quickly close it and click “restart sequence” to start up a new sequence or something (to keep coils heated).
- There are quadrupole coils (seem like 4 pairs?) in an anti-Helmholtz configuration for the MOT, Feshbach coils for the broad \(|1,1\rangle\) Feshbach resonance field to tune \(a\) (there are some empirical correlations in Cicero between the applied voltage in the coils and the corresponding \(B\to a\) you get out of it).
- Optical dipole trap (ODT), the light for that is the dangerous IR (power is 1 Watt, can even burn your skin).
- “Walking the beam” (draw schematic) by turning say \(\phi_1\) and seeing which direction \(\phi_2\) needs to go to keep at the same voltage, doing same for \(\theta_1,\theta_2\)…adjusting collimation at the end.
- Fiber pen, fiber cleaning kit (microscope, never look into it if the other end of fiber is coupled to light or will go blind).
- To actually make the optical box trap of green light, shine light onto a spatial light modulator (SLM, which is a bunch of liquid crystals applying some phase and stuff, a Freedericksz transition?, a bit like DMD except rotates slower so response time is kinda ass). Box is not a perfect cylinder, it is more like the waist of a Gaussian beam (length of \(40\) microns or so is Rayleigh distance \(z_R\)), and the sides are given by steep power law potentials. A bunch of lenses of various \(f\) act like Fourier transformers, etc. so that light field at focal plane is Fraunhofer pattern of SLM grating.
- When locking onto say the D\(2\) laser, have an absorption cell of solid \(\text K(s)\) with melting point around \(40^{\circ}\text{ C}\). Doppler-free spectroscopy allows measuring . derivative is physically measured, two PID controllers for different time scales used to
Ideal Fermi Gases
Consider an non-interacting gas of identical fermions (e.g. electrons \(e^-\)); this is called an ideal Fermi gas. Because the Pauli exclusion principle prohibits identical fermions from occupying the same quantum state, the grand canonical partition function \(\mathcal Z\) for an ideal Fermi gas is just:
\[\mathcal Z=\prod_{|k\rangle\in\mathcal H_0}\sum_{N_k=0,1}e^{-\beta(N_kE_k-\mu N_k)}=\prod_{|k\rangle\in\mathcal H_0}\left(1+e^{-\beta(E_k-\mu)}\right)\]
From which the grand canonical potential is:
\[\Phi=-\frac{1}{\beta}\ln\mathcal Z=-\frac{1}{\beta}\sum_{|k\rangle\in\mathcal H_0}\ln\left(1+e^{-\beta(E_k-\mu)}\right)\]
And the average number of fermions is:
\[\langle N\rangle=-\frac{\partial\Phi}{\partial\mu}=\sum_{|k\rangle\in\mathcal H_0}\frac{1}{e^{\beta(E_k-\mu)}+1}\]
from which one immediately reads off the Fermi-Dirac distribution of the Fermi occupation numbers of each of the single-fermion states \(|k\rangle\):
\[\langle N_k\rangle=\frac{1}{e^{\beta(E_k-\mu)}+1}\]
It is remarkable that a mere sign change in the denominator from the Bose-Einstein distribution is all that is needed to enforce the Pauli exclusion principle. Unlike for the ideal Bose gas where the chemical potential \(\mu<0\) had to be negative, for the Fermi-Dirac distribution \(\mu\in\textbf R\) can be anything.
Just as with the ideal Bose gas, for an ideal Fermi gas one would like to approximate the series with integrals (called the Thomas-Fermi approximation) \(\sum_{|k\rangle\in\mathcal H_0}\mapsto\int_0^{\infty}g(E)dE\). Taking the ideal Fermi gas to be non-relativistic, one has the density of states:
\[g(E)=\frac{g_sm^{3/2}V}{\sqrt{2}\pi^2\hbar^3}\sqrt{E}\]
where \(g_s=2s+1\) is a spin degeneracy factor (which has to be explicitly included for fermions by virtue of the spin-statistics theorem \(s=1/2,3/2,5/2,…\) and the fact that the free Hamiltonian \(H=T\) commutes with \(\textbf S^2\)). In the grand canonical ensemble, one thus has for an ideal Fermi gas:
\[\Phi=\frac{g_sV}{\beta\lambda^3}\text{Li}_{5/2}(-z)\]
\[\langle N\rangle=-\frac{g_sV}{\lambda^3}\text{Li}_{3/2}(-z)\]
\[\langle E\rangle=-\frac{3g_sV}{2\beta\lambda^3}\text{Li}_{5/2}(-z)\]
from which one obtains \(pV=\frac{2}{3}E\) for an ideal Fermi gas as was the case for the ideal Bose gas (and the ideal classical gas). In the high-temperature \(T\to\infty\) limit \(z\to 0\), one finds that, similar to the ideal Bose gas, the ideal Fermi gas looks like an ideal classical gas, at least to first order in the virial expansion (at second order, the quantum correction actually increases the pressure of the ideal Fermi gas whereas it was decreasing for the ideal Bose gas):
\[pV=NkT\left(1+\frac{\lambda^3N}{4\sqrt{2}g_sV}+O\left(\frac{N}{V}\right)^2\right)\]
In order to see more interesting, non-classical physics, it will as usual be necessary to look in the low-temperature limit \(T\to 0,z\to 1\). In fact, to start, one may as well look directly at the case of absolute zero \(T=0\). In this case, the ideal Fermi gas is said to be degenerate. At a glance, this is because the Fermi-Dirac distribution for the Fermi occupation numbers reduces to a top-hat filter:
\[N_k=\frac{1}{e^{\beta(E_k-\mu)}+1}=[E_k<\mu]\]
One can define the Fermi energy by \(E_F:=\mu(T=0)\) so that states \(|k\rangle\) with \(\hbar^2k^2/2m<E_F\) lying in the Fermi sea are fully occupied (i.e. have Fermi occupation number of \(N_k=1\)) while states \(|k\rangle\) with \(\hbar^2k^2/2m>E_F\) lying beyond the Fermi surface are completely empty. This definition of the Fermi energy \(E_F\) is strictly speaking a bit misleading since in the grand canonical ensemble \(\mu\) and \(T\) are independent and fixed while \(N\) fluctuates; in practice \(N\) is fixed and both \(\mu\) and \(T\) fluctuate in a way to keep \(N\) fixed so that working in the grand canonical ensemble is just a mathematical convenience. Therefore, it would make more sense to express/define \(E_F\) in terms of the fixed number \(N\) of fermions in the degenerate ideal Fermi gas:
\[N=\sum_{|k\rangle\in\mathcal H_0}N_k=\int_0^{\infty}[E<E_F]g(E)dE=\int_0^{E_F}g(E)dE\Rightarrow E_F=\frac{\hbar^2}{2m}\left(\frac{6\pi^2 N}{g_sV}\right)^{2/3}\]
This is of course related to the Fermi momentum and Fermi temperature by \(E_F=\hbar^2k_F^2/2m=kT_F\). The Fermi temperature \(T_F\) for the ideal Fermi gas determines whether the ideal Fermi gas is in the high-temperature \(T>T_F\) regime or the low-temperature \(T<T_F\) regime. For example, in a copper \(\text{Cu(s)}\) wire the number density of electrons \(e^-\) is \(N/V\approx 8.5\times 10^{28}\text{ m}^{-3}\), so the corresponding Fermi temperature is actually quite hot \(T_F\approx 8.2\times 10^4\text{ K}\) by everyday standards, and so in particular room temperature \(T\approx 300\text{ K}\ll T_F\) means that the electrons \(e^-\) in metals can be thought of to a good approximation as degenerate \(T=0\) Fermi gases.
Having computed the total number of fermions \(N=\langle N\rangle\), one can also compute the total energy \(E=\langle E\rangle\) in the grand canonical ensemble:
\[E=\sum_{|k\rangle\in\mathcal H_0}N_kE_k=\int_0^{\infty}[E<E_F]Eg(E)dE=\int_0^{E_F}Eg(E)dE=\frac{3}{5}NE_F\]
which is pretty intuitive, the factor of \(3/5\) essentially just coming from the average of \(k^2\) in a ball of radius \(k_F\), i.e. \(\frac{3}{4\pi k_F^3}\int_0^{k_F}k^24\pi k^2dk=\frac{3}{5}k_F^2\).
Finally, the “equation of state” \(pV=\frac{2}{3}E\) earlier yields the corresponding degeneracy pressure:
\[pV=\frac{2}{5}NE_F\]
For comparison, recall that below the critical temperature \(T<T_c\) the pressure \(p\sim T^{5/2}\) of a BEC approached \(p\to 0\) as \(T\to 0\); not so for an ideal Fermi gas. For both the ideal Bose and Fermi gases, \(pV=\frac{2}{3}E\) but because bosons can condense to the \(E=0\) ground state, their pressure \(p\) also drops to \(p\to 0\), however fermions cannot do this because of the Pauli exclusion principle (they are forced to fill out a Fermi sea instead), so their total energy \(E=\frac{3}{5}NE_F\) can never reach zero, and therefore their pressure \(p\) also cannot reach \(p\to 0\), leaving this residual \(T=0\) degeneracy pressure \(p=\frac{2}{5}\frac{N}{V}E_F>0\).
Finally, it is worth asking more generally just about the physics of an ideal Fermi gas not necessarily when it is degenerate at \(T=0\), but merely at some “low” temperature \(T\ll T_F\). Here, “physics” shall mean “low-temperature heat capacity” \(C_V=C_V(T)\).
In this case, the Fermi-Dirac distribution will be distorted from the degenerate \(T=0\) top-hat filter into a distribution that looks like:

The key observation is that only fermions close to the Fermi surface, specifically whose energy is within \(kT\) of the Fermi energy \(E_F\) can respond to any additional energy added to the ideal Fermi gas, and therefore contribute to the heat capacity \(C_V\) (since only they notice the non-degenerate temperature \(T>0\), the rest of the fermions being locked in the Fermi sea by the Pauli exclusion principle).
\[C_V=\frac{\partial E}{\partial T}=-\frac{3g_sV}{2}\frac{\partial}{\partial T}\left(\frac{1}{\beta\lambda^3}\text{Li}_{5/2}(-z)\right)\]
At this point, invoke the behavior of the polylogarithm as the fugacity \(z\to 1\) in the low-\(T\) limit (called the Sommerfeld expansion, essentially just a lot of binomial expansions):
\[-\text{Li}_{s}(-z)=\frac{(\ln z)^s}{\Gamma(s+1)}\left(1+\frac{\pi^2}{6}\frac{s(s-1)}{(\ln z)^2}+…\right)\]
where \(\ln z=\beta\mu\), so this simplifies to:
\[C_V\approx \frac{\sqrt{2}g_sm^{3/2}V}{5\pi^2\hbar^3}\frac{\partial}{\partial T}\left(\mu^{5/2}\left(1+\frac{5\pi^2}{8\beta^2\mu^2}\right)\right)\]
Here comes the subtle point; although \(\mu\) is supposed to be independent of \(T\) in the grand canonical ensemble with the expected total number of particles \(N\) determined through \(\mu\) and \(T\), in practice the number of fermions \(N\) is fixed so \(\mu=\mu(T)\) is implicitly a function of \(T\) in order to keep \(N\) fixed. This is why one cannot for instance just factor the \(\mu^{5/2}\) outside the \(\partial/\partial T\), etc. The resolution here is to write \(\mu\) in terms of \(N\) (or equivalently, in terms of the Fermi energy \(E_F\)) which would be fixed and therefore easy to deal with;
\[N\approx\frac{\sqrt{2}g_sm^{3/2}V}{3\pi^2\hbar^3}\mu^{3/2}\left(1+\frac{\pi^2}{8\beta^2\mu^2}\right)\]
This immediately implies:
\[E_F\approx \mu\left(1+\frac{\pi^2}{8\beta^2\mu^2}\right)^{2/3}\]
Or, isolating \(\mu\) to suitable order:
\[\mu=E_F\left(1+\frac{\pi^2}{8\beta^2\mu^2}\right)^{-2/3}\approx E_F\left(1-\frac{\pi^2}{12\beta^2\mu^2}\right)\approx E_F\left(1-\frac{\pi^2}{12\beta^2E_F^2}\right)\]
Finally, it is clear that one can re-express the heat capacity in terms of \(N\) and \(E_F\) (the fixed variables) as:
\[C_V=\frac{3N}{5}\frac{\partial}{\partial T}\left(\mu\frac{1+5\pi^2k^2T^2/8\mu^2}{1+\pi^2k^2T^2/8\mu^2}\right)\approx\frac{3NE_F}{5}\frac{\partial}{\partial T}\left(1-\left(\frac{5}{8}-\frac{1}{8}-\frac{1}{12}\right)\frac{\pi^2}{\beta^2E_F^2}\right)\]
leading to the linear heat capacity behavior of the low-\(T\) ideal Fermi gas:
\[C_V=\frac{\pi^2}{2}Nk\frac{T}{T_F}\]
Ignoring the \(\pi^2/2\) prefactor which came from the detailed Sommerfeld expansion of the polylogarithms, there is a simple intuitive way to understand this formula: the number of Fermi surface fermions living within \(kT\) of the Fermi energy \(E_F\) is \(g(E_F)kT\) and the energy of each fermion is of order \(kT\) so the total energy of all Fermi surface fermions is \(E\sim g(E_F)(kT)^2\). If one adds some energy \(dE\) into the ideal Fermi gas, then essentially all this energy has to go into the Fermi surface fermions so that one may legitimately equate \(dE\sim g(E_F)k^2TdT\) reproducing the linear heat capacity:
\[C_V\sim g(E_F)k^2T\sim E_F^{1/2}k^2T\sim N^{1/3}k^2T\sim Nk\frac{T}{T_F}\]
The theory of ideal Fermi gases has diverse applications, ranging from electrons \(e^-\) in a conductor (as justified by Landau’s Fermi liquid theory) to astrophysics (e.g. white dwarf stars are supported by electron degeneracy pressure, neutron stars are supported by neutron degeneracy pressure, thanks to the fact that both electrons \(e^-\) and neutrons \(n^0\) are fermions) to Pauli paramagnetism and Landau diamagnetism in condensed matter physics.