Examples from physics#
Here are some examples that demonstrate the usefulness of matrices in physics, and help you develop a bit of an intuition about what they are and how to use them. The material covered in this appendix is not needed for the PHY129 exam. This appendix is an “optional extra”. It is meant to give you a flavour of ways in which you will come across techniques learnt in this module in physics you study later. In later years you may want to come back to these examples when you encounter the relevant physics module. Finally, if you want to find out how the matrix methods introduced here apply to functions more generally, watch this video from 3Blue1Brown.
Polarisation filters#
Light is polarised, and the polarisation can be described by a complex two-dimensional vector. Here we will consider linear polarisation only, so the vectors will have only real elements. The polarisations for horizontal and vertical polarisation can be chosen as
We can also have polarisation at \(\pm 45^\circ\) from these directions:
where \(\d\) and \(\a\) stand for diagonal and anti-diagonal. Let’s send the \(\d\) polarisation through a polarisation filter \(F_h\) that lets only horizontal polarisation through. This filter is described by a projection onto \(\h\):
You can easily verify that \(F_h^2 = F_h\). The filter acting on the \(+45^\circ\) polarisation gives
This is in fact \(\h\)-polarised light, but the length of the vector has shrunk. Since the intensity of the light is given by the length-squared of the vector, we see that the filter removed half the light, as expected.
Next, we place a polarisation filter oriented in the vertical direction in the beam. This filter is represented by the matrix
The output after this second filter is now
Again as expected, the second filter extinguishes all the light, because there is no vertical polarisation component in the light just after it passes a horizontal polarisation filter. We could have seen this with out the input light, by looking inly at the matrices:
So any light will be extinguished by the combination of these two filters.
Next, what happens if we slide a third polarisation filter between \(F_h\) and \(F_h\) in the diagonal direction? The projector is in the direction of \(\d\), do the projector \(F_d\) becomes
Now what does \(F_d\) do to the incoming light? You may expect that because there is a horizontal and a vertical polarisation filter in the series of filters, all light will still be extinguished. Let’s do the matrix calculation:
This is not the zero matrix! It means that inserting the filter \(F_d\) rekindles the light that passes the polarisation filters. In particular, it transfers a quarter of the original horizontally polarised light into vertically polarised light (that’s what the location of the 1 and the factor \(\frac12\) indicate). The effect is demonstrated in this video.
Note that the order of the matrices is crucial here: the filter \(F_d\) really must separate \(F_h\) and \(F_v\), otherwise we still have the full light extinction. In technical terms, \(F_d\) commutes with neither \(F_h\), nor \(F_v\). Otherwise we would be able to rearrange the product so that \(F_h\) and \(F_v\) multiply each other and give the zero matrix. Also note that we can discuss the effect of the filters without worrying about the incoming light by considering only the matrices. Even though matrices are defined by the vectors they act upon, often we can draw physics conclusions using matrices without getting vectors involved at all.
Coupled harmonic oscillators#
Consider the one-dimensional problem of two masses \(m\), each connected to opposite walls by springs with spring constant \(k\). The masses are connected by a spring with spring constant \(k'\):
The displacement from the equilibrium position of the two masses is \(x_1\) and \(x_2\).
We set up a force balance using Hooke’s law, where \(a_1=\ddot{x}_1\) and \(a_2=\ddot{x}_2\) are the accelerations of the first and second mass, respectively:
Make sure you understand how the signs come about in these equations. Next, we make the assumption[1] that the solutions to these equations are given by
for some \(\omega\), \(\phi\), and amplitudes \(\alpha_1\) and
\(\alpha_2\). The expressions in equation
eq:028y4rehiudfs
then become
In matrix form with \(\bm\alpha = (\alpha_1,\alpha_2)^T\), we can write this as
Note that this is independent of the phase \(\phi\). To see whether our assumption was valid, and these equations have nontrivial solutions, we verify that \(\det A = 0\):
This is satisfied for the positive angular frequencies
Substituting these values back into equation
eq:204yigrweh
, we find that
In other words, either the two masses can oscillate in phase with the same amplitude, where the spring with constant \(k'\) is not compressed or extended during the oscillation, or the two masses oscillate opposite to each other:
where I keep the possibility open that the two frequencies may have different amplitudes \(\alpha\) and \(\beta\). These are called the normal modes of the two oscillating masses. Since this is a linear system, superpositions of these normal modes will also be solutions to the original equations of motion. Therefore
So with some fairly straightforward matrix techniques we can find the solutions to the equations of motion of a pretty complex system!
It is instructive to see how the normal modes of the system relate to the eigenvalues and eigenvectors of \(A\). The two eigenvalues of \(A\) are \(k-m\omega^2\) and \(k+2k' -m\omega^2\). So our choice of \(\omega\) makes one of the eigenvalues zero, which is sufficient to make the determinant of \(A\) zero. The eigenvectors of \(A\) are
The top entry is proportional to the amplitude of the first mass, and the bottom entry to the amplitude of the second mass. You see that they have either the same amplitude or opposite amplitude. Because of this connection to the eigenvalues and eigenvectors, the normal modes are sometimes also called the eigenmodes. Here is a practical demonstration of the effect.
Moment of inertia tensor#
The moment of inertia of a body depends on the axis of rotation. A disk that is lying flat in the \(xy\)-plane with its centre of mass located at the origin, and rotating around the \(z\)-axis will have a different moment of inertia than if the same disk rotates around the \(x\)-axis. To find the moment of inertia of a body rotating around an arbitrary axis through its centre of mass therefore requires several numbers. However, it turns out that we can solve this very elegantly using matrices.
From first year mechanics you know that the moment of inertia \(I\) plays the role of mass in rotations, and relates to the angular momentum \(L\) as
However, angular momentum is a three-dimensional vector
(\(\L\)), and so is the angular rotation \(\bm\omega\) if we take into
account the rotation axis. So equation
eq:294iruhef
must be fixed. Given that the moment of
inertia is axis-dependent, \(I\) must be a \(3\times 3\) matrix:
Note that for the purposes of this example, \(I\) is no longer the identity matrix! The matrix describing the momentum of inertia is called a tensor, because it has some special transformation properties that are not important right now. Also, \(I\) is a real symmetric matrix, so \(I_{jk} = I_{kj}\) and \(I_{jk}^* = I_{jk}\). The off-diagonal elements are called the product of inertia. You can calculate the elements of \(I\) using the integrals
The diagonal elements are the normal moments of inertia for rotations around the axis corresponding to the element’s position in the matrix.
A nice application of this is the situation where the moment of inertia tensor has non-zero off-diagonal terms due to an asymmetry in the rotating body. In that case the angular momentum does not line up with the rotation vector. For example,
The angular momentum \(\L\) clearly does not point in the direction of \(\bm\omega\). This produces a torque \(\bm\tau = d\L/dt\), which can be substantial if \(\bm\omega\) and \(\L\) are large. An example of this can be found in this video.
Generators of rotations#
Let’s define a matrix
For reasons that will become clear in a minute we exponentiate the matrix \(L_z\) multiplied by an angle \(\theta\):
How do we exponentiate a matrix? We know we can multiply matrices, so we can take the series expansion of the exponential and calculate the result that way:
Next, we calculate the powers of \(L_z\):
You see that \(L_z^5 = L_z\), and we repeat the cycle. The intermediate matrices \(-L_z^2\) and \(L_z^2\) are not quite the identity matrix, but they are close. Let’s call it \(I'\). The series expansion of the exponential then splits into two series of even and odd powers:
We have to be careful about the first term, denoted here by 1. Since the sum in the series expansion is over terms that are \(3\times 3\) matrices, and we can add only matrices of the same size, the first term must be in fact the proper identity matrix \(I\). Next, we see that the terms in brackets are familiar: the last term is \(\sin\theta\), while the first series in brackets is close to \(\cos\theta\). We can add and subtract 1 inside the brackets in the first series to obtain
This is the rotation matrix \(R_z(\theta)\) around the \(z\)-axis.
We can also define matrices \(L_x\) and \(L_y\) that, when exponentiated, create rotations around the \(x\)- and \(y\)-axis:
The three matrices obey the commutation relations
where \(i\), \(j\), and \(k\) take on the values \(x\), \(y\), and \(z\), and the special symbol \(\epsilon_{ijk}\) is the Levi-Civita symbol that takes on the values
The matrices \(L_x\), \(L_y\), and \(L_z\) are called the
generators of rotations. Any three matrices of size \(n\times n\) (with
\(n\geq 3\)) that obey the commutation relations in equation
eq:vjhg49820uwoeij
generate rotation matrices in a space
of dimension \(n\).
Finally, we can see how the matrices \(L_j\) are related to angular momentum. Recall that angular momentum is defined by \(\L = \r\times \p\) with \(\r\) and \(\p\) the position and momentum vectors, respectively. The components of \(\L\) are
Now we can see the connection: the \(j^{\rm th}\) component of the angular momentum is related to the matrix \({L}_j\) via the relation \(\mathcal{L}_j = \r\cdot (L_j \p)\):
We say that angular momentum is the generator of rotations.
Lorentz boosts#
Matrices play a central role in both special and general relativity. As a simple example, consider a particle in one dimension with a position coordinate \(x\) and time coordinate \(t\), as measured relative to a reference frame. The values for these coordinates specify where the particle is at what time in that reference frame. Not all combinations of \(x\) and \(t\) will occur for a given particle; there will be a relationship between position and time, \(x(t)\), which is the particle’s trajectory. However, any possible trajectory can be captured by these two numbers, and we collect them in a vector \(\s\):
Next, we want to know how to express \(\s\) in a different reference frame (denote the vector by \(\s'\)), moving at velocity \(v\) in the positive \(x\)-direction as seen from the original reference frame. For a non-relativistic (Galilean) transformation, the coordinates in this new reference frame will be
As the primed frame moves in the positive direction with respect to the original frame, the position \(x\) in the original frame moves in opposite direction in the primed frame (try to visualise this). The time in each frame remains unchanged, i.e., clock readings are not effected by this transformation. We can capture these two equations neatly in a matrix equation:
where we defined the Galilean transformation matrix \(G\) as the matrix that changes between the original reference frame and the primed frame. You should verify yourself the neat property that
where we explicitly included the velocity dependence of the matrix \(G\). So the combination of two Galilean transformations is again a Galilean transformation.
Now consider Lorentz transformations of the same system. Instead of the time coordinate \(t\), for convenience we consider the scaled coordinate \(ct\) with \(c\) the speed of light in vacuum:
with \(\beta = v/c\) and \(\gamma = (1-\beta^2)^{-\frac12}\). In matrix notation, this becomes
where the matrix \(L\) is called the Lorentz boost.
We now want to show that two boosts in the same direction[2] again produce a boost. To achieve this, we note that
This allows us to parametrise \(\gamma\) and \(\beta\gamma\) in terms of \(\cosh r\) and \(\sinh r\) as follows:
where the parameter \(r\) is called the rapidity. Try to derive the relationship between the boost \(v\) and the rapidity \(r\):
The Lorentz boost now takes the simple form
Two boosts with rapidities \(r_1\) and \(r_2\) then produce a new boost given by
The last equality can be found by using the addition formulas
The addition formula in special relativity is
Can you derive this from equations
eq:gh304woeijrs
and
eq:gry4y928woesd
?
Energy levels in a two-level atom#
In quantum mechanics, measurable properties are represented by Hermitian matrices. For example, the energy of a system is a matrix called the Hamiltonian. As an example, consider a two-level atom with a ground state \(|g\rangle\) and an excited state \(|e\rangle\). Despite the strange notation, these are just vectors:
For an isolated two-level atom, the ground state energy \(E_g\) and the excited state energy \(E_e\) are eigenvalues of the Hamiltonian, so that
Next, imagine a laser interacting with the atom. If the atom is in the ground state, the laser can excite it to the excited state, and vice versa. In terms of matrices, this becomes
So the off-diagonal elements are responsible for the transitions. The values of these elements represent the strength of the interaction. A general Hamiltonian can then be written as
where \(\hbar\Omega\) is the interaction strength. It is generally a complex number because it includes the phase of the laser (this is not important right now). You can prove that \(H\) is Hermitian, which means that its eigenvalues are real.
The energy levels of the two-level system interacting with a laser
change. To find the new energy levels, we must find the eigenvalues of
\(H\) in equation
eq:buhg9834owr
. Let \(E_g = 0\), \(E_e
= 1.2\) eV, and \(\hbar\Omega = 0.3 i\) eV. We find the eigenvalues by
calculating
Therefore, the new eigenvalues are
These are the new energy values of the two-level atom interacting with a laser. You can see that the interaction drives the two energy states apart, and the ground state energy becomes negative. This is not a problem, because only energy differences are physically meaningful.
Note also that the Hamiltonian must be Hermitian, because the eigenvalues are physical quantities, and should therefore be real numbers.