Measurements and two-level quantum systems

Chapter 6 Measurements and two-level quantum systems In the previous chapter we saw how the quantum state, represented by the wave function of a quantum system, evolves in space and time as governed by the Schrödinger equation. We discussed the interpretation of the wave function as the probability amplitude, in which the modulus squared is equal to the probability density of finding the particle in a small region about a point in space at a given time. Furthermore, we discussed the idea of expectation values, that is, if we measured say the position of the particle at time t prepared in some initial state, the specific measurement outcome is random, and we can only talk about the most probable outcome or average outcome that we expect if we repeat the same measurement on multiple copies of identically prepared systems. This inherent uncertainty in the outcome of a measurement is a crucial difference between quantum and classical physics. In the previous discussion we did not mention the implications of this random outcome nor what happens to a quantum system after a measurement is made though, and these will be topics of the current chapter. In addition we will set out the mathematics of two level quantum systems, which are the most basic (simplest) quantum systems upon which modern quantum thinking is based. 6.1 Observables, operators, and expectation values 6.1.1 Classical measurement As mentioned in the introductory chapter, the concept of measurement in quantum mechanics differs significantly from its classical counterpart. Classically, measurements reveal intrinsic properties or observables of the system, for example the position, momentum, angular momentum, or energy of a 1

particle are classical observables Classical observables: x, p, L, and E. Classical measurement postulates: Two key postulates for classical measurements that fail for quantum measurements involve the implicit assumptions that 1. Independent reality: Measurements reveal elements of physical reality that exist independent of the observation. In other words, measurements reveal information about properties possessed by a particle, which existed prior to the measurement, and we are simply ignorant of the value of the observable before we make the measurement. For example, classically we would say that measurement of the position of an electron bound to a hydrogen nucleus just tells us where the electron was before the measurement.. No disturbance: Measurements can be performed that do not disturb the system. In classical physics, all that is needed to do this is to make our measurement interaction sufficiently weak so that there is no disturbance of the system. In classical physics, and most experiences we have in the macroscopic world, we do not see any difficulties with these postulates. We have already discussed the concept of measurement disturbance in quantum mechanics in terms of the uncertainty principle and Heisenberg microscope. The idea that by trying to determine the position of a particle with ever increasing precision (smaller uncertainty x), at the cost of gaining momentum uncertainty of a high energy (and thus high momentum) photon seems plausible to our classical ways of thinking. However, the idea that the position of the moon does not exist unless we look seems absurd to our classical intuition. As we will see, this is precisely what quantum mechanics prescribes. 6.1. Quantum measurement In quantum mechanics measurements are represented by operators that act on the wave function Ψ(x, t). For example the observables of position, momentum, angular momentum, and energy correspond to the following operators ˆXΨ(x, t) = xψ(x, t), (6.1) ˆpΨ(x, t) = i Ψ(x, t), (6.) ˆLΨ(x, t) = i x Ψ(x, t), (6.3) ĤΨ(x, t) = i t Ψ(x, t). (6.4)

The position operator acting on a wave function results in multiplication by the coordinate x inside the argument of the wave function. The momentum operator acting on a wave function takes a spatial gradient (multiplied by i ). The orbital angular momentum operator follows from these and the corresponding classical definition. The Hamiltonian operator, or energy operator, acting on a wave function takes a time derivative (multiplied by i ). As mentioned in the previous chapter, the outcome of a particular measurement cannot in general be predicted. One can only talk about the probability to obtain a particular measurement outcome, which is governed by the wave function. This is a fundamental feature of quantum mechanics. Let us take the example of the particle confined to an infinite potential well, as described in Sec. 5.6 of the previous chapter, and consider the situation in which the particle is prepared in the symmetric superposition state Ψ (+) (x, t) = 1 ( φ1 (x)e iω 1t + φ (x)e iω t ), (6.5) whose wave function and probability density are shown in Fig. 6.1 at time t = 0. If we make a measurement of the position, any value in the range L/ < x < L/ will be obtained for a single measurement, with the exception of L/6, where the wave function is zero. The most probable position to find the particle can be calculated as the expectation value x(t) = 1 x [ φ 1 (x) + φ (x) + R { φ (x)φ 1 (x)e i ωt}] dx = 16L cos( ωt), (6.6) 9π where ω = ω ω 1 is the difference between the energies of the two energy eigenstates in the superposition. Suppose we choose to measure the particle energy instead. The expec- 3

0.4 0. 0. 0.4 Figure 6.1: Probability amplitude and probability distribution for the symmetric superposition state of the ground and first-excited states of the infinite potential well at time t = 0. The horizontal axes is in units of the box width L. Note that the probability to find the particle at L/6 is zero. tation value of the energy is Ĥ = Ψ (x, t)ĥψ(x, t)dx = = 1 = 1 = 1 Ψ (+) (x, t)i t Ψ (+) (x, t)dx = 1 ( ω 1 + ω ), ( φ1 (x)e iω 1t + φ (x)e iω t ) i t ( φ1 (x)e iω 1t + φ (x)e iω t ) dx ( φ1 (x)e iω 1t + φ (x)e iω t ) ( ω1 φ 1 (x)e iω 1t + ω φ (x)e iω t ) dx [ ω1 φ 1 (x) + ω φ (x) + ( ω 1 φ 1 (x)φ (x)e i ωt + ω φ 1(x)φ (x)e i ωt ) ] dx 4 (6.7)

where in going to the last line we used the orthogonality of the energy eigenstates. Note that the expected value is simply the average energy of the two states. Now, what happens on a single measurement when we choose to measure the energy? Since the allowed energy values of the particle confined to the box only take on discrete, quantized values E n, a measurement of the energy can only yield one of these allowed values and nothing in between. Note that the energy expectation value does not equal one of the energy eigenvalues, and thus cannot be observed in a single measurement. However, if we repeat the measurement many times on identically prepared systems, we would find that the average value is given by the expectation value in Eq. (6.7). Starting with the symmetric superposition state, we have equal probability of 1/ to obtain energy value E 1 or E on any given energy measurement. Suppose on a given measurement we find energy E 1. Initially, just before the measurement the particle was in the symmetric superposition state Ψ (+) (x, t) in Eq. (6.5). However, our knowledge of the energy E 1 implies that the particle most certainly must occupy the corresponding energy eigenstate φ 1 (x) just after the measurement. Because of the timeevolution behavior of energy eigenstates as governed by the time-dependent Schrödinger equation, i.e. they evolve in time by gathering an unobservable global phase e iωnt, any subsequent measurement of the particle energy will give the same value of energy! Thus, we can predict with certainty the outcome of any subsequent measurement of the energy, in this case we would find E 1. This should seem strange to you, in the sense that the measurement causes the wave function of the particle to collapse into the eigenstate of the observable being measured. By observing the particle, we can cause irreversible evolution of the state. The collapse of the wave function is one of the key non-classical features of quantum mechanics. The measurement collapse hypothesis can be generalized to observables other than energy as well. For example, measurement of the particle position causes the wave function to collapse into an eigenstate of the position operator. To determine the subsequent evolution of the quantum state after such a measurement requires development of some further mathematics, for example one must define the position eigenstates (these turn out to be what are called Dirac delta functions) and how these evolve in time. This is beyond the scope of the current course, and you will spend a significant amount of time on this in the future. For now, we will summarize the postulates associated with quantum measurements as taught in most schools of thought. This approach to measurement was put forth by Niels Bohr, Werner Heisenberg and Wolfgang Pauli, who were all working in Copenhagen at the time it was developed and is thus known as the Copenhagen 5

interpretation, and formalized by John von Neumann who emphasized the measurement collapse hypothesis. We will then examine the consequences of these postulates on the interpretation of measurement outcomes. (a) Niels Bohr, Werner Heisenberg, and Wolfgang Pauli (b) John von Neumann Quantum Measurement Postulates 1. Single-value measurement outcome: When a measurement corresponding to an operator Â is made, the result is one of the operator eigenvalues, a n.. Wave function collapse: As a result of a measurement yielding eigenvalue a n, the wave function collapses into the corresponding eigenstate of the measurement operator ψ(x, t) φ n (x). (6.8) 3. Outcome probability: The probability of a particular measurement outcome equals the squared modulus of the overlap between the wave functions before and after the measurement. For example, the probability to obtain measurement outcome a n, corresponding to the eigenstate φ n (x) is given by P n (t) = φ n ψ(t) = φ n(x)ψ(x, t)dx. (6.9) A useful corollary to Postulate 3, gives us a recipe to determine the expansion coefficients of an arbitrary state. For example, let us consider a state expanded in the energy eigenstates {φ n (x)}, given by ψ(x, t) = c n φ n (x)e iωnt, (6.10) n=1 6

with expansion coefficients {c n }. Here I have include the time dependence of the energy eigenstates given by the phase factors e iωnt explicitly. (Recall that ω n = E n / is the angular frequency associated with the energy eigenvalue E n.) By taking the overlap of the state in Eq. (6.10) with one of the energy eigenstates, say φ m (x), we find φ m ψ(t = 0) = = = φ m(x)ψ(x, t = 0)dx, c n φ m(x)φ n (x)dx, n=1 c n δ m,n, n=1 = c m, (6.11) where in going from the second to third lines, we used the orthonormality of the energy eigenstates. So we see that the expansion coefficients {c n } associated with the energy eigenstates of an arbitrary state are given by the overlap of the state at t = 0 with the corresponding eigenstates {φ n (x)}. 6. Interpretation of quantum measurement A major challenge associated with the measurement collapse hypothesis is that it is a fundamentally stochastic process, that is the actual measurement outcome is completely random and unpredictable. We can only say with what probability we expect to obtain a particular measurement outcome. Another difficulty lies with the instantaneous collapse of the wave function itself. Once the wave function is known, the Schrödinger equation is completely deterministic in describing its evolution. There is no randomness associated with the state evolution between measurements. Thus, there appears to be two different types of time evolution in quantum mechanics: unitary evolution under the action of the time-dependent Schrödinger equation and collapse associated with measurement. We now want to explore some of the consequences of the measurement collapse hypothesis and how this affects our interpretation of quantum physics. 6..1 Schrödinger s cat An often cited example of the implications and challenges to our classical way of thinking associated with quantum measurement is Schrödinger s cat. This gedanken experiment (thought experiment), was proposed by Schrödinger in 1935 to highlight the apparent conflicts associated with the theory of quantum superpositions and quantum measurement when applied 7

to the macroscopic level of everyday experience. Figure 6.: Schematic of the Schrödinger cat thought experiment. The basic idea for the experiment, which to my knowledge has thankfully not been carried out, consists of placing a live cat inside a sealed steel chamber, which also has a vial of poison that can be smashed by a hammer connected to a Geiger counter, as depicted in Fig. 6.. There is a small amount of radioactive material inside the chamber as well, which can decay and cause the Geiger counter to trigger the hammer. From the amount of radioactive material used and its known half-life, we expect that within one hour there is a 50:50 chance that one atom has decayed. If this occurs, the Geiger counter will trigger the hammer, which smashed the vial containing the poison, and subsequently killing the cat. Prior to measuring the decay, the state of the radioactive atoms must be described as a superposition of decayed and not decayed states ψ atoms = 1 (φ u + φ d ), (6.1) where u and d correspond to undecayed atoms and decayed atoms respectively. The state of the cat being alive or dead is exactly correlated with the state of the radioactive material, so that their joint state is written Ψ atoms+cat = 1 (φ u χ alive + φ d χ dead ). (6.13) Clearly, when we open the box and look inside we will find the cat either dead or alive, depending on whether or not one or more of the atoms have decayed. However, prior to opening the box and looking inside, i.e. making a measurement, the state of the cat and atoms must be given by the superposition state in Eq. (6.). Prior to measurement the state of the cat is therefore blurred it is neither alive nor dead, but in some peculiar 8

combination of both states as depicted in the cartoon in Fig.??. We can perform a measurement on the state of the cat by opening the box to see if it has survived. The major dilemma is concerned with the time at which the measurement, that is the collapse of the state of the cat, occurs. Do we suppose that the measurement occurs at precisely the time that we catch the first glimpse of the cat (either lying there motionless or springing out to greet us), and record the observation dead or alive as appropriate? Or does the cat somehow observe itself? One could easily replace the cat with a friend (as Wigner suggested), in which he suggested consciousness plays an important role in observation. The example of Schrödinger s cat is meant to illuminate the apparent contradiction between our classical notions of reality and measurement and those of quantum physics. Classically, we imagine that measurement simply reveals properties of physical systems, i.e. a physical system has an innate real existence independent of and prior to measurement that is illuminated by the act of measurement. Our classical way of thinking tells us that the cat must surely be either alive or dead before looking in the box. However, this is in contradiction to quantum theory and in some cases does not yield the correct predictions. Furthermore, the notion that there are two different types of time evolution for the quantum state is unsatisfactory. One would like to have a theory in which the state evolution and measurement process are given by the same description. There have been various approaches to resolve these dilemmas, for example, some have argued that quantum mechanics is incomplete and does not fully describe the natural world. Such approaches are collectively known as hidden-variable theories, in which there are certain yet-to-be observed properties of quantum systems hidden from current experiments that determine the properties of physical systems and measurement outcomes. Such hidden-variable theories should reproduce the results of quantum mechanics in every case where these have been confirmed. However, it was shown by John Bell in 1964 that under certain circumstances quantum theory and all local hidden-variable theories predict different results. (Note: Local hidden-variable theories are hidden-variable theories in which the hidden properties of the physical systems are localized to each individual system and do not depend on the properties at another point in space.) This result, known as Bell s Theorem, allows one to experimentally test whether quantum mechanics or local hidden-variable theories accurately predict experimental outcomes. Thus far, no experiments have shown deviations from quantum predictions, thus ruling out all local hidden-variable theories. As we will see in the next chapter, Bell s theorem forces us to make a choice between quantum theory or non-local hidden-variable theories. 9

6.. Many worlds Figure 6.3: Hugh Everett III One approach to alleviate the difficulties associated with the measurement collapse hypothesis is to reinterpret what we mean by measurement. In 1957 a young American physicist, Hugh Everett, introduced what he called the relative state formulation of quantum measurement, in which he assumed that no wave-function collapse actually occurs when we perform a measurement. Instead Everett proposed that the universe splits into multiple universes, or branches, one for each possible measurement outcome. The multiverse, which consists of multiple universes (or branches), each labeled by n, can be described by a single quantum state Ψ = c n ψ n, (6.14) n=1 in which each universe is represented by a state ψ n. This universal wave function Ψ evolves according to the Schrödinger equation, with no discontinuous collapse. In the case of Schrödinger s cat, there are two possible outcomes (dead or alive), and the universe splits into two worlds one in which the cat is alive and another in which it is dead, as depicted in the cartoon in Fig. 6.4. Once the splitting has occurred, not only can the initial state not be reconstructed, but there is no way in which the different branches can interfere with one another. Each branch evolves independently of the others and, as far as it is concerned, its future behavior is the same as if collapse had 10

occurred and the other branches disappeared. Figure 6.4: Many-worlds cartoon depiction of Schrödinger s cat thought experiment. In the many-worlds interpretation, both outcomes actually occur. The point in time at which the measurement occurs the universe splits into two branches corresponding to the two possible measurement outcomes. These worlds have no knowledge of the other world, and cannot interact with one another. This reinterpretation of quantum measurement eliminates the problem of wave-function collapse and reinstates realism to quantum physics, that is, a particle in a superposition of two possible states does actually ends up in both state! These ideas were not well received at the time Everett proposed his relative state framework, leading to his departure from physics research for a career in the defense industry. It was not until the 1960s and 1970s that physicists began to take a serious look at Everett s proposal. It was also during this time that the term many-worlds interpretation was coined by Bryce DeWitt to describe Everett s theory. The many-worlds interpretation of quantum physics may resolve the measurement problem, but it introduces its own interpretation problems. If we accept the premises of the many-worlds interpretation, then one major difficulty that arises is the idea that there are an infinite-number of universes with which we can never interact or gain knowledge about, but nevertheless exists independent of ourselves. This is an extremely uneconomical prediction, which seems to defy the idea of Occam s razor, that is given a choice between theories that predict the same outcomes of events, one should choose the theory with the fewest number of postulates or assumptions. There is a second, more difficult problem with the many-worlds interpretation apart from the multiverse extravagance. The problem resides in the question of how to talk about the probabilities of events when all possible events actually occur. If a particle s spin is either up or down, it makes sense to attribute 11

probability to each outcome (up or down), which can be verified by making measurements on a large number of systems and associating the fraction of up and down outcomes with the corresponding probabilities. However, if the spin is in a superposition of up and down, what does probability mean? The question of probabilities is particularly difficult when we realize that the probabilities postulated in quantum mechanics are not related to the number of branches associated with each measurement outcome. The difficulties of probabilities and the multiverse in the many-worlds interpretation are still open problems. Nevertheless, many well-respected physicists choose to side with the Everett interpretation. 6..3 Copenhagen interpretation The standard interpretation of quantum measurements developed by Niels Bohr, Werner Heisenberg, and Wolfgang Pauli in Copenhagen during the 1930s, is known as the Copenhagen interpretation. The basic idea is that quantum theory is not a theory of reality, but rather a set of mathematical rules that allow us to predict the probabilities of measurement outcomes. The interpretation assumes that the wave function has no counterpart in reality. In other words, the wave function does not represent a physical entity, but is only a mathematical object that enables us to calculate statistical predictions about experiments. The concept of complementarity introduced by Bohr to describe wave-particle duality, plays a central role in the Copenhagen interpretation. Recall that complementarity implies that one cannot observe complementary aspects or properties of a quantum system in the same experiment. For example, in the double slit experiment, if we try to determine the path of the particle through the double slit, we cannot subsequently observe the interference pattern due to the wave properties of the quantum system. Here the wave and particle nature of quantum systems are complementary to one another. Similarly, the position and momentum are complementary observables in that measurement of one destroys any possible knowledge about the other. This led Bohr to the conclusion that not only does it not make sense to discuss measurement of both the wave and particle properties of a system simultaneously, but that these properties do not actually exist independent of the measurement. Unfortunately the Copenhagen interpretation does not really address the measurement problem at all, but circumvents it by an operational approach, that is to say that quantum mechanics does not tell us about what is (no reality associated with the wave function), but only what can be (probabilities for certain measurement outcomes). 1

6.3 Two-level quantum systems There are many situations in which a quantum system can only take on one of two possible values when measured. Examples of such two-level systems include photon polarization, electron spin, and the path of a photon through a beam splitter. Such two-level quantum systems allow simple demonstrations of nonclassical predictions of quantum theory and form the foundations of quantum information. In the latter, a two-level quantum system is known as a quantum bit or qubit. Here we look at three examples of two-level quantum systems and how to calculate the state evolution and measurement outcomes for different experimental scenarios. 6.3.1 Dirac notation Figure 6.5: Paul Adrian Maurice Dirac Before moving to the examples of two-level quantum systems, we should introduce some notation that will make our calculations easier. To simplify many calculations in quantum mechanics, Dirac introduced a shorthand notation that now bears his name. It puts emphasis on the overlap integral of two wave functions ψ(x) and φ(x) ψ φ = ψ (x)φ(x)dx, (6.15) which is the probability amplitude to observe the particle in state ψ(x) when it is initially prepared in state φ(x), as discussed in Eq. (6.9) above. In Dirac notation, the integral on the right is written in the form shown on the left. More generally, the integral operation ψ φ denotes 1. Take the complex conjugate of the object in the first position (ψ ψ ).. Integrate the product ψ φ. 13

This operation has the following simple properties. number and the functions ψ, φ satisfy then the following relations hold If a is any complex ψ (x)φ(x)dx <, (6.16) ψ aφ = a ψ φ (6.17) aψ φ = a ψ φ (6.18) ψ φ = φ ψ (6.19) ψ + φ = ψ + φ (6.0) and (ψ 1 + ψ ) (φ 1 + φ )dx = ψ 1 + ψ φ 1 + φ (6.1) = ( ψ 1 + ψ ) ( φ 1 + φ ) = ψ 1 φ 1 + ψ 1 φ + ψ φ 1 + ψ φ The object ψ (called a bra ) joins in to make the inner product with the object φ (called a ket ) to form a bracket, ψ φ, which is a complex number representing the complex overlap of the two wave functions (or states). Note that an operator acting on a Dirac bra or ket follows from its action on the wave function. For example, the position operator acts by multiplying the wave function by x, as in Eq. (6.). This implies that the expectation value of an operator Â when the system is known to be in state ψ is Â = ψ Â ψ = ψ (x)âψ(x)dx. (6.) The operator acts from left to right on the ket, or similarly on the wave function ψ(x) and not its complex conjugate. For our purposes here, we will only begin to use Dirac notation to simplify the discussion of two-level systems, in which the two possible states, generically denoted a and b are orthogonal. This implies that the state ket for an arbitrary state, can be written in terms of a and b, for example ψ = α a + β b, (6.3) where the expansion coefficients must satisfy α + β = 1 for normalization of the wave function, ψ ψ = α a a + β b b + α β a b + β α b a = α + β = 1. (6.4) 14

Here we use the orthonormality of the wave functions, i j = δ i,j, where i and j label the different states a and b, and δ m,n is the Kronecker delta function, which is zero for m n, and one for m = n. The formalism of two-level systems can also be expressed in matrix notation, in which the state is represented by a vector. The states a and b represent basis vectors a = ( ) 1, b = 0 along with their corresponding conjugates 6.3. Photon beam path ( ) 0, (6.5) 1 a = ( 1 0 ), b = ( 0 1 ). (6.6) Consider a photon confined to occupy a beam path. Such a system is readily created in the laboratory using nonlinear optics. The photon is incident on a beam splitter, which transmits a photon with probability T = t and reflects a photon with probability R = r. Note that since the photon must go somewhere, either transmit or reflect, we have R + T = 1, which is a statement of conservation of probability, photon number, or energy (all equivalent). If we label the input modes a and b, with the output modes c and d, as show in Fig. 6.6. Denoting the photon occupying a given mode j = a, b, c, d by j, then the state of the photon before the beam splitter is given by ψ initial = a. (6.7) The shorthand notation j implies that the photon is localized in mode j = a, b, c, d, which could also be represented with an appropriately defined spatial wave function. The state at the output of the beam splitter is given by a superposition of the photon having transmitted or reflected with appropriate weighting ψ out = t c + ir d. (6.8) The factor of i arise from the π/ phase shift between transmission and reflection from a surface. (For more in depth derivation of the beam splitter input-output relations, please see the excerpt (Beam splitter relations) from R. Loudon s book The Quantum Theory of Light on the course website.) We can thus use the matrix representation of the two-level system to write the input-output relations for the beam splitter as ( ) c d ( t ir = ir t ) ( a b ). (6.9) 15

Figure 6.6: Beam splitter input and output mode labels. Two inputs, a and b, are transformed into two output modes, c and d, by the beam splitter. The transmission (t) and reflection (r) coefficients correspond to the amplitudes (not intensities) and generally depend on many factors such as frequency, incidence angle, and polarization. However, for many cases beam splitters are often designed to take on specific values transmission and reflection coefficients. For example, a 50:50 beam splitter transmits and reflects equal amounts leading to t = r = 1/. The probability to find the photon in output mode c is thus given by the modulus squared of the overlap between the state representing c and the output state P c = c ψ out = c (t c + ir d ), = t c c + ir c d, = t = T, (6.30) where we use the orthonormality of the states c and d to simplify the second line. Note that the probability to find the particle in mode c is given by the expectation value of the operator ˆΠ c = c c. (6.31) This operator projects the state of the photon into mode c ˆΠ c ψ out = c c ψ out = c c (t c + ir d ) = t c, (6.3) and is thus known as a projection operator. It corresponds to a measurement of the photon in mode c. A similar projector exists for mode d. The probability to find the photon in a particular mode is thus given by the expectation value of a projector onto that mode, for example P c = ψ ˆΠ c ψ = ψ c c ψ = c ψ c ψ = c ψ. (6.33) 16

From Eq. (6.8) we see that the beam splitter creates a superposition state of the photon being in two modes (or states), which is a wave phenomenon (superposition of amplitudes). We can observe this wave behavior by interfering the two output paths of the beam splitter, creating a Mach- Zehnder interferometer as in Fig. 6.7. If we vary the path length of the lower interferometer path by a distance L (note that this is not the amount the mirror moves, but is related to it through the angle of reflection), we introduce a phase difference Φ = kl between the upper and lower paths. This gives the following input on the second beam splitter ψ in = t 1 c + e iφ ir 1 d, (6.34) where t 1 (r 1 ) is the transmission (reflection) coefficient of the input beam splitter. The output state of the interferometer is given by ψ out = (t 1 t r 1 r e iφ ) e + i(r 1 t + t 1 r e iφ ) f, (6.35) where we note that the transmission and reflection coefficients for both beam splitters could differ. For 50:50 beam splitters, in which t = r = 1/, this Figure 6.7: Mach-Zehnder interferometer. Two input modes, a and b, are interfered on the first beam splitter, which has transmission and reflection coefficients t 1 and r 1, respectively. Two mirrors direct the output modes of the first beam splitter, c and d, to the inputs of the second beam splitter, which has transmission and reflection coefficients t and r, respectively. The path lengths between the two interior arms (or paths) of the interferometer are initially balanced (equal), but the lower path (c) can be adjusted to introduce an additional path length difference, L, and thus a phase difference Φ = kl. The outputs of the second beam splitter, e and f, will generally be superpositions of the input modes, dependent upon the phase difference Φ between the two interior interferometer arms. 17

simplifies to ψ out = 1 (1 eiφ ) e + i (1 + eiφ ) f, ( ) Φ = ie (sin iφ/ e cos ( Φ ) ) f. (6.36) Now, suppose instead of letting the amplitudes recombine on the second beam splitter, we decide to find out which path the photon takes through the interferometer by looking just before the input of the second beam splitter. What happens to the output state? Well, if the photon is observed in mode c, the state in Eq. (6.34) collapses into just c and similarly for d. The output state will then be just c t e +ir f or d ir e +t f respectively. The interference fringes that depend upon the phase Φ disappear and the wave aspect is washed out. This demonstrates the concept of wave-particle duality and complementarity in a new situation. Note that for a 50:50 beam splitter the following two states are invariant under the beam splitter transformation (i.e. eigenstates of the beam splitter transformation) + = 1 ( a + b ), (6.37) and = 1 ( a b ). (6.38) You should verify that if one of these states is put into a beam splitter, then it emerges from the beam splitter in the same state unchanged. These states are also orthonormal, so that + = + = 0 and + + = = 1. 6.3.3 Spin-1/ system The two paths of a beam splitter or interferometer is a perfectly acceptable two-level system, but really only represents a subset of all possible spatial states that a photon can occupy. For example, there are an infinite number of input modes for a beam splitter that differ by their transverse mode shape. A common example of a naturally occurring two-level system is that of a spin-1/ system, for example an electron or certain atoms have total spin-1/. The concept of particle spin was first introduced by Wolfgang Pauli in 194, as a two-valued quantum degree of freedom associated with the outer electrons of an atom. This allowed him to formulate the exclusion principle, i.e. that two electrons cannot occupy the same quantum state, and describe the structure of the periodic table. (NOTE: When we speak of a quantum state of an electron in an atom, we need only specify a full list of quantum numbers describing the energy level, orbital angular momentum value, orbital angular momentum projection onto the z-axis, spin, and spin projection along the z-axis. These quantum numbers specify the 18

particular wave function of the electrons. This is similar to the way the quantum number n specified the wave function ψ n (x, t) for the particle in a box from Chapter 5.) The nature of this additional degree of freedom was not initially identified. In 195, Ralph Kronig, and George Uhlenbeck and Samuel Goudsmit suggested that Pauli s additional degree of freedom is associated with the self-rotation of the electron, and thus an intrinsic angular momentum. Although strictly speaking this concept is incorrect since the speed at which the particle would have to rotate is much faster than relativity allows, it does give the correct line of thinking. Spin, just as electric charge, is an intrinsic property of elementary (electrons, quarks, and photons for example) and composite particles (protons and neutrons for example), as well as atoms, and is associated with intrinsic angular momentum. For elementary particles, with no known substructure, spin cannot be explained by postulating that such particles are composed of smaller particles rotating about a common center of mass. The spin of elementary particles is a truly intrinsic physical property. From experimental observation we find in nature that elementary particles only have integer (s = 0, 1,, 3,..., known as bosons) or half-integer (s = 1/, 3/, 5/,..., known as fermions) spin. The expectation value of the total spin vector squared is given by Ŝ = s(s + 1), (6.39) where s is often just called the particle spin. In quantum mechanics, the projection of spin angular momentum measured along any cartesian coordinate axis, x, y, or z, can only take on quantized multiplies of. For example, we commonly choose to talk about the spin projection along the z-axis as a matter of convention. The possible values for this spin projector are given by Ŝz = m s, (6.40) where m s = s, (s 1),..., s 1, s, which for a spin-1/ system gives m s = 1/, 1/. Quantization of spin angular momentum is a natural extension of the concept of orbital angular momentum quantization as proposed by Bohr in his model of the hydrogen atom. However, notice that the orbital angular momentum may only take on integer multiples of, whereas spin can also have half-integer multiples in the case of fermions. Particles with charge may also possess an intrinsic magnetic dipole moment associated with the spin. The naive idea is that a spinning charge distribution has rotating current, which leads to a magnetic dipole. Just as the projection of spin angular momentum along a measurement axis is 19

quantized, so too is the associated magnetic dipole moment. The first direct experimental evidence for the quantization of electron spin was the Stern- Gerlach experiment, which set out to demonstrate the quantization of orbital angular momentum associated with the motion of an electron orbiting an atom. However, as we will see below, their anticipated interpretation was incorrect due to the fact that the atomic species they chose, silver, has zero orbital angular momentum. Stern-Gerlach Experiment In 19, two German physicists in Frankfurt, Otto Stern and Walter Gerlach, were attempting to demonstrate the prediction made by Arnold Sommerfeld and Paul Ehrenfest in 1913 that projection of orbital angular momentum along a particular measurement axis should be quantized. However, it was not immediately clear that their results actually showed the existence of electron spin and its quantization. Figure 6.8: Schematic of Stern-Gerlach experiment. An oven emits silver atoms with a range of velocities that are subsequently filtered using a pair of slits to give a fairly uniform atomic beam with velocity v x. The atomic beam is passed through a non-uniform magnetic field derived from a pair of magnets. The magnetic moment of the outer shell electron undergoes a force due to the magnetic field gradient. The direction and magnitude of the force depends on the projection of the magnetic moment onto the z-axis. Classically, one expects a continuous range of magnetic moment orientations along the z axis. However, quantum theory predicts quantized values of the magnetic moment projection, which is experimentally observed. The Stern-Gerlach experiment, depicted in Fig. 6.8 consists of a beam of neutral silver atoms traveling in the x-direction with velocity v x and directed through an inhomogeneous magnetic field B z (z) oriented in the z-direction. Stern and Gerlach anticipated that the magnetic moment of the outer-shell 0

electron orbiting the silver atoms should experience a force due to the inhomogeneous magnetic field, and due to the quantization of orbital angular momentum along the z-axis one should observe three deflected positions for the beam. Prior to discussing the results of the Stern-Gerlach experiment, let us first go through the classical and quantum predictions. Assuming that the electron orbiting the nucleus occupies a circular orbit of radius r, the magnetic moment associated with this motion is given by µ = IA ( ev ) (πr = n ) πr = n e m mvr = n e m L = n e m n = nµ B n, (6.41) where I is the current associated with the electron orbiting the nucleus, given by the charge e, multiplied by the speed v divided by the circumference of the orbit πr. A is the vector associated with the area of the electron orbit, with unit direction vector n, normal to the surface area as depicted in Fig. 6.9. In going from line 3 to 4 we use the relationship between angular momentum for a circular orbit of radius r, and momentum, i.e. L = mvr = rp. In the second-to-last line we have used the quantization of angular momentum, L = n, where n = 1,, 3,..., and in the last line we introduce the Bohr magneton defined as µ B = e m, (6.4) which represents the fundamental unit of magnetic moment. A magnetic dipole moment µ placed in a magnetic field B, will experience a torque τ = µ B. (6.43) Depending on the orientation of the magnetic moment with respect to the magnetic field, the system will have different energy associated with this interaction as depicted in Fig. 6.10. We can define a potential energy associated with this interaction U = µ B. (6.44) Here we see that a magnetic moment aligned along the magnetic field has the lowest possible energy, while a magnetic moment aligned anti-parallel 1

Figure 6.9: Pictoral representation of the magnetic moment associated with the orbital motion of an electron (e in a circular orbit of radius r around the nucleus (+). The electron has velocity v, sweeps out an area A = πr, and the unit vector ˆn corresponds to the direction of orbital angular momentum L = r p. The magnetic moment, µ, points in the opposite direction owing to the negative sign of the electronic charge. with the magnetic has the largest energy. Now, if the magnetic field is not homogeneous, not only will there be a difference in energy associated with different orientations, but also a net force on the magnetic dipole proportional to the gradient of the magnetic field. This can be viewed in terms of a force due to a potential energy F = U, leading to the following force on the magnetic dipole in the Stern-Gerlach experiment F = U = (µ B) = (µ )B = µ z ( z B z )ẑ (6.45) where we have made use of the fact that the magnetic moment is assumed to be constant, and the following vector calculus identity (A B) = (A )B + (B )A + A ( B) + B ( A). (6.46) We have also used the fact that the curl of the magnetic field is zero, since there is no free current or time-varying electric field, and the magnetic field is oriented and varies only along the z-direction. The force acts on the beam only during the time that the atoms pass through the magnetic field gradient, which is equal to t = L/v x, where L is the magnet length, and v x is the velocity in the x-direction of the particles in the beam. Due to the magnetic force on the particle in Eq. (6.45), the beam will gain some transverse velocity in the z-direction and will be deflected at an angle θ on the output. The transverse velocity will be given

Figure 6.10: Energy dependence of magnetic dipole moment orientation in a uniform external magnetic field (B). The lowest energy is associated with a magnetic dipole moment oriented parallel (with) the magnetic field (on the left), whereas the highest energy is associated with a magnetic dipole moment oriented anti-parallel (against) the magnetic field (on the right). by the kinematic relations This leads to a deflection angle as sketched in Fig. 6.11. v z = a z t = F z m t = µ z( z B)L v x m. (6.47) θ = v z v x = µ z( z B)L v xm, (6.48) Figure 6.11: The deflection angle θ from the Stern-Gerlach apparatus is related to the initial velocity in the x-direction and the acquired velocity component in the z-direction, v z due to the interaction of the electron magnetic moment and the inhomogeneous magnetic field. Classical prediction 3

Equation (6.48) implies that the beam is deflected through a range of angles proportional to the projection of the atomic magnetic moment onto the z-axis. Silver has only one valence electron that can contribute to the orbital angular momentum (the filled subshells are completely symmetric and therefore do not contribute to the angular momentum). If we assume a classical model of the atom, in which the electron orbits in a circular orbit with arbitrary orientation of the orbit (the projection onto the z-axis is given by a sinusoidal distribution), we see that the deflection angle should vary smoothly over a range of values, as depicted in Fig. 6.1. Classically, the projection of magnetic momentum onto the z-axis is not quantized. Figure 6.1: Predicted angular deflection probability distributions for classical (top), quantum spin-1 (middle), and quantum spin-1/ (bottom) models. The classical distribution has a continuous distribution across the deflection angles θ. The spin-1 model predicts three discrete peaks, while the spin- 1/ model predicts only two peaks. Stern and Gerlach observed two peaks indicating that the electron does indeed have quantized spin. Stern-Gerlach prediction The predicted results that Stern and Gerlach were hoping to obtain were based on the idea that the orbital angular momentum of the electron in the silver atoms was quantized. Having only one electron in the valence shell, 4

they anticipated that there would be total orbital angular momentum value of. This would then imply the magnetic moment is also quantized, as in Eq. (6.41), and would have three projections onto the z-axis, leading to three deflection angles (one for each of the three projections of the orbital angular momentum onto the z-axis n = +1, 0, 1) θ n = µ Bn( z B)L v xm. (6.49) However, the theory at the time (due to Sommerfeld) predicted two lines, which corresponds to the experimental results which only showed two deflected paths ( up and down ), with no straight through path. At the time, they were satisfied that they had indeed showed the quantization of angular momentum. However, the interpretation that the magnetic moment causing the deflection was due to the orbital angular momentum of the electron was incorrect. Recall that I mentioned briefly that the Bohr model predicts the wrong value of orbital angular momentum for and electron in the ground state it predicts L = for the ground state, but it is actually 0! The same follows for the ground state of silver atoms. It was only years later when the concept of particle spin was introduced, did a satisfactory explanation of the Stern-Gerlach experiment arise. Stern-Gerlach spin-1/ description The total magnetic moment of the silver atoms in the Stern-Gerlach experiment is given by a vector sum of the contributions due to the nuclear spin, electron orbital angular momentum, and electron spins. This turns out to give a spin-1/ system. Thus the correct interpretation of the Stern- Gerlach experiment is not in terms of the quantization of the orbital angular momentum of the valence electron, but the quantization of all contributions to the angular momentum and thus magnetic moment. For a spin 1/ system there are only two possible projections onto the z-axis (m = ±1/), leading to two deflection angles θ = ± µ B( z B)L v xm. (6.50) This explanation matches the observed behavior as depicted in Fig. 6.1 and was the first direct experimental observation of electron spin, although not known at the time. When a particle is detected in the z+ deflected direction, it is said to have spin up, while a particle in the z deflected direction is said to have spin down. The Stern-Gerlach apparatus measures the projection of magnetic moment (and thus spin) along the magnetic field gradient. Subsequent measurements 5

We now want to consider sequential Stern-Gerlach measurements, in which the atomic beam goes through two or more SG magnets. We begin by considering a beam of unpolarized spin-1/ atoms emitted from an oven that pass through a SG apparatus with the field gradient aligned along the z-axis, which we denote SGz as depicted in Fig. 6.13. At the output of the first SG apparatus, the atoms will have split into two beams with equal numbers of atoms in well-defined spin projection onto the z-axis, denoted by z+ and z. We denote the spin states associated with these by S z ; ±, and the measurement performed by the SGz apparatus by Ŝz, which gives Note that these states are orthonormal, that is Ŝ z S z ; ± = ± S z; ±. (6.51) S z ; i S z ; j = δ i,j (6.5) where the Kronecker delta symbol is 0 for non-identical indices (+ or ), and 1 for identical indices. If we block the z output, and send the z+ beam into a second SG apparatus with the same magnetic-field gradient orientation (i.e. we use another SGz setup), then we will only see one beam emerge from the second SG in the z+ port with nothing coming out of z. This is not too surprising if we think of the atom spins are all aligned in the up state before entering the second SGz setup. What happens if we rotate the SG apparatus by 180 around the beam axis, so that the z+ and z outputs are flipped? It turns out that we will again see only one beam emerge, but this time all the atoms will come out from the z port at the top of the apparatus. (To understand why this is the case requires further mathematics to describe how spin-1/ systems transform under spatial rotations. This is quite different from the rotation properties of vectors with which most students are familiar. I will not go into detail about this, but the interested student can see for example J. J. Sakurai s book, Modern Quantum Mechanics, for a nice description of the rotation properties of spin-1/ systems.) Another interesting situation, depicted in Fig. 6.13, consists of an initial SG apparatus again with its magnetic field gradient aligned along z, but the second SG apparatus is aligned along the x-direction, denoted SGx. The z+ polarized beam that enters the second SG setup is split into two beams at the output with well-defined spin projections onto the x-axis, denoted by x+ and x. It is tempting to assume that the beam output from SGz has atoms with well-defined projections of spin onto both the z-axis and the x- axis simultaneously. However, this is not correct as we will see, measurement of the spin component along z is incompatible with measurement along x. This is analogous to trying to measure both the position and momentum of a 6

()%!"#$% &'(% (*% ()% S z ;+ &'(% (*% +,%% S z ;./0%,1%$345% ()%!"#$% &'(% (*% -)% S x ;+ &'-% -*% S x ; 6.0%,1%$345% 6.0%,1%$345% ()%!"#$% &'(% (*% -)% &'-% -*% ()% S z ;+ &'(% (*% S z ; 768.0%,1%$345% 768.0%,1%$345% Figure 6.13: Sequential Stern-Gerlach (SG) measurements for different SG apparatus orientations. An oven emits unpolarized (randomly oriented spin) atoms into a SG apparatus with its field gradient oriented along the z-axis (denoted SGz). Blocking the z port, we create an atomic beam with well-defined spin projection along the z-axis in the state S z ; +. If we pass this through a second SGz setup, we only find S z ; + again (top). However, passing the z+ polarized atoms through a SG apparatus with its field gradient along the x-axis, we find both x+ and x at the output (middle). If we further take only the S x ; + output from the SGx setup, and pass this through an SGz, we find both z+ and z polarized atoms at the output. particle simultaneously. We can only measure the spin component along one axis, and subsequent measurement of the spin projection along another axis will give a random value. This means that we cannot obtain simultaneous knowledge of the spin projection along both the x- and z-axes. To understand the incompatibility of measuring the spin projection along the z-axis or x-axis, we need to determine the spin state associated with the SGx setup in terms of the eigenstates of the z-axis spin measurement Ŝ z, i.e. S z ; ±. Again, this requires an understanding of how spin-1/ systems transform under rotations, and I will only state the result. The spin eigenstates for a SG apparatus rotated by an angle θ with respect to the z-axis, which we will denote S θ ; ±, are given by and S θ ; + = cos S θ ; = sin ( θ ) S z ; + + sin ( θ ) S z ;, (6.53) ( ) ( ) θ θ S z ; + cos S z ;. (6.54) From these we can see that the spin eigenstates associated with projection 7