| HPS 0410 | Einstein for Everyone |
John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh
Background Reading: J. P. McEvoy, Introducing Quantum Theory. Totem. This book covers very similar ground to this chapter, but in greater detail. Read as much as you like!
Each of the theories we have dealt with so far show us how classical theories break down when we proceed to realms remote from common experience. Classical Newtonian physics fails when have systems that travel very fast, of very strong gravity, or very large distances. Special relativity prevails in domains of very high speeds; general relativity in domains of very strong gravitation; relativistic cosmology over enormous distances.
Classical Newtonian physics also breaks down when we consider very small systems, such as individual atoms and the particles from which they are made. Quantum theory gives us our best account of nature in the very small. The standard quantum theory we shall consider here makes no changes to the ideas of space and time of relativity theory. Most standard quantum theories are formulated within spaces and times that conform to Einstein's special theory of relativity or even just to Newton's account. While some versions of quantum theory are set within in the spacetimes of general relativity, a complete adaptation of quantum theory and Einstein's general theory of relativity remains beyond our grasp.
Quantum theory is a theory of matter; or more precisely it is a theory of the small components that comprise familiar matter. The ordinary matter of tables and chairs, omelettes and elephants is made up of particles, like electrons, protons and neutrons. Quantum theory provides us our best account of these particles. It also provides us with an account of matter in the form of radiation, such as light. It is commonly known that light somehow consists both of light waves and also particle-like photons. The notion of these photons comes from quantum theory (and from Einstein directly, who first introduced them in 1905 as "light quanta").
The central novelty of quantum theory lies in the description of the state of these particles. It turns out that this state does not coincide perfectly with any state we are familiar with from classical physics. In some ways, the particles of quantum theory are like little tiny points of matter, as the name "particle" suggests. In others, they are like little bundles of waves. A full account requires us to see that fundamental particles have properties of both at the same time. There is no easy way to visualize this necessary combination; indeed there may be no fully admissible image at all. The problem of arriving at it remains a challenge today. That problem, however, has proved to be no obstacle to the theory itself. Modern quantum theory has enjoyed enormous empirical success, accounting for a huge array of phenomena and making striking predictions.
From a philosophical perspective, the principal difficulty presented by quantum theory is that the picture of matter in the small is quite unlike that of matter in the large, the matter of our ordinary experience. Somehow ordinary states of matter have to come out of the combining of many quantum components. Einstein's long standing hesitations about quantum theory ultimately all derive in one way or another from his dissatisfaction with the way his colleagues accommodated ordinary experiences with quantum theory.
Unlike relativity theory, the birth of quantum theory was slow and required many hands. It emerged in the course of the first quarter of the twentieth century with contributions from many physicists, including Einstein.
At the end of the nineteenth century, matter was understood to come in two forms.
| One was particles, localized lumps of stuff that flew about like little bullets. The best investigated of the fundamental particles was the electron. Thomson had found in 1896 that the cathode rays found in cathode tubes--the precursor of old fashioned glass TV tubes--were deflected by electric and magnetic fields just as if they were tiny little lumps of electrically charged matter. Atoms, a bound collection of various particles, were also particulate in character. | ![]() |
The other form was wavelike matter. The one well-investigated form was light or, more generally, electromagnetic waves. Newton, along with many others in the seventeenth century, had given accounts of light as consisting of a shower of tiny corpuscles. Although wave account had then also been pursued, Newton's corpuscular view remained dominant. That changed at the beginning of the nineteenth century with the exploration of interference effects by Thomas Young and others.
| The most celebrated interference effect arises in the two slit experiment. Waves of light (depicted as parallel wavefronts moving up the screen) strike a barrier with two holes in it. Secondary waves radiate out from the two slits and interfere with each other, forming the characteristic cross hatching pattern of interference. These are the same patterns seen on the surface of a calm pond in the ripples cast off by two pebbles dropped in the water. | ![]() |
![]() |
The essential thing in these interference
experiments is the way the waves combine. The patterns arise because
the waves can add up two ways. In constructive interference, the phase of the waves are such that they add to form a combined wave of greater amplitude. The figure shows the greatest possible effect of constructive interference. All the parts of the two waves line up to interfere constructively everywhere. |
| In destructive
interference, the phases are such that the waves subtract to
cancel out. The figure shows the greatest possible effect of
destructive interference. All parts of the two waves line up in such
a way as to interfere destructively everywhere. In ordinary cases of interference, such as the two slit experiments, both destructive and constructive interference happen in different parts of the region where the waves intersect. That leads to the complicated interference patterns seen. |
![]() |
Interference effects are readily understandable if one thinks of a wave as some sort of displacement in a medium. A water wave in the ocean, for example, consists of peaks and troughs where the sea water is displaced above and below the mean sealevel. If two waves meet and both peaks coincide, the result is a peak with their combined height. That is constructive interference. If a peak and trough coincide, then the two can cancel out. That is destructive interference.
In the nineteenth century, Maxwell found that that explanation of interference so compelling, that he thought it provided good evidence for an ether. Light, he urged, must be a displacement in something if it is to have peaks and troughs that can cancel out. That something, the carrier of the light wave, is the ether. If light were made up of corpuscles, it seemed impossible that one could combine two corpuscles and have them annihilate.
With the demise of the ether theory, it became clear that something more interesting was at hand. The matter of light itself somehow came in a form that it could locally cancel other light waves. That sort of interaction was an early indication of the sorts of interactions that would become commonplace in quantum theory.
This neat division of matter into particle-like and wave-like would not persist. The story of the coming of quantum theory is the story of the breakdown of this division. In the sections to come, we shall see how various clues in the observed physical properties of matter showed that this simple division must fail.
| Ordinary matter Gases, liquids, solids |
Radiative matter Light, radio waves, heat radiation |
|
| View at the end of the nineteenth century |
Particles | Waves |
| Clue that this was too simple | Discreteness of atomic spectra (and more) | Thermal properties of heat radiation (and more) |
| View with the completion of quantum theory |
Both wave and particle properties | Both wave and particle properties |
The first clue that radiation might also have particle light properties came in 1900. It came in apparently innocuous work on heat radiation. This sort of radiation is familiar to everyone. It is the radiation that warms our hands in front of fire, that burns the toast and that provides the intense glare of a furnace. Physicists had been measuring how much energy is found in each of the different frequencies (i.e. colors) that comprise heat radiation. That distribution varies with the temperature of the radiation. As a body that emits radiation passes from red to orange to white heat, the frequencies with the greatest energy change correspondingly.
In 1900, as the newest and latest of the data came in, Max Planck in Berlin was working on understanding the physical processes that led to these distributions of energy. His model of heat radiation was of a jumble of many frequencies of electromagnetic waves that have come to equilibrium in a cavity. The waves are absorbed and emitted by oscillating charges in the walls of the cavity. That way, the temperature of the walls could be conveyed to the radiation itself. The cavity really just is an oven and it is filling the space inside with heat radiation. This radiation inside the cavity was known as "cavity radiation."
If a tiny window was opened in the walls of the cavity, the radiation released would also have the temperature of the cavity. Some clever thermodynamic arguments showed that it had exactly the same composition as radiation re-emitted by a body at that same temperature if that body had the special property that it absorbed perfectly all radiation that fell on it, before re-radiating it. Such bodies are called "black"; so that form of radiation is known as "black body radiation."
Planck found a very simple formula that fitted
the latest experimental results very well. His problem was to tell a
theoretical story about how that formula came about. After some
hesitation, he found such a story. However the essential computation
in his story depended upon a very odd
assumption. (Debate continues today over whether Planck
actually realized how radical this assumption was and how crucial it
was to his account.) Planck modelled the heat radiation as coming
from energized electric resonators. |
![]() |
Deciding what those units were proved to be important. The units of energy were tied to the resonant frequency of the resonator. They were given by Planck's formula:
Energy = h x frequency
That means that the allowed energies are (h x frequency), twice (h x frequency), thrice (h x frequency), and so on.
The letter h stands for a new constant of nature introduced by Planck and now called "Planck's constant." This new constant plays the same of role in quantum theory that the speed of light plays in relativity theory; it tells us when quantum effects will be important. The number is very small, suggesting that quantum effects are to be expected in the small; for example, for ordinary frequencies, units of energy given by Planck's formula will be very small, so we will not notice the granularity it requires when we look at the larger energies of systems ordinary experience. (h = 6.62 x 10-27 erg seconds.)
Planck's original formula applied to the energy of the resonators. He tried hard to confine the discontinuity it suggested to these resonators and even just to the interaction between radiation and the resonators. Over the next decade, other physicists began to see that the discontinuity could not be confined. Computations analogous to those of Planck from 1900 could be applied to heat radiation directly. They drove to the conclusion that Planck's formula applied directly to heat radiation as well. In each frequency, the energy of heat radiation must come in whole units of h x frequency. That conclusion is hard to reconcile with the idea that heat radiation is purely a wave phenomenon.
While Planck may not have recognized how radical his work of 1900 was, Einstein realized that something very odd was afoot with high frequency light and he did it apparently independently of Planck. In 1905 he argued that we needed to change our basic picture of the constitution of radiation.
High frequency light behaves in certain circumstances as it if were made up of spatially localized bundles of energy using (once the notation is adjusted) Planck's formula to give the amount of energy in each bundle. So once again light could be seen, in some ways, as a shower of corpuscles, each corpuscle now with energy equal to h x (frequency of light).
![]() |
The traditional picture inherited from the great achievements of nineteenth century physics was that light is a propagating wave. |
| What Einstein now urged was that high frequency light sometimes behaved as if it were made up of spatially localized bundles of energy. Planck's formula gave the amount of energy in each bundle. So once again light was said to consist of a shower of corpuscles, each corpuscle now with energy equal to h x (frequency of light). | ![]() |
While this seems like a return to a Newtonian particle view, the return was not and could not be complete. For the notion of wave notion of frequency was part of Einstein's hypothesis. And whatever else may come, the experiments on the interference of light remained.
Einstein's core argument was
ingenious. He looked at the observed properties of high frequency light and
noticed they were governed in certain aspects by exactly the same laws that
govern ordinary gases. By reverse engineering those gas laws, Einstein could
show that they depended essentially on gases consisting of very many
spatially localized little localized lumps of matter, their molecules. He
supposed that is was no accident that light and gases obeyed the same laws;
they did, he urged, because the light really was made of little localized
units--called "quanta"--of energy.
For a more detailed account of Einstein's core argument, see the chapter "Atoms and the Quantum," Section 7, "The Light Quantum Paper: Einstein's
Astonishing Idea."
The word "quantum" (plural "quanta") was then just used as a label for a unit of some quantity. In 1905 talk of a light quantum would be understood to be nothing more than talk of a "light unit."
![]() |
The best known part of Einstein's 1905 paper on
the light quantum was an observation made towards the end of the
paper. Einstein had been following experiments on the so-called
"photoelectric effect." In it, light is
used to kick electrons out of an electrically charged cathode.
According to the wave theory of light, the intensity of the light
ought to determine if the light can generate these "photoelectrons."
For more intense light has more energy and energy is what is needed
to liberate the electrons held in the cathode's surface. |
It is easy to diminish the intensity of light. We can, for example, just move the light source far away so that the light energy it emits is spread over a great area. The expectation from the wave theory is that this dimmed light will lose its ability to liberate photoelectrons.
Experiment had shown, however, that the intensity did not matter to the ability of light to produce photoelectrons. All that mattered was the frequency of the light. If light was of low frequency, it could not generate photoelectrons, even if the light were very intense. If the light had a high frequency it could produce photoelectrons, even if the light was of very low intensity.
This, Einstein observed triumphantly, is just what one would expect if light energy were localized in quanta with energy given by Planck's formula. All one had to assume was that a single quanta was all that was needed to generate each photoelectron.
| If the light was of low
frequency, its individual quanta would be of low energy, so no
one quanta would be energetic enough to knock electrons out of the
cathode. Increasing the intensity of the light did nothing more than
increasing the number of light quanta showering on the cathode, all
them too weak in energy to liberate a photoelectron. If the light was of high frequency, then each light quantum was individually energetic enough to liberate a single photoelectron. The intensity of the light did not matter. Low intensity meant that there were not many light quanta incident on the cathode. But since only one light quantum is needed to liberate just one photoelectron, the effect would be there for high frequency light, no matter how weak the intensity of the light. |
![]() |
About fifteen years later in 1921, Einstein won the Nobel prize. His work on the photoelectric effect attracted special mention in the award. The citation read "for his services to Theoretical Physics, and especially for his discovery of the law of the photoelectric effect."
If this corpuscular view of light is so successful, do we need the wave view at all? In 1909, Einstein showed that certain phenomena could only be successfully explained if we used both wave and particle view; the full observed effect came from the sum of two terms, one a particle term, the other a wave term. The need for both is sometimes called "wave-particle duality."
Many of you will want to use the word "photon" interchangeably with Einstein's "light quantum." There is probably not much harm in doing that as long as you realize that the word "photon" comes from a later era in quantum theory. It was introduced by G. N. Lewis in 1926, 21 eventful years later.
When we use the word photon, the natural presumption is that we are referring to the entity that derives from the completed quantum theory of the 1920s and 1930s. When Einstein proposed his light quanta, not even an Einstein could anticipate quite how radically the emerging quantum theory would diverge from classical ideas. Einstein's proposal of 1905 was quite restricted; he posited that the energy of high frequency light was spatially localized into the little lumps he called light quanta. He could not then know how things would transpire for low frequency light. And his proposal of 1905 did not say anything about the momentum of the light quanta. That light quanta also carry momentum was inferred later.
The analysis of heat radiation and the power of light to generate photoelectrons provided the first clues that this wavelike form of matter was not merely wavelike, but also had particle-like aspects as well. What of the particles that make up matter? What of the electrons that Thomson had found in 1896? The clue that they also had wavelike aspects eventually derived from observations in atomic spectra.
If gases are energized by heating or passing an electric discharge through them, they emit light. The orange sodium vapor lamps or bright white mercury vapor lamps used in parking lots employ this mechanism in its simplest form. The reverse process also occurs. Gases will absorb light--that is how they can block transmission of light.
One might expect that such emissions (and absorptions) contain all frequencies (colors)--a perfect rainbow--even if the intensity of light across the spectrum might vary. They do not. Gases are very selective in the frequencies they emit and absorb. They will emit and absorb only a few very particular frequencies. The frequencies emitted form what is called the atomic emission spectrum of the element; and those absorbed form the absorption spectrum. The frequencies in them are distinctive that they can be used as a characteristic signature for identifying an otherwise unknown gas.
| Here is the emission spectrum of hydrogen gas. The light emitted by excited hydrogen has been spread out into its component frequencies by passing it through a prism or diffraction grating. The light then darkens a photographic emulsion in different places according to its frequency.The series of lines shown is the so-called "Balmer series" that appears in the visible and near visible frequencies of light. (Wavelengths are shown in units of Angstroms.) From Gerhard Herzberg, Atomic Spectra and Atomic Structure. Prentice-Hall, 1937. | ![]() |
In 1913, Niels Bohr reported on his efforts to devise a model of the process of light emission from the atoms of elements that would explain the very particular frequencies emitted. The problem proved to be far harder than one would expect. Then, the best model of an atom was Rutherford's nuclear model. According to it, an atom is like a little solar system. It has a massive, but tiny, positively charged nucleus. That nucleus exerts an attractive force on lighter negatively charged electrons that orbit it, rather like the way the planets orbit the very massive sun.
| In the Rutherford model, exciting a gas by passing
high voltage electricity through it would energize the electrons,
which could then move further away from the attractive pull of the
nucleus. When they fell back towards the nucleus, the energy they
gained would be lost as light energy; that emitted light forms the
emission spectrum. The first difficulty was that, as they fell back to the nucleus, they would pass through a continuous range of orbital frequencies and thus emit a continuous range of frequencies of light. There was no way to limit the emitted to light of just a few special frequencies. The second difficulty was more serious. Nothing stops the emission of energy by the electrons through this process of light emission. They would continue to do it until they crashed into the nucleus. According to classical electrodynamics, this would happen very quickly. It was not clear that Rutherford's model allowed matter made of atoms to exist at all. |
![]() |
![]() |
Bohr solved both
problems with a proposal of breath-taking audacity. Classical
electrodynamics was quite clear: an electron orbiting the nucleus is
accelerating and therefore must radiate energy. It would be like a
little radio transmitter, broadcasting electromagnetic waves. In the
process, it must lose energy, fall deeper into the attractive pull of
the nucleus and eventually crash into it nucleus. Bohr simply posited that this was not true. Rather, he asserted that there are stable orbits arrayed around the nucleus in which an electron could orbit indefinitely without losing any energy. |
Next Bohr supposed that light energy is absorbed or emitted
when the orbiting electrons jump up or down
between different orbits. When an energized electron jumps down to a lower
energy orbit, its extra energy is emitted as light with a single frequency.
That single frequency is given by Planck's energy-frequency formula. That is,
the energy of the light emitted when a electron jumps down is equal to h
times the frequency of light emitted.
Having made those assumptions, Bohr could read off the oddest result from the observed atomic spectra. Since only very few frequencies of light were present, it followed that only very few orbits were permitted for the electron. It was as though our sun allowed a planet to orbit where Venus is and where the Earth is; but it prohibited any planet in between.
All that remained was to figure out just which of the many possible orbits are found in this favored set of stable orbits. That was relatively easy to do. The observed spectra gave a complete catalog of the energy differences between these allowed, stable orbits. Each line in the observed spectra resulted from electrons jumping between two specific orbits. It is a numerical exercise to determine precisely which those few orbits are. The calculation was not so different from this exercise in geography. If we are given the distances between every pair of cities in a country, we can use those data to figure out where on the map each city is found. Atomic spectra gave Bohr the energetic distances between his allowed orbits. From those data he could determine the energies and thus locations of those allowed orbits.
| When Bohr did that, he found a very simple way to summarize just which of the orbits were allowed. They were those whose orbital angular momentum came in units h/2π. Just as Planck's relation told us that radiant energy comes in whole units of h x frequency, Bohr now found that orbiting electrons always must have whole units of angular momentum: one h/2π, two h/2π, three h/2π and nothing in between. | We have seen that the ordinary (linear) momentum of a body is just its mass times its velocity. Angular momentum is an analogous quantity that plays an important role in the dynamics of rotating or orbiting systems. For a small mass like a classical electron orbiting a nucleus, it is defined as the electron's mass x radius of orbit x angular speed of electron. |
Bohr's theory was puzzling, even maddening. Just as with Einstein's hypothesis of the light quantum, it seemed to require that classical physical notions both hold and fail at the same time. That was not a comfortable situation. Those discomforts were eclipsed by a brighter fact. Bohr's theory worked, and it worked very well. Observational spectroscopy was providing theorists with an expansive catalog of spectra of many substances under many different conditions. Starting from Bohr's theory, physicists were able to develop an increasingly rich and successful account of them. While it was clear that something was not right, in the face of these successes, it was tempting to postpone asking too pointedly how this goose could keep laying golden eggs.
| The central posit of Bohr's theory of 1913 was that the angular momentum of orbiting electrons came in full multiples -- quanta -- of h/2π. In the years immediately following, that simple condition was expanded into a broader condition that a quantity known as "action" came only in whole multiples for physical systems that returned periodically to the same initial condition. As a result the term "quantum of action" entered the physicists' vocabulary. | This sidebar should contain a brief sentence that gives you a useful idea of the physical quantity "action." Alas, I've been unable to figure out what that sentence might be. It probably doesn't help too much if I tell you that the trajectories of bodies obeying classical physical laws can be picked out as those that render extremal the action added up along the trajectories. Did that help? I didn't think it would. |
Bohr's theory of 1913 and its later elaboration gave a wonderfully rich repertoire of methods for accounting for atomic spectra. They depended on a contradictory mix of classical and non-classical notions. By the early 1920s, the limits of this system began to show and theorists also turned to the task of making some coherent sense of this body of theory that soon came to known as "the old quantum theory."
The major breakthroughs to the "new quantum theory" came in the middle of the 1920s. A number of different theorists found ways of developing coherent theories of the quantum domain; and they all eventually proved to be different versions of the same new theory. Heisenberg, Born and Jordan first developed matrix mechanics. Its basic quantities were infinite tables of numbers -- matrices -- draw as directly as possible from observed quantities like atomic spectra.
Another approach proved equivalent and is easier to picture. It was based on a supposition by de Broglie of 1925 and developed by Schroedinger in 1926. Einstein had show that a wave phenomenon, light, also had particle like properties. Might the reverse be true also? Might particle like electrons also have wave properties?
The hypothesis answered yes. It associated a wave of a
particular wavelength with a particle of some definite momentum.

Here is de Broglie's formula that tells us which wavelength goes with which
momentum:
Notice how similar it is to Planck's formula which relates energy and frequency. Here is Planck's formula again:
energy = h x frequency
The two together form the foundation of the matter wave approach. They tell us how to assign a wave
of some definite frequency and wavelength to a particle of some given energy
and momentum.
The beauty of the matter wave hypothesis is that it explained naturally why only very particular energy states are admissible for electrons bound in atoms. The reason that only few energy states are admissible for these electrons derives directly from the fundamental differences between particles and waves. We can see these differences by considering a very simple case, a particle/wave trapped in a box.
![]() |
To begin, imagine an ordinary, classical particle confined to a box. It bounces back and forth between the walls. Classical physics allows it to move at any speed. As a result it can have a continuous range of different energies. |
| Now imagine instead that we are confining a wave
to the same box. The stable waves that can persist within the box are
so called "standing waves." Anyone who plays a stringed instrument is familiar with them. When a string is plucked or bowed, the base note results from a standing wave whose half-wavelength is the length of the string. There are overtones also formed that give the richness of the sound. These are smaller standing waves, whose wavelengths equal the length of the string, 3/2 that length, twice that length, and so on. The essential condition is that a wave can form as long as it has nodes--the points of no displacement--at either end of the string. The matter waves that can form within the box have the same structure as these tones and overtones. They have wavelengths of once, 1/2, 1/3, ... , times the double width of the box. (We use the double width since standing waves have nodes at each half wavelength.) Each of these waves turns out to have a different energy that depends on the wavelength of the standing wave. Thus only very few definite energies are permitted for the waves trapped in the box; the many intermediate energies between them are not allowed. |
![]() |
What of de Broglie's relation, momentum = h/wavelength? Are we to say that the standing waves in the box have momenta proportional to h/2, h, (3/2)h, 2h, ... etc. corresponding to the above allowed wavelengths. Well--almost. The standing wave with wavelength equal to the width of box could be associated with a particle moving to the right with momentum = h/(width of box) and one moving to the left with momentum h/(width of box). But a standing wave is propagating neither to the right nor to the left. To get the wave to stand still, we form the superposition of these two waves. Superposition allows us to have a wave that is moving both to the left and right at the same time, and thus goes nowhere. See below for more on superposition.
The situation for an electron in a hydrogen atom is essentially the same. The electric attraction of the positively charged nucleus forms a prison that traps the electron, just as the box above traps the wave. The wave in the box may persist only in a few energy states. Correspondingly an electron-wave trapped in a hydrogen atom may persist only in a few definite energy states. These turn out to the be the energies of the stable orbits of Bohr's theory.
While those energies survive, what does not survive from Bohr's theory is the idea of the electron as a spatially localized particle orbiting the nucleus in a classical circular or elliptical orbit, but nonetheless violating classical electrodynamics by not radiating. The space around the atom's nucleus is filled with a standing wave of the electron. Classical electrodynamic theory no longer directly applies; the earlier contradiction with that theory has evaporated.
A distinctive characteristic of waves is that we can take two waves and add them up to form a new wave. That adding of waves is the essence of the phenomenon of interference discussed above. The theory of matter waves tells us that particles like electrons are also waves. So we should be able to add several of them together, just as we could add several light waves together.
When we do this, we form the "superposition" of the
individual matter waves. These superpositions turn out to have a central role
in the theory of matter waves and in quantum theory as a whole. So let us
look at a simple example of superposition. Here are four
matter waves with wavelengths 1, 1/2, 1/3 and 1/4. We will "add them
up," that is, form their superposition, in the same way that we add light
waves.

Notice what happened when we formed the superposition. Each of the four component waves is uniformly spread out in space and has a definite wavelength. That situation starts to reverse in the superposition. The resulting wave is no longer uniformly spread out. It tends to be more concentrated in one place. It also no longer has a single wavelength. The distances between adjacent peaks and troughs differ in different parts of the wave.
This example of superposition will help us resolve a little puzzle in matter wave theory. Recall de Broglie's relation. It tells us that a matter wave with a definite wavelength has a definite momentum.

Where is the particle? The answer can be read from the figure. It is spread throughout space. It has no one position in space; it has all positions.
What wave represents a particle that is spatially localized? Take the extreme case of a particle localized at just one point in space. Its matter wave is just pulse at that point in space.

So now we come to the puzzle: what is the momentum of this spatially localized particle?
The superposition given earlier answers the puzzle. We found that when we took the matter waves of particles with different momenta and added them, we produced a matter wave that was spatially localized. If we had been careful in choosing exactly which matter waves to add, we could find a set that would sum to form a perfectly localized pulse. That set turns out to contain all possible values of momenta.
So the answer to our puzzle is that the pulse is associated with all possible momenta.
These two cases are the extremes. We have a matter wave with a definite momentum but all possible positions; and we have a matter wave with a definite position but all possible momenta. Free, propagating particles in quantum theory are represented by an intermediate case, a wave packet:

We arrive at a wave packet by adding matter waves with a small range of momenta. The resulting packet occupies a range of positions in space and is associated with a range of momenta.
The trade-off we have just seen between definiteness of position and definiteness of momentum is quantified by what is commonly known as Heisenberg's uncertainty principle. For reasons that I will explain shortly, I prefer to call it an "indeterminacy principle." It depends on using a standard statistical measure, the standard deviation, for the uncertainty or indeterminacy or, more colloquially, the spread in a wave packet. The principle asserts:
indeterminacy |
x |
indeterminacy |
is greater than |
h/2π |
This principle tells us that the indeterminacy in position and momentum when multiplied together can never get smaller than h/2π. That means that if we somehow reduce the indeterminacy of the momentum of a wave packet, then we must increase the indeterminacy of the wave packet's position.
Conversely, if we reduce the indeterminacy of the wave packet's position, then we must increase the indeterminacy of its momentum. Just this was the process we saw when we started to form a wave packet by superposing waves of different momentum. As we add more waves of different momentum, we can narrow the spatial spread of the wave packet, but only at the cost of increasing the spread in momentum.
Since h is such a small number, the sorts of uncertainties arising are so small as to be unnoticeable for ordinary objects. It is quite different on an atomic scale.
Take the case of an electron trapped in a hydrogen atom. If the electron is to remain bound to the positively charged nucleus of the atom, it must have a quite small momentum. Otherwise it would readily tear itself away from the nucleus. That is, its indeterminacy in momentum must be sufficiently small for the momentum to be close to zero. It is a simple computation to see how small that must be. If we insert that smallest indeterminacy into Heisenberg's formula, we find the least indeterminacy of the electron's position. That indeterminacy turns out to be roughly of the size of the atom; or, more precisely, of the lowest energy orbit of Bohr's 1913 model.
So the electron is spread over the whole atom; it is futile to look at a particular spot within the atom for the electron. This reflects what we already expected from the use of a matter wave to represent an electron in a hydrogen atom. Bohr's troublesome classical orbits are replaced by waves spread over the space surrounding the nucleus.
This reciprocal indeterminacy of position and momentum is just one of many in quantum mechanics. When two quantities form complementary pairs, the two quantities will enter into analogous indeterminacy relations. There is such a relation, for example, between the energy and timing of a process. There is another between the angular momentum of an object and its angular position. (The angular position of a body is just a specification of the direction in which it lies with respect to some arbitrarily chosen center and axis. Is it in the zero degree position? Or do we find it at 90 degrees?)
This last indeterminacy can be applied to the example of the hydrogen atom. If an orbiting electron is definitely in just one of Bohr's stationary orbits, then its angular momentum has a definite value. As a result of the angular moment-angular position indeterminacy, its angular position must be completely indeterminate. So the angular position of the electron about an axis used to determine the angular momentum is completely indeterminate. That is again just what we would expect when we replace Bohr's point-like electrons with waves.
Why am I avoiding the common talk of "uncertainty" in association with Heisenberg's principle?
Uncertainty over some quantity suggests the quantity has a definite value but that we just do not know what it is. We may be uncertain, for example, about the price of paint at the paint store before we go there to buy paint. There is a definite price all customers are charged; we just do not know what it is.
Now compare that with the price that some very valuable painting may obtain in a coming auction. We do not now know what that price will be; the auction hasn't happened yet. We may say that we are uncertain of the price. But it is a different sort of uncertainty. There is no price now to know. The price will only be determined when the auction actually happens.
In the standard approach to quantum mechanics, the uncertainties of Heisenberg's uncertainty principle are of the second type. When the position of a particle is indeterminate, that means that there is no single position associated with the particle; its wave is spread over many positions. It is not that the particle really has a definite position and we just don't know which it is. It is not that we are uncertain about the position because there are more facts to know about the position. There are no further facts to know.
So talk of "uncertainty" in Heisenberg's formula can be misleading. It suggest that we are just ignorant of something that could be known. It is easy to overlook the second way that we can come to be uncertain: the issue is indefinite and there is nothing more to know.
The standard approach to quantum mechanics derives the uncertainty from indefiniteness. There are other approaches in which this is not so. In one developed by David Bohm, particles always have a definite position and the uncertainties arise from our ignorance. These approaches represent a minority view.
An essential part of quantum mechanics deals with how matter waves change over time. Mostly, matter waves behave just like ordinary waves. If you have ever watched ripples spread on the surface of a smooth pond, you have see at least qualitatively just what matter waves do.
Take a particle that we localize to just one place, so its matter wave is a spatially localized pulse. Left to itself, that pulse will spread out in all directions as propagating waves. It is just like what happens when a pebble hits the surface of the pond. The localized splash immediately spreads out in broadening ripples.

That type of behavior is called "Schroedinger evolution," because it is governed by Schroedinger's wave equation.That equation just says that matter waves propagate like waves.
If Schroedinger evolution were the only way that matter waves could change, we would have some difficulty connecting matter waves with our ordinary experience. Matter waves typically are spread over many positions and are superpositions of many momenta. Yet when we measure them, we always find just one value for position or momentum.
For example, the simplest sort of measurement is to intercept a matter wave with a photographic plate or a scintillation screen that glows when struck by a particle. In both cases, we find that the matter waves yield just one definite position. They give us a single spot in the photograph or a localized flash of light on the screen.
| The screen of an old fashioned TV tube is a scintillation screen. Electrons are fired at it from an electron gun at the rear of the tube. While the electrons are in flight, they retain wavelike properties. Those wavelike properties are essential to an electron microscope, which focusses them like an optical microscope focusses light. | ![]() |
![]() |
When the matter wave of the electron strikes the screen, however, the resulting flash of light reveals just a single position. |
The standard solution to this problem is to propose that there is a second sort of time evolution for matter waves. The first type, Schroedinger evolution, arises when matter waves are left to themselves or when they interact with just a few other particles.
The second type arises whenever we perform a measurement of a quantity like position or momentum. Then the matter wave collapses to one that has a definite value for the quantity measured. If we are measuring the position of the matter wave, it collapses to a localized pulse. If we are measuring momentum, it collapses to a wave with a definite momentum.
This second sort of time evolution is called "measurement" or "collapse of the wave packet."

It is not easy to specify exactly when a measurement evolution will take place. The simplest condition is that it arises in a circumstance in which we are trying to ascertain the value of a quantity. That condition is of no use in theory formation. For matter waves do not "know" what we are intending; they do not choose to evolve in one way or another according to our wishes or interests. The best we can come up with is a simple rule of thumb. Matter waves left to themselves or interacting with just a few particles undergo Schroedinger evolution. Matter waves interacting with macroscopic bodies (such as particle detectors) undergo collapse.
Schroedinger evolution of a matter wave is fully deterministic. That means that if we specify the present state of the matter wave, its future state is fixed completely by Schroedinger's equation.
This determinism of the theory fails when we consider measurement. For when we measure the position of a particle represented by a wave packet, we do not know for sure which position will be revealed. The best we can do is to say which are the candidate positions and, using a standard rule, compute the probability of each.
Thus measurement introduces indeterminism into quantum theory. A full specification
of the present state of the matter wave and everything that will interact
with it is not enough to fix what its future state will be.
![]() |
The rule that determines the probability of each
candidate outcome depends essentially on superposition. Consider, for example, a wave
packet. It is the superposition of many spatially localized pulses.
The figure shows just four of them. In general there are infinitely many. What is important is that the amplitude of the component pulses vary according to the part to which they will contribute in the fully assembled wave packet. A pulse contributing to the large amplitude central section will have a large amplitude. A pulse contributing to the smaller amplitude edges will itself have a smaller amplitude. This last fact is the clue that tells us how to compute the probability of a measurement outcome. We expect the measured position of the particle to appear more probably in the large amplitude center of the wave packet, than in the lower amplitude edges. |
Max Born used this fact when he proposed the "Born rule," that tells us that the amplitude of the component fixes the probability that this component will be the outcome of measurement.
Probability that |
= |
amplitude2
|
| The slight complication in Born's rule is that the amplitudes of the components are not real numbers. They are complex numbers that include things like "i," the square root of minus one and other more complicated things like 1+i and 37 - 10i. Probabilities have to be real numbers between 0 and 1. So Born had to convert the complex-valued amplitudes into a real numbers. There are many ways of doing this. Few give a real number that also obeys all the rules of the probability calculus. Taking the "square" of the amplitude turns out to be the one that works. | For experts only: of course by "square" of a complex number I really mean its "squared norm." That is the number itself, multiplied by its complex conjugate. For z=1+i, the squared norm|z|2 = (1+i)(1-i) = 1-i2 = 2. |
When quantum theory first emerged as our best theory of fundamental particles, the central role of probabilities in the theory caused much concern. The probabilities associated with the collapse of the wave packet were not of the type always formerly seen.
Prior to quantum theory, the probabilities that had crept into physics could always be thought of manifestations of our ignorance of the true state of affairs.
We might not know whether a coin will come up heads or tails when tossed, so we say there is a probability of 1/2 on heads. But that probability merely masks our ignorance. If we knew exactly how hard the coin had been flipped, exactly how the air currents in the room were laid out, and a myriad more other detail, we could in principle determine exactly whether the coin would be heads or tails.
In quantum theory, when the wave packet collapses, we find different probabilities for the different outcomes. But there is no definite fact of the matter over which we are ignorant. There is no one true, hidden outcome. No further accumulation of information could lessen our ignorance. There is nothing more to know. The best we can say is that each of the position measurements are possible and that they will arise with such and such probability.
It is now a little hard to see why this difference in the probabilities led to so much anxiety among physicists in the 1920s and later. All that has happened is that we have found the world to be a little different from what we expected. We may once have thought probabilities to be expressions of ignorance. We now find that they are irreducible parts of the way the world is put together. Their appearance in theory has nothing to do with what we may or may not know. The world just is fundamentally chancy in certain of its aspects.
The reason, I believe, that this irreducibly chancy character of the world created such anxiety is a legacy of nineteenth century philosophy. In the course of the nineteenth century, the notion of causation had been greatly purified by philosophical analysis. The outcome was a lean account of causation as determinism. This causes that simply means that this is invariably followed by that. So for the world to be causal, in this view, simply means that the present state of the world fixes its future state.
The irreducible probabilities of quantum theory showed that the present state of the world does not fix its future state. The best it does is to give probabilities for different possible futures. Therefore, according to the nineteenth century conception, the world is not causal. Thus the physicists of the 1920s frequently lamented the violation of the "principle of causality."
The consensus now is that their notion of causation was far too narrow. There are notions of causation that cohere perfectly well with irreducible probabilities. Quantum theory does not present a challenge to cogency of causation.
That is the majority view. There is a minority view, which I champion. It regards the 1920s failure of the principle of causality as part of a long history of failure. In this view, the effort to find a principle of causality in nature is actually an effort to conceive an a priori science. Processes in nature are interconnected. But it is not our business to legislate in advance the nature of that connectedness. Perhaps it conforms to something like a principle of causality; or perhaps it does not. The long history of our failure to find any well-functioning principle of causality suggests that there is none to be found. It suggests that our efforts are better spent empirically examining how things connect, broadening our conceptions to match and not trying to force them into a mold first devised thousands of years ago. Or that is what I argue in my "Causation as Folk Science." in Philosophers' Imprint, Vol. 3, No. 4.
When will a wave packet undergo Schroedinger evolution or collapse? Earlier, we saw that there is only a rule of thumb to guide us. Schroedinger evolution arises when matter waves are left to themselves or when they interact with just a few others. Measurement arises when a matter wave interacts with a macroscopic measuring device. That means that a matter wave interacting with a photographic plate collapses. Sometimes it is said that the last collapse does not happen until an intelligent human agent actually looks at the plate. That last claim is extremely strange. Are we supposed to believe that human intelligence enters into the time evolution of fundamental particles in the same way as perturbing fields?
The lack of a precise principle to decide which evolution will arise has created a constellation of puzzles known at the "measurement problem." The best known example is "Schroedinger's cat," a thought experiment devised by Erwin Schroedinger in 1935.
To see how it arises, let us first look at how quantum theory treats radioactive decay. The radioactive element Neptunium NP23193 is extremely unstable. It will undergo radioactive decay quite quickly.

It has a "half life" of 53 minutes. That means that if we start with a lump of NP23193 and wait 53 minutes we will have only half a lump left and lot of radioactive decay products.
At the level of an individual atom of NP23193, that means that there is a probability of 1/2 that each individual atom will decay over this 53 minutes. Now, individual atoms of NP23193 are governed by Schroedinger evolution; the probabilities only enter when we measure to see if the atom has decayed or not.
So over 53 minutes the atom evolves into a half:half superposition of undecayed and decayed atom.

The collapse into one or other of these components only arises when we take a measurement. That many happen when we use a Geiger counter to check for radioactive decay products. If we find them, then the atom collapses into the decayed component. Otherwise it collapsed into the undecayed component.
So far everything seems reasonable. What Schroedinger realized was that there was quite some arbitrariness in our division between Schroedinger evolution and wave collapse. It was quite possible for that one collapse to be magnified. The decay products of the one decaying atom might trigger the collapse of others. So instead of having just one atom entering into a superposition over 53 minutes, we might have very many atoms all coupled together entering the superposed state after 53 minutes.
The cat paradox arises when we push this process of amplification to an extreme. Instead of coupling the one atom of NP23193 to a collection of other radioactive atoms, we couple it to the 1025 atoms of cat. The coupling is simple, although cruel. A Geiger counter is set up to sense the decay of the atom. If it decays, the Geiger counter will trigger the opening of a can of poison. The atom, Geiger counter, poison and cat are all enclosed in a box.

We then wait 53 minutes. In that time, the atom evolves into a superposition of undecayed and decayed atom. With it, the poison evolves into a superposition of released and unreleased poison; and the cat into a superposition of live cat and dead cat.

+

At this stage, no measurement has been performed; no human has looked at the Geiger counter or listened for its clicks. So the cat is neither alive nor dead. The evolution, as far as the cat is concerned, is something like this:

What finally decides whether the cat is alive or dead is our observation. After 53 minutes we open the box and observe, that is, "measure," the life state of the cat. Only then does the cat's wave collapse onto one of dead or alive.
There is a widespread sense that there is something wrong with a theory that allows observation to play such an important role. Most people have an instinctive sense that the fact of life or death for the cat is not decided merely by our observation. After 53 minutes, the cat is definitely just one of alive or dead; whether we look in the box does not change that circumstance in any way.
This instinctive reaction is surely correct. However having it really only sharpens the problem. It does not solve it. For the inference that the cat is in a superposition of alive and dead follows directly from quantum theory by merely assuming that the box contains nothing but atoms whose time evolution is governed by Schroedinger's equation.
This paradox of the Schroedinger's cat is the most vivid expression of a lingering problem in the foundations of quantum theory. In the last two decades especially, there has been a huge amount of work devoted to finding variations to standard quantum theory or just new ways to think about the same theory that avoid this problem. There is no consensus on which approach is the correct one or even if some sort of repair is needed.
While Einstein was one of the founders of quantum theory, he gradually evolved into one of its most prominent critics. His hesitations and reservations about the theory are detailed and subtle and I cannot possibly hope to do justice to them here.
The most widely read of Einstein's criticisms of quantum theory came in a 1935 paper coauthored with Podolsky and Rosen and is known everywhere as the "EPR" paper. The paper used the example of two coupled particles to urge that there is more reality to individual particles than the standard theory allowed.
The essential idea of the paper was to examine two coupled particles that are allowed to move apart in space until they are very far from each other. The essence of the EPR argument is that one can ascertain any of the properties of the second particle by making the corresponding measurement on the first. Since measurements on the first particle can be made without in any way affecting the now very distant second particle, EPR urged that the properties ascertained for it must be real.
The EPR paper brought to the fore Einstein's concern with what is real and the related issue of separability. Einstein presumed that very distant objects now have separate states, even if they once interacted. Quantum theory asserts otherwise, a claim that greatly troubled Einstein. He would surely have been even more troubled had he lived to see the experiments that lend direct support to the claim.
Einstein also reacted to quantum theory at a more visceral level. And these reactions are the ones that pertain to the measurement problem. Einstein is legendary for having frequently remarked that he could never believe that "God plays dice." His reference is precisely to the idea that the probabilities of the Born rule are irreducible. He clearly favored the idea that they somehow represented ignorance. His long standing hope was somehow this approach would be vindicated by his decades' search for his fully classical unified field theory.
Other remarks by Einstein clearly expressed great discomfort at the role the observer plays in quantum
measurement. As we saw in an earlier chapter, his collaborator and biographer
Abraham Pais reports:
"...during one walk, Einstein suddenly stopped, turned to me, and asked
whether I really believed that the moon exists only when I look at it."
The famous physicist (and inventor of the name "black hole") John Wheeler
also reported of Einstein:
"...No one can forget how he expressed his discomfort about the role of the
observer, 'When a mouse observes, does that change the state of the
universe?'"
One of the largest of the recent literatures in philosophy of quantum theory has sought to resolve the measurement problem. Generally speaking, most of those responses fall into four groups.
1. Accept the standard
account.
This response essentially urges that the treatment developed in this chapter
is adequate. It is intelligible only in so far as it repeats the rule of
thumb for deciding when measurement collapse occurs. In so far as it tries to
go further and offer more principled grounds, we descend into a darkness
where we are teased by dim lights with names like "complementarity" that are
so distant as to remain obscure.
2. Hidden variable theory.
In this response, we are told that the probabilities of quantum theory are
(as Einstein wanted) merely expressions of our ignorance. The best known and
best elaborated of these approaches is Bohm's pilot wave theory. While the
theory gives an elegant treatment of the simplest case of non-relativistic
quantum mechanics, it is strained to accommodate the later forms of quantum
theory that emerged in the decades following the 1920s.
3. New dynamics.
In this approach, we suppose that the laws governing matter change when we
move from considering just a few particles to the very many that comprise
macroscopic bodies. It turns out that only very slight changes are needed to
eradicate the measurement problem completely and to give macroscopic bodies
properties that are very different from their microscopic constituents. The
principal difficulty with this approach is that no one is able to say just
which of the many possible slight changes is the correct one.
4. No collapse theories.
These theories propose that Schroedinger evolution is perfectly admissible
for both macroscopic and microscopic bodies. It denies that wave packet
collapse is a real process like Schroedinger evolution. The most popular
version of this approach employs the notion that all results of a measurement
are realized in many parallel worlds, all of which we inhabit. This approach
requires some dedication. We must be quite committed to the idea that a
theory devised for tiny particle applies unchanged to macroscopic bodies. For
it requires us to give up the most fundamental aspect of our laboratory
experience, that experiments have single, definite results.
My own feeling is that none of these responses is satisfactory. The least defective is the third. However, if there are new physical laws that would resolve the measurement problem, we can be pretty sure that they are quite exotic and not produced by a small adjustment in our existing theories. For, if these small adjustments are there to be found, eight decades of work by many of the brightest minds in quantum physics has failed to find them.
Copyright John D. Norton. April 2001; March 16, 2008.