HPS 0410 Einstein for Everyone

Back to main course page

Origins of Special Relativity

John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh

Background reading: J. Schwartz and M. McGuinness, Einstein for Beginners. New York: Pantheon.. pp. 1 - 82.

We now take Einstein's special theory of relativity for granted. The evidence in its favor is now quite massive, so that there is little license for skepticism. Our real task is to learn the theory and there are many text books that develop it in an easy to understand fashion.

In 1905, however, when Einstein first introduced it, it was a strange and even shocking theory . Then Einstein did not have the luxury of a simple text book on special relativity from which he could learn the theory. Somehow he had to see that such a theory was needed. And then he had to devise the theory and know it was not crazy speculation. How did he do it? That is the present topic--the history of Einstein's discovery of special relativity. We shall see that Einstein had no crystal ball. He worked with resources and methods available to everyone. That is the fascination of the episode. We shall see how he took the same pieces everyone had and assembled a masterpiece where everyone else faltered.

Three Components

We shall see three components in Einstein's discovery:

While all three had a role in Einstein's discovery, we shall see that the last was the most decisive. Unfortunately this is often overlooked in accounts of the origins of Einstein's theory. Einstein's engagement with current experiments and his facility in philosophical analysis are important. However special relativity would not have come about at all were it not for the particular problems in electrodynamics addressed by Einstein and which demanded a radical solution.

Origins of the Principle of Relativity

The principle of relativity tells us that we cannot detect our uniform motion. That idea became important to physics in the seventeenth century. After Copernicus, it gradually became accepted that the earth was not motionless at the center of the universe. Instead it spun on its axis and orbited the sun. Yet, as the ancient Greeks were quick to point out, if the earth moved, why didn't we have some sensation of the movement?

If Copernicus' idea was to survive, physics would have to be renewed so that one's own motion would be undetectable; that is, so that it satisfied a principle of relativity. As far as observable things were concerned, the physics Newton developed in the seventeenth century satisfied this principle. For example, he associated forces with acceleration and not simply motion. So, no matter how fast a body moved, as long as it was not accelerating, no force acted on it.

Light

What altered this happy arrangement in the nineteenth century were advances in the theory of light. Newton has supposed that light consisted of rapidly moving corpuscles; they obeyed the principle of relativity as much as anything else in his universe. Following work of Fresnel and others early in the nineteenth century, this account was replaced by one of light as a propagating wave.

If light was a wave, it was assumed that the wave must be carried by some medium, just as sound waves are carried by air. That medium was known as the luminiferous (=light bearing) ether. So the moving earth was now supposed to be moving through a medium that must stream past the earth, much as water streams past a boat moving through the ocean.

Ether Current Experiments Fail

This ether now made plausible that our planet's absolute motion might be detectable by experiments on the earth. All we had to do was to seek to see the current of ether flowing past. It proved quite easy to devise experiments to do this. Recall that the ether carries light waves, much as air carries sound waves or water, water waves. So if the ether is flowing past us, that flow ought to be revealed in measurements on light.

A series of experiments were devised in the 19th century to detect this ether current. The striking result of all these experiments was that the flow of ether had no effect on optical experiments. In that sense, all the experiments failed. Curiously, it was as though the earth just happened to be at perfect rest in the ether. In retrospect, this is a puzzling outcome. At the time, however, there was nothing like the sense of crisis you might expect. Rather it had become a simple regularity of experiment that the ether drift was invisible to us.

The experiments could be catalogued according to their sensitivity. The least sensitive and easiest to conduct were so called "first order" experiments. Many were undertaken and all failed to demonstrate an ether current.

That failure rapidly ceased to be mysterious. It could be explained by a single hypothesis, the Fresnel "ether drag" hypothesis. That supposed that the ether was dragged by optically dense media--the lenses and other media used in optical experiments--by an amount tuned directly to the medium's refractive index. It turned out that amount could be selected so that it would exactly cancel out any possible first order effect of an ether current.

The odd thing, however, about Fresnel's formula was that it meant that light of different frequencies would be associated with different amounts of ether drag. In effect that meant that each frequency of light had its own ether, and that was troubling in the 19th century.

Michelson Morley Experiment

After first order experiments came second order experiments. These were a great deal more sensitive to any ether current. They were, however, also a great deal harder to carry out. There was only one successfully executed in the 19th century, the celebrated experiment of Albert A. Michelson and Edward W. Morley of 1887 that completed Michelson's earlier efforts at such an experiment. Indeed the experiment was so difficult that Michelson won the Nobel prize principally for his highly sensitive optical interferometer used in the experiment.

The basic idea of the experiment is that light moves differently on a moving earth according to whether it propagates transverse to the direction of the earth's motion or parallel to the direction of the earth's motion. In the first case the ether current flows across the propagating light, slowing it a little. In the second case, it provides a kind of head wind that slows the light more or a tail wind that speeds it up.

Here is a schematic picture of the way the experiment sought to look for these differences.

MM apparatus at rest

A light source sends a beam of light to a half silvered mirror that splits the beam in two. One half continues in the same direction; the other is sent off at 90 degrees. They both strike mirrors at equal distances which reflect them back to a place where they can be viewed. That the mirrors are placed at equal distances from the half-silvered mirror is represented by the two rods of equal length in the figure that connect them.

You can grasp the way the experiment works most simply if you imagine not a beam of light, but merely a pulse of light, as shown in the figure. Since the distances to the two mirrors are the same, the two pulses will require the same time to traverse the distance out and back and they will be detected at the same time.

In practice, pulses are not used. A steady lightbeam is used. Any difference in propagation time will be manifested by the peaks and troughs of the waves misaligning when they are combined at the detecting screen. The combining of these two waves produces interference fringes at the detecting screen. So any change in the alignment is revealed as a change in the interference fringes.

In use, the apparatus is turned very slowly so that the ether current passes over it from successively different directions. During this turning, the ether current affects the light traveling in the two directions differently and these changes are expected to be manifested as changes in the observed interference patterns.

MM moving classically
Imagine, for example, that the horizontal direction in the figure below aligns with the direction of motion of the earth in the ether. Then, thinking classically, we expect the ether current to slow the travel time of a light pulse making the round trip in the direction transverse to the ether current. The net effect of of the ether current on the pulse that makes the round trip parallel to the ether current is an even greater slowing. So, as the figure shows, by the time the transverse pulse reaches to detector, the longitudinal pulse is still traversing the apparatus.

These difference in arrival times will change as the apparatus rotates and they will be manifested as changes in the observable interference fringes.

The result was negative. Michelson and Morley found shifts in the interference fringes, but they were very much smaller that the size of the effect expected from the known orbital motion of the earth.

The Failures are Explained by H. A. Lorentz

The outcome of the 19th century tradition of experiments aimed at detecting the ether current was negative. The wave theory of light of the 19th century depended upon this ether. It was what carried the light wave, just as air carries sound waves. Yet no experiment could show the direction or magnitude of the ether current.

The puzzle was deepened and broadened by the end of the 19th century through the assimilation of optics into Maxwell's theory of electric and magnetic fields. In the 1860's, Maxwell showed that a light wave is really a wave of electric and magnetic fields, an electromagnetic wave. So now the luminiferous ether was also the ether that carried these fields.

How is it possible for Maxwell's electrodynamics to be based fundamentally upon the notion of an ether, yet no experiment can reveal the magnitude and direction of the ether current? This was the problem taken up and solved brilliantly by the great Dutch physicist H. A. Lorentz

Lorentz first simplified Maxwell's theory into the form that it is routinely taught today. All matter, he proposed, simply consists of electric charges (called "ions" or "electrons") in the empty space of the ether. He then proceeded to show how electrodynamical theory could explain the failure of the experiments to produce a result.

If an optical medium just consists of such charges, Lorentz could show that an electromagnetic wave propagating through it would be affected in exactly the way Fresnel's ether drag hypothesis required. The hypothesis was no longer a supposition but a demonstrated result in electrodynamics. That explained why all first order experiments failed.

The second order Michelson Morley experiment was a little harder. There was a solution suggested by the fact that classically light needs more time to make the longitudinal round trip than the transverse one. So what if the apparatus contracted in length longitudinally. Then the longitudinal pulses would need less time to make the round trip and negative result could be restored. The result would look something like this:
MM contracted

Amazingly what Lorentz was able to show was that Maxwell's theory of electromagnetism predicted precisely this much longitudinal contraction. More precisely, he could show this result. If an object consists just of electric charges positioned in equilibrium under electric and magnetic forces, then when that object is set in motion, it contracts by just the amount needed to ensure a negative result for the Michelson Morley experiment.

The catch was that matter probably couldn't consist just of electric charges held by electric and magnetic forces. There had to be other forces as well. They had to be there, for example, to prevent Lorentz's electrons blowing themselves apart under the mutual repulsion of the like charges in different parts of an electron. So Lorentz simply supposed that these other forces would behave just like electric and magnetic forces and yield the same result.

The 20th century opened with the Maxwell-Lorentz theory of electrodynamics as the most successful physical theory of the era. While that theory was based essentially on the existence of an ether, the failure to detect ether currents was no longer a puzzle, but a prediction of the theory. Lorentz showed that the theory entailed effects whose combined import was to make the ether current invisible and the absolute motion of the earth undetectable by us. We might be moving through the ether at some definite speed and in some definite direction. But the physics of electrodynamics conspired to prevent us ever measuring that speed and direction.

A final remark: the schematic drawing of the Michelson Morley experiment above may seem oddly familiar. In fact we have already seen its essential content before. The two arms of the apparatus are light clocks. You will recall that we computed the relativistic contraction effect from the condition that moving light clocks, one transverse to and one parallel to the direction, of motion must tick at the same rate. This is the same contraction that figures in Lorentz's account.

Two light clocks

Highlights of Einstein's pathway to special relativity

It was against the background of these developments that Einstein discovered the special theory of relativity in 1905. The discovery was not momentary. The theory was the outcome of, in Einstein's own reckoning, seven and more years of work. He even places one of his early landmarks in a thought experiment he had at the age of 16, in 1896, nine years before the year of miracles of 1905. Unfortunately we have only fragmentary sources to document the years of this struggle. Below I identify a few of the major ones.

The story of Einstein's discovery of special relativity has exercised an almost irresistible fascination on many, in spite of the dearth of sources. So, if you read more widely, you will see much speculation over how to fill in the blanks between the known landmarks and even over which are the important landmarks. Some of it is responsible; some is not.

Chasing a beam of light

Writing a half century later in 1946 in his Autobiographical Notes, Einstein recounted a thought experiment conducted while he was a 16 year old student in 1896 that marked his first steps towards special relativity.

"...a paradox upon which I had already hit at the age of sixteen:

If I pursue a beam of light with the velocity c (velocity of light in a vacuum), I should observe such a beam of light as an electromagnetic field at rest though spatially oscillating.

There seems to be no such thing, however, neither on the basis of experience nor according to Maxwell's equations.

From the very beginning it appeared to me intuitively clear that, judged from the standpoint of such an observer, everything would have to happen according to the same laws as for an observer who, relative to the earth, was at rest. For how should the first observer know or be able to determine, that he is in a state of fast uniform motion?

One sees in this paradox the germ of the special relativity theory is already contained."

The basic thought is clear. If Einstein were to chase after a propagating beam of light at c

propagating light

he would see a frozen light wave

frozen light wave

and that Einstein deemed impossible.

At first it seems that is will be simple to figure out just what is worrying Einstein. He states a few simple reasons. I don't want to go into them here since they actually turn out to be rather hard to disentangle. My best effort to disentangle them is given at "Chasing a Beam of Light:Einstein's Most Famous Thought Experiment," http://www.pitt.edu/~jdnorton/Goodies/Chasing_the_light

Magnet and conductor

Einstein's thinking evolved from this early, youthful flight into richer and technically more detailed scrutiny of motion in Maxwell's electrodynamics. Einstein initially took the idea of an ether state of rest seriously and conceived experiments that were designed to reveal the earth's motion through the ether.

These thoughts eventually took a very different turn with Einstein deciding that the ether state of rest had no place in electrodynamics and that the principle of relativity was to be upheld. The decisive moment seems to have come with a thought experiment, the magnet and conductor, that is recounted in the opening paragraph of Einstein's 1905 paper.

The simple idea behind the thought experiment is that Maxwell's electrodynamics treats a magnet at rest in the ether very differently from one that moves in the ether. A magnet at rest is surrounded a magnetic field only.
magnet at rest

However, if the magnet moves through the ether, things are very different. In addition to the magnetic field, a new entity comes into being around the magnet, an induced electric field.
moving magnet

This difference between the two cases seems to provide an unequivocal marker of motion through the ether--or so it would seem. To determine if a magnet is moving absolutely through the ether or not, one merely needs to look for that induced electric field. That is easy to do. An electric field accelerates electric charges, such as the conducting electrons in a piece of wire, a conductor. So all that has to be done is to place a conductor near the magnet, as the figures show, and to look for an induced electric current. If there is one, then there is an induced electric field and magnet is moving; if there isn't one, then the magnet is at rest in the ether.

It all seems so simple. But it doesn't work. The simplest situation arises if we attach the conductor to the magnet so that it moves or rests with the magnet. If the magnet is at rest in the ether, then there will be no current in the conductor. So far, it is as expected. But if the magnet and conductor move together an extra complication enters. Because the conductor is now moving absolutely in a magnetic field, another part of Maxwell's theory tells us that a second electric current will be induced in the conductor. Remarkably that second current flows in the opposite direction to the one produced by the electric field and it turns out to cancel it out exactly.

The upshot is that checking for an electric current in the conductor fails as a means of distinguishing the absolute rest of the magnet from its motion. In both cases, the current is the same--no current at all. So an Einstein riding with an absolutely moving magnet, would detect no current and find the situation to be indistinguishable from absolute rest as far as the observable currents were concerned.

Einstein rides magnet

More curiously, it is as if the electric field just isn't there for an observer moving with the magnet. But one at rest in the ether would say there is an electric field present.

Einstein watches moving magnet

Einstein later described how this realization had affected him quite profoundly:

"In setting up the special theory of relativity, the following ... idea concerning Faraday’s magnet-electric induction [experiment] played a guiding role for me...

[magnet conductor thought experiment described].

...The idea, however, that these were two, in principle different cases was unbearable for me. The difference between the two, I was convinced, could only be a difference in choice of viewpoint and not a real difference. Judged from the [moving] magnet, there was certainly no electric field present. Judged from the [ether state of rest], there certainly was one present. Thus the existence of the electric field was a relative one, according to the state of motion of the coordinate system used, and only the electric and magnetic field together could be ascribed a kind of objective reality, apart from the state of motion of the observer or the coordinate system. The phenomenon of magneto-electric induction compelled me to postulate the (special) principle of relativity.

[Footnote] The difficulty to be overcome lay in the constancy of the velocity of light in a vacuum, which I first believed had to be given up. Only after years of [jahrelang] groping did I notice that the difficulty lay in the arbitrariness of basic kinematical concepts."

In sum Einstein's lesson was this. Maxwell's theory employed an ether state of rest; but that state of rest could not be revealed by observation. So somehow the principle of relativity needed to be upheld.

And a second moral was an unexpected relativity. Prior to Einstein, it had been thought that whether an electric field is present at some place is an absolute fact. Einstein now concluded that is observer dependent: some observers will judge an electric field to be present; others in a different state of motion will not. This was the first of Einstein's reorganization of our ideas of which quantities are absolute and which relative.

Emission theories of light

The magnet and conductor thought experiment marked the way forward for Einstein. He was to uphold the principle of relativity in electrodynamics. The only obvious way of doing that was to modify electrodynamical theory. As the concluding footnote in Einstein's quote from 1920 above suggests, Einstein could already know one element that must be in the modification. According to Maxwell's theory, light always propagates at c with respect to the ether. That result must change if the theory conforms to the principle of relativity since there will no longer be an ether state of rest against which the motion of the light can be judged.

We know from later recollections what one of Einstein's modified versions of electrodynamics looked like. In that version, the velocity of light is a constant, not with respect to the ether, but with respect to the source that emits the light. Such a theory is called an "emission" theory of light and, if the other parts of the theory are well behaved, will satisfy the principle of relativity.

Einstein later recalled that the theory he developed was essentially that developed later by Walther Ritz in 1908. In Ritz's theory--and thus probably also in Einstein's theory--all electrodynamic action, not just light, propagated in a vacuum at c with respect to the actions source. The essential change is shown in the animation:
In Maxwell's theory, all electrodynamic action, generated by a source charge at some moment, propagates at c from the fixed point in the ether occupied by the source at that moment.
Ritz theory animation
In a Ritz-style emission theory, all electrodynamic action, generated by a moving source, propagates at c from a point that moves at uniform velocity with the source.

Here is a non-animated version:

non-animated Ritz theory

My own best effort to reconstruct of the details of Einstein's theory can be found in "Einstein's Investigations of Galilean Covariant Electrodynamics prior to 1905," Archive for History of Exact Sciences, 59 (2004), pp. 45-105.

Crisis: the relativity of simultaneity

It was a lovely theory. But it didn't work. We can only guess what the problems were. But we know he found many. Indeed Einstein seems to have expended considerable energy trying to figure out if any emission theory might work. His later recollections are littered with different reasons for why no emission at all could do justice to electrodynamics.

An emission theory fails. So Einstein would have found himself in an impossible position. The speed of light cannot vary with the speed of the emitter; presumably it must be a constant, as Maxwell's theory had urged all along. Yet in addition, Einstein was convinced that the principle of relativity must obtain in electrodynamic theory. How can both obtain? They require the speed of light to be the same for all inertial observers?

The footnote already quoted above points us to Einstein's next step.

"The difficulty to be overcome lay in the constancy of the velocity of light in a vacuum, which I first believed had to be given up. Only after years of [jahrelang] groping did I notice that the difficulty lay in the arbitrariness of basic kinematical concepts."

The key to the puzzle is the relativity of simultaneity. If Einstein gives up the absoluteness of simultaneity, then the principle of relativity and the constancy of the speed of light are compatible after all. The price paid for the compatibility is that we must allow that space and time behaves rather differently than Newton told us.

More importantly for Einstein's struggles of that time is an extra bonus: it turns out that within the new theory of space and time of special relativity, Maxwell's electrodynamics does not need to be modified at all. It turns out to be compatible with principle of relativity just as it is. That would have been a very satisfactory outcome for Einstein.

Einstein recounted later the moment of discovery. In a lecture in Kyoto on December 14, 1922, he is reported by Ishiwara, who took notes in Japanese, to have said:

"Why are these two things inconsistent with each other? I felt that I was facing an extremely difficult problem. I suspected that Lorentz’s ideas had to be modified somehow, but spent almost a year on fruitless thoughts. And I felt that was puzzle not to be easily solved.

But a friend of mine living in living in Bern (Switzerland) [Michele Besso] helped me by chance. One beautiful day, I visited him and said to him: ‘I presently have a problem that I have been totally unable to solve. Today I have brought this “struggle” with me.’ We then had extensive discussions, and suddenly I realized the solution. The very next day, I visited him again and immediately said to him: ‘Thanks to you, I have completely solved my problem.”

My solution actually concerned the concept of time. Namely, time cannot be absolutely defined by itself, and there is an unbreakable connection between time and signal velocity.

Using this idea, I could now resolve the great difficulty that I previously felt. After I had this inspiration, it took only five weeks to complete what is now known as the special theory of relativity."

Translation from Stachel, John (2002) Einstein from ‘B’ to ‘Z.’: Einstein Studies, Volume 9. Boston: Birkhäuser, p. 185.

This moment of recognition of the relativity of simultaneity is one of the great moments of discovery in science and, at this moment philosophical reflections played a key role. Absolute simultaneity seems an uncontroversial part of the world. How could we give it up? Einstein had been reading many philosophers, including Hume and Mach. They had stressed that concepts are our servants, not our masters, and they are warranted only in so far as they might be grounded in experience. So was absolute simultaneity grounded properly in experience? Einstein began to think about the experiences that we use to establish simultaneity of events and he realized that it was not. Reading these philosophers gave him the courage to continue and to abandon absolute simultaneity. In its place came the relativity of simultaneity.

For an account of how reading Hume and Mach helped, see my "How Hume and Mach Helped Einstein Find Special Relativity," Prepared for M. Dickson and M. Domski, eds., Synthesis and the Growth of Knowledge: Essays at the Intersection of History, Philosophy, Science, and Mathematics. Open Court, forthcoming.

The turn to principles

The moment of the recognition of the relativity of simultaneity came, in the above account, 5 weeks prior to Einstein's completion of the 1905 paper (and in another 5 to 6 weeks). In these five to six weeks in which he pulled together the pieces of the finished theory, Einstein made one more very significant methodological advance that would forever color how we see relativity theory.

Einstein's pathway to discovery amounted to the recognition that if you take Maxwell's electrodynamics seriously you have to see that built into it is both the principle of relativity and a new kinematics of space and time that supports it. Yet Einstein does not simply argue it that way in the finished paper.

The reason is not hard to see. Prior to, just a few months before completing his 1905 special relativity paper, Einstein had published a paper in which he had foreshadowed the demise of Maxwell's electrodynamics! In his earlier light quantum, Einstein had advanced the astonishing assertion that sometimes light does not behave like a wave as Maxwell's theory demanded; sometimes it behaved like a spatially localized collection of energy.

So how could Einstein now base a new theory of space and time on Maxwell's theory? He knew something was very right about Maxwell's theory. There was also something very wrong about it. How could one theorize in such an unstable environment. The answer came to Einstein, as he reported in his Autobiographical Notes, in a distinction of what he called constructive theory from theories of principle.

"Reflections of this type made it clear to me as long ago as shortly after 1900, i.e., shortly after Planck's trailblazing work, that neither mechanics nor electrodynamics could (except in limiting cases) claim exact validity. Gradually I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. The longer and the more desperately I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results. The example I saw before me was thermodynamics. The general principle was there given in the theorem: The laws of nature are such that it is impossible to construct a perpetuum mobile (of the first and second kind). How, then, could such a universal principle be found?"

In effect, what Einstein saw was that he did not really need all of Maxwell's theory for his new account of space and time. He needed only a few core ideas robust enough to survive the coming quantum revolution. Following the model of thermodynamics, these few core ideas would be advanced as principles from which the entire theory could be deduced.

What could those principles be? The principle of relativity itself was an obvious choice. He also needed something that distilled the relevant essence of Maxwell's electrodynamics. What about the hardest won lesson of his years of work towards the final theory: the recognition that an emission theory of light must fail? That is, that Maxwell's theory was right after all in demanding that that light always propagates at c, no matter how fast the emitter may be moving? That became the second principle, light postulate. Those two principles proved to be sufficient to allow the entire theory to be deduced. Einstein laid out both as his postulates and the theory adopted its now familiar form.

Einstein's 1905 "On the Electrodynamics of Moving Bodies"

Einstein arrived at his "On the electrodynamics of moving bodies," which is my best candidate for the most famous scientific paper ever written.

The paper has several parts. First there is an introduction. It commences with the recounting of the magnet and conductor thought experiment. It then announces the project of solving the resulting problem with a new theory of space and time based on the principle of relativity and the light postulate.

In the first "Kinematical Part" of the paper, Einstein develops the parts of the theory devoted only to space and time. Its first section, "Definition of Simultaneity," Einstein gives his celebrated analysis of the relativity of simultaneity. It is one of the most celebrated conceptual analyses of the century and a model very many others tried to follow.

The second "Electrodynamical Part" proceeds to what must have seemed for Einstein in 1905 to be the real benefit of the paper. He proceeded to show how Maxwell's electrodynamics was already a theory that conformed to the principle of relativity and noted that this fact made solution of some problems in electrodynamics very easy.

For a problem concerning moving systems, such as the reflection of light off a moving mirror, was really the same as another much easier problem with resting bodies, such as the reflection of light off a resting mirror. If you could solve the easy problem, then the principle of relativity let you write down a solution to the harder one almost immediately, just by transforming your viewpoint from one frame of reference to another.

For more on Einstein's discoveries of 1905, see my website.

What you should know

Copyright John D. Norton. January 2001, September 2002; July 2006; January 2, 2007; January 21,February 4, 2008.