HPS 0410 Einstein for Everyone

Back to main course page

Special Theory of Relativity: The Basics

John D. Norton
Department of History and Philosophy of Science
University of Pittsburgh

Background reading: J. Schwartz and M. McGuinness, Einstein for Beginners. New York: Pantheon.. pp. 66 - 151.




"On the Electrodynamics of Moving Bodies"

In June 1905, when Albert Einstein was still a patent examiner in Bern, Switzerland, he sent a paper with this title to the journal Annalen der Physik. It contained his special theory of relativity. He argued that altering our understanding of the behavior of space and time could resolve certain problems in electrodynamics. (See page one in German or English.)

To understand what these alterations were, we need some preliminary notions.

Inertial and Accelerated Motion

There is a preferred motion in space known an inertial motion. Any body left to itself in space will default to an inertial motion, which is just motion at uniform speed in a straight line. The easiest example to visualize is a huge spaceship with the engines turned off, gliding through space. At any point in space, many inertial motions are possible. They will be pointed in different directions and will be at different speeds.

Any other motion is accelerated. This includes motion at uniform speed in a circle. While the speed stays the same, the direction does not. So the motion is accelerated.

Sometimes we will talk of an "inertial observer," which is just an observer moving inertially.

Such an observer might set up an elaborate system of measuing rods and other physical devices to fix the positions of events; and an elaborate system of clocks to fix their timing. Such a system is an inertial frame of reference.
inertial

Absolute versus Relative Motion

Relative motion arises when one body moves with respect to another. For example, our spaceship might move relatively to a nearby planet. absolute1
ABsolute2 Correspondingly the planet moves relative to the spaceship.

Prior to Einstein, it was generally thought that there was another sense of motion, absolute motion. According to this sense, there is a fact of the matter as to whether the spaceship is moving, without regard to whether it moves relative to another object, such as a planet. There is an absolute state of rest in space, according to this earlier view. Either the spaceship is in this state and at rest; or it is not and it is moving.





Einstein found it most convenient to base his theory of relativity on two postulates; once they were assumed it became an exercise in logic to develop the whole theory. The two postulates are
        I. The Principle of Relativity and
        II. The Light Postulate.

I. The Principle of Relativity

All inertial observers find the same laws of physics.

What this says is just this: imagine two spaceships, each moving inertially in space but with different velocities. If we conduct experiments on either ship aimed at determining a law of physics, we will end up with the same law no matter which spaceship we are on.

Or, more simply, the laws of physics simply tell us which physical process can happen and which cannot. So if all inertial observers find the same laws, that just means that any process that can happen for one inertial observer can happen for any other.


Here are some important consequences of the principle:

No experiment aimed at detecting a law of nature can reveal the inertial motion of the observer.
Absolute velocity has no place in any law of nature.
No experiment can reveal absolute motion.

Notice that the principle of relativity is limited to inertial motions. In special relativity, this relativity of motion does not extend to accelerated motion. If something accelerates, then it does so absolutely; there is no need to say that it "accelerates with respect to..." A traditional indicator of accelertion is inertial forces. If you are in an airplane that flies uniformly in a straight line, you have no sense of motion. If the airplane hits turbulence and accelertes, you sense immediately the acceleration as inertial forces throw things around in the cabin.

II. The Light Postulate

All inertial observers find the same speed for light.

That speed is 186,000 miles per second or 300,000 kilometers per hour. Because this speed crops up so often in relativity theory, it is represented by the letter "c".

That Einstein should believe the principle of relativity should not come as such a surprise. We are moving rapidly on planet earth through space. But our motion is virtually invisible to us, as the principle of relativity requires.

Why Einstein should believe the light postulate is a little harder to see. We would expect that a light signal would slow down relative to us if we chased after it. The light postulate says no. No matter how fast an inertial observer is traveling in pursuit of the light signal, that observer will always see the light signal traveling at the same speed, c.


The principal reason for his acceptance of the light postulate was his lengthy study of electrodynamics, the theory of electric and magnetic fields. The theory was the most advanced physics of the time. Some 50 years before, Maxwell had shown that light was merely a ripple propagating in an electromagnetic field. Maxwell's theory predicted that the speed of the ripple was a quite definite number: c.

The speed of a light signal was quite unlike the speed of a pebble, say. The pebble could move at any speed, depending on how hard it was thrown. It was different with light in Maxwell's theory. No matter how the light signal was made and projected, its speed always came out the same.

The principle of relativity assured Einstein that the laws of nature were the same for all inertial observers. That light always propagated at the same speed was a law within Maxwell's theory. If the principle of relativity was applied to it, the light postulate resulted immediately.


A Light Clock

One cannot have both of Einstein's postulates and leave everything else unchanged. We can only retain both without contradiction if we make systematic changes throughout our physics. Let us begin investigating these changes, which include our basic, classical presumptions about space and time. One of them is that we learn that a moving clock runs slower.

light clock 1 To see how this comes about, we could undertake a detailed analysis of a real clock, like a wristwatch or a pendulum clock. That would be difficult and complicated--and unnecessarily so. All we need is to demonstrate the effect for just one clock and that will be enough, as we shall see shortly, to give it to us for all clocks. So let us pick the simplest design of clock imaginable, one specifically chosen to make our analysis easy.

A light clock is an idealized clock that consists of a rod of length 186,000 miles with a mirror at each end. A light signal is reflected back and forth between the mirrors. Each arrival of the light signal at a mirror is "tick" of the clock. Since light moves at 186,000 miles per second, it ticks once per second.

Light Clocks are Slowed by Motion

To see the effect of motion on this light clock, imagine that it has been set into rapid motion. To begin, we will assume that the motion is perpendicular to the rod and that it is very fast--99.5% the speed of light. (We'll write this compactly as "0.995c.") An observer traveling with the clock will still see the light signal bounce backwards and forwards between the mirrors as before. Let us view this process from the perspective of an observer who stays behind and does not move with the clock.

light clock 2

That observer sees a light signal leave one end of the rod and arrive at the other end. But that end is now rushing away from the light signal at 99.5% the speed of light. A quick calculation shows that that the signal will now take 10 seconds to reach the other end of the rod.


To see this, note that in ten seconds the rod will move 1,850,700 miles, as shown in the figure above. So to get to the end of the rod, the light signal must traverse the diagonal path shown. A little geometry tells us that a right angle triangle with sides 186,000 miles and 1,850,700 miles will have a diagonal of 1,860,000 miles.

pythagoras light clock Pythagoras' theorem tells us the diagonal is 1,860,000 miles since

1,860,000 miles2
= 1,850,700 miles2 + 186,000 miles2

Since light moves at 186,000 miles per second, it will need ten seconds to traverse the diagonal.

Setting the arithmetic aside, the result is simple. Since the light signal must travel so much farther to traverse the rod of a moving clock, it takes much longer to do it. So a moving light clock ticks slower. In this case, for a clock moving at 99.5% the speed of light, it ticks once each ten seconds instead of once each second.


All Moving Clocks Are Slowed by Motion

An ordinary clock and a light clock side by side. A simple application of the principle of relativity shows that all clocks must be slowed by motion, not just light clocks. We set a clock of any construction next to a light clock at rest in an inertial laboratory.

We notice that they both tick at the same rate.

That must remain true when we set the laboratory into a different state of inertial motion.

But since the light clock has slowed with the motion, the other clock must also slow if it is to keep ticking at the same rate as the light clock.

You might be tempted to say that the other clock would not keep pace with the light clock. But then you would have devised a device that detects absolute motion, in contradiction with the principle relativity. That device would pick out absolute rest as the only state in which the two clocks run at the same rate.


Moving Rods Shrink in the Direction of Their Motion

So far, we have considered a light clock whose rod is perpendicular to the direction of its motion. If we now consider a light clock whose rod is oriented parallel to the direction of motion, we will end up concluding that its rod must shrink in the direction of its motion. To get to this result, we need two steps:


First Step: Light clocks oriented perpendicular to one another run at the same speed.

Two perpendicular light clocks. Take the light clock considered above. Image a second, identical light clock with its rod oriented parallel to the direction of the motion. Once again the principle of relativity requires that both clocks run at the same speed. We could just leave it at that--an application of the earlier result. However it is reassuring to go through it from scratch.

To begin, we don't need the principle of relativty to see that the clocks at rest run at the same rate. They will run at the same rate simply because they are the same clocks oriented in different directions. That just follows from the symmetry of space. All its directions are equivalent. So the orientation of the clock cannot affect its speed.

Now imagine that we take the entire system of the two clocks and set it into rapid motion at, say, 99.5% the speed of light, in the direction of one of the light clocks.

Two light clocks moving.

An observer moving with the two light clocks must see them continue to run at the same rate. We now do need the principle of relativity to establish this. Our earlier symmetry argument doesn't work anymore, since the two directions of the clocks are intrinsically different. One is perpendicular to the direciton of motion; the other is parallel to it. The principle of relativity requires that they run at the same rate. For, if they ran at different rates, the device would be an experiment that could detect absolute motion.

Second Step: The rod oriented in the direction of motion must shrink.

We know from the earlier analysis that a light clock (indeed any clock) moving at 99.5% the speed of light is slowed so that it ticks only once in ten seconds. So now we know that the light clock oriented parallel to the direction of motion must tick once each ten seconds. But that cannot happen if everything is just as we describe it. Imagine the outward bound journey of the light signal.


Light clock parallel to motion.

The light signal has to go from one end to the other of a 186,000 mile rod. The light moves at 186,000 miles per second. But the rod is also moving in the same direction at 99.5% the speed of light. So the light has to chase after a rapidly fleeing end and will need much more than a second to catch it. With a little arithmetic it turns out that the light will need 200 seconds to make the trip.
But the light clock has to tick once every ten seconds! Something has gone badly wrong. What has gone wrong is our assumption that the rod parallel to the direction of motion retains its length. That is incorrect. That rod actually shrinks to 10% of original length, so the moving pair of clocks really looks more like:
Two light clocks, one shrunk.

Now the light signal has time to get from one end of the rod to the other and keep the clock ticking at once each ten seconds as expected. The signal just has far less distance to travel so now it can maintain the rate of ticking expected.


There are more details in this last calculation that I don't want to bother you with. But since some of you will ask, here they are--but only for those who really want them.

Overall it will turn out that the light signal now needs 20 seconds to complete the journey from the trailing end of the rod to the front and then back. That is what we expect. The round trip journal is "two ticks" and should take 2x10=20 seconds. The catch is that virtually all of the 20 seconds will be spent in the forward trip and virtually none of it in the rearward trip. This effect actually figures in the relativity of simultaneity which we will discuss at some length later.

If you want to see this for yourself you should redo the calculations. If you do, you'll need to undo my rounding off. The rod is not contracted exactly 10%--I rounded things off to keep life simple. It is 9.987%. The ticks are not exactly 10 seconds apart, but 10.0125 seconds. The forward trip will take 19.9750 seconds. The rearward trip will take 0.05 seconds. That gives a total round trip of 20.025 seconds = 2x10.0125 as expected.

The analysis is now complete. We have learned that a clock moving at 99.5% the speed of light, slows by a factor of ten. It ticks once each ten seconds instead of once each second. A rod, oriented in the direction of motion, shrinks to 10% of its length. Rods perpendicular to the direction of motion are unaffected.

The two effects are not noticeable as long as our speeds are far from that of light. They become marked when we get close to the speed of light. The closer we get the the speed of light, the closer clocks come to stopping completely and rods come to shrkinking to no length in the direction of motion. For more details of how the effects depend on speed, see What Happens at High Speeds.

Nothing Can Be Accelerated Through the Speed of Light

The speed of light clearly has a special place in this theory. If something is traveling at the speed of light c, then all observers will find it to be traveling at exactly same speed.

A similar thing happens to things traveling at less than the speed of light. If one observer finds an object to be traveling at less than light, say, then so must every other. There is no way that observers can change their states of motion so as to find the object traveling at faster than the speed of light. And there is a similar result for objects traveling at faster than the speed of light--if such things exist. If one observer finds them traveling at faster than the speed of light, then so must all.

One of light's most important roles as a limiting velocity follows from this: no matter how hard we try, it is impossible to accelerate something through the speed of light. More generally, the speeds of things are divided into three groups:
--things that travel slower than light,
--things that travel at exactly the speed of light,
--and things that travel faster than the speed of light.
We cannot slow down or speed up anything so that it crosses the barrier of the speed of light.

Yet it looks like it would be pretty easy to violate the limiting character of the speed light by accelerating something through the speed of light. We might have a gun that can fire particles at, say, 2,000 miles per second. That is well below the speed of light. We put the gun on a spaceship that we accelerate up to 185,000 miles per second--a mere 1,000 miles per second short of the speed of light. If we fire the gun in the direction of motion, would it not accelerate the particle through the speed of light?

The limiting character of the speed of light is sufficiently striking for it to be worth seeing how it follows from the principle of relativity.

Setting up the Challenge

To see it, let us set up the challenge quite solidly. Imagine that a machine that can fire particles at 100,000 miles per second, which is more than half the speed of light, 186,000 miles per second. Machine that fires particles at 100,000 miles per second
Machine on spaceship traveling at 100,000 miles per second Now we will try to push things past the speed of light. Imagine that the machine is placed on a spaceship that also moves at 100,000 miles per second in the direction that the machine fires the particles; that is, it moves at this speed with respect to a second observer on the earth.
So, let us ask the obvious question. What will the earth bound observer find for the speed of the particle?

The calculation seems irresistible.

The spaceship moves at 100,000 miles per second with respect to the earthbound observer; and the particle moves at 100,000 miles per second with respect to the spaceship. So...
What will the earth bound observer find for the speed of the particle

100,000 + 100,000 = 200,000 ??

But that would be faster than the speed of light, 186,000 miles per second.

Prohibited by the Principle of Relativity

To see that the principle of relativity prohibits this faster than light outcome, imagine that a light signal passes the particle emitting machine at the moment that the particle is emitted. The observer moving with the machine would (obviously) judge that the light signal overtakes the particle. Light signal overtakes particle.
Earth bound observer watches light overtake the particle. Now imagine this same process viewed by the the Earthbound observer.

That observer must also see the light signal overtake the particle.

It is just the one experiment, so both observers must judge the same outcome.

What else would you expect? Might it be that the light signal would overtake a particle emitted by the machine, when the machine is on earth. But when the machine emits on a rapidly moving spaceship, then the particle overtakes the light?

That is exactly what the principle of relativity prohibits! For then we have an experiment that can detect absolute motion. The resting machine emits particles that don't overtake light; the rapidly moving machine emits particle that do overtake light. The principle of relativity demands that the experiment must proceed in the same way when carried out on earth or a rapidly moving spaceship.

Adding Velocities Einstein's Way

What this shows is that the principle of relativity prohibits us adding velocities in the usual way. We cannot add velocities by the ordinary rule 100,000 + 100,00 = 200,000. More generally, the classical rule for the composition of velocities fails:

Velocity of A
with respect to C
= Velocity of A
with respect to B
+ Velocity of B
with respect to C

In its place we need a new rule for the composition of velocities. It ought to look like the ordinary rule as long as velocities are small--we do know that the ordinary rule works for slow moving things like cars on freeways and trains. But it must look very different at high speeds. If we use it to add two velocities close to light, we must get a resultant that is still less than the velocity of light. Einstein found that the principle of relativity forces a particular rule. For the case of velocities oriented in the same direction in space, the relativistic rule for composition of velocities is:

Velocity of A
with respect to C
= Velocity of A
with respect to B
+ Velocity of B
with respect to C
__________________________________
reduction factor
All the work is done in this new rule by the reduction factor. When the velocities are small, this factor is close to 1. So it is as if it isn't really there and Einstein's rule just behaves like the classical rule. But when the velocities get to be close to that of light, the factor starts to get larger and larger and in just the right way to prevent any composition of velocities less than light exceeding that of light.

If we use the rule to add 100 mph to 100 mph, the reduction factor is almost exactly one, so the ordinary rule works: 100 + 100 = 200.

If we use the rule for adding 100,000 miles per second to 100 miles per second, we are now dealing with velocities that are 100,000/186,000 = 0.54 the speed of light. For that sum, the reduction factor is 1.29, so the composition yields:
          (100,000 + 100,000)/1.29 = 200,000/1.29 = 155,000
which is still less than the speed of light.

What is most instructive is to see what happens if we start with a velocity of 100,000 miles; and add 100,000 miles per second to it; and add it again; and again; and again.

To picture physically what we are doing, imagine that we start with our base machine "I" that happens already to be moving at 100,000 miles per second. From it we shoot out a second smaller version of the same machine--call it "II"--at 100,000 miles per second with respect to "I."

vel add 6

Now let's repeat the operation. From the smaller machine "II," we'll shoot out a yet smaller version of the same machine at 100,000 miles per second with respect to "II." Call it "III."

vel add 7

Then machine "III" will shoot out machine "IV"; and so on; and so on. As we pass through the series of machines "I," "II,", "III," "IV," etc., we are boosting each with a speed of 100,000 miles per second with respect to the one before.

The cumulative effect of the repeating boosting by 100,000 miles per second is shown below. The total speed of the last boosted machine increases as we proceed along the sequence "I," "II," etc. But the increases become smaller and smaller.

I. 100,000 II. 155,000 III. 176,000 IV. 183,000 V. 185,100 VI. 185,700
   arrow +100,000 arrow arrow +100,000 arrow arrow +100,000 arrow arrow +100,000 arrow arrow +100,000 arrow

No matter how often we add 100,000 miles per second, we never get past the speed of light--here set at exactly 186,000 miles per second. We get closer and closer to it. But never past it.

One way to think of it is as an "Einstein tax," that copies the way a very severe progressive taxation might increase the amount of tax paid as we get more income. We keep adding 100,000 miles per second to the speed, but the Einstein tax--implemented through the reduction factor--precludes our total speed ever exceeding that of light.

That the ordinary addition rule fails follows from the principle of relativity. Why should the ordinary rule fail? Here's way to get comfortable with the the failure. In the original example, the spaceship observer uses rods and clocks that move with the spaceship to measure the speed of the emitted particle as 100,000 miles per second. The earthbound observer now wants to find the speed of the emitted particle. That observer, however, cannot directly use measurements made with the spaceship rods and clocks, for the earthbound observer thinks that they have shrunk and slowed. The earthbound observer must correct the spaceship observer's measurements for effects such as these. The result of the these corrections is Einstein's formula!

Relativity of Simultaneity

When Einstein first hit upon special relativity, he thought one effect of special importance, so much so that it fills the first section of his "On the Electrodynamics of Moving Bodies." It is the relativity of simultaneity. According to it, inertial observers in relative motion disagree on the timing of events at different places. If one observer thinks that two events are simultaneous, another might not. At first this will seem like just another of the many novel effects relativity brings. However, as we explore more deeply, you will see that this is the central adjustment Einstein made to our understanding of space and time in special relativity . Once you grasp it, everything else makes sense. (And until you do, nothing quite makes sense!)


There is a quick way to see how this comes about. Imagine a long platform with an observer located at its midpoint. At either end, at the places marked A and B, there are two momentary flashes of light. The light propagates from these events to the observer. Let us imagine that they arrive at the same moment, as they do in the animation below. Noticing that they arrive at the same moment and that they come from places equal distances away, the observer will decide that the two events happened simultaneous.

Another outcome is closely related. Imagine also that there are clocks located at A and B. If both clocks show the same reading at the events of the two flashes, then we would judge the two clocks to be properly synchronized. That is what the platform observer judges since, as the animation shows, both clocks read "0" when the flashes occur at each location.

simultaneity 1

Here's a version that isn't animated.

unanimated version



So far, nothing remarkable has happened. That is about to change.

Now consider this process from the point of view of an observer who moves relative to the platform along its length. For that new observer, the platform moves rapidly and, in the animation, in the direction from A towards B. Once again there will be two flashes and light from them will propagate towards the observer at the midpoint of the platform. However the midpoint is in motion. It is rushing away from light coming from A; and rushing toward the light coming from B. Nonetheless, the two signals arrive at the midpoint at the same moment.

animated simultaneity 2



Here's a version that isn't animated.

still simultaneity 2

What is the new observer to make of this? For the new observer, the light from A must cover a greater distance to catch up with the receding midpoint; and the light from B must cover a lesser distance to arrive at the midpoint rushng towards it. So if the two arrive at the same moment, the light from A must have left earlier than the light from B to give it greater time to cover the greater distance to get to the midpoint. That is, the flash at A happened earlier than the flash at B. The two events were not simultaneous, according to the new observer.

The reasoning extends to the clocks. The clocks at A and B show the same time when the flash events happen at each.These two events are not simultaneous for the new observer. Therefore the new observer will judge the clocks at A and B not be properly synchronized. In fact clock A is set ahead of clock B.


In short, the platform observer will say that the two flashes happened simultaneously and that the two clocks are properly synchronized; the new observer will say the A-flash happened first and that the A-clock is set ahead of the B-clock. It is not a matter that one or other of them is somehow misinformed. They are both using the same information. Rather it is that judgements of the simultaneity of spatially separated events depend on the observer, just as the rate of clocks and lengths of bodies depends of the observer in special relativity.

What the Relativity of Simultaneity is NOT

There is a quite benign way in which observers can disagree on the simultaneity of events. It is not the effect at issue. To see the benign way, imagine that a flash of lightning strikes the tree you are standing under. Let us say the strike comprises two events: the flash of the light and the boom of the thunder. For you standing under the tree, if you survive, the two events are simultaneous. It would not appear so for someone standing on a distant hill top watching the lightning strike. That observer would see the flash and then, several seconds later, hear the boom of the thunder. For you the flash and the boom are simultaneous. For the distant observer they are not simultanteous; or, more precisely, they do not appear simultaneous.

This same effect can arise in more abstruse settings. When we look at a distant galaxy 10 million light years away, we are seeing it as it appeared 10 million years ago. So if we see some event occuring now, such as a star in the galaxy exploding, that event really happened 10 millions years ago. It will appear to us that it happened now, at the same time as the events of the present day. In fact it did not.

These two examples both illustrates the oddities of what we can call "appearance simultaneity." Event are simultaneous in this sense, merely if our sensations of them happen at the same moment. Or they fail to be simultaneous in this sense if our sensations of them happen at different times.

That sort of simultaneity is not the sort that is at issue in the relativity of simultaneity. The idea is that we correct for differences in appearance simultaneity. For example, when we hear the boom of the thunder coming after we see the flash of the lightning, we routinely allow for the fact that light travels very rapidly, but sound travels slowly--roughly one mile in five seconds. So even though we sense the flash and boom at different times, we judge the two originating events to be simultaneous.

The relativity of simultaneity of relativity theory arises after we have corrected for the oddities of appearance simultaneity. Even after those corrections have been made, it turns out that observers in relative motion will not agree on the timing of spatially separated events. In the thought experiment above with the A and B clocks, it turns out that no corrections for appearance simultaneity are needed. Since the observer is located at the midpoint of the platform, the flashes of light at A and B are delayed equally. That is why the observer was placed there.

A Final Observation

This special role for the speed of light sometimes arouses special wonder. What is so special about light, we may be drawn to ask, that everything else takes such special note of it? Once one starts along this path, all sorts of confusions may arise. Is it that light is used for communication and finding things out? Does everything somehow respond to how we find things out? Well--you can forget all this mystical mumbo-jumbo, if ever it attracted you. There is nothing special about light. The basic fact is that there is a fundamental invariant (="unchanging") speed that is a property of space and time itself. Light is just something that happens to go as fast as it possibly can and thereby ends up going at that speed.

There's nothing special about light. What is special is the speed at which it goes.


What you need to know:


Copyright John D. Norton. January 2001, August 30, 2002, July 20, 2006; January 8 2007, January 3, 2008.