OLD ARCHIVED VERSION OF JULY 2010.
PLEASE CONSULT LATEST VERSION.


When a Good Theory meets a Bad Idealization: The Failure of the Thermodynamics of Computation

John D. Norton
Center for Philosophy of Science
Department of History and Philosophy of Science
University of Pittsburgh
http://www.pitt.edu/~jdnorton

Experts who want to see the no go result described in 650 words, should go to "No Go Result for the Thermodynamics of Computation"

A blockA blockA blockblock animated

The thermodynamics of computation seeks to identify the principled thermodynamic limits to computation. It imagines computations carried out on systems so small that their components are of molecular sizes. The founding tenet of the analysis is that all the processes excepting one can in principle be carried out in a non-dissipative manner, that is, in a manner in which no thermodynamic entropy is created. The sole class of entropy creating processes is identified by the central dogma as those that carry out a many-to-one mapping. The universal example of such a process is erasure. It takes a memory device that may be in many different states and maps it to one, a single reset state.

Elsewhere I have argued that the thermodynamics of computation is gravely troubled. Its leading principle, Landauer's Principle, connects erasure to entropy creation. Yet there is still no sound justification for it. The proofs that have been given rest on fallacies or misapplications of statistical and thermal physics.

The purpose of this site is to describe a different problem in the theory that I hope will be of more general interest in philosophy of science. It gives a concrete example of what can go wrong if one uses idealizations improperly.

An ineliminable assumption of the thermodynamics of computation is that one can employ a repertoire of non-dissipative processes at molecular scales. These processes are thermodynamically reversible and, if carried out, would involve no net creation of entropy.

All thermal processes at molecular scales are troubled by fluctuations. They arise because thermal processes are the average of many microscopic molecular processes and the averaging is never quite perfect. Most famous of these is the Brownian motion that leads pollen grains to dance about under a microscope. The dancing arises because the collisions with the pollen grain of water molecules from all sides do not quite average out to leave no effect.

Brownian motion

My contention here is that the reversible processes presumed in the thermodynamics of computation are fatally disrupted by fluctuations that the theory selectively ignores. That is especially awkward since the one occasion on which processes connected to fluctuations are not ignored is when the theory treats the thermodynamics of erasure.

As a result, the founding tenet of the thermodynamics of computation--that erasure-like processes only are necessarily dissipative--arises entirely because fluctuations have been idealized away selectively for other processes. In short the basic conception of the theory itself derives from inconsistently implemented idealizations.

CONTENTS

The account below is written at a more general level for readers that I presume have only a slight contact with the thermodynamics of computation. Little thermodynamics is assumed. But a reader would be well placed just knowing the difference between work and heat and having some sense of how entropy increase can connect to dissipation.

1. How an Idealization Can be Bad

Idealizations are a routine part of science. They are known falsehood, sometimes introduced for theoretical convenience or sometimes out of necessity. The latter arises often in physics, since no theory gives the complete truth. Every theory is an idealization to some degree.

When used wisely, idealizations enable good physics. Galileo was able to develop his theory of projectile motion precisely because he idealized away air resistance. It is the classic case of successful idealization. Their imprudent use, however, can lead to great mischief. Imagine that one tries to give an account of airplane flight that neglects the effect of air passing over the airplane. It is the same idealization as used by Galileo. However now it leads to the disastrous result that sustained flight is impossible for an airplane.

This is one way that an idealization can be a bad idealization: it leads to incorrect results that we attribute unwittingly to the original system, where those results are merely artifacts of the idealization.

My goal here is outline a striking case of a bad idealization in present science. It has created a spurious science, known as the "thermodynamics of computation." The analyses in this science are dependent upon idealizing away thermal fluctuations in an arbitrary and selective manner. If one treats the fluctuations consistently, the results of the science disappear.

One can get an idea in advance of what the problem will be by thinking about scaling. We tend to imagine things remaining pretty much the same as we scale things down. If we were one tenth or one hundredth our size, we imagine that that we could carry on our business pretty much as normal, as long as everything else was scaled down correspondingly.

man man man man man

Famously, those intuitions are flawed. Humans scaled up to the size of elephants would need legs as fat as an elephants' merely to move around. If we were scaled down to the size of fleas, however, our muscles would be grossly oversized. The slender leg of a flea is already powerful enough to let it leap many times its height.

These sorts of problems mount the greater the scaling. When we scale static objects down to molecular sizes, we imagine that they will stay put, just as they did at macroscopic scales. At the larger scale, it is no great feat to stack up a few children's blocks.

A block A blockA blockA block

A block animatedBut if they were scaled down to molecular sizes, the accumulated impact of air molecules on the now miniscule block would not average out smoothly. The block would be jumping about. Simple stacking and all the other building processes that we take for granted at larger scales would be fatally disrupted.

The thermodynamics of computation has proceeded by assuming too often that one can scale down to molecular sizes familiar processes that work well on macroscopic scales. The ones that do not scale down are the non-dissipative processes, for they can be non-dissipative by virtue of being in perfect but delicate equilibrium at every stage. The thermal fluctuations that lead tiny block to jiggle about will disrupt them as well.

2. How Thermodynamics Applies to Computation

2.1 The Landauer-Bennett Orthodoxy

The notion that there is a thermodynamics of computation begins with the fact that all computers are thermal systems. In operation they take work energy, usually supplied as electrical energy, and convert it to heat. This conversion of work to heat is a thermodynamically dissipative one. It is entropy increasing. As a practical matter, compact computing devices of any power require cooling systems to dissipate this heat generation.

motherboard with fan

The thermodynamics of computation seeks to answer the question of whether this dissipative heat generation can be eliminated entirely, at least in principle; or whether some heat generation arises essentially as a part of computation itself.

The reigning answer to this question is based on the work of Rolf Landauer and Charles Bennett. According to what I shall call the Landauer-Bennett orthodoxy, all computational processes can be carried out in principle in a non-dissipative manner excepting one, a many-to-one mapping of states. Erasure is the familiar example of such a process. Thermodynamic entropy is necessarily created by erasure or, equivalently, work is necessarily converted to heat according to:

Landauer's Principle: Erasure of one bit of information creates at least k ln 2 of thermodynamic entropy in the surroundings.

2.2. A One Bit Memory Device

To see how this Principle works, consider a one bit memory device. It is a physical system that admits two distinct states corresponding to the "0" or "1" of the bit stored. In principle, it could be a boulder that is rolled to one or other side of a cavern. This boulder and cavern is a thermal system. As a result, the boulder's energy is fluctuating very slightly as it gains and loses energy in collisions with air molecules and the other systems

Practical computing devices are much smaller than this boulder. In older electronic computing devices, a single bit was stored in the ferrite rings of a magnetic core. The magnetic field of the core could be manipulated electrically to point in one of two directions that corresponded to the "0" or "1" of the stored bit. Modern DRAM chips record a bit according to whether a tiny charge is stored on a capacitor within the chip.

chip

These devices are also thermal systems, constantly interchanging energy with their surroundings. As a result their internal energies are fluctuating. The voltage of the capacitor in a DRAM chip will be bouncing about, for example, and the size of the bouncing will be larger the closer the capacitor's size approaches atomic scales. This will cause no problem as long as the semiconductor walls confining the charge are high enough electrically to prevent the charge escaping.

A single, simple model is used almost universally to capture these thermal aspects of a memory device. It is the one-molecule memory device. Since the thermal physics of fluctuations is robust, as far as I know this particular idealization is benign. The sorts of results that are generated for the one-molecule device will have analogs for the other devices. The latter will just harder to see.

The device consists of a chamber that holds a single molecule. The chamber can be divided into two parts by a removable partition. When the partition is inserted into the chamber, the molecule will be captured on one side. That side represents whether the device is storing an "L" or an "R"; or, more generically, "0" or a "1."

removal and insertion of partition

The chamber is in thermal contact with the surroundings at temperature T and, as a result, both the chamber and its molecule are also at temperature T. The molecule will carry thermal energy of kT/2 per degree of freedom, where k is Boltzmann's constant. For the case of a monatomic molecule--a Helium atom, for example--this thermal energy is (3/2)kT.

Since k is extremely small, this thermal energy is minute. However the molecule is also minute, so that the thermal energy is sufficient to fling the molecule through the chamber. Over time, the single molecule acts like a gas that fills the chamber. Its presence on one or other side is merely a momentary thermal fluctuation. As a result, the partition is an essential component of the device. For without it the molecule would not be confined to one other side of the chamber.

This capacity of a single molecule to fill a chamber like a gas is an extreme manifestation of thermal fluctuations. The more familiar fluctuations consists in the slight jiggling of the volume and constraining piston of an ordinary gas. It is actually imperceptible at ordinary scales:

kinetic gas

As we reduce the number of molecules, the tiny jiggles become relatively larger until they dominate. Here is the case of just four molecules. Fluctuations can lead the four-molecule gas to spontaneously compress to one half its volume. That happens with a probability of (1/2)4, which is 1/16.

four molecules

The motion of the single molecule of a one-molecule gas through its chamber is just an extreme manifestation of these fluctuations in which density fluctuations are wide enough to evacuate one half of the chamber.

single molecule

When the partition is inserted and the gas has been confined to half the volume, all we have done is to "lock in" a fluctuation.

 

2.3 Thermalizing a Memory Device

To arrive at the entropy cost indicated by Landauer's principle, we need to find the entropy associated with a thermalized memory device. This is a memory device whose partition has been removed, "thermalizing" it, so that the molecule is free to move through both sides of the chamber.

That is, we start with a memory device holding some data, such as L.
Lcell

We remove the partition and give the molecule access to the entire chamber.
Lcell thermalizing

The data is now lost and the cell is thermalized.
Lcell thermalized

When we thermalize a binary memory device like this, we increase its entropy by k ln 2. This is an irreversible process that creates entropy of k ln 2. That it is irreversible is easy to see--the expansion of the gas is uncontrolled; there is no balance of forces. As a result we have lost a possibility of gaining work if we had allowed the gas to expand in a way that let us recover work.

Since this quantity of entropy of k ln 2 is of central importance to the thermodynamics of computation, it is worth recounting how it is derived. The derivation proceeds by assuming that there is a reversible expansion that connects the initial and final states; and then, by tracking the heat exchanges, the figure of k ln 2 is recovered. The details are below in the notes "Computing the entropy increase in thermalizing a one-molecule memory cell."

2.4 Arriving at Landauer's Principle

Landauer's principle is commonly justified by distinguishing a memory device with known data from one with unknown or random data. The latter only is treated thermodynamically as the same as a thermalized memory device.

Consider an ensemble of many memory devices. If the devices hold known data--in this case all reset to the left side--the ensemble looks like this.
cells known
"Known" data

If the devices hold random data, then the molecule may be found on the left or the right of the chamber. Then the ensemble might look something like this.
cells randomized
"Random" data

Finally we have an ensemble of thermalized devices all of whose partitions have been removed.
cells thermalized
Thermalized data

These last two ensembles are similar in that, in both cases, we are equally likely to find the molecule on either side of the chamber

This resemblance is taken (mistakenly--see below) to justify the conclusion that the two ensembles are equivalent thermodynamically. This means that a memory device holding random data will have the same thermodynamic entropy as a thermalized memory device. That entropy is k ln 2 greater for each cell than for a memory device holding known data.

This difference leads us to Landauer's principle. An erasure process brings memory devices holding random data back to a state in which they hold known data. That is, during erasure, each memory device has its thermodynamic entropy reduced by k ln 2. The second law of thermodynamics tells us that entropy overall cannot decrease. Hence this reduction in entropy must be compensated by an increase in entropy of the surroundings of at least k ln 2. This is the result claimed in Landauer's principle (for a one bit erasure).

There is a related way of arriving at the same result. The argument from a many-to-one mapping or compression of phase space. In a Boltzmannian approach to statistical mechanics, the thermodynamic entropy of a thermal system is related to the volume of phase space that it occupies according to the relation

S = k ln (phase volume)

In erasing a device with random data, its phase space is reduced by a factor of 2. An entropy reduction of k ln 2 follows. Since entropy overall cannot decrease, this reduction must be compensated by an entropy increase in the surroundings.

2.5 An Aside

My purpose here is not to dispute Landauer's principle. However, I believe that all the argumentation of the last section is flawed:

• An erasure does not compress phase space. It leaves the occupied volume of phase space the same and merely rearranges those parts that are occupied.

• While we may not know which side of a memory device with random data holds the molecule, the resulting probability distribution cannot be associated with a thermodynamic entropy. For probability distributions to issue in thermodynamic entropy, the regions of phase space over which the probability ranges must be accessible to the system point. This condition fails in the above analysis.

It is now a half century since Landauer's first suggestion of a necessary connection between erasure and thermodynamic entropy creation. We still have no cogent justification for his principle, although there have been no lack of attempts to justify it. At best we can say that the principle remains an interesting speculation; at worst, it is a seductive mistake.

2.6 An Erasure Process

What does an erasure process look like? A common example is the following two step process:

1. We remove the partition to thermalize the data.

removal

2. We insert a piston and carry out a reversible, isothermal compression of the space to restore the device to the reference state.

compression

Work kT ln 2 is supplied to the gas in the course of the compression. Since the gas energy remains the same, this work energy is passed to the surroundings as heat. Since the surroundings are at temperature, the effect is an increase in the entropy of the surroundings of kT ln 2/T = k ln 2, which is the minimum amount required by Landauer's principle.

3 A Threat to the Second Law of Thermodynamics?

 

The processes described here offer the possibility of a violation of the second law of thermodynamics. Start with a memory cell in the thermalized state. Insert the partition. The net effect is a reduction of entropy of k ln 2 in apparent violation of the second law of thermodynamics.

This threat to the law is an old one. It arose with the recognition in the nineteenth century that thermal processes had a molecular constitution. Maxwell had described his demon. It was a nimble being who could open and close a door in the partition in a gas-filled chamber, so that faster moving molecules are accumulated on one side and slower moving molecules on the other. The net effect is that the gas on one side of the partition becomes hotter and on the other side, colder. This separation of hot from cold without the expenditure of work violates the Second Law of Thermodynamics.

The problem became acute when Einstein established in 1905 that molecular fluctuations were a real part of experimental physics. One needed only to gaze upon pollen grains under a microscope to see them. Their random jiggling, Brownian motion, results from the accumulated impact of many water molecules.

Brownina

With each abrupt jiggle, some of the heat energy of the water is converted into motion. It is a conversion of heat to work forbidden by the second law.

The violation was undeniable, as Henri Poincaré reported in 1907: "...we see under our eyes now motion transformed into heat by friction, now heat changed inversely into motion, and that without loss since the movement lasts forever. This is contrary to the principle of Carnot." and, in 1905: “One can almost see Maxwell’s demon at work.”

3.1 Szilard's One-Molecule Gas Engine

A retrenchment was needed. It was decided that these violations of the second law could be confined to microscopic realms. There would be no way to accumulate them so that many microscopic violations could issue in a macroscopic violation of the Second Law. This was the result of detailed reflections, in particular, by Marian Smoluchowski. (We will return to them in a moment.) In them, Smoluchowski considered a range of proposals to accumulate these small violations and urged their failure.

A later proposal in this tradition was made by Leo Szilard in a paper of 1929. Smoluchowski had not considered the one-molecule gas. Its analysis was Szilard's invention. Szilard's engine depended essentially on the decrease in entropy associated with the insertion of the partition above. His engine went through the following cycle:

1. A partition is inserted into the midpoint of a chamber holding a one molecule gas.

2. The side on which the molecule is captured is detected and piston/weight system installed accordingly.

3. The one molecule gas is expanded reversibly and isothermally. Heat kT ln 2 is drawn from the surroundings and supplied as work in the raising of the weight.

4. The piston is removed and cycle repeated.

Szilard engine

The net effect of the cycle is the conversion of kT ln 2 of heat, drawn from the surroundings, to work, stored in the raised weight. Indefinite repetitions of the cycle enables an indefinite conversion of heat to work, accumulating a macroscopic violation of the Second Law.

Here's an animated version of the cycle:
Szilard engine

3.2 Exorcisms

 

The near-universal reaction to this one-molecule engine was that it must fail. There must be, it was presumed, some hidden dissipation in the operation of the device that compensates for the entropy reduction.

An early consensus was expressed by von Neumann, Brillouin and, apparently Szilard. To function, the demon had to detect which side of the partition had trapped the molecule. Working backwards, they concluded that this detection would provide one bit of information and that there had to be an ineliminable entropy cost associated with acquiring it of k ln 2.

This proposal has been called "Szilard's principle," by John Earman and me.

A central tenet of the new Landauer-Bennett orthodoxy is that this principle is incorrect. For if it were correct, then there would be a new source of entropy generation in computing. Every time a memory device is read in a computation, a detect operation is carried out. If Szilard's principle were true, each of these operations would necessarily create k ln 2 of entropy, representing a source of dissipation other than erasure.

This new orthodoxy provides an alternative account of the supposed failure of the engine. The demon operating it must make a record of which side is found to hold the molecule. To complete the cycle and restore the system to its initial state, the demon must erase this one bit memory. This erasure process creates the k ln 2 of entropy needed to balance the entropy books and protect the second law.

4 The Standard Inventory of Admissible Processes

The thermodynamics of computation depends on the supposition that one can assemble a composite process by combining processes in some standard inventory.

I know of no (other) place in which this inventory is listed. However in my own work elsewhere, I have reviewed papers in the thermodynamics of computation and assembled the following inventory.

Perform reversible, isothermal expansions and compressions of a one-molecule gas.
expansion contraction

Insert and remove a partition in a chamber holding a one molecule gas without dissipation.
insert remove partition

Detect the location of the molecule of a one-molecule gas without dissipation.

L cellor Rcell?

 

Shift between equal entropy states without dissipation.
shift

Trigger new processes according to the location detected without dissipation.

My claim is that this inventory of processes neglects relevant thermal processes, most notably thermal fluctuations. I will show below that several of these processes are disrupted by fluctuations and cannot be brought to completion.

The mode of failure is quite robust and will be captured in a result that applies to all reversible processes at these scales. Reversibility means that, at each moment of the infinitely slow process, its systems are in equilibrium so that all forces perfectly balanced. Thermal fluctuations will be superimposed onto these equilibria for all thermal system. For macroscopic systems, the size of the fluctuations is sufficiently small that they can be neglected. However the processes considered in the thermodynamics of computation seek to manipulate systems at the molecular level. Then, it turns out, fluctuations are significant enough to disrupt the equilibrium to such a degree that the processes cannot proceed in any definite direction.

In supposing that the processes of the inventory can be completed non-dissipatively, the orthodoxy neglects the very processes that are essential to the recovery of Landauer's principle. A memory device with known data increases in entropy by k ln 2 when thermalized only because thermal fluctuations fling the molecule throughout the larger volume of the chamber now accessible.

If these processes of the inventory are to be effected, we shall see that further dissipative processes are required. Their need then contradicts the central tenet of the Landauer-Bennett orthodoxy, that erasure is the only necessarily dissipative process in computation.

5 What Fluctuations do to Reversible Processes

5.1 Smoluchowski Trapdoor

That fluctuations are the key to understanding all these processes is not a new idea. When the molecular threat to the second law of thermodynamics was first addressed, they took center stage. The basic idea of Smoluchowski's analysis of the early 1910s was this: the functioning of any machine that sought to accumulate microscopic violations of the second law would itself be fatally disrupted by fluctuations.

The most celebrated example of many was the Smoluchowski trapdoor. It was a design for a mechanical Maxwell's demon. In place of the demon watching molecules approaching a door, the design employs a lightly spring-loaded door. Fast moving molecules approaching from one side can swing the door open, so that they pass through to the other. The trapdoor does not allow molecules to pass in the other direction.

Its intended operation would then lead to accumulation of more, faster moving molecules on one side, as shown here:

Trapdoor ideal

The design fails. The trapdoor itself must be very light and very lightly spring-loaded if a collision with a single molecule is to open it. However the trapdoor is itself a thermal system, carrying energy kT/2 for each degrees of freedom. The result is that the trapdoor is flapping about wildly, bouncing off the partition and not remaining closed. It provides no differential obstacle to molecules. They can can pass through the trapdoor with equal facility in either direction:

Trapdoor real

 

5.2 Reversible, Isothermal Expansion and Compression of a One-Molecule Gas

Analogously, fluctuations will disrupt these expansion and compression processes. To see how, we need a simple implementation of the process. The cylinder in which the one-molecule gas will expand is oriented vertically so that the work done by the expanding gas is captured directly in the raising of the mass of the piston.

For the expansion to be reversible, the weight of the piston must balance exactly the upward pressure of the one molecule gas. Thus, as the expansion proceeds, the weight must be constantly adjusted.

The process we intend will consist of the piston, in perfect equilibrium with the gas pressure, rising very slowly as work energy is drawn from the gas. Since the internal energy of the gas will remain the same, that withdrawn work energy is in turn replenished as heat conveyed to the one-molecule gas from the surroundings.

 

expansion idea

We can estimate the condition of perfect equilibrium as follows. The gas exerts a mean pressure P = kT/V on piston. If the piston has area A and is at height h, then we have V = Ah, and the force F exerted by the gas on the piston is

F = P.A = (kT/V).A = kT/h

The weight of the piston is Mg, if its mass is M. Setting the force and weight equal, we find that the equilibrium height for the mass M is

heq = kT/Mg

What this description ignores is that the piston must be very light if impacts with a single molecule are sufficient to raise it. And it is also a thermal system that will undergo fluctuations in its own right. Since its mass will be comparable to that of the single molecule, we can anticipate that these fluctuations will cause the piston to bounce throughout the chamber. For it is these same fluctuations that leads the single molecule to bounce through the chamber, simulating a chamber filling gas.

piston fluctuating

An easy computation affirms that this will happen. Take the case shown above of the piston by itself, that is, without a one-molecule gas trapped beneath it. The piston will have a thermal energy distributed probabilistically according to the Boltzmann distribution. That is, the probability density p(h) that the piston is at height h is given by:

p(h) = (Mg/kT) exp(-Mgh/kT)

The mean of this distribution is kT/Mg, which is equal to heq computed above. That means that the piston will on average already be around this equilibrium position of the expansion without any specific interaction with the gas, merely because the piston is a thermal system whose fluctuations fling it through the chamber.

More telling is the standard deviation of the above distribution. It is also kT/Mg. Since this magnitude gives the scale of the piston's fluctuations, we now see that the piston will be bouncing wildly through the entire size of the cylinder merely because it is a thermal system. It will be process without discernible start or finish, as shown here:

gas expansion real

The outcome is that a reversible isothermal expansion or its inverse, a reversible isothermal compression of a one-molecule gas is impossible. They are delicate processes in which all the forces must balance. Yet exactly this fact makes it fall victim to thermal fluctuations.

How are we to perform a compression or expansion? We must apply a force strong enough to overcome the fluctuations in the motion of the piston and use that force to drive the piston from its expanded to compressed position.

But to do that is to use forces that are out of equilibrium. The process is no longer reversible, but irreversible and hence an entropy creating dissipation.

5.3 Detection

It is essential to the Landauer-Bennett orthodoxy that we can detect the location of a single molecule in a memory device without dissipation.

5.3.1 The Tipping Keel

Bennett has provided several schemes that purport to do this. In one, a keel shaped device is slowly lowered onto the memory device. it is fitted with two pistons, each moving through one side of the memory device. Only one will be resisted by the molecule and this will tip the keel. The location of that molecule is then read from the direction in which the keel tipped. A key shaped device is coupled with the keel as it tips by turning a toothed wheel to raise and lower the key. The final reading of the detection process is given by the location of the key on the locking pin.

Here are the stages of this process:

detectiondetectiondetectiondetectiondetection

 

The difficulty with this scheme is essentially the same one that disrupted the expansion above. If the keel is to be tipped by the pressure exerted by a single molecule, then the keel must be very light. The keel is also a thermal system with its own thermal energy of kT/2 per degree of freedom. Just as the piston in the above compression bounced about because of its thermal energy, so the keel will rock wildly. This wild rocking will persist so that the keel never settles down into a configuration in which it correctly indicates the memory device state.

5.3.2 The Ferromagnet

Other detection schemes described by Bennett fail in the same way. An earlier proposal involves the reading of a ferromagnetic core memory device. The data stored by the magnetic core is held in the direction of its magnetization. That is, its own magnetic field is directed "up" or "down" according to the data bit recorded.

Bennett proposes that the state of the device can be read by a second detector memory core that is brought from a bistable state into a "soft mode" by passing it through a magnetic field. In the soft mode, the detector's magnetization can be altered easily. It is then slowly coupled with the memory device, where it aligns its own field contrary to that of the memory device. After this coupling operation, the detector is returned to a bistable state.

Here is an animation of the process:

ferromagnet

Here is the figure from which the animation was developed.

Fluctuations will once again disrupt this scheme. Since the detection must be carried out in a reversible manner, the forces coupling and decoupling the detector with the target memory device must be in perfect equilibrium with those resisting. If the memory device and detector are small and of molecular scales, then then thermodynamic fluctuations are superimposed upon the equilibrium state and will drown out any reliable detection.

For example, consider the process through which the detector ferromagnet is coupled with the data ferromagnet. When the detector is in its soft mode, its state can move freely between the "0" and "1" sub-states and thermal fluctuations will lead it bounce between them. Some other device--let's call it a driver--will bring the detector to couple with the data ferromagnet. The result will be that the detector states will now be restricted to one of "0" or "1" only.

This coupling process is a compression of the detector state space. It is compressed from a state having access to the full range of "0" and "1" sub-states to a state from which only one of them is accessible. Here is how we might expect the process to go:

measurement ideal

However it will not proceed this smoothly. Thermodynamically, the coupling has the same behavior as the compression of a one-molecule gas. The detector is the analog of the gas. The driver is the analog of the compressing piston. This coupling will be disrupted by fluctuations in the same way as the gas compression. The compressive force exerted by the driver will be balanced exactly by the detector's resistance to compression. Once thermal fluctuations are superimposed onto this equilibrium, we have a process that fluctuates about so wildly as to have no definite beginning or end:

measurement real

That fluctuations will have this effect could be shown by a computation dealing with the specifics of the arrangement just described. That turns out to be unnecessary. In so far as the processes of detection just described are to be carried out as thermodynamically reversible processes, they are governed by the "no go" result of Section 6 below, which shows that the processes are fatally disrupted by fluctuations.

5.3.3 What it takes

How can detection be carried out? What we need is a process that can override fluctuations. Such processes will be dissipative.

The older literature offered many proposals. The best known in the Brillouin torch. In the context of exorcising Maxwell's Demon, he proposed that we locate a molecule in a kinetic gas by shining a light that will reflect off it, revealing its position.

Maxwell demon with torch

He argued, however, that this detection operation is thermodynamically dissipative. One needs sufficiently energetic light so that the resulting signal remains detectible above the background thermal radiation. That condition, Brillouin argued, required a dissipation of at least k ln 2 for each bit of information secured.

5.4 Other Woes

The other processes of the inventory are troubled. Take, for example, the simple idea that we can insert a partition into the chamber without incurring athermodynamic cost. A small amount of analysis opens numerous problems.

If the partition is very light, then it will be subject to thermal fluctuations. If we try to drive it into place in a reversible process with perfectly balanced forces, we can expect thermal fluctuations to disrupt our efforts as before.

So let us imagine that it is very massive. Our presumption is that we are dealing with frictionless systems (i.e. conservative Hamiltonians). For otherwise their operation would be accompanied by just the dissipative conversion of work to heat that we are seeking to avoid. That means that a massive partition slowly sliding into place would not cease moving when it strikes the chamber walls. It would simply bounce off.

partition bouncing

Some further machinery is needed to halt its motion. It cannot become wedged by friction into a groove, for example. That would mean that it is held by disallowed frictional forces. Or we cannot tie it down with ropes or screw it in place, for knots in ropes and screws both depend essentially on friction to hold. (Frictionless shoelaces would spontaneously untie!)

We might consider a mechanical device that operates without friction. For example, a spring-loaded pin might press against the partition wall. When the partition insertion is completed, the pin would align with a hole in the partition and slide into place, locking the partition in place.

We might anticipate its operation to proceed as follows:

spriung catch

It will not work that way. We have simply replicated all the problems of the sliding partition in the sliding pin. If the pin is light, it will have thermal energy sufficiently great to lead to fluctuations that are in turn sufficient to have the pin bounce out and release the partition. If the pin is massive, since no friction will restrain it, it will simply reverse its motion and bounce back.

real catch

6 A No Go Result

So far, we've seen a few examples of how fluctuations defeat efforts to conduct the routine business of computation non-dissipatively at microscopic scales. The obvious hunch is that this will always be the case. There seems to be nothing special about the examples we've seen.

That hunch turns out to be right. All the cases looked at so far turn out to be instances of a quite general result. Indeed the result is quite startling for its generality.

6.1 The Result

The key property of the processes covered by the no go result is that they are non-dissipative. Recalling the key ideas once again:

To be non-dissipative, that is to be a process that creates no new entropy in the universe, the process must be in perfect equilibrium at all of its stages. That means that the forces that drive the process are in perfectly balanced.

Processes with this property are called "reversible," a notion introduced by Sadi Carnot in the 19th century at the birth of thermodynamics. They have the property of being able to proceed with equal facility in both directions.

A reversible expansion, for example, is one described earlier in which the weight of the piston perfectly balances the expansive pressure of the gas. If the weight were taken off the piston so the pressure force would overpower it, then we would have a sudden expansion of the gas; it would be an "irreversible process."

The no go result applies to all processes of this type, if implemented at the microscopic level. That means that they are implemented in a way designed to manipulate the properties of individual molecules. Such processes are intended to proceed very slowly through a series of stages indexed by some parameter, λ:

λ=1, λ=2, λ=3, ...

The result is that all of these stages are equiprobable:

p(λ) = constant

where p is the probability density of stages over λ. What this means is that, if we inspect the process at any moment, we are as likely to find it any stage as at any other.

That defeats all efforts to implement the infinitely slow reversible transformation. Imagine for example that we set up the system in some initial state, λ = λ1, intending that over a long time it will slowly evolve into a final state, λ = λ2. That means we are expecting something like this:

λ1 tortoise λ2

That is not what we will get. If, moments after, we inspect the system, we are as likely to find the system already in any of the stages of the process, including the final stage. The process will have no definite start, middle or end. These fluctuation probabilities will be hurling the process over all the intermediate stages, indiscriminately. This is what we will have:

λ1tortoise λ2

The origin of the no go result lies in the fact that reversible processes are, at all times, in perfect equilibrium. That is a delicate balance of forces. Thermal systems fluctuate slightly all the time because of their molecular constitution. If we consider macroscopic processes, these fluctuations are minor and rarely need our attention. If, however, the processes are manipulating components at a molecular level, the situation is reversed. The delicate balance needed for equilibrium is overwhelmed by fluctuations.

6.2 Picturing the No Go result in phase space

It is helpful to attach a geometric picture to this no go result. All the processes unfold in a phase space whose individual points represent all possible microscopic states of the system. The evolution of the processes themselves are represented by a trajectory through this phase space. Finally, the different stages of the processes are represented by subvolumes of the phase space.

Perhaps from an analogy with macroscopic processes, what we expect is for the evolution to proceed peacefully, with the system remaining for a while in the first stage λ=1; then moving on to the next, λ=2, where it lingers; then moving on to λ=3; and so on.

ideal phase space

However this is not what will happen. We are assuming that the states of the various stages are dynamically remote from one another so the trajectory cannot easily pass between them. That is incorrect. The various stages are dynamically very close and the system can as easily pass to a different stage as remain in the present stage. This figure is closer to what will happen:

real phase space

This figure also displays the important fact that the volumes of phase space corresponding to each stage overlap. That means that there is considerable arbitrariness in how we assign the stage of the process actualized when we have the system in a state that belongs to many stages.

We can see how this overlapping comes about if we look at the phase space of a gas-piston system. The full phase space of the system will have many dimensions. There will be coordinates for the positions and momenta of both gas and piston. The overlapping is seen, however, if we look at a slice through the phase space that shows just the heights of the molecule and the piston. For stage gas-piston height boundary h ="H," the gas molecule occupies heights from 0 to H and the piston occupies heights from H to infinity. These are plotted on the figure below and correspond to the infinitely extending rectangles shown.

gas piston phase space

We can see from the figure that a single microscopic state of the gas-piston can belong to many stages. For example, if the gas height is 0.25H and the piston height is 1.5H, it will belong to all four of the stages shown.

6.3 Where It Comes From

We need two main ingredients to arrive at this no-go result.

6.3.1 How to Compute Fluctuation Probabilities

The first is a means of computing the probabilities of the various stages. Here we recall the result that Einstein made famous under the name of "Boltzmann's Principle." It asserts that the the entropy S of some isolated system is related to its probability W by

S = k ln W

For example, imagine an ideal gas consisting of just four molecules. These four molecules will bounce around in their vessel and, by chance, may all happen to collect on one side. We can compute the relative probabilities of these two states by writing

Shalf = k ln Whalf
Sfull = k ln Wfull

Subtracting we find

Shalf - Sfull = k ln (Whalf/Wfull)

It is an elementary result of the kinetic theory of gases that this entropy difference is just

ΔS = Shalf - Sfull = -4k ln 2 = k ln (1/24)

Combining the last two formulae, we recover

Whalf/Wfull = 1/24

The "full" state really corresponds to the case of the gas molecules enclosed in any manner in the full vessel. That means that we have Wfull = 1. Hence

Whalf= 1/24

four molecules

And that is exactly what we expected. For there is a probability 1/2 that each molecule is found on one particular side. Since they move independently, the probability that they are all there is 1/24.

This gives an idea of how thermodynamic quantities and probabilities can be connected. Unfortunately, Einstein's S = k ln W cannot be applied to the case at hand. For Einstein's result applies to isolated systems. The systems that concern us are in thermal equilibrium with their surroundings. They are anything but isolated.

However it turns out that a slight adjustment of Einstein's result applies for this case. We must assume that our system is in thermal equilibrium with the environment, that it can freely exchange heat with it, but that it exchanges no work. Then Einstein' result can be replaced by

S - E/T = k ln W

This can be rewritten in terms of another quantity, the free energy F = E-TS:

F = -kT ln W

Inverting this formula gives us the expression we need for probabilities

W = is proportional to exp(-F/kT)

This enables us to write the ratio of probability densities for two stages of λ as

p(λ2)/p(λ2) = exp(- (F2-F1)/kT = exp(-ΔF/kT)

where for compactness of notation we write F2-F1 = ΔF.

6.3.2 Representing Equilibrium

The second ingredient is the condition of equilibrium at every stage λ of the process. For a system in thermal equilibrium with its surroundings, this condition is simply the condition that the free energy F remains the same as we proceed through the stages.

F(λ) = constant

Here is how to see that this constancy of F expresses equilibrium. A closed system will spontaneously move to states of higher entropy. So if a closed system is at equilibrium, it must have the same entropy as all accessible nearby states. If a process Δ connects the system to these nearby states, then we must have ΔS = 0.

Our situation is different, but only slightly so. The entropy of our system may not change due to spontaneous, irreversible processes. But it may change since a reversible heat transfer with the surroundings may add or subtract entropy. This entropy change will be given by the usual formula of heat/T, where the heat gained or lost is just the change in internal energy E. That is, we have

ΔS = ΔE/T    or    ΔS - ΔE/T = 0    or   TΔS - ΔE = 0    or   Δ(E-TS) = 0    or   ΔF=0

Since the process Δ connects the system with neighboring states indexed by λ, another way to write this is as dF(λ)/dλ = 0, which is the same as

F(λ) = constant.

To see that this condition expresses equilibrium conceived as a balance of forces, go to Notes, Expressing Equilibrium.

6.3.3 Assembling...

We now combine these two results to recover the no go result.

We have for any isothermal reversible process that passes from λ1to λ2 that

p(λ2)/p(λ1) = exp(-(F2-F1)/kT = exp(-ΔF/kT)

But we also have from the condition for equilibrium that ΔF = F2-F1 = 0. Therefore p(λ2)/p(λ1) = exp(0) = 1 and

p(λ1) = p(λ2)

which is the no go result.

7 Overcoming the Fluctuations: What it Takes

7.1 On the Micro-scale

The no go result tells us that we cannot employ isothermal, reversible processes in our computing machines if they are to function at the microscopic level. What does it take to overcome these fluctuations so that processes at the microscopic level are possible? The fluctuation formula tells us:

p(λ2)/p(λ1) = exp(-(F2-F1)/kT = exp(-ΔF/kT)

If we want a process to proceed from λ1to λ2, we need to create a probability density gradient that favors the process advancing. The formula tells us that we can do it creating a negative free energy change, ΔF < 0. That means that we will have a system out of equilibrium. If, for example, we want to expand a gas, the mean force exerted by the piston must be less than that exerted by the gas, so it spontaneously expands, most likely.

For example, if we want a probability density ratio of at least 20, we have

p(λ2)/p(λ1) > 20 = exp(3)

and we need to set ΔF/kT < -3. Recalling that ΔF = ΔE - TΔS, that amounts to

ΔS - ΔE/T > 3k

To interpret this relation, note that ΔE is the energy change of the system. Therefore -ΔE is the energy change of the surroundings which must arise through a gain of heat by the surroundings of -ΔE. If that gain were through a reversible process, then the entropy change in the surroundings would be heat/T = -ΔE/T. However since we are not longer restricting ourselves to reversible processes, the entropy change in the surroundings will be -ΔE/T or greater. Combining, this means that the last inequality says

(Entropy increase in system and surroundings) > 3k

This entropy creation of 3k greatly exceeds the k ln 2 = 0.69k of entropy tracked by Landauer's Principle. Thus, if a computing device is functioning at all, it must be creating quantities of entropy that exceed this Landauer limit with each of its steps.

A probability gradient of 20 is not high. Loosely it means that one in twenty times, the process will not go ahead. That is a high failure rate for a system with many steps. It can only be reduced, however, by including dissipative processes that create still more entropy.

Even though it is not high, implementing a process with ΔF < -3kT requires that we do quite some violence to our system. Imagine, for example, that we want to ensure that a one-molecule gas expands to double its volume. Merely upsetting equilibrium by reducing the restraining force on the piston will not suffice. If we remove the piston entirely so the gas expands into a vacuum, the free energy change is still only -kT ln 2 = -0.69 kT. To drive it up to -3kT would require further machinery that would accelerate the expansion, reinforcing the gas' tendency to expand into a vacuum.

7.2 Macroscopic versus Microscopic

At first, this no go result seems too strong to be correct. It says--correctly-- that isothermal, reversible processes are impossible among systems with molecular constitution. That impossibility, one might think, would be hard to reconcile with their central place in thermodynamic theory. The origin of thermodynamics lay in the analysis of the efficiency of steam engines. The major discovery is that, as a quite general matter, a steam engine is made more efficient by bringing its processes closer to reversible processes. How are we to make sense of that if fluctuations disrupt reversible processes?

The key fact is that disruptions only interfere at molecular scales. They will cause trouble if we try to build a steam engine with molecular-sized components. But they will not as long as our machines are macroscopically sized.

To see this, let us go back to the fluctuation formula

p(λ2)/p(λ1) = exp(-(F2-F1)/kT = exp(-ΔF/kT)

We can make the process overwhelmingly likely to advance if we have a free energy imbalance of ΔF/kT < -25. For then

p(λ2)/p(λ1) > 7.2 x 1010 = exp(25)

However the free energy imbalance of 25kT involves a minute quantity of energy. It is merely the mean thermal energy of ten oxygen molecules. Steam engines are systems that employ over 1025 molecules. That is over 100,000,000,000,000,000,000,000,000. The energy of just ten molecules will be invisible on all scales that matter to steam engines!

Notes

Computing the entropy increase in thermalizing a one-molecule memory cell

We compute the entropy difference between two states by finding a reversible process that connects the two states. The two states are:

An cell holding data
Lcell
and a thermalized data cell
Lcell thermalized

A reversible isothermal expansion connects the state of a molecule confined to one half of the chamber and the molecule free to move through the whole chamber.

one molecule expansion

To carry out this expansion, we introduce a piston at the midpoint of the chamber. The molecule repeatedly collides with the piston, exerting a mean pressure P on it that is given by the familiar ideal gas law

P = kT/V

where V is the volume of the chamber holding the molecule. The work done by the the gas is the course of the expansion is given by the integral

Work = P dV = kT/V dV =kT ln(Vfin/Vinit) = kT ln 2

where the integration varies from V = Vinit to V = Vfin = 2*Vinit, since the expansion is a doubling in volume. This work is captured by the raising of a weight.

A characteristic property of an ideal gas of one or more molecules is that its energy is independent of its volume. Hence the work energy drawn from the gas must be replaced by heat drawn from the surroundings. That is, in the process, heat

Qrev = kT ln 2

is drawn into the gas. The subscript "rev" emphasizes that this is a reversible heating.

Finally, the Clausius definition of entropy change ΔS for a reversible process is

ΔS = dQrev/T = k ln 2

That is, thermalizing memory device increases its entropy by k ln 2.

Why does free energy measure fluctuation probabilities?

Why this formula?

S - E/T = k ln W

A standard derivation is in "Waiting for Landauer," Section 7.4. However the following plausibility argument helps by showing how the formula deals with two signal cases.

First, what is new is that the system can exchange heat with its surroundings (but not, by assumption, work). There is one special case of this exchange. That is when the system gains or loses a small amount of heat in a reversible process. Since there is no exchange of work with the surroundings, the gain in heat is just equal to the change in internal energy, ΔE. Hence the entropy changes from S to S + ΔE/T. Since the process is reversible, it is at every stage in equilibrium. That means that the process has no tendency to advance in either the forward or reverse direction. Hence, the probability of the two states must be the same. Otherwise the process would tend towards the higher probability state. From this we infer that the probability attached to the states with entropies S and S + ΔE/T must be the same.

The formula S - E/T = k ln W has this property. For if the change Δ is associated with probabilities W1 and W2, it tell us that

k ln (W1/W2) = Δ(S - E/T) = ΔS - ΔE/T = 0

using ΔS = ΔE/T. Since ln 1 = 0, it follows that W1 = W2, as required.

Second, this new formula remains consistent with Einstein's original formula. To see this, consider an isolated system comprised of two (sub)-systems A and B in thermal equilibrium that can exchange heat with each other but no work. The probability changes for each system through some process Δ are

k ln (WA,2/WA,1) = ΔSA - ΔEA/T
k ln (WB,2/WB,1)=ΔSB - ΔEB/T

Adding and noting that ΔEA = -ΔEB, we have

k ln (WA,2WB,2/WA,1WB,1) = ΔSA + ΔSB

Since the probabilities for the total system obey W1=WA,1WB,1and W2=WA,2WB,2 and since ΔS = ΔSA + ΔSB, we have

k ln (W1/W2) = ΔS

which is the result of Einstein's formula.

Expressing Equilibrium

There is another way to get the result that equilibrium is expressed by

F(λ) = constant.

It looks rather different from the analysis in the main text, but actually expresses the same relations. Here we see that this equilibrium condition directly expresses the idea that all forces balance perfectly in the equilibrium state.

The quantity

X(λ) = -dF(λ)/dλ

is the generalized force exerted by a system with free energy F(λ) as is passes through the stages of the process. In the familiar case of an ideal gas of n molecules with volume V, the free energy is

F = -nkT ln V + constant(T)

where the constant is different for different T. If the process is an isothermal expansion or contraction with parameter V, then the generalized force is just

X(V) = -(d/dV) (-nkT ln V) = nkT/V

where T is held constant in the differentiation. That is just the ordinary (mean) pressure exerted by an ideal gas. The quantity X(λ) generalizes this for other processes.

If the system consists of parts, such as a gas and a piston, then the condition of equilibrium is just that the generalized forces exerted by each part should balance precisely. If the system has parts A and B, that means that

XA(λ) + XB(λ) = 0

Since XA(λ) = -dFA(λ)/dλ and XB(λ) = -dFB(λ)/dλ, this amounts to requiring

0 = dFA(λ)/dλ + dFB(λ)/dλ = (d/dλ) (FA(λ) + FB(λ)) = dF(λ)/dλ

which is the same result as before.

Copyright John D. Norton. January22, 2010.