John D. Norton
Center for Philosophy of Science
Department of History and Philosophy of Science
University of Pittsburgh
Revised and improved edition, July 2013.
For historians here is an archived copy of the old version from 2010.
Experts who want to see the no go result described in 650 words, should go to "No Go Result for the Thermodynamics of Computation"
|For a more detailed development of the no-go result see:
"All Shook Up: Fluctuations, Maxwell's Demon and the Thermodynamics of Computation." Part II.
For more papers written in this area, see the collection Maxwell's Demon, Landauer's Principle and the Thermodynamics of Computation.
|The thermodynamics of computation seeks to identify the principled thermodynamic limits to computation. It imagines computations carried out on systems so small that their components are of molecular sizes. The founding tenet of the analysis is that all the processes excepting one can in principle be carried out in a non-dissipative manner, that is, in a manner in which no thermodynamic entropy is created or passed to the surroundings. The sole class of necessarily dissipative processes is identified by the central dogma as those physical processes that implement a logical, many-to-one mapping. The universal example of such a process is erasure. It takes a memory device that may be in many different states and maps it to one, a single reset state.
Elsewhere I have argued that the thermodynamics of computation is gravely troubled. Its leading principle, Landauer's Principle, connects erasure to entropy dissipation. Yet there is still no sound justification for it. The proofs that have been given rest on fallacies or misapplications of statistical and thermal physics.
The purpose of this site is to describe a different problem in the theory that I hope will be of more general interest in philosophy of science. It gives a concrete example of what can go wrong if one uses idealizations improperly.
An ineliminable assumption of the thermodynamics of computation is that one can employ a repertoire of non-dissipative processes at molecular scales. These processes are thermodynamically reversible and, if carried out, would involve no net creation of entropy.
All thermal processes at molecular scales are troubled by fluctuations. They arise because thermal processes are the average of many microscopic molecular processes and the averaging is never quite perfect. Most famous of these is the Brownian motion that leads pollen grains to dance about under a microscope. The dancing arises because the collisions with the pollen grain of water molecules from all sides do not quite average out to leave no effect.
My contention here is that the reversible processes presumed in the thermodynamics of computation are fatally disrupted by fluctuations that the theory selectively ignores. That is especially awkward since the one occasion on which processes connected to fluctuations are not ignored is when the theory treats the thermodynamics of erasure.
As a result, the founding tenet of the thermodynamics of computation--that erasure-like processes only are necessarily dissipative--arises entirely because fluctuations have been idealized away selectively for other processes. In short the basic conception of the theory itself derives from inconsistently implemented idealizations.
The account below is written at a more general level for readers that I presume have only a slight contact with the thermodynamics of computation. Little thermodynamics is assumed. But a reader would be well placed just knowing the difference between work and heat and having some sense of how entropy increase can connect to dissipation.
Idealizations are a routine part of science. They are known falsehood, sometimes introduced for theoretical convenience or sometimes out of necessity. The latter arises often in physics, since no theory gives the complete truth. Every theory is an idealization to some degree.
When used wisely, idealizations enable good physics. Galileo was able to develop his theory of projectile motion precisely because he idealized away air resistance. It is the classic case of successful idealization. Their imprudent use, however, can lead to great mischief. Imagine that one tries to give an account of airplane flight that neglects the effect of air passing over the airplane. It is the same idealization as used by Galileo. However now it leads to the disastrous result that sustained flight is impossible for an airplane.
This is one way that an idealization can be a bad idealization: it leads to incorrect results that we attribute unwittingly to the original system, where those results are merely artifacts of the idealization.
|1.The term "thermodynamics of computation" is taken from the title of Charles Bennett (1982) “The Thermodynamics of Computation—A Review,” International Journal of Theoretical Physics, 21, pp. 905-40; reprinted in Harvey S. Leff and Andrew Rex (2003), eds., Maxwell’s Demon 2: Entropy, Classical and Quantum Information, Computing. Bristol and Philadelphia: Institute of Physics Publishing. Ch. 7.||My goal here is outline a striking case of a bad idealization in present science. It has created a spurious science, known as the "thermodynamics of computation." The analyses in this science are dependent upon idealizing away thermal fluctuations in an arbitrary and selective manner. If one treats the fluctuations consistently, the results of the science disappear.|
One can get an idea in advance of what the problem will be by thinking about scaling. We tend to imagine things remaining pretty much the same as we scale things down. If we were one tenth or one hundredth our size, we imagine that that we could carry on our business pretty much as normal, as long as everything else was scaled down correspondingly.
Famously, those intuitions are flawed. Humans scaled up to the size of elephants would need legs as fat as an elephants' merely to move around. If we were scaled down to the size of fleas, however, our muscles would be grossly oversized. The slender leg of a flea is already powerful enough to let it leap many times its height.
These sorts of problems mount the greater the scaling. When we scale static objects down to molecular sizes, we imagine that they will stay put, just as they did at macroscopic scales. At the larger scale, it is no great feat to stack up a few children's blocks.
But if they were scaled down to molecular sizes, the accumulated impact of air molecules on the now minuscule block would not average out smoothly. The block would be jumping about. Simple stacking and all the other building processes that we take for granted at larger scales would be fatally disrupted.
The thermodynamics of computation has proceeded by assuming too often that one can scale down to molecular sizes familiar processes that work well on macroscopic scales. The ones that do not scale down are the non-dissipative processes, for they can be non-dissipative by virtue of being in perfect but delicate equilibrium at every stage. The thermal fluctuations that lead tiny block to jiggle about will disrupt them as well.
The notion that there is a thermodynamics of computation begins with the fact that all computers are thermal systems. In operation they take work energy, usually supplied as electrical energy, and convert it to heat. This conversion of work to heat is a thermodynamically dissipative one. It is entropy increasing. As a practical matter, compact computing devices of any power require cooling systems to dissipate this heat generation.
The thermodynamics of computation seeks to answer the question of whether this dissipative heat generation can be eliminated entirely, at least in principle; or whether some heat generation arises essentially as a part of computation itself.
The reigning answer to this question is based on the work of Rolf Landauer and Charles Bennett. According to what I shall call the Landauer-Bennett orthodoxy, all computational processes can be carried out in principle in a non-dissipative manner excepting one, a logically irreversible many-to-one mapping of states. Erasure is the familiar example of such a process. When it occurs, thermodynamic entropy is necessarily passed to the surroundings. The results in an increased entropy of the surroundings.
Landauer's Principle: Erasure of one bit of information passes at least k ln 2 of thermodynamic entropy to the surroundings.
|You may have noticed an odd asymmetry between the direct and converse form of Landauer's principle. Logically reversible processes can be implemented in thermodynamically reversible processes. You might expect then that logically irreversible processes must be implemented by thermodynamically irreversible processes. But the principle does not say that. This asymmetry reflects a deeper muddle explained in Just WHAT does Landauer's Principle say?.||Since we assume that the computer exchanges only heat but not work with the surroundings, it follows that the surroundings are warmed by at least kT ln 2 of heat for each bit erased. The source of the heat is ultimately the work energy used to power the computer. That work energy has been degraded to heat.
There is a converse principle that is usually tacit but is sometimes stated. It affirms that erasure (or, more generally, logically irreversible, many-to-one mappings) are the only necessarily dissipative computational processes.
Converse Landauer's Principle: All logically reversible operations can be implemented by thermodynamically reversible processes, that is, processes that create no entropy.
To see how this principle comes about, consider a one bit memory device. It is a physical system that admits two distinct states corresponding to the "0" or "1" of the bit stored. In principle, it could be a boulder that is rolled to one or other side of a cavern. This boulder and cavern is a thermal system. As a result, the boulder's energy is fluctuating very slightly as it gains and loses energy in collisions with air molecules and the other systems
Practical computing devices are much smaller than this boulder. In older electronic computing devices, a single bit was stored in the ferrite rings of a magnetic core. The magnetic field of the core could be manipulated electrically to point in one of two directions that corresponded to the "0" or "1" of the stored bit. Modern DRAM chips record a bit according to whether a tiny charge is stored on a capacitor within the chip.
These devices are also thermal systems, constantly interchanging energy with their surroundings. As a result their internal energies are fluctuating. The voltage of the capacitor in a DRAM chip will be bouncing about, for example, and the size of the bouncing relative to its voltage will be larger the closer the capacitor's size approaches atomic scales. This will cause no problem as long as the semiconductor walls confining the charge are high enough electrically to prevent the charge escaping.
A single, simple model is used almost universally to capture these thermal aspects of a memory device. It is the one-molecule memory device. Since the thermal physics of fluctuations is robust, as far as I know this particular idealization is benign. The sorts of results that are generated for the one-molecule device will have analogs for the other devices. The latter will just harder to see.
The device consists of a chamber that holds a single molecule. The chamber can be divided into two parts by a removable partition. When the partition is inserted into the chamber, the molecule will be captured on one side. That side represents whether the device is storing an "L" or an "R"; or, more generically, "0" or a "1."
The chamber is in thermal contact with the surroundings at temperature T and, as a result, both the chamber and its molecule are also at temperature T. The molecule will carry thermal energy of kT/2 per degree of freedom, where k is Boltzmann's constant. For the case of a monatomic molecule--a Helium atom, for example--this thermal energy is (3/2)kT.
Since k is extremely small, this thermal energy is minute. However the molecule is also minute, so that the thermal energy is sufficient to fling the molecule through the chamber. Over time, the single molecule acts like a gas that fills the chamber. Its presence on one or other side is merely a momentary thermal fluctuation. As a result, the partition is an essential component of the device. For without it the molecule would not be confined to one other side of the chamber.
This capacity of a single molecule to fill a chamber like a gas is an extreme manifestation of thermal fluctuations. The more familiar fluctuations consists in the slight jiggling of the volume and constraining piston of an ordinary gas. It is actually imperceptible at ordinary scales:
As we reduce the number of molecules, the tiny jiggles become relatively larger until they dominate. Here is the case of just four molecules. Fluctuations can lead the four-molecule gas to spontaneously compress to one half its volume. That happens with a probability of (1/2)4, which is 1/16.
The motion of the single molecule of a one-molecule gas through its chamber is just an extreme manifestation of these fluctuations in which density fluctuations are wide enough to evacuate one half of the chamber.
When the partition is inserted and the gas has been confined to half the volume, all we have done is to "lock in" a fluctuation.
To arrive at the entropy cost indicated by Landauer's principle, we need to find the entropy associated with a thermalized memory device. This is a memory device whose partition has been removed, "thermalizing" it, so that the molecule is free to move through both sides of the chamber.
That is, we start with a memory device holding some data, such as L.
We remove the partition and give the molecule access to the entire chamber.
The data is now lost and the cell is thermalized.
When we thermalize a binary memory device like this, we increase its entropy by k ln 2. This is an irreversible process that creates entropy of k ln 2. That it is irreversible is easy to see--the expansion of the gas is uncontrolled; there is no balance of forces. As a result we have lost a possibility of gaining work if we had allowed the gas to expand in a way that let us recover work.
Since this quantity of entropy of k ln 2 is of central importance to the thermodynamics of computation, it is worth recounting how it is derived. The derivation proceeds by assuming that there is a reversible expansion that connects the initial and final states; and then, by tracking the heat exchanges, the figure of k ln 2 is recovered. The details are below in the notes "Computing the entropy increase in thermalizing a one-molecule memory cell."
Landauer's principle is commonly justified by distinguishing a memory device with reset data from one with random data. The latter only is treated thermodynamically as the same as a thermalized memory device.
Consider an ensemble of many memory devices. If the devices hold reset data--in this case all reset to the left side--the ensemble looks like this.
If the devices hold random data, then the molecule may be found on the left or the right of the chamber. Then the ensemble might look something like this.
Finally we have an ensemble of thermalized devices all of whose partitions have been removed.
These last two ensembles are similar in that, in both cases, we are equally likely to find the molecule on either side of the chamber
This resemblance is taken (mistakenly--see below) to justify the conclusion that the two ensembles are equivalent thermodynamically. This means that a memory device holding random data will have the same thermodynamic entropy as a thermalized memory device. That entropy is k ln 2 greater for each cell than for a memory device holding known data.
This difference leads us to Landauer's principle. An erasure process brings memory devices holding random data back to a state in which they hold reset data. That is, during erasure, each memory device has its thermodynamic entropy reduced by k ln 2. The second law of thermodynamics tells us that entropy overall cannot decrease. Hence this reduction in entropy must be compensated by an increase in entropy of the surroundings of at least k ln 2. This is the result claimed in Landauer's principle (for a one bit erasure).
There is a related way of arriving at the same result. The argument from a many-to-one mapping or compression of phase space. In a Boltzmannian approach to statistical mechanics, the thermodynamic entropy of a thermal system is related to the volume of phase space that it occupies according to the relation
S = k ln (phase volume)
In erasing a device with random data, its phase space is reduced by a factor of 2. An entropy reduction of k ln 2 follows. Since entropy overall cannot decrease, this reduction must be compensated by an entropy increase in the surroundings.
|These complaints have been argued at some length in my papers:
"All Shook Up: Fluctuations, Maxwell's Demon and the Thermodynamics of Computation."Download.
"Waiting for Landauer," Studies in History and Philosophy of Modern Physics, 42(2011), pp. 184-198. Download.
"Eaters of the Lotus: Landauer's Principle and the Return of Maxwell's Demon." Studies in History and Philosophy of Modern Physics, 36 (2005), pp. 375-411. Download.
|My purpose here is not to dispute Landauer's principle. However, I believe that all the argumentation of the last section is flawed:
• An erasure does not compress phase space. It leaves the occupied volume of phase space the same and merely rearranges those parts that are occupied.
• While we may not know which side of a memory device with random data holds the molecule, the resulting probability distribution cannot be associated with a thermodynamic entropy. For probability distributions to issue in thermodynamic entropy, the regions of phase space over which the probability ranges must be accessible to the system point. This condition fails in the above analysis.
It is now a half century since Landauer's first suggestion of a necessary connection between erasure and thermodynamic entropy creation. We still have no cogent justification for his principle, although there have been no lack of attempts to justify it. At best we can say that the principle remains an interesting speculation; at worst, it is a seductive mistake.
What does an erasure process look like? A common example is the following two step process:
1. We remove the partition to thermalize the data.
2. We insert a piston and carry out a reversible, isothermal compression of the space to restore the device to the reference state.
|The piston works against a mean pressure P = kT/V. In the course of a compression from V = Vinit to V = Vfin = Vinit/2, the work done is ∫P dv = ∫ kT/V dV = kT ln (Vinit/Vfin) = kT ln 2.||Work kT ln 2 is supplied to the gas in the course of the compression. Since the gas energy remains the same, this work energy is passed to the surroundings as heat. Since the surroundings are at temperature T, the effect is an increase in the entropy of the surroundings of kT ln 2/T = k ln 2, which is the minimum amount required by Landauer's principle.|
|One of the curious aspects of the Landauer-Bennett approach is that the insertion of the partition is not taken to reduce thermodynamic entropy. Even though the one-molecule gas has been compressed and its entropy reduced, we do no know which side of the chamber holds the molecule. There is, supposedly, a thermodynamic entropy of k ln 2 associate with this uncertainty. In my view this is the same mistake as the thermodynamic conflation of random and thermalized data above.||The processes described here offer the possibility of a violation of the second law of thermodynamics. Start with a memory cell in the thermalized state. Insert the partition. The net effect is a reduction of entropy of k ln 2 in apparent violation of the second law of thermodynamics.|
This threat to the law is an old one. It arose with the recognition in the nineteenth century that thermal processes had a molecular constitution. Maxwell had described his demon. It was a nimble being who could open and close a door in the partition in a gas-filled chamber, so that faster moving molecules are accumulated on one side and slower moving molecules on the other. The net effect is that the gas on one side of the partition becomes hotter and on the other side, colder. This separation of hot from cold without the expenditure of work violates the Second Law of Thermodynamics.
The problem became acute when Einstein established in 1905 that molecular fluctuations were a real part of experimental physics. One needed only to gaze upon pollen grains under a microscope to see them. Their random jiggling, Brownian motion, results from the accumulated impact of many water molecules.
With each abrupt jiggle, some of the heat energy of the water is converted into motion. It is a conversion of heat to work forbidden by the second law.
The violation was undeniable, as Henri Poincaré reported in 1904:
"[…] we see under our eyes now motion transformed into heat by friction, now heat changed inversely into motion, and that without loss since the movement lasts forever. This is the contrary of the principle of Carnot.
If this be so, to see the world return backward, we no longer have need of the infinitely keen eye of Maxwell's demon; our microscope suffices."
...and, in 1905: “One can almost see Maxwell’s demon at work.”
A retrenchment was needed. It was decided that these violations of the second law could be confined to microscopic realms. There would be no way to accumulate them so that many microscopic violations could issue in a macroscopic violation of the Second Law. This was the result of detailed reflections, in particular, by Marian Smoluchowski. (We will return to them in a moment.) In them, Smoluchowski considered a range of proposals to accumulate these small violations and urged their failure.
A later proposal in this tradition was made by Leo Szilard in a paper of 1929. Smoluchowski had not considered the one-molecule gas. Its analysis was Szilard's invention. Szilard's engine depended essentially on the decrease in entropy associated with the insertion of the partition above. His engine went through the following cycle:
1. A partition is inserted into the midpoint of a chamber holding a one molecule gas.
2. The side on which the molecule is captured is detected and piston/weight system installed accordingly.
3. The one molecule gas is expanded reversibly and isothermally. Heat kT ln 2 is drawn from the surroundings and supplied as work in the raising of the weight.
4. The piston is removed and cycle repeated.
The net effect of the cycle is the conversion of kT ln 2 of heat, drawn from the surroundings, to work, stored in the raised weight. Indefinite repetitions of the cycle enables an indefinite conversion of heat to work, accumulating a macroscopic violation of the Second Law.
Here's an animated version of the cycle:
|In retrospect, this is a curious reaction. When one considers devices as exotic as this one-molecule engine, why should the Second Law be protected? We do know that the molecular constitution of matter will ultimately force a failure the second law. Poincaré recurrence assures that a gas expanding into a chamber will eventually--after much time--recompress by an accumulation of spontaneous motions of the individual molecules to one side. Why is it so repugnant to imagine that a device like the one-molecule engine can hasten this violation?||The near-universal reaction to this one-molecule engine was that it must fail. There must be, it was presumed, some hidden dissipation in the operation of the device that compensates for the entropy reduction.
An early consensus was expressed by von Neumann, Brillouin and, apparently Szilard. To function, the demon had to detect which side of the partition had trapped the molecule. Working backwards, they concluded that this detection would provide one bit of information and that there had to be an ineliminable entropy cost associated with acquiring it of k ln 2.
This proposal has been called "Szilard's principle," by John Earman and me.
A central tenet of the new Landauer-Bennett orthodoxy is that this principle is incorrect. For if it were correct, then there would be a new source of entropy generation in computing. Every time a memory device is read in a computation, a detect operation is carried out. If Szilard's principle were true, each of these operations would necessarily create k ln 2 of entropy, representing a source of dissipation other than erasure. That would violate the converse form of Landauer's Principle.
This new orthodoxy provides an alternative account of the supposed failure of the engine. The demon operating it must make a record of which side is found to hold the molecule. To complete the cycle and restore the system to its initial state, the demon must erase this one bit memory. This erasure process creates the k ln 2 of entropy needed to balance the entropy books and protect the second law.
The thermodynamics of computation depends on the supposition that one can assemble a composite process by combining processes in some standard inventory.
|"Waiting for Landauer" Studies in History and Philosophy of Modern Physics, 42(2011), pp. 184-198.. Download.||I know of no (other) place in which this inventory is listed. However in my own work elsewhere, I have reviewed papers in the thermodynamics of computation and assembled the following inventory.|
Perform reversible, isothermal expansions and compressions of a one-molecule gas.
Insert and remove a partition in a chamber holding a one molecule gas without dissipation.
Detect the location of the molecule of a one-molecule gas without dissipation.
Shift between equal entropy states without dissipation.
Trigger new processes according to the location detected without dissipation.
My claim is that this inventory of processes neglects relevant thermal processes, most notably thermal fluctuations. I will show below that several of these processes are disrupted by fluctuations and cannot be brought to completion.
The mode of failure is quite robust and will be captured in a result that applies to all reversible processes at these scales. Reversibility means that, at each moment of the infinitely slow process, its systems are in equilibrium so that all forces perfectly balanced. Thermal fluctuations will be superimposed onto these equilibria for all thermal system. For macroscopic systems, the size of the fluctuations is sufficiently small that they can be neglected. However the processes considered in the thermodynamics of computation seek to manipulate systems at the molecular level. Then, it turns out, fluctuations are significant enough to disrupt the equilibrium to such a degree that the processes cannot proceed in any definite direction.
In supposing that the processes of the inventory can be completed non-dissipatively, the orthodoxy neglects the very processes that are essential to the recovery of Landauer's principle. A memory device with known data increases in entropy by k ln 2 when thermalized only because thermal fluctuations fling the molecule throughout the larger volume of the chamber now accessible.
If these processes of the inventory are to be effected, we shall see that further dissipative processes are required. Their need then contradicts the central tenet of the Landauer-Bennett orthodoxy, that erasure is the only necessarily dissipative process in computation.
That fluctuations are the key to understanding all these processes is not a new idea. When the molecular threat to the second law of thermodynamics was first addressed, they took center stage. The basic idea of Smoluchowski's analysis of the early 1910s was this: the functioning of any machine that sought to accumulate microscopic violations of the second law would itself be fatally disrupted by fluctuations.
The most celebrated example of many was the Smoluchowski trapdoor. It was a design for a mechanical Maxwell's demon. In place of the demon watching molecules approaching a door, the design employs a lightly spring-loaded door. Fast moving molecules approaching from one side can swing the door open, so that they pass through to the other. The trapdoor does not allow molecules to pass in the other direction.
Its intended operation would then lead to accumulation of more, faster moving molecules on one side, as shown here:
The design fails. The trapdoor itself must be very light and very lightly spring-loaded if a collision with a single molecule is to open it. However the trapdoor is itself a thermal system, carrying energy kT/2 for each degrees of freedom. The result is that the trapdoor is flapping about wildly, bouncing off the partition and not remaining closed. It provides no differential obstacle to molecules. They can can pass through the trapdoor with equal facility in either direction:
Analogously, fluctuations will disrupt these expansion and compression processes. To see how, we need a simple implementation of the process. The cylinder in which the one-molecule gas will expand is oriented vertically so that the work done by the expanding gas is captured directly in the raising of the mass of the piston.
|This need to adjust the piston mass as the process proceeds would introduce further machinery that complicates the analysis. A better approach is to assume a non-gravitational force field that pulls the piston down with force 2kT/h. The piston then remains in perfect equilibrium with the gas throughout the process.
This force field is also assumed not to act on the single molecule, precluding the complication of a gravitationally induced density gradient in the gas.
For this analysis, see "Waiting for Landauer" Section 7.5.
|For the expansion to be reversible, the weight of the piston must balance exactly the upward pressure of the one molecule gas. This is the condition that a thermodynamically reversible process must meed. Thus, as the expansion proceeds and the gas pressure drops, the weight must be constantly adjusted.
The process we intend will consist of the piston, in perfect equilibrium with the gas pressure, rising very slowly, as work energy is drawn from the gas. Since the internal energy of the gas will remain the same, that withdrawn work energy is in turn replenished by heat that is conveyed to the one-molecule gas from the surroundings.
We can estimate the condition of perfect equilibrium as follows. The gas exerts a mean pressure P = kT/V on piston. If the piston has area A and is at height h, then we have V = Ah, and the force F exerted by the gas on the piston is
F = P.A = (kT/V).A = kT/h
|This is only an estimate of the condition for equilibrium. The more complete analysis of "Waiting..." shows that we must also consider the effects of the thermal motion of the piston.||The weight of the piston is Mg, if its mass is M. Setting the force and weight equal, we find that the equilibrium height for the mass M is|
heq = kT/Mg
What this description ignores is that the piston must be very light if impacts with a single molecule are sufficient to raise it. And it is also a thermal system that will undergo fluctuations in its own right. Since its mass will be comparable to that of the single molecule, we can anticipate that these fluctuations will cause the piston to bounce throughout the chamber. For it is these same fluctuations that leads the single molecule to bounce through the chamber, simulating a chamber filling gas.
An easy computation affirms that this will happen. Take the case shown above of the piston by itself, that is, without a one-molecule gas trapped beneath it. The piston will have a thermal energy distributed probabilistically according to the Boltzmann distribution. That is, the probability density p(h) that the piston is at height h is given by:
p(h) = (Mg/kT) exp(-Mgh/kT)
The mean of this distribution is kT/Mg, which is equal to heq computed above. That means that the piston will on average already be around this equilibrium position of the expansion without any specific interaction with the gas, merely because the piston is a thermal system whose fluctuations fling it through the chamber.
More telling is the standard deviation of the above distribution. It is also kT/Mg. Since this magnitude gives the scale of the piston's fluctuations, we now see that the piston will be bouncing wildly through the entire size of the cylinder merely because it is a thermal system. It will be process without discernible start or finish, as shown here:
The outcome is that a reversible isothermal expansion or its inverse, a reversible isothermal compression of a one-molecule gas is impossible. They are delicate processes in which all the forces must balance. Yet exactly this fact makes it fall victim to thermal fluctuations.
How are we to perform a compression or expansion? We must apply a force strong enough to overcome the fluctuations in the motion of the piston and use that force to drive the piston from its expanded to compressed position.
But to do that is to use forces that are out of equilibrium. The process is no longer reversible, but irreversible and hence an entropy creating dissipation.
It is essential to the Landauer-Bennett orthodoxy that we can detect the location of a single molecule in a memory device without dissipation.
Bennett has provided several schemes that purport to do this. In one, a keel shaped device is slowly lowered onto the memory device. it is fitted with two pistons, each moving through one side of the memory device. Only one will be resisted by the molecule and this will tip the keel. The location of that molecule is then read from the direction in which the keel tipped. A key shaped device is coupled with the keel as it tips by turning a toothed wheel to raise and lower the key. The final reading of the detection process is given by the location of the key on the locking pin.
Here are the stages of this process:
The difficulty with this scheme is essentially the same one that disrupted the expansion above. If the keel is to be tipped by the pressure exerted by a single molecule, then the keel must be very light. The keel is also a thermal system with its own thermal energy of kT/2 per degree of freedom. Just as the piston in the above compression bounced about because of its thermal energy, so the keel will rock wildly. This wild rocking will persist so that the keel never settles down into a configuration in which it correctly indicates the memory device state.
Other detection schemes described by Bennett fail in the same way. An earlier proposal involves the reading of a ferromagnetic core memory device. The data stored by the magnetic core is held in the direction of its magnetization. That is, its own magnetic field is directed "up" or "down" according to the data bit recorded.
Bennett proposes that the state of the device can be read by a second detector memory core that is brought from a bistable state into a "soft mode" by passing it through a magnetic field. In the soft mode, the detector's magnetization can be altered easily. It is then slowly coupled with the memory device, where it aligns its own field contrary to that of the memory device. After this coupling operation, the detector is returned to a bistable state.
Here is an animation of the process:
Here is the figure from which the animation was developed.
Fluctuations will once again disrupt this scheme. Since the detection must be carried out in a reversible manner, the forces coupling and decoupling the detector with the target memory device must be in perfect equilibrium with those resisting. If the memory device and detector are small and of molecular scales, then then thermodynamic fluctuations are superimposed upon the equilibrium state and will drown out any reliable detection.
For example, consider the process through which the detector ferromagnet is coupled with the data ferromagnet. When the detector is in its soft mode, its state can move freely between the "0" and "1" sub-states and thermal fluctuations will lead it bounce between them. Some other device--let's call it a driver--will bring the detector to couple with the data ferromagnet. The result will be that the detector states will now be restricted to one of "0" or "1" only.
This coupling process is a compression of the detector state space. It is compressed from a state having access to the full range of "0" and "1" sub-states to a state from which only one of them is accessible. Here is how we might expect the process to go:
However it will not proceed this smoothly. Thermodynamically, the coupling has the same behavior as the compression of a one-molecule gas. The detector is the analog of the gas. The driver is the analog of the compressing piston. This coupling will be disrupted by fluctuations in the same way as the gas compression. The compressive force exerted by the driver will be balanced exactly by the detector's resistance to compression. Once thermal fluctuations are superimposed onto this equilibrium, we have a process that fluctuates about so wildly as to have no definite beginning or end:
That fluctuations will have this effect could be shown by a computation dealing with the specifics of the arrangement just described. That turns out to be unnecessary. In so far as the processes of detection just described are to be carried out as thermodynamically reversible processes, they are governed by the "no go" result of Section 6 below, which shows that the processes are fatally disrupted by fluctuations.
How can detection be carried out? What we need is a process that can override fluctuations. Such processes will be dissipative.
The older literature offered many proposals. The best known in the Brillouin torch. In the context of exorcising Maxwell's Demon, he proposed that we locate a molecule in a kinetic gas by shining a light that will reflect off it, revealing its position.
He argued, however, that this detection operation is thermodynamically dissipative. One needs sufficiently energetic light so that the resulting signal remains detectable above the background thermal radiation. That condition, Brillouin argued, required a dissipation of at least k ln 2 for each bit of information secured.
The other processes of the inventory are troubled. Take, for example, the simple idea that we can insert a partition into the chamber without incurring a thermodynamic cost. A small amount of analysis opens numerous problems.
If the partition is very light, then it will be subject to thermal fluctuations. If we try to drive it into place in a reversible process with perfectly balanced forces, we can expect thermal fluctuations to disrupt our efforts as before.
So let us imagine that it is very massive. Our presumption is that we are dealing with frictionless systems (i.e. conservative Hamiltonians). For otherwise their operation would be accompanied by just the dissipative conversion of work to heat that we are seeking to avoid. That means that a massive partition slowly sliding into place would not cease moving when it strikes the chamber walls. It would simply bounce off.
Some further machinery is needed to halt its motion. It cannot become wedged by friction into a groove, for example. That would mean that it is held by disallowed frictional forces. Or we cannot tie it down with ropes or screw it in place, for knots in ropes and screws both depend essentially on friction to hold. (Frictionless shoelaces would spontaneously untie!)
We might consider a mechanical device that operates without friction. For example, a spring-loaded pin might press against the partition wall. When the partition insertion is completed, the pin would align with a hole in the partition and slide into place, locking the partition in place.
We might anticipate its operation to proceed as follows:
It will not work that way. We have simply replicated all the problems of the sliding partition in the sliding pin. If the pin is light, it will have thermal energy sufficiently great to lead to fluctuations that are in turn sufficient to have the pin bounce out and release the partition. If the pin is massive, since no friction will restrain it, it will simply reverse its motion and bounce back.
So far, we've seen a few examples of how fluctuations defeat efforts to conduct the routine business of computation non-dissipatively at microscopic scales. The obvious hunch is that this will always be the case. There seems to be nothing special about the examples we've seen.
That hunch turns out to be right. All the cases looked at so far turn out to be instances of a quite general result. Indeed the result is quite startling for its generality.
|That is not quite right. If the process were in perfect equilibrium at every stage, nothing would happen. There has to be slight disequilibrium to allow it to proceed. Imagine, however, that we make this disequilibrium smaller and smaller, so that the process proceeds slower and slower. We are really concerned with processes that are so far along this sequence that the tiny disequilibria are smaller than any amount we care about.
It is an interesting project in philosophy just to get clear on precisely what these processes are. My analysis is given in "Infinite Idealizations," Prepared for Vienna Circle Institute Yearbook (Springer: Dordrecht-Heidelberg-London-New York). Download.
|The key property of the processes covered by the no go result is that they are non-dissipative. Recalling the key ideas once again:
To be non-dissipative, that is to be a process that creates no new entropy in the universe, the process must be in perfect equilibrium at all of its stages. That means that the forces that drive the process are perfectly balanced.
Processes with this property are called "reversible," a notion introduced by Sadi Carnot in the 19th century at the birth of thermodynamics. They have the property of being able to proceed with equal facility in both directions.
A reversible expansion, for example, is one described earlier in which the weight of the piston perfectly balances the expansive pressure of the gas. If the weight were taken off the piston so the pressure force would overpower it, then we would have a sudden expansion of the gas; it would be an "irreversible process."
The no go result applies to all processes of this type, if implemented at the microscopic level. That means that they are implemented in a way designed to manipulate the properties of individual molecules. Such processes are intended to proceed very slowly through a series of stages indexed by some parameter, λ:
λ=1, λ=2, λ=3, ...
We now add the existence of fluctuations. They impose extra motions on the system, beyond the intended motion through the various stages. For macroscopic bodies, these fluctuations produce a very, very slight probabilistic wobble that is quite indiscernible. For smaller bodies like pollen grains that are visible under the microscope, they introduce larger probabilistic wobbles known as a Brownian motion. For still smaller bodies, these tiny probabilistic wobbles become dominant; they fling the system rapidly over all the stages.
More precisely stated, the result is that all of these stages are equiprobable:
p(λ) = constant
where p is the probability density of stages over λ. What this means is that, if we inspect the process at any moment, we are as likely to find it any stage as at any other.
That defeats all efforts to implement the infinitely slow reversible transformation. Imagine for example that we set up the system in some initial state, λ = λ1, intending that over a long time it will slowly evolve into a final state, λ = λ2. That means we are expecting something like this:
That is not what we will get. If, moments after, we inspect the system, we are as likely to find the system already in any of the stages of the process, including the final stage. The process will have no definite start, middle or end. These fluctuation probabilities will be hurling the process over all the intermediate stages, indiscriminately. This is what we will have:
The origin of the no-go result lies in the fact that reversible processes are, at all times, in perfect equilibrium. That is a delicate balance of forces. Thermal systems fluctuate slightly all the time because of their molecular constitution. If we consider macroscopic processes, these fluctuations are minor and rarely need our attention. If, however, the processes are manipulating components at a molecular level, the situation is reversed. The delicate balance needed for equilibrium is overwhelmed by fluctuations.
It is helpful to attach a geometric picture to this no go result. All the processes unfold in a phase space whose individual points represent all possible microscopic states of the system. The evolution of the processes themselves are represented by a trajectory through this phase space. Finally, the different stages of the processes are represented by sub-volumes of the phase space.
Perhaps from an analogy with macroscopic processes, what we expect is for the evolution to proceed peacefully, with the system remaining for a while in the first stage λ=1; then moving on to the next, λ=2, where it lingers; then moving on to λ=3; and so on.
However this is not what will happen. We are assuming that the states of the various stages are dynamically remote from one another so the trajectory cannot easily pass between them. That is incorrect. The various stages are dynamically very close and the system can as easily pass to a different stage as remain in the present stage. This figure is closer to what will happen:
This figure also displays the important fact that the volumes of phase space corresponding to each stage overlap. That means that there is considerable arbitrariness in how we assign the stage of the process actualized when we have the system in a state that belongs to many stages.
We can see how this overlapping comes about if we look at the phase space of a gas-piston system. The full phase space of the system will have many dimensions. There will be coordinates for the positions and momenta of both gas and piston. The overlapping is seen, however, if we look at a slice through the phase space that shows just the heights of the molecule and the piston. For stage gas-piston height boundary h ="H," the gas molecule occupies heights from 0 to H and the piston occupies heights from H to infinity. These are plotted on the figure below and correspond to the infinitely extending rectangles shown.
We can see from the figure that a single microscopic state of the gas-piston can belong to many stages. For example, if the gas height is 0.25H and the piston height is 1.5H, it will belong to all four of the stages shown.
We need two main ingredients to arrive at this no-go result. We need to know how to compute the probabilities of various fluctuations; and we need to know how precisely to characterize thermodynamically reversible processes. When the two are combined, we get the no-go result.
First, I will look at a case that might be more familiar: fluctuations in an isolated system. Then I will turn to the case that arises in the thermodynamics of computation, fluctuations in a system in thermal contact with its environment.
Consider a completely isolated system, such as a gas confined to a chamber, where the chamber is not in thermal or other forms of contact with anything else.
The phase space of such an isolated system can be divided up into parts that correspond to various macrostates. The largest such part will the one that corresponds to the equilibrium state, such as one in which the molecules of a gas are distributed fairly uniformly over the volume that contains it.
Other non-equilibrium states, such as ones in which the gas spontaneously compresses to one side, will occupy very tiny portions of the phase space.
|The condition of non-intersection is unique to an isolated system. Such systems are deterministic. That means that the state of the system fixes its future time development. Hence there can be only one trajectory through each point of the phase space. If there are intersections, then the intersection points represent states with multiple futures possible, in violation of determinism.||Over time, the system will explore its phase space, tracing out a trajectory that (for the case of an isolated system) never intersects itself. It might look something like this:|
As the exploration proceeds, there will be some probability that the system point is found in any nominated part of the space. An isolated system is microcanonically distributed. That means that this probability is uniformly distributed over the space. So the probability of the system being in any volume V of the phase space is just proportional to the volume V.
Probability ∝ V
For the particular case of the equilibrium state that occupies volume V, it turns out the the thermodynamic entropy S of the state is related to the volume V by a famous relation due ultimately to Boltzmann:
S = k ln V
This relation holds, initially, only for the equilibrium state. For thermodynamic entropy is defined by the Clausius formula dS = dqrev/T only for equilibrium states. However it is common to extend the formula to volumes that do not correspond to equilibrium states. Through that extension, the formula becomes a definition.
Combining them we arrive at what Einstein called "Boltzmann's Principle":
S = k ln (probability) + constant
We can invert this relation to give an expression for probability:
If we have two states 1 and 2, we can compute the relative probability of finding the system through fluctuation to be in either of them:
probability2/probability1 = exp(ΔS/k)
where the change of entropy ΔS = S2- S1.
The extension of S to non-equilibrium states turns out to be a very useful definition for our application. We might initialize a process by creating some equilibrium state of known entropy S1. We then release it so that it can become a non-equilibrium state and evolve to another state that we then trap as a new equilibrium state, with another entropy S2. We can then insert those values into the relative probability formula to find out how likely we are to have to process go to the completion state 2 as opposed to remaining it its initial state 1.
|In Einstein's honor I will adhere to the old tradition and use the letter W for probability. That follows Einstein's usage of the German "Wahrscheinlichkeit" = probability.||In his 1905 paper on the light quantum, Einstein provided a famous illustration of the computations of the last section. He considered the case of an ideal gas whose molecules fill some chamber. He considered a fluctuation in which all the molecules found themselves in some sub-volume.|
To make things concrete, imagine an ideal gas consisting of just four molecules. These four molecules will bounce around in their chamber and, by chance, may all happen to collect in one half of the chamber. We start with the equilibrium state in which the gas occupies the full chamber. We are interested in a final state in which the gas is confined to half the chamber (and we manage to trap it there after a fluctuation).
The change in entropy between these two equilibrium states is given by standard thermodynamics as:
ΔS = Shalf - Sfull = -4k ln 2 = k ln (1/24)
If we now apply the relative probability form of Boltzmann's principle, we find:
Whalf/Wfull = exp(ΔS/k) = exp (ln(1/24)) = 1/24
In this simple case, we can check our result independently. The probability that any nominated one of the four molecules is in the specified half is just (1/2).
Since they move independently, the probability that all four of them happen to be in that specified half is:
and that matches the result given by Boltzmann's principle and the entropies of ideal gases.
This gives an idea of how thermodynamic quantities and probabilities can be connected. Unfortunately, Einstein's S = k ln W cannot be applied to the case at hand. For Einstein's result applies to isolated systems. The systems that concern us are in thermal equilibrium with their surroundings. They are anything but isolated.
However it turns out that a slight adjustment of Einstein's result applies for this case. We must assume that our system is in thermal equilibrium with the environment, that it can freely exchange heat with it, but that it exchanges no work. Then Boltzmann's Principle can be replaced by one that is very close to it. All we need to do is to replace the entropy S by the free energy F = E-TS, where E is the systems mean internal energy.
S - E/T = k ln W + constant
F = -kT ln W + constant
Inverting this formula gives us the expression we need for probabilities
(W = probability) is proportional to exp(-F/kT)
The relative form is
probability2/probability1 = exp(-ΔF/kT)
where ΔF = F2-F1. This now enables us to write for the probability density p(λ) over λ, where the F's below would also now be free energy densities over λ:
p(λ2)/p(λ1) = exp(- (F2-F1)/kT = exp(-ΔF/kT)
The replacement of entropy S by free energy F comes about as follows. The molecular level picture is the same. We have the isothermal system exploring the possible states that comprise its phase space.
As before, the equilibrium states will occupy virtually all the volume of the phase space. So most of the system's trajectory will be spent in the equilibrium state with only rare excursions into macroscopically distinguishable non-equilibrium states.
|Since this phase space is a subspace of a larger total phase space, the earlier restriction that the trajectory cannot intersect itself can be dropped. The figure does not show any intersections, but they can happen now. Determinism is still preserved. Which possible future is realized after an intersection is determined by the state of the other parts of the system, not shown.||As before, the equilibrium states will occupy virtually all the volume of the phase space. So most of the system's trajectory will be spent in the equilibrium state with only rare excursions into macroscopically distinguishable non-equilibrium states.|
An isolated system is distributed microcanonically over its phase space. A system in thermal equilibrium with a larger system is distributed canonically over its phase space. That means
(probability of phase point at position x)∝ exp(-E(x)/kT)
where E(x) is the energy of the system when at point x in the phase space. The probability that the system is in some volume V of the phase space is thus:
p(V)∝ ∫V exp(-E(x)/kT) dx = Z(V)
The quantity Z(V) is known as the partition function or partition integral. For equilibrium systems it is related to the free energy by:
F = -kT ln Z
Proceeding as in the earlier case, we can define the free energy F of a non-equilibrium system by the same formula. Combining, we have
F = -kT ln Z = -kT ln p(V) + constant
Inverting gives the earlier result:
(probability = p(V)) is proportional to exp(-F/kT)
The second ingredient is the condition of equilibrium at every stage λ of the process. For a system in thermal equilibrium with its surroundings, this condition is simply the condition that the free energy F remains the same as we proceed through the stages.
F(λ) = constant
Here is how to see that this constancy of F expresses equilibrium. A closed system will spontaneously move to states of higher entropy. So if a closed system is at equilibrium, it must have the same entropy as all accessible nearby states. If a process Δ connects the system to these nearby states, then we must have ΔS = 0.
Our situation is different, but only slightly so. The entropy of our system may not change due to spontaneous, irreversible processes. But it may change since a reversible heat transfer with the surroundings may add or subtract entropy. This entropy change will be given by the usual formula of heat/T, where the heat gained or lost is just the change in internal energy E. That is, we have
ΔS = ΔE/T or ΔS - ΔE/T = 0 or TΔS - ΔE = 0 or Δ(E-TS) = 0 or ΔF=0
Since the process Δ connects the system with neighboring states indexed by λ, another way to write this is as dF(λ)/dλ = 0, which is the same as
F(λ) = constant.
To see that this condition expresses equilibrium conceived as a balance of forces, go to Notes, Expressing Equilibrium.
We now combine these two results to recover the no go result.
We have for any isothermal reversible process that passes from λ1to λ2 that
p(λ2)/p(λ1) = exp(-(F2-F1)/kT = exp(-ΔF/kT)
But we also have from the condition for equilibrium that ΔF = F2-F1 = 0. Therefore p(λ2)/p(λ1) = exp(0) = 1 and
p(λ1) = p(λ2)
which is the no go result.
The above analysis makes the tacit assumption that the thermodynamically reversible processes are self-contained. That means that they do not rely on interactions with non-thermal components or with components that are very far from thermodynamic equilibrium.
The reason is that such interactions would compromise the whole idea of seeking the thermodynamic limits of computation. If we allow non-thermal components, then we have added an element not bound by the usual laws of thermodynamics. We do not know how much of our results come from its non-thermal behavior.
If we allow a component that is far from thermal equilibrium, we no longer have a thermodynamically reversible process, for the latter consists of a sequence of equilibrium states or ones minutely removed from them.
We have already seen a common example. If an expanding gas raises a weighted piston, it is standard to imagine some non-thermal process adjusting the weight of the piston so that equilibrium is maintained.
One scheme requires that the piston is weighted by a pile of lead shot. As the piston expands and the gas pressure drops, a non-thermal operator reaches down and removes the pieces of shot one at a time.
As far as I can see, this sort of non-thermal intervention produces no problems if we merely wish to infer the thermodynamic properties of the thermal sub-system. However we cannot include a non-thermal component in the construction of a computing machine since there are no non-thermal components.
A simple example is a scheme for compressing a gas that employs the inertia of a very massive, very slowly moving body.
This slow motion of the mass provides just the force needed to compress the gas slowly. As far as the gas is concerned, the process is a sequence of equilibrium states and thus looks thermodynamically reversible.
However it is quite different for the mass. It is also a thermal system, as is any real body. At equilibrium its kinetic energy is canonically distributed. That means that its velocity fluctuates back and forth about the mean value of zero, which is rest. For a massive body, these fluctuations are imperceptible. We would just see it sitting still, although there would be very slight wobbles.
The inexorable, unidirectional motion displayed comes about when the massive body's velocity is moved into an extreme tail of the distribution; that is, the body is moved to an improbable, non-equilibrium state.
That means the total process displayed above is not a thermodynamically reversible process. They are a sequence of equilibrium states or states minutely removed from equilibrium. The process shown is a sequence of far-from-equilibrium states.
The no go result tells us that we cannot employ isothermal, reversible processes in our computing machines if they are to function at the microscopic level. What does it take to overcome these fluctuations so that processes at the microscopic level are possible? The fluctuation formula tells us:
p(λ2)/p(λ1) = exp(-(F2-F1)/kT = exp(-ΔF/kT)
If we want a process to proceed from λ1to λ2, we need to create a probability density gradient that favors the process advancing. The formula tells us that we can do it creating a negative free energy change, ΔF < 0. That means that we will have a system out of equilibrium. If, for example, we want to expand a gas, the mean force exerted by the piston must be less than that exerted by the gas, so it spontaneously expands, most likely.
For example, if we want a probability density ratio of at least 20, we have
p(λ2)/p(λ1) > 20 = exp(3)
and we need to set ΔF/kT < -3. Recalling that ΔF = ΔE - TΔS, that amounts to
ΔS - ΔE/T > 3k
To interpret this relation, note that ΔE is the energy change of the system. Therefore -ΔE is the energy change of the surroundings which must arise through a gain of heat by the surroundings of -ΔE. If that gain were through a reversible process, then the entropy change in the surroundings would be heat/T = -ΔE/T. However since we are not longer restricting ourselves to reversible processes, the entropy change in the surroundings will be -ΔE/T or greater. Combining, this means that the last inequality says
(Entropy increase in system and surroundings) > 3k
This entropy creation of 3k greatly exceeds the k ln 2 = 0.69k of entropy tracked by Landauer's Principle. Thus, if a computing device is functioning at all, it must be creating quantities of entropy that exceed this Landauer limit with each of its steps.
A probability gradient of 20 is not high. Loosely it means that one in twenty times, the process will not go ahead. That is a high failure rate for a system with many steps. It can only be reduced, however, by including dissipative processes that create still more entropy.
Even though it is not high, implementing a process with ΔF < -3kT requires that we do quite some violence to our system. Imagine, for example, that we want to ensure that a one-molecule gas expands to double its volume. Merely upsetting equilibrium by reducing the restraining force on the piston will not suffice. If we remove the piston entirely so the gas expands into a vacuum, the free energy change is still only -kT ln 2 = -0.69 kT. To drive it up to -3kT would require further machinery that would accelerate the expansion, reinforcing the gas' tendency to expand into a vacuum.
At first, this no go result seems too strong to be correct. It says--correctly-- that isothermal, reversible processes are impossible among systems with molecular constitution. That impossibility, one might think, would be hard to reconcile with their central place in thermodynamic theory. The origin of thermodynamics lay in the analysis of the efficiency of steam engines. The major discovery is that, as a quite general matter, a steam engine is made more efficient by bringing its processes closer to reversible processes. How are we to make sense of that if fluctuations disrupt reversible processes?
The key fact is that disruptions only interfere at molecular scales. They will cause trouble if we try to build a steam engine with molecular-sized components. But they will not as long as our machines are macroscopically sized.
To see this, let us go back to the fluctuation formula
p(λ2)/p(λ1) = exp(-(F2-F1)/kT = exp(-ΔF/kT)
We can make the process overwhelmingly likely to advance if we have a free energy imbalance of ΔF/kT < -25. For then
p(λ2)/p(λ1) > 7.2 x 1010 = exp(25)
However the free energy imbalance of 25kT involves a minute quantity of energy. It is merely the mean thermal energy of ten oxygen molecules. Steam engines are systems that employ over 1025 molecules. That is over 100,000,000,000,000,000,000,000,000. The energy of just ten molecules will be invisible on all scales that matter to steam engines!
The difficult with the sums of the last two sections is that they overlook intermediate states. If we have a 20 to 1 ratio for the probabilities of the final and initial state, we can say that the final state is 20 times more likely than the initial state. But that does not preclude the system ending up in one of the intermediate states with with some other probability.
We can design processes that require the minimum entropy creation by the simple expedient of setting the free energies of the intermediate states so high that they become very improbable when compared with the initial and final states. (For an illustration of how this is possible, see the "bead on a wire" illustration below.)
Assume we have such a case. The initial state, "init," is associated with some small interval of λ and the final state, "fin," is associated with some other small interval of λ. All other states are much less probable. We can now write for the probabilities of the system being in the initial or final state:
Pinit + Pfin = 1
(Note: Pinit is not the probability of the system starting in the initial state; it is the probability that a fluctuation carries it back to the initial state once the process has begun.) It follows that their ratio is given by:
Pfin/Pinit = Pfin/(1-Pfin) = Ofin
where Ofin is just the odds of finding the system in the final stage. If the free energy difference between the final and initial states is ΔF, we have
Ofin = exp(-ΔF/kT)
Inverting, we have
-ΔF/T = k ln Ofin
If ΔE and ΔS are the energy and entropy changes of the system, we have from F=E-TS that
-ΔF/T = ΔS - ΔE/T
Since the system can exchange no work with the environment, the change of energy ΔE must result from a transfer of heat of -ΔE to the environment. As a result in the best case, the entropy of the environment is increased by ΔSenv = -ΔE/T. Thus we have that:
-ΔF/T = k ln Ofin= ΔS - ΔE/T = ΔS + ΔSenv = ΔStot
where ΔStot is the total increase of entropy in the system and its environment. From this last series of equalities we recover the main result:
ΔStot = k ln Ofin
It is displayed in larger type because it is the controlling relation.
If we have any process at all, no matter how trivial, that we wish to proceed from some initial to some final state, and we want to be assured of its completion with odds Ofin, then we must create at least k ln Ofin of thermodynamic entropy to achieve it.
This applies to the most trivial of processes. If we just wish to do nothing fancier than move a component from one place to another, to be assured of success, we must overcome fluctuations and that requires the creation of thermodynamic entropy.
The bead on a wire provides just about the simplest illustration possible of the no-go result.
Consider a bead that can slide frictionlessly along a horizontal wire.
At each position on the wire, it is in a state of perfect equilibrium. No net force presses it to one side or another. Hence the sequence of states of its possible positions along the wire correspond to an extremely simple thermodynamically reversible process, in which the bead moves across the wire. To get it to move, as with all thermodynamically reversible processes, we need to introduce a minute disequilibrium. In this case, we would tilt the wire minutely.
Now add in the effect of fluctuations.
|For the calculations supporting the results shown here, see "All Shook Up: Fluctuations, Maxwell's Demon and the Thermodynamics of Computation,." Section 10. Download.
"rms" = "root mean square"
|The bead will have a mean thermal energy of kT/2 and, if the bead is small, this will give it a large speed. If the bead has mass of 100 amu--that is the mass of an n-heptane molecule--its "rms" velocity is 157 m/s at 25C. For a one meter wire, that means that the bead is flying to and fro rapidly over the full length of the wire. Its position is uniformly distributed over the wire, as the no-go results asserts.
Now consider a macroscopic bead of mass 5g. Its rms velocity is just 9 x 10-10 m/s at 25C. That is so small as to be indiscernible to us macroscopically. While its motion is also a fluctuation motion qualitatively like that of the small mass, on human time scales, the macroscopic mass is at rest.
What does it take to overcome the fluctuations and ensure that the bead moves from one side of the wire to another? We tilt the wire until the force of gravity along the wire is strong enough to do the job.
For the 5g macroscopic bead, the amount of tilting needed is extremely small. The destination end of a one meter wire needs be depressed by an amount that turns out to be many orders of magnitude smaller than a hydrogen atom.
For the molecular scale bead of 100 amu, it turns out that no mere tilting is adequate. Rotating the wire so that it is vertical will have a negligible effect on the fluctuations. We would need to impose an external field far greater in strength than terrestrial gravity to pull it down to the end. That is something we could have expected. n-heptane is a volatile liquid, so its molecules are perfectly able to keep themselves in a vapor state under ordinary conditions in a terrestrial gravitational field.
Finally, the least entropy creation formula ΔStot = k ln Ofin applies to the case in which we make the intermediate stages of a process probabilistically inaccessible. The bead on a wire illustrates how that is possible. We bend the wire so that the bead must pass over a high mountain before descending into the lower valley at the destination end.
As long as the mountain has finite height, there is always some small probability that the bead ascends it and passes over. However the presence of the mountain means that the process will take a long time to complete. We must wait for a rare fluctuation of sufficient size that it can fling the bead over the mountain.
The converse principle says:
Converse Landauer's Principle: All logically reversible operations can be implemented by thermodynamically reversible processes, that is, processes that create no entropy.
So we have an association:
You might expect the direct version of the principle to have the corresponding association:
|??? goes with ???||thermodynamically
However that is not what the direct principle says.
Landauer's Principle: Erasure of one bit of information passes at least k ln 2 of thermodynamic entropy to the surroundings.
It turns out that there are two cases. If the data is "random," then erasure merely passes entropy to the surroundings in a thermodynamically reversible process. However if the data is "known" or "nonrandom," then the process is thermodynamically irreversible and creates entropy.
Here's Bennett's explanation:
|Bennett, Charles H. (1988). “Notes on the History of Reversible Computation,” IBM Journal of Research and Development, 32 (No. 1), pp. 16-23.||"When truly random data (e.g., a bit equally likely to be 0 or 1) is erased, the entropy increase of the surroundings is compensated by an entropy decrease of the data, so the operation as a whole is thermodynamically reversible.
...[I]n computations, logically irreversible operations are usually applied to nonrandom data deterministically generated by the computation. When erasure is applied to such data, the entropy increase of the environment is not compensated by an entropy decrease of the data, and the operation is thermodynamically irreversible".
This literature assumes incorrectly that random data and thermalized data have the same thermodynamic entropy. So when random data is reset, it is thought to undergo a reduction in thermodynamic entropy of kT ln 2 for each bit. That then matches the increase in thermodynamic entropy of the surroundings. The same does not hold for known data, so its entropy is not decreased when it is erased. Thus, in this second case, the entropy appearing in the surroundings derives from a thermodynamically irreversible process.
This is puzzling.
-- If the erasure program can make use of the fact that the data is known data, then erasure would merely amount to a rearrangement of which parts of phase space occupied, so that there need be no net increase in thermodynamic entropy.
-- If the erasure program cannot make use of the fact that the data is known data, then it will do exactly the same thing to random and known data. So the amounts of thermodynamic entropy created in each case should be the same.
All I can do is report this awkwardness and explain the muddled reasoning the leads to it. I cannot make sense of it.
We compute the entropy difference between two states by finding a reversible process that connects the two states. The two states are:
An cell holding data
and a thermalized data cell
A reversible isothermal expansion connects the state of a molecule confined to one half of the chamber and the molecule free to move through the whole chamber.
To carry out this expansion, we introduce a piston at the midpoint of the chamber. The molecule repeatedly collides with the piston, exerting a mean pressure P on it that is given by the familiar ideal gas law
P = kT/V
where V is the volume of the chamber holding the molecule. The work done by the the gas is the course of the expansion is given by the integral
Work = ∫P dV = ∫kT/V dV =kT ln(Vfin/Vinit) = kT ln 2
where the integration varies from V = Vinit to V = Vfin = 2*Vinit, since the expansion is a doubling in volume. This work is captured by the raising of a weight.
A characteristic property of an ideal gas of one or more molecules is that its energy is independent of its volume. Hence the work energy drawn from the gas must be replaced by heat drawn from the surroundings. That is, in the process, heat
Qrev = kT ln 2
is drawn into the gas. The subscript "rev" emphasizes that this is a reversible heating.
Finally, the Clausius definition of entropy change ΔS for a reversible process is
ΔS = ∫dQrev/T = k ln 2
That is, thermalizing memory device increases its entropy by k ln 2.
There is another way to get the result that equilibrium is expressed by
F(λ) = constant.
It looks rather different from the analysis in the main text, but actually expresses the same relations. Here we see that this equilibrium condition directly expresses the idea that all forces balance perfectly in the equilibrium state.
X(λ) = -dF(λ)/dλ
is the generalized force exerted by a system with free energy F(λ) as is passes through the stages of the process. This is the force that enters in to the familiar expression of the first law of thermodynamics:
Change of internal energy U = Heat gained - Work done
dU = dq - X dx = TdS - Xdx
dq = heat gained = TdS for the thermodynamically reversible process.
Work done = Xdx, where the generalized force X moves a generalized displacement dx. On example is the X is pressure and x is volume.
In the familiar case of an ideal gas of n molecules with volume V, the free energy is
F = -nkT ln V + constant(T)
where the constant is different for different T. If the process is an isothermal expansion or contraction with parameter V, then the generalized force is just
X(V) = -(d/dV) (-nkT ln V) = nkT/V
where T is held constant in the differentiation. That is just the ordinary (mean) pressure exerted by an ideal gas. The quantity X(λ) generalizes this for other processes.
If the system consists of parts, such as a gas and a piston, then the condition of equilibrium is just that the generalized forces exerted by each part should balance precisely. If the system has parts A and B, that means that
XA(λ) + XB(λ) = 0
Since XA(λ) = -dFA(λ)/dλ and XB(λ) = -dFB(λ)/dλ, this amounts to requiring
0 = dFA(λ)/dλ + dFB(λ)/dλ = (d/dλ) (FA(λ) + FB(λ)) = dF(λ)/dλ
which is the same result as before.
Copyright John D. Norton. January22, 2010. Revised July 12, 2013.