Three pieces – and some puzzles

I just finished a bit of summer reading – in particular, three books with very similar scope. The first is by a well-known physical chemist and author – it is called “Four Laws that drive the Universe“. The second is by a well-known quantum information expert, and called “Decoding reality: The universe as quantum information“. The third is an article in a philosophy of science journal from July 2019, it is called “A hot mess“, about what the author believes is a series of fatal flaws in Landauer’s computation of the entropy change due to computation. I discuss these below – indeed I thought up some puzzles that might help expand the material.

Atkins’ book (not the nutritional expert, but an Oxford don) is a gem amongst stale books on thermodynamics. He brings the four laws of thermodynamics and their statistical mechanics equivalents to life with lucid descriptions and simple examples. He uses math very judiciously, not beyond fractions and algebra and {\bf {explains}}  the material well. It is worth a careful read, not a beach read!

These books (the first, in particular) bring to mind a bunch of puzzles that consumed me when I first learnt cosmology.

Here is {\bf {Puzzle \#1}}.

As we all are aware, the evidence of increasing red-shifts of increasingly distant galaxies points to an expanding universe. When popular books discuss the expanding universe, they immediately say that the universe had a “hot” beginning, and it “cooled, as it expanded”.

Read this peculiar series of posts on an internet site about this topic. This is (yet) another reason why you shouldn’t believe what you read on the internet, including this article, unless you understand it yourself!

In these articles, the analogy is made to a hot gas in a cylinder with a piston, that cools as the piston is withdrawn – the molecules of the gas do work on the piston wall (assume the piston moves out really slowly, so the process does not heat up the piston, just slowly pushes it out). Here’s a picture of the gas-filled container with the piston (marked in green)


and then, here’s a picture of a gas molecule bouncing (elastically, like a good tennis ball) off the piston


It is easy to see why the speeds of the molecule post-bounce off a moving piston are as shown. First, if the piston were {\bf {stationary}}, were the molecule approaching with velocity (+ -ve to the right, --ve to the left), {\it V}, it would bounce off with velocity {\it -V}, i.e., a speed {\it V} in the opposite direction. The change in velocity of the particle would be -2 \times {\it V}.

Consider the situation if the piston were to move (to the right) with a velocity {\it v}. An observer sitting on the piston (using Galilean/Newtonian relativity), would see that the same particle was approaching the piston with a velocity {\it {V-v}} and leaving it with velocity {\it {-(V -v)}}. Translating this back to the frame of the container (with respect to which the piston is moving to the right, with speed {\it v}), the molecule bounces back with velocity {\it {-V+2 \times v}}. The change in velocity of the molecule is {\it {-2 \times V + 2 \times v}}. Notice the slight difference when you considering a slowly moving piston!

Consider the situation when the piston is stationary and we set {\it v}=0.

If there were {\cal N} collisions of molecules (each of mass m) with the piston per second per unit area of the piston, then, in time \Delta t, the momentum transferred to the piston would be \Delta P = {\cal A} \: m \: {\cal N} \Delta t \times 2 \times {\it {V}}. Here {\cal A} is the area of the piston face. This means the instantaneous force on the piston while it is moving would be {\cal F} = \frac{\Delta P}{\Delta t} = {\cal A} \: m \: {\cal N} \times 2 \times {\it {V}}. The {\bf pressure} on the piston, which is the average force per unit area of the piston’s face, would be {\cal P} = Average \bigg( m \: {\cal N} \times 2 \times {\it {V}} \bigg) .

The upshot is that the pressure on the piston comes from molecules of gas randomly hitting it. This is the “statistical-mechanics” view of the macroscopic quantity “pressure”.

Why does an expanding gas cool? It cools because there is a force from the gas molecules on the piston and when the piston moves to the right, the forces have done work on the piston. This work is produced at the expense of the internal energy of the gas (the molecules are moving slightly slower than they were earlier). To show this, observe (based on the above diagram) that the energy of each molecule goes from \frac{1}{2} m {\it V}^2 to \frac{1}{2} m {\it {(V-2 \times v)}}^2, since each molecule slows down after hitting the piston wall. Then in time \Delta t, the total energy of the gas has gone {\bf {down}} by approximately 2 \times {\cal {NA}} \: m \: {\it {Vv}} \Delta t (which we get by expanding the above square to linear order in {\it v}). This expression can be re-written as {\cal P} \Delta {\cal V}, where \Delta {\cal V} is the change in the volume of the container.

So, an expanding gas cools because it performs work on the piston. Where is this “piston” that the contents of the universe performed their work on during the expansion. There is no such thing.

The universe cooled as it expanded because the expansion means something different from what it means for the closed container above. This is explained with a rubber-sheet analogy in a previous post, but to quickly summarize, think of points in the universe like points on a rubber sheet. When the rubber sheet is pulled apart, the points move apart. The {\bf {scale-factor = a(t)}} measures the “scale” of the universe and is usually written as a function of the cosmological time (since the Big-Bang). As the universe expands, if you think of a particle that was travelling at 10 \frac{m}{s}, it now travels a smaller fraction of that distance in the same time, as those meter-grid-points are now further apart! So it travels slower, has lower energy! And if you think of the particle as a wave (think of wave-particle duality), the wavelength of the wave is stretched out as it travels – {\it {ergo}}, the wavelength gets longer, the frequency gets smaller and the energy of the particle represented by the wave gets smaller, in exactly the same way. Hence cosmic expansion “cools” the hot universe.

On to {\bf {Puzzle \#2}}.

When one learns thermodynamics, one hears the Clausius definition of entropy. It is a “state-function”, which means it can be uniquely defined at every specific macroscopic state of a system. Such state-functions are valuable since they serve as light-houses for us to compute useful things like how much work can be extracted from a system, or how much heat will accompany such work.

Entropy (or the change thereof) is written as \Delta S = \oint \frac{d Q}{T}, where dQ is the amount of heat added to the system, while T is the temperature.

I’ve always wondered, if this is a state-function, why not consider functions (dQ, dU are incremental heat added and incremental internal energy) whose change is \oint \frac{d Q+dU}{T} or different powers of the denominator? Why aren’t they as useful as the entropy?

Atkins’ book is the only one I have seen that actually mentions anything about variations. The reason why the other functions aren’t actually interesting is not because they aren’t (for some reason), not state functions or something like that. The original entropy definition is the only one that can be re-interpreted as being the {\bf {logarithm}} of the number of accessible microscopic states. As it turns out, that is useful for a myriad of other reasons, but not the only way to think about thermodynamics.

Here’s an example. In the 1800s a fellow named Carnot proved that a heat engine’s maximum efficiency is achieved by a special kind of engine. For doing so, he used an engine that was run through a cycle of isothermal (constant-temperature) and isentropic (constant entropy) processes. These are depicted in the P-V diagram below. Briefly, one carries out an isothermal expansion, with the gas doing work and absorbing heat from a heat source, followed by isentropic expansion, cooling the gas, but with the gas still performing work. Then one performs isothermal compression, shedding heat to the heat-sink and with work being done on the gas inside the cylinder. This is followed by isentropic compression, with the gas getting warmer. This is roughly the order of operations followed by a four-stroke engine.

The work done by this Carnot engine is equal to the area (depicted by a blue splash) in the above diagram. That area is bounded by different lines that are either isothermal or isentropic processes.

However, if we invented a new quantity, call it “F_{entropy}“, whose change in a process is dF_{entropy} = \int \frac{dQ+dU}{T}, then we would simply find a new set of curves (call them iso-F-entropic processes) in the above P-V graph. Would we find a more efficient F-Carnot engine by this mechanism?

There is a simple argument that we would not. For if we did, we would simply run the less efficient Carnot engine, do some work and dispose of some heat in the low-temperature heat-sink. Then run the more efficient F-Carnot engine as a refrigerator, to use less of the work to transfer the above disposed of heat from the low-temperature sink to the high-temperature source. This would be a machine that would violate the Second Law of Thermodynamics – it would take heat from a heat source and convert all of it to work. So these other versions of entropy wouldn’t really change anything – its enough to use the version that additionally has the connection to the number of microscopic states.

The last topic I want to discuss, possibly in a future post, has to do with the thermodynamics of computation. Briefly, Rolf Landauer (in the 60s at IBM) deduced that there is an absolute minimum of heat (and entropy) that is generated when a computation is performed. He connected this to something called Maxwell’s demon, which was a thought experiment constructed to explicitly break the Second Law of Thermodynamics. He (Landauer) then showed how ordinary entropy of disorder could be connected to the entropy of information, in a really concrete way. The article referred to above tries to make the case that this connection is weak, primarily because the “message” in information theory is too small, not macroscopic in size. I am not convinced, but think a longer post is essential to discuss it.