New kinds of Cash & the connection to the Conservation of Energy And Momentum

Posted on Updated on

Its been difficult to find time to write articles on this blog – what with running a section teaching undergraduates (after 27 years of {\underline {not \: \: doing \: \:  so}}), as well as learning about topological quantum field theory – a topic I always fancied but knew little about.

However, a trip with my daughter brought up something that sparked an interesting answer to questions I got at my undergraduate section. I had taken my daughter to the grocery store – she ran out of the car to go shopping and left her wallet behind. I quickly honked at her and waved, displaying the wallet. She waved back, displaying her phone. And insight struck me – she had the usual gamut of applications on her phone that serve as ways to pay at retailers – who needs a credit card when you have Apple Pay or Google Pay. I clearly hadn’t adopted the Millennial ways of life enough to understand that money comes in yet another form, adapted to your cell phone and aren’t only the kinds of things you can see, smell or Visa!

And that’s the connection to the Law Of Conservation of Energy, in the following way. There were a set of phenomena that Wolfgang Pauli considered in the 1930s – beta decay. The nucleus was known and so were negatively charged electrons (these were called \beta-particles). People had a good idea of the composition and mass of the nucleus (as being composed of protons and neutrons), the structure of the atom (with electrons in orbit around the nucleus) and also understood Einstein’s revolutionary conceptions of the unity of mass and energy. Experimenters were studying the phenomenon of nuclear radioactive decay. Here, a nucleus abruptly emits an electron, then turns into a nucleus with one higher proton number and one less neutron number, so roughly the same atomic weight, but with an extra positive charge. This appears to happen spontaneously, but in concert with the “creation” of a proton, an electron is also produced (and emitted from the atom), so the change in the total electric charge is +1 -1 = 0 – it is “conserved”.  What seemed to be happening inside the nucleus was, that one of the neutrons was decaying into a proton and an electron. Now, scientists had constructed rather precise devices  to “stop” electrons, thereby measuring their momentum and energy. It was immediately clear that the total energy we started with – the mass-energy of the neutron (which starts out not moving very much in the experiment), in decaying into the proton and electron was more than the energy of the said proton (which also wasn’t moving very much at the end) and aforesaid electron.

People were quite confused about all this. What was happening? Where was the energy going? It wasn’t being lost to heating up the samples (that was possible to check). Maybe the underlying process going on wasn’t that simple? Some people, including some famous physicists, were convinced that the Law of Conservation of Energy and Momentum had to go.

As it turned out, much like I was confused in the car because I had neglected that money could be created and destroyed in an iPhone, people had neglected that energy could be carried away or brought in by invisible particles called neutrinos. It was just a proposal, till they were actually discovered in 1956 through careful experiments.

In fact, as has been rather clear since Emmy Noether discovered the connection between a symmetry and this principle years ago, getting rid of the Law of Conservation of Energy and Momentum is not that easy. It is connected to a belief that physics (and the result of Physics experiments) is the same whether done here, on Pluto or in empty space outside one of the galaxies on the Hubble deep field view! As long as you systematically get rid of all “known” differences at these locations – the gravity and magnetic field of the earth, your noisy cousin next door, the tectonic activity on Pluto, or small black holes in the Universe’s distant past, the fundamental nature of the universe is translationally \: \: invariant. So if you discover that you have found some violation of the Law of Conservation of Energy and Momentum, i.e., a perpetual motion machine, remember that you are announcing that there is some deep inequivalence between different points and time in the Universe.

The usual story is that if you notice some “violation” of this Law, you immediately start looking for particles or sources that ate up the missing energy and momentum rather than announce that you are creating or destroying energy. This principle gets carried into the introduction of new forms of “potential energy” too, in physics, as we discover new ways in which the Universe can bamboozle us and reserve energy for later use in so many different ways. Just like you have to add up so many ways you can store money up for later use!

That leads to a conundrum. If the Universe has a finite size and has a finite lifetime, what does it mean to say that all times and points are equivalent? We can deal with the spatial finiteness – after all, the Earth is finite, but all points on it are geographically equivalent, once you account for the rotation axis (which is currently where Antarctica and the Arctic are, but really could be anywhere). But how do you account for the fact that time seems to start from zero? More on this in a future post.

So, before you send me mail telling me you have built a perpetual motion machine, you really have to be Divine and if so, I am expecting some miracles too.

The Normal Distribution is AbNormal

Posted on Updated on

I gave a talk on this topic exactly two years ago at my undergraduate institution, the Indian Institute of Technology, in Chennai (India). The speech is here, with the powerpoint presentation accompanying it The Normal Distribution is Abnormal And Other Oddities. The general import of the speech was that the Normal Distribution, which is a statistical distribution that applies to random data of a variety of sorts, that’s often used to model the random data, is often not particularly appropriate at all. I presented cases where this is the case and the assumption (of a normal distribution) leads to costly errors.

Enjoy!

Mr. Olbers and his paradox

Posted on Updated on

Why is the night sky dark? Wilhelm Olbers asked this question, certainly not for the first time in history, in the 1800s.

That’s a silly question with an obvious answer. Isn’t that so?

Let’s see. There certainly is no sun visible, which is the definition of night, after all. The moon might be, but on a new moon night, the moon isn’t, so all we have are stars in the sky.

Now, let’s make some rather simple-minded assumptions. Suppose the stars are distributed equally throughout space, at all previous times too. Why do we have to think about previous times? You know that light travels at 300,000 \: km/s, so when you look out into space, you also look back in time. So, one has to make some assumptions about the distribution of stars at prior times.

Then, if you draw a shell around the earth, that has a radius R and thickness \delta R, the volume of this thin shell is 4 \pi R^2 \delta R.

Shell Radius R

Suppose there were a constant density of stars n stars per unit volume, this thin shell has n 4 \pi R^2 \delta R stars. Now the further away a star is, the dimmer it seems – the light spreads out in a sphere around the star. A star a distance R that emits I units of energy per second, will project an intensity of \frac{I}{4 \pi R^2} per unit area at a distance R away. So a shell (thickness \delta R) of stars at a radius R will bombard us (on the earth) with intensity \frac {I}{4 \pi R^2} n 4 \pi R^2 \delta R units of energy per unit area. This is = I n \ \delta R units of energy per unit area.

This is independent of R!. Since we can do this for successive shells of stars, the brightness of each shell adds! The night sky would be infinitely bright, IF the universe were infinitely big.

Some assumptions were made in the above description.

  1. We assumed that the stars are distributed uniformly in space, at past times too. We also assumed, in particular,
  2. isotropy, so there are no special directions along which stars lie.
  3. We also assumed that stars all shine with the same brightness.
  4. We didn’t mention it, but we assumed nothing obscures the light of far away stars, so we are able to see everything.
  5. In addition, we also assumed that the universe is infinite in size and that we can see all the objects in it, so it also must have had an infinite lifetime before this moment. Since light travels at a finite speed, light from an object would take some time to reach us; if the universe were infinitely old, we’d see every object in it at the moment we look up.
  6. We also assumed that the Universe wasn’t expanding rapidly – in fact with a recession speed for far away stars that increased (at least proportionately, but could be even faster than linear) with their distance. In such a universe, if we went far enough, we’d have stars whose recession speed from us exceeds the speed of light. If so, the light from those stars couldn’t possibly reach us – like a fellow on an escalator trying hard to progress upwards while the escalator goes down.

There are a tremendous number of population analyses of the distribution of light-emitting objects in the universe that make a convincing case (next post!) that the universe is isotropic and homogeneous on enormously large length scales (as in 100 Mega parsecs). We don’t see the kind of peculiar distributions that would lead us to assume a conspiracy of the sort implied in point 2.

We have a good idea of the life cycles of stars, but the argument would proceed on the same lines, unless we had a systematic diminution of intrinsic brightness as we looked at stars further and further away. Actually, the converse appears to be true. Stars and galaxies further away had tremendous amounts of hydrogen and little else and appear to be brighter, much brighter.

If there were actually dust obscuring far away stars, then the dust would have absorbed radiation from the stars, started to heat up, then would have emitted radiation of the same amount, once it reached thermodynamic equilibrium. This is not really a valid objection.

The best explanation is that either the universe hasn’t been around infinitely long, or the distant parts are receding from us so rapidly that they are exiting our visibility sphere. Or both.

And that is the start of the study of modern cosmology.

The Great American Eclipse of 2017

Posted on Updated on

I really had to see this eclipse – met up with my nephew at KSU, then eclipse chasing (versus the clouds) all the way from Kansas to central and south-east Missouri. The pictures I got were interesting, but I think the videos (and audio) reflect the experience of totality much better. The initial crescent shaped shadows through the “pinholes” in the leafy branches,

With the slowly creeping moon swallowing the sun

Followed by totality

and the sudden disappearance of sunlight, followed by crickets chirping (listen to the sounds as the sky darkens)

I must confess, it became dark and my camera exposure settings got screwed up – no pictures of the diamond ring. Ah, well, better luck next time!

This is definitely an ethereal experience and one worth the effort to see it. Everybody and his uncle did!

Mr. Einstein and my GPS

Posted on Updated on

I promised to continue one of my previous posts and explain how Einstein’s theories of 1905 and 1915 together affect our GPS systems. If we hadn’t discovered relativity (special and general) by now, we’d have certainly discovered it by the odd behaviour of our clocks on the surface of the earth and on an orbiting satellite.

The previous post ended by demonstrating that the time interval between successive ticks of a clock at the earth’s surface \Delta t_R and a clock ticking infinitely far away from all masses \Delta t_{\infty} are related by the formula

\Delta t_{R} =  \Delta t_{\infty} (1 + \frac{ \Phi(R)}{c^2})

The gravitational potential \Phi(R)=-\frac{G M_E}{R} is a {\bf {negative}} number for all R. This means that the time intervals measured by the clock at the earth’s surface is {\bf {shorter}} than the time interval measured far away from the earth. If you saw the movie “Interstellar“, you will hopefully remember that a year passed on Miller’s planet (the one with the huge tidal waves) while 23 years passed on the Earth, since Miller’s planet was close to the giant Black Hole Gargantua. So time appears to slow down on the surface of the Earth compared to a clock placed far away.

Time for some computations. The mass of the earth  is 5.97 \times 10^{24} \: kg, Earth’s radius is R = 6370 \: km \: = 6.37\times 10^6 \: meters and the MKS units for G = 6.67 \times 10^{-11} \: MKS \: units. In addition, the speed of light c = 3 \times 10^8 \frac {m}{s}. If \Delta t_{\infty} = 1 \: sec, the clock on an orbiting satellite (assumed to be really far away from the earth) measures one second, the clock at the surface measures

\Delta t_R = (1 \: sec) \times (1 - \frac {(6.67 \times 10^{-11}) \:  \times \:  (5.97 \times 10^{-24} )}{(6.37 \times 10^6 )\: (3 \times 10^8 )^2})

this can be simplified to  0.69 \: nanoseconds less than 1 \: sec. In a day, which is  (24 \times 3600) \: secs, this is 70 \times 10^-6 = 60 \: \mu \: seconds (microseconds are a millionth of a second).

In reality, as will be explained below, the GPS satellites are operating at roughly 22,000 \: km above the earth’s surface, so what’s relevant is the {\bf {difference}} in the gravitational potential at 28,370 \: km and 6,370 \: km from the earth’s center. That modifies the difference in clock rates to 53 \: nanoseconds per second, or 46 \: microseconds in a day.

How does GPS work? The US (other countries too – Russia, the EU, China, India) launched several satellites into a distant orbit 20,000 \: - 25,000 \: km above the earth’s surface. Most of the orbits are designed to allow different satellites to cover the earth’s surface at various points of time. A few of the systems (in particular, India’s) have satellites placed in a Geo-Stationary orbit, so they rotate around the earth with the earth – they are always above a certain point on the earth’s surface. The key is that they possess rather accurate and synchronized atomic clocks and send the time signals, along with the satellite position and ID to GPS receivers.

If you think about how to locate someone on the earth, if I told you I was 10 miles from the Empire State Building in Manhattan, you wouldn’t know where I was. Then, if I told you that I was 5 miles from the Chrysler building (also in Manhattan), you would be better off, but you still wouldn’t know how high I was. If I receive a third coordinate (distance from yet another landmark), I’d be set.  So we need distances from three well-known locations in order to locate ourselves on the Earth’s surface.

The GPS receiver on your dashboard receives signals from three GPS satellites. It knows how far they are, because it knows when the signals were emitted, as well as what the time at your location is.  Since these signals travel at the speed of light (and this is sometimes a problem if you have atmospheric interference), the receiver can compute how far away the satellites are. Since it has distances to three “landmarks”, it can be programmed to compute its own location.

Of course, if its clock was constantly running slower than the satellite clocks, it would constantly overestimate the distance to these satellites, for it would think the signals were emitted \: earlier than they actually were. This would screw up the location calculation, to the distance travelled by light in 0.53 \: nanoseconds, which is 0.16 meters. Over a day, this would become 14 kilometers. You could well be in a different city!

There’s another effect – that of time dilation. To explain this, there is no better than the below thought experiment, that I think I first heard of from George Gamow’s book. As with {\bf {ALL}} arguments in special and general relativity, the only things observers can agree on is the speed of light and (hence) the order of causally related events. That’s what we use in the below.

There’s an observer standing in a much–abused rail carriage. The rail carriage is travelling to the right, at a high speed V. The observer has a rather cool contraption / clock. It is made with a laser, that emits photons and a mirror, that reflects them. The laser emits photons from the bottom of the carriage towards the ceiling, where the mirror is mounted. The mirror reflects the photon back to the floor of the car, where it is received by a photo-detector (yet another thing that Einstein first explained!).

Light Clock On Train

The time taken for this up- and down- journey (the emitter and mirror are separated by a length L) is

\Delta t' = \frac{2 L}{c}

That’s what the observer on the train measures the time interval to be. What does an observer on the track, outside the train, see?

Light Clock Seen from Outside Train

She sees the light traverse the path down in blue above. However, she also sees the light traveling at the same (numerical) speed, so she decides that the time between emission and reception of the photon is found using Pythagoras’ theorem

L^2 = (c \frac{\Delta t}{2})^2 - (V \frac {\Delta t}{2})^2

\rightarrow  \Delta t = \frac {2 L}{c} \frac{1}{\sqrt{1 - \frac{V^2}{c^2}}}

So, the time interval between the same two events is computed to be larger on the stationary observer’s clock, than on the moving observer’s clock. The relationship is

\Delta t = \frac {\Delta t'}{ \sqrt{1 - \frac{V^2}{c^2}} }

How about that old chestnut – well, isn’t the observer on the track moving relative to the observer on the train? How come you can’t reverse this argument?

The answer is – who’s going to have to turn the train around and sheepishly come back after this silly experiment runs its course? Well! The point is that one of these observers has to actively come back in order to compare clocks. Relativity just observes that you cannot make statements about {\bf {absolute}} motion. You certainly have to accept relative motion and in particular, how observers have to compare clocks at the same point in space.

From the above, 1 second on the moving clock would correspond to \frac {1}{ \sqrt{1 - \frac{V^2}{c^2}} } seconds on the clock by the tracks. A satellite at a distance D from the center of the earth has an orbital speed of \sqrt {\frac {G M_E}{D} } , which for an orbit 22,000 km above the earth’s surface, which is 28,370 \: km from the earth’s center, would be roughly

\sqrt { \frac {(6.67 \times 10^{-11} (5.97 \times 10^{-24})}{28370 \times 10^3} }\equiv  3700 \: \frac{meters}{sec}

which means that 1 second on the moving clock would correspond to 1 \: sec + 0.078 \: nanoseconds on the clock by the tracks. Over a day, this would correspond to a drift of 6 \: microseconds, in the {\bf {opposite}} direction to the above calculation for gravitational slowing.

Net result – the satellite clocks run faster by 40 microseconds in a day. They need to be continually adjusted to bring them in sync with earth-based clocks.

So, that’s three ways in which Mr. Einstein matters to you EVERY day!

 

Master Traders and Bayes’ theorem

Posted on Updated on

Imagine you were walking around in Manhattan and you chanced upon an interesting game going on at the side of the road. By the way, when you see these games going on, a safe strategy is to walk on, since they usually reduce to methods of separating a lot of money from you in various ways.

The protagonist, sitting at the table tells you (and you are able to confirm this by a video taken by a nearby security camera run by a disinterested police officer), that he has managed to toss the same quarter (an American coin) thirty times and managed to get “Heads” {\bf ALL} of those times. What would you say about the fairness or unfairness of the coin in question?

Next, your good friend rushes to your side and whispers to you that this guy is actually one of a really \: large number of people (a little more than a billion) that were asked to successively toss freshly minted, scrupulously clean and fair quarters. People that tossed tails were “tossed” out at each successive toss and only those that tossed heads were allowed to toss again. This guy (and one more like him) were the only ones that remained. What can you say now about the fairness or unfairness of the coin in question?

What if the number of coin tosses was 100 rather than 30, with a larger number of initial subjects?

Just to make sure you think about this correctly, suppose you were the Director of a large State Pension Fund and you need to invest the life savings of your state’s teachers, firemen, policemen, highway maintenance workers and the like. You get told you have to decide to allocate some money to a bet based made by an investment manager based on his or her track record (he successively tossed “Heads” a hundred times in a row). Should you invest money on the possibility that he or she will toss “Heads” again? If so, how much should you invest? Should you stay away?

This question cuts to the heart of how we operate in real life. If you cut out the analytical skills you learnt in school and revert to how our “lizard” brain thinks, we would assume the coin was unfair (in the first instance) and express total surprise at the knowledge of the second fact. In fact, even though the second situation could well have happened to every similar situation of the first sort we encounter in the real world, we would still operate as if the coin was unfair, as our “lizard” brain would instruct us to behave.

What we are doing unconsciously is using Bayes’ theorem. Bayes’ theorem is the linchpin of inferential deduction and is often misused even by people who understand what they are doing with it. If you want to read couple of rather interesting books that use it in various ways, read Gerd Gigirenzer’s “Reckoning with Risk: Learning to Live with Uncertainty” or Hans Christian von Baeyer’s “QBism“. I will discuss a few classic examples. In particular Gigirenzer’s book discusses several such, as well as ways to overcome popular mistakes made in the interpretation of the results.

Here’s a very overused, but instructive example. Let’s say there is a rare disease (pick your poison) that afflicts 0.25 \% of the population. Unfortunately, you are worried that you might have it. Fortunately for you, there is a test that can be performed, that is 99 \% accurate – so if you do have the disease, the test will detect it 99 \% of the time. Unfortunately for us, the test has a 0.1 \% false positive rate, which means that if you don’t have the disease, 0.1 \% of such tested people will mistakenly get a positive result. Despite this, the results look exceedingly good, so the test is much admired.

You nervously proceed to your doctor’s office and get tested. Alas, the result comes back “Positive”. Now, ask yourself, what the chances you actually have the disease? After all, you have heard of false positives!

A simple way to turn the percentages above into numbers, suppose you consider a population of 1,000,000 people. Since the disease is rather rare, only (0.25 \% \equiv ) \: 2,500 have the disease. If they are tested, only (1 \% \equiv ) \: 25 of them will get an erroneous “negative” result. However, if the rest of the population were tested in the same way, (0.1 \%=) \: 1000 people would get a “Positive” result, despite not having the disease. In other words, of the 3475 people who would get a “Positive” result, only 2475 actually have the disease, which is roughly 72\% – so such an accurate test can only give you a 7-in-10 chance of actually being diseased, despite its incredible accuracy. The reason is that the “false positive” rate is low, but not low enough to overcome the extreme rarity of the disease in question.

Notice, as Gigirenzer does, how simple the argument seems when phrased with numbers, rather than with percentages. To do this using standard probability theory, one writes, if we are speaking about Events A and B and write the probability that A could occur once we know that B has occurred as P(A/B), then

P(A/B) P(B) = P(A)

Using this

P(I \: am \: diseased \: GIVEN \: I \: tested \: positive) = \frac {P(I \: am \: diseased)}{P(I \: test \: positive)}

and then we note

P(I \: am \: diseased) = 0.25\%

P(I \: test \: positive) = 0.25 \% \times 99 \% + 99.75 \% \times 0.1 \%

since I could test positive for two reasons – either I really among the 0.25 \% positive people and additionally was among the 99 \% that the test caught OR I really was among the 99.75 \% negative people but was among the 0.1 \% that unfortunately got a false positive.

Indeed, \frac{0.25 \%}{0.25 \% \times 99 \% + 99.75 \% \times 0.1 \%} \approx  0.72

which was the answer we got before.

The rather straightforward formula I used in the above is one formulation of Bayes’ theorem. Bayes’ theorem allows one to incorporate one’s knowledge of partial outcomes to deduce what the underlying probabilities of events were to start with.

There is no good answer to the question that I posed in the first paragraph. It is true that both a fair and an unfair coin could give results consistent with the first event (someone gets 30 or even 100 coin tosses). However, if one desires that probability has an objective meaning independent of our experience, based upon the results of an infinite number of repetitions of some experiment (the so-called “frequentist” interpretation of probability), then one is stuck. In fact, based upon that principle, if you haven’t heard something contrary to the facts about the coin, your a priori assumption about the probability of heads must be \frac {1}{2}. On the other hand, that isn’t how you run your daily life. In fact, the most legally defensible (many people would argue the {\bf {only}} defensible) strategy for the Director of the Pension Fund would be to

  • not assume that prior returns were based on pure chance and would be equally likely to be positive or negative
  • bet on the manager with the best track record

At a minimum, I would advise people to stay away from a stable of managers that simply are the survivors of a talent test where the losers were rejected (oh wait, that sounds like a large number of investment managers in business these days!). Of course, the manager that knows they have a good thing going is likely to not allow investors at all for fear of reducing their returns due to crowding. Such managers also exist in the global market.

The Bayesian approach has a lot in common with our every-day approach to life. It is not surprising that it has been applied to the interpretation of Quantum Mechanics and that will be discussed in a future post.

 

 

 

 

 

 

 

 

 

 

 

Fermi Gases and Stellar Collapse – Cosmology Post #6

Posted on Updated on

The most refined Standard Candle there is today is a particular kind of Stellar Collapse, called a Type 1a Supernova. To understand this, you will need to read the previous posts (#1-#5), in particular, the Fermi-Dirac statistics argument in Post #5 in the sequence. While this is the most mathematical of the posts, it might be useful to skim over the argument to understand the reason for the amazing regularity in these explosions.

Type 1a supernovas happen to white dwarf stars. A white dwarf is a kind of star that has reached the end of its starry career. It has burnt through its hydrogen fuel, producing all sorts of heavier elements, through to carbon and oxygen. It has also ceased being hot enough to burn Carbon and Oxygen in fusion reactions. Since these two elements burn rather less efficiently than Hydrogen or Helium in fusion reactions, the star is dense (there is less pressure from light being radiated by fusion reactions in the inside to counteract the gravitational pressure of matter, so it compresses itself) and the interior is composed of ionized carbon and oxygen (all the negatively charged electrons are pulled out of every atom, the remaining ions are positively charged and the electrons roam freely in the star). Just as in a crystalline lattice (as in a typical metal), the light electrons are good at keeping the positively charged ions screened from other ions. In addition, they also use them in turn to screen themselves from other electrons; the upshot is, the electrons behave like free particles.

At this point, the star is being pulled in by its own mass and is being held up by the pressure exerted by the gas of free electrons in its midst. The “lattice” of positive ions also exerts pressure, but the pressure is much less, as we will see. The temperature of the surface of the white dwarf is known from observations to be quite high, \sim 10,000-100,000 \: Kelvin. More important, the free electrons in a white dwarf of mass much greater than the Sun’s mass (written as M_{\odot}) are ultra-relativistic, with energy much higher than their individual mass. Remember, too, that electrons are a species of “fermion”, which obey Fermi-Dirac statistics.

The Fermi-Dirac formula is written as

P(\vec k) = 2 \frac {1}{e^{\frac{\hbar c k - \hbar c k_F}{k_B T}}+1}

What does this formula mean? The energy of an ultra-relativistic electron, that has energy far in excess of its mass, is

E = \hbar c k

where k c is the “frequency” corresponding to the electron of momentum c k, while \hbar is the “reduced” Planck’s constant (=\frac {h}{2 \pi}) (here h is the regular Planck’s constant) and c is the speed of light. The quantity k_F is called the Fermi wave-vector.  The function P(\vec k) is the (density of) probability of finding an electron in the momentum state specified by \hbar \vec k . In the study of particles where their wave nature is apparent, it is useful to use the concept of the de Broglie “frequency” (\nu = \frac{E}{h}), the de Broglie “wavelength” (\lambda=\frac {V}{\nu} where V is the particle velocity) and k=\frac{2 \pi}{\lambda}, the “wave-number”  corresponding to the particle. It is customary for lazy people to forget to write c and \hbar in formulas, hence, we speak of momentum k for a hyper-relativistic particle travelling at speed close to the speed of light, when it should really be h \frac {\nu}{c} = h \frac{V}{\lambda c} \approx {h}{\lambda} = \frac {h}{2 \pi} \frac{2 \pi}{\lambda} = {\bf {\hbar k}}.

Why a factor of 2? It wasn’t there in the previous post!

From the previous post, you know that fermions don’t like to be in the same state together. We also know that electrons have a property called spin and they can be spin-up or spin-down. Spin is a property akin to angular momentum, which is a property that we understand classically, for instance, as describing the rotation of a bicycle wheel. You might remember that angular momentum is conserved unless someone applies a torque to the wheel. This is the reason why free-standing gyroscopes can be used for airplane navigation – they “remember” which direction they are pointing in. Similarly, spin is usually conserved, unless you apply a magnetic field to “twist” a spin-up electron into a spin-down configuration. So, you can actually have two kinds of electrons – spin-up and spin-down, in each momentum state \vec k . This is the reason for the factor of 2 in the formula above – there are two “spin” states per \vec k state.

Let’s understand the Fermi wave-vector k_F. Since the fermions need to occupy momentum states two-at-a-time for each, and if they were forced into a cube of side L, you can ask how many levels they occupy. They will, like all sensible particles, start occupying levels starting with the lowest energy level, going up till all the fermions available are exhausted. The fermions are described by waves and, in turn, waves are described by wavelength. You need to classify all the possible ways to fit waves into a cube. Let’s look at a one-dimensional case to start

Fermion Modes 1D

The fermions need to bounce off the ends of the one-dimensional lattice of length L so we need the waves to be pinned to 0 at the ends. If you look at the above pictures, the wavelengths of the waves are 2L (for n=1), L (for n=2), \frac{2 L}{3} (for n=3), \frac {L}{2} (for n=4).  In that case, the wavenumbers, which are basically \frac {2 \pi}{\lambda} for the fermions need to be of the sort \frac {n \pi}{L}, where n is an integer (0, \pm 1, \pm 2 ...).

Fermion Modes 1D Table

For a cube of side L, the corresponding wave-numbers are described by \vec k = (n_x, n_y, n_z) \frac {\pi}{L} since a vector will have three components in three dimensions. These wave-numbers correspond to the momenta of the fermions (this is basically what’s referred to as wave-particle duality), so the momentum is \vec p = \hbar \vec k. The energy of each level is \hbar c k. It is therefore convenient to think of the electrons as filling spots in the space of k_x, k_y, k_z.

What do we have so far? These “free” electrons are going to occupy energy levels starting from the lowest k = (\pm 1, \pm 1, \pm 1) \frac{\pi}{L} and so on in a neat symmetric fashion. In k space, which is “momentum space”, since we have many, many electrons, we could think of them filling up a sphere of radius k_F in momentum space. This radius is called the Fermi wave-vector. It represents the most energetic of the electrons, when they are all arranged in as economically as possible – with the lowest possible energy for the gas of electrons. This would happen at zero temperature (which is the approximation we are going to work with at this time). This comes out from the probability distribution formula (ignore the 2 for this, consider the probability of occupation of levels for a one-dimensional fermion gas). Note that all the electrons are inside the Fermi sphere at low temperature and leak out as the temperature is raised (graphs towards the right).

It is remarkable and you should realize this, that a gas of fermions in its lowest energy configuration has a huge amount of energy. The Pauli principle requires it. If they were bosons, all of them would be sitting at the lowest possible energy level, which couldn’t be zero (because we live in a quantum world) but just above it.

What’s the energy of this gas? Its an exercise in arithmetic for zero temperature. Is that good enough? No, but it gets us pretty close to the correct answer and it is instructive.

The total energy in the gas of electrons is (in a spherical white dwarf of volume V = \frac {4}{3} \pi R^3, with 2 spins per state, is

E_{Total} =  2 V  \int \frac {d^3 \vec k}{(2 \pi)^3} \hbar c k = V \frac {\hbar c k_F^4}{4 \pi^2}

The total number of electrons is obtained by just adding up all the available states in momentum space, up to k_F

N = 2 V \int \frac {d^3 \vec k}{(2 \pi)^3} \rightarrow k_F^3 = 3 \pi^2 \frac {N}{V}

We need to estimate the number of electrons in the white dwarf to start this calculation off. That’s what sets the value of k_F, the radius of the “sphere” in momentum space of filled energy states at zero temperature.

The mass of the star is M. That corresponds to \frac {M} { \mu_e m_p} full atoms, where m_p is the mass of the proton (the dominant part of mass) and \mu_e is the ratio of atomic weight to atomic number for a typical constituent atom in the white dwarf. For a star composed of Carbon and Oxygen, this is 2. So, N = V \frac {M}{\mu_e m_p} = V \frac {M}{2 m_p}.

Using all the above

E_{Total} = \frac {4\pi}{3} R^3  \frac {\hbar c}{4 \pi^2} \left(  3 \pi^2 \frac {M}{\mu_e m_p \frac{4\pi}{3} R^3 }\right)^{4/3}

Next, the white dwarf has some gravitational potential energy just because of its existence. This is calculated in high school classes by integration over successive spherical shells from 0 to the radius R, as shown below

Spherical Shell White Dwarf

The gravitational potential energy is

\int_{0}^{R} (-) G \frac {\frac{4 \pi}{3} \rho_m r^3     4 \pi r^2}{r} dr = - \frac{3}{5} \frac {G M^2}{R}

A strange set of things happen if the energy of the electrons (which is called, by the way, the “degeneracy energy”) plus the gravitational energy goes negative. At that point, the total energy can become even more negative as the white dwarf’s radius gets smaller – this can continue {\it ad \: infinitum} – the star collapses. This starts to  happen when you set the gravitational potential energy equal to the Fermi gas energy, this leads to

 \frac{3}{5} \frac{G M^2}{R} = \frac {4\pi}{3} R^3  \frac {\hbar c}{4 \pi^2} \left(  3 \pi^2 \frac {M}{\mu_e m_p \frac{4\pi}{3} R^3 }\right)^{4/3}

the R (radius) of the star drops out and we are left with a unique mass M where this happens – the calculation above gives an answer of 1.7 M_{\odot}. A more precise calculation at non-zero temperature gives 1.44 M_{\odot}.

The famous physicist S. Chandrasekhar, after whom the Chandra X-ray space observatory is named, discovered this (Chandrasekhar limit) while ruminating about the effect of hyper-relativistic fermions in a white dwarf.

chandrasekhar

He was on  a cruise from India to Great Britain at the time and had the time for unrestricted rumination of these sorts!

Therefore, as is often the case, if a white dwarf is surrounded by a swirling cloud of gas of various sorts or has a companion star of some sort that it accretes matter from, it will collapse into a denser neutron star in a cataclysmic collapse precisely when this limit is reached. If so, once one understands the type of light emitted from such a supernova from some nearer location, one has a Standard Candle – it is like having a hand grenade that is of {\bf exactly} the same quality at various distances. By looking at how bright the explosion is, you can tell how far away it is.

After this longish post, I will describe the wonderful results from this analysis in the next post – it has changed our views of the Universe and our place in it, in the last several years.