# Cosmology for Pedestrians

### New kinds of Cash & the connection to the Conservation of Energy And Momentum

Posted on Updated on

Its been difficult to find time to write articles on this blog – what with running a section teaching undergraduates (after 27 years of ${\underline {not \: \: doing \: \: so}}$), as well as learning about topological quantum field theory – a topic I always fancied but knew little about.

However, a trip with my daughter brought up something that sparked an interesting answer to questions I got at my undergraduate section. I had taken my daughter to the grocery store – she ran out of the car to go shopping and left her wallet behind. I quickly honked at her and waved, displaying the wallet. She waved back, displaying her phone. And insight struck me – she had the usual gamut of applications on her phone that serve as ways to pay at retailers – who needs a credit card when you have Apple Pay or Google Pay. I clearly hadn’t adopted the Millennial ways of life enough to understand that money comes in yet another form, adapted to your cell phone and aren’t only the kinds of things you can see, smell or Visa!

And that’s the connection to the Law Of Conservation of Energy, in the following way. There were a set of phenomena that Wolfgang Pauli considered in the 1930s – beta decay. The nucleus was known and so were negatively charged electrons (these were called $\beta$-particles). People had a good idea of the composition and mass of the nucleus (as being composed of protons and neutrons), the structure of the atom (with electrons in orbit around the nucleus) and also understood Einstein’s revolutionary conceptions of the unity of mass and energy. Experimenters were studying the phenomenon of nuclear radioactive decay. Here, a nucleus abruptly emits an electron, then turns into a nucleus with one higher proton number and one less neutron number, so roughly the same atomic weight, but with an extra positive charge. This appears to happen spontaneously, but in concert with the “creation” of a proton, an electron is also produced (and emitted from the atom), so the change in the total electric charge is $+1 -1 = 0$ – it is “conserved”.  What seemed to be happening inside the nucleus was, that one of the neutrons was decaying into a proton and an electron. Now, scientists had constructed rather precise devices  to “stop” electrons, thereby measuring their momentum and energy. It was immediately clear that the total energy we started with – the mass-energy of the neutron (which starts out not moving very much in the experiment), in decaying into the proton and electron was more than the energy of the said proton (which also wasn’t moving very much at the end) and aforesaid electron.

People were quite confused about all this. What was happening? Where was the energy going? It wasn’t being lost to heating up the samples (that was possible to check). Maybe the underlying process going on wasn’t that simple? Some people, including some famous physicists, were convinced that the Law of Conservation of Energy and Momentum had to go.

As it turned out, much like I was confused in the car because I had neglected that money could be created and destroyed in an iPhone, people had neglected that energy could be carried away or brought in by invisible particles called neutrinos. It was just a proposal, till they were actually discovered in 1956 through careful experiments.

In fact, as has been rather clear since Emmy Noether discovered the connection between a symmetry and this principle years ago, getting rid of the Law of Conservation of Energy and Momentum is not that easy. It is connected to a belief that physics (and the result of Physics experiments) is the same whether done here, on Pluto or in empty space outside one of the galaxies on the Hubble deep field view! As long as you systematically get rid of all “known” differences at these locations – the gravity and magnetic field of the earth, your noisy cousin next door, the tectonic activity on Pluto, or small black holes in the Universe’s distant past, the fundamental nature of the universe is $translationally \: \: invariant$. So if you discover that you have found some violation of the Law of Conservation of Energy and Momentum, i.e., a perpetual motion machine, remember that you are announcing that there is some deep inequivalence between different points and time in the Universe.

The usual story is that if you notice some “violation” of this Law, you immediately start looking for particles or sources that ate up the missing energy and momentum rather than announce that you are creating or destroying energy. This principle gets carried into the introduction of new forms of “potential energy” too, in physics, as we discover new ways in which the Universe can bamboozle us and reserve energy for later use in so many different ways. Just like you have to add up so many ways you can store money up for later use!

That leads to a conundrum. If the Universe has a finite size and has a finite lifetime, what does it mean to say that all times and points are equivalent? We can deal with the spatial finiteness – after all, the Earth is finite, but all points on it are geographically equivalent, once you account for the rotation axis (which is currently where Antarctica and the Arctic are, but really could be anywhere). But how do you account for the fact that time seems to start from zero? More on this in a future post.

So, before you send me mail telling me you have built a perpetual motion machine, you really have to be Divine and if so, I am expecting some miracles too.

### Mr. Olbers and his paradox

Posted on Updated on

Why is the night sky dark? Wilhelm Olbers asked this question, certainly not for the first time in history, in the 1800s.

That’s a silly question with an obvious answer. Isn’t that so?

Let’s see. There certainly is no sun visible, which is the definition of night, after all. The moon might be, but on a new moon night, the moon isn’t, so all we have are stars in the sky.

Now, let’s make some rather simple-minded assumptions. Suppose the stars are distributed equally throughout space, at all previous times too. Why do we have to think about previous times? You know that light travels at $300,000 \: km/s$, so when you look out into space, you also look back in time. So, one has to make some assumptions about the distribution of stars at prior times.

Then, if you draw a shell around the earth, that has a radius $R$ and thickness $\delta R$, the volume of this thin shell is $4 \pi R^2 \delta R$.

Suppose there were a constant density of stars $n$ stars per unit volume, this thin shell has $n 4 \pi R^2 \delta R$ stars. Now the further away a star is, the dimmer it seems – the light spreads out in a sphere around the star. A star a distance $R$ that emits $I$ units of energy per second, will project an intensity of $\frac{I}{4 \pi R^2}$ per unit area at a distance $R$ away. So a shell (thickness $\delta R$) of stars at a radius $R$ will bombard us (on the earth) with intensity $\frac {I}{4 \pi R^2} n 4 \pi R^2 \delta R$ units of energy per unit area. This is $= I n \ \delta R$ units of energy per unit area.

This is independent of $R$!. Since we can do this for successive shells of stars, the brightness of each shell adds! The night sky would be infinitely bright, $IF$ the universe were infinitely big.

Some assumptions were made in the above description.

1. We assumed that the stars are distributed uniformly in space, at past times too. We also assumed, in particular,
2. isotropy, so there are no special directions along which stars lie.
3. We also assumed that stars all shine with the same brightness.
4. We didn’t mention it, but we assumed nothing obscures the light of far away stars, so we are able to see everything.
5. In addition, we also assumed that the universe is infinite in size and that we can see all the objects in it, so it also must have had an infinite lifetime before this moment. Since light travels at a finite speed, light from an object would take some time to reach us; if the universe were infinitely old, we’d see every object in it at the moment we look up.
6. We also assumed that the Universe wasn’t expanding rapidly – in fact with a recession speed for far away stars that increased (at least proportionately, but could be even faster than linear) with their distance. In such a universe, if we went far enough, we’d have stars whose recession speed from us exceeds the speed of light. If so, the light from those stars couldn’t possibly reach us – like a fellow on an escalator trying hard to progress upwards while the escalator goes down.

There are a tremendous number of population analyses of the distribution of light-emitting objects in the universe that make a convincing case (next post!) that the universe is isotropic and homogeneous on enormously large length scales (as in 100 Mega parsecs). We don’t see the kind of peculiar distributions that would lead us to assume a conspiracy of the sort implied in point 2.

We have a good idea of the life cycles of stars, but the argument would proceed on the same lines, unless we had a systematic diminution of intrinsic brightness as we looked at stars further and further away. Actually, the converse appears to be true. Stars and galaxies further away had tremendous amounts of hydrogen and little else and appear to be brighter, much brighter.

If there were actually dust obscuring far away stars, then the dust would have absorbed radiation from the stars, started to heat up, then would have emitted radiation of the same amount, once it reached thermodynamic equilibrium. This is not really a valid objection.

The best explanation is that either the universe hasn’t been around infinitely long, or the distant parts are receding from us so rapidly that they are exiting our visibility sphere. Or both.

And that is the start of the study of modern cosmology.

### Fermi Gases and Stellar Collapse – Cosmology Post #6

Posted on Updated on

The most refined Standard Candle there is today is a particular kind of Stellar Collapse, called a Type 1a Supernova. To understand this, you will need to read the previous posts (#1-#5), in particular, the Fermi-Dirac statistics argument in Post #5 in the sequence. While this is the most mathematical of the posts, it might be useful to skim over the argument to understand the reason for the amazing regularity in these explosions.

Type 1a supernovas happen to white dwarf stars. A white dwarf is a kind of star that has reached the end of its starry career. It has burnt through its hydrogen fuel, producing all sorts of heavier elements, through to carbon and oxygen. It has also ceased being hot enough to burn Carbon and Oxygen in fusion reactions. Since these two elements burn rather less efficiently than Hydrogen or Helium in fusion reactions, the star is dense (there is less pressure from light being radiated by fusion reactions in the inside to counteract the gravitational pressure of matter, so it compresses itself) and the interior is composed of ionized carbon and oxygen (all the negatively charged electrons are pulled out of every atom, the remaining ions are positively charged and the electrons roam freely in the star). Just as in a crystalline lattice (as in a typical metal), the light electrons are good at keeping the positively charged ions screened from other ions. In addition, they also use them in turn to screen themselves from other electrons; the upshot is, the electrons behave like free particles.

At this point, the star is being pulled in by its own mass and is being held up by the pressure exerted by the gas of free electrons in its midst. The “lattice” of positive ions also exerts pressure, but the pressure is much less, as we will see. The temperature of the surface of the white dwarf is known from observations to be quite high, $\sim 10,000-100,000 \:$ Kelvin. More important, the free electrons in a white dwarf of mass much greater than the Sun’s mass (written as $M_{\odot}$) are ultra-relativistic, with energy much higher than their individual mass. Remember, too, that electrons are a species of “fermion”, which obey Fermi-Dirac statistics.

The Fermi-Dirac formula is written as

$P(\vec k) = 2 \frac {1}{e^{\frac{\hbar c k - \hbar c k_F}{k_B T}}+1}$

What does this formula mean? The energy of an ultra-relativistic electron, that has energy far in excess of its mass, is

$E = \hbar c k$

where $k c$ is the “frequency” corresponding to the electron of momentum $c k$, while $\hbar$ is the “reduced” Planck’s constant $(=\frac {h}{2 \pi})$ (here $h$ is the regular Planck’s constant) and $c$ is the speed of light. The quantity $k_F$ is called the Fermi wave-vector.  The function $P(\vec k)$ is the (density of) probability of finding an electron in the momentum state specified by $\hbar \vec k$. In the study of particles where their wave nature is apparent, it is useful to use the concept of the de Broglie “frequency” $(\nu = \frac{E}{h})$, the de Broglie “wavelength” $(\lambda=\frac {V}{\nu}$ where $V$ is the particle velocity) and $k=\frac{2 \pi}{\lambda}$, the “wave-number”  corresponding to the particle. It is customary for lazy people to forget to write $c$ and $\hbar$ in formulas, hence, we speak of momentum $k$ for a hyper-relativistic particle travelling at speed close to the speed of light, when it should really be $h \frac {\nu}{c} = h \frac{V}{\lambda c} \approx {h}{\lambda} = \frac {h}{2 \pi} \frac{2 \pi}{\lambda} = {\bf {\hbar k}}$.

Why a factor of 2? It wasn’t there in the previous post!

From the previous post, you know that fermions don’t like to be in the same state together. We also know that electrons have a property called spin and they can be spin-up or spin-down. Spin is a property akin to angular momentum, which is a property that we understand classically, for instance, as describing the rotation of a bicycle wheel. You might remember that angular momentum is conserved unless someone applies a torque to the wheel. This is the reason why free-standing gyroscopes can be used for airplane navigation – they “remember” which direction they are pointing in. Similarly, spin is usually conserved, unless you apply a magnetic field to “twist” a spin-up electron into a spin-down configuration. So, you can actually have two kinds of electrons – spin-up and spin-down, in each momentum state $\vec k$. This is the reason for the factor of 2 in the formula above – there are two “spin” states per $\vec k$ state.

Let’s understand the Fermi wave-vector $k_F$. Since the fermions need to occupy momentum states two-at-a-time for each, and if they were forced into a cube of side $L$, you can ask how many levels they occupy. They will, like all sensible particles, start occupying levels starting with the lowest energy level, going up till all the fermions available are exhausted. The fermions are described by waves and, in turn, waves are described by wavelength. You need to classify all the possible ways to fit waves into a cube. Let’s look at a one-dimensional case to start

The fermions need to bounce off the ends of the one-dimensional lattice of length $L$ so we need the waves to be pinned to 0 at the ends. If you look at the above pictures, the wavelengths of the waves are $2L$ (for n=1), $L$ (for n=2), $\frac{2 L}{3}$ (for n=3), $\frac {L}{2}$ (for n=4).  In that case, the wavenumbers, which are basically $\frac {2 \pi}{\lambda}$ for the fermions need to be of the sort $\frac {n \pi}{L}$, where $n$ is an integer $(0, \pm 1, \pm 2 ...)$.

For a cube of side L, the corresponding wave-numbers are described by $\vec k = (n_x, n_y, n_z) \frac {\pi}{L}$ since a vector will have three components in three dimensions. These wave-numbers correspond to the momenta of the fermions (this is basically what’s referred to as wave-particle duality), so the momentum is $\vec p = \hbar \vec k$. The energy of each level is $\hbar c k$. It is therefore convenient to think of the electrons as filling spots in the space of $k_x, k_y, k_z$.

What do we have so far? These “free” electrons are going to occupy energy levels starting from the lowest $k = (\pm 1, \pm 1, \pm 1) \frac{\pi}{L}$ and so on in a neat symmetric fashion. In $k$ space, which is “momentum space”, since we have many, many electrons, we could think of them filling up a sphere of radius $k_F$ in momentum space. This radius is called the Fermi wave-vector. It represents the most energetic of the electrons, when they are all arranged in as economically as possible – with the lowest possible energy for the gas of electrons. This would happen at zero temperature (which is the approximation we are going to work with at this time). This comes out from the probability distribution formula (ignore the 2 for this, consider the probability of occupation of levels for a one-dimensional fermion gas). Note that all the electrons are inside the Fermi sphere at low temperature and leak out as the temperature is raised (graphs towards the right).

It is remarkable and you should realize this, that a gas of fermions in its lowest energy configuration has a huge amount of energy. The Pauli principle requires it. If they were bosons, all of them would be sitting at the lowest possible energy level, which couldn’t be zero (because we live in a quantum world) but just above it.

What’s the energy of this gas? Its an exercise in arithmetic for zero temperature. Is that good enough? No, but it gets us pretty close to the correct answer and it is instructive.

The total energy in the gas of electrons is (in a spherical white dwarf of volume $V = \frac {4}{3} \pi R^3$, with $2$ spins per state, is

$E_{Total} = 2 V \int \frac {d^3 \vec k}{(2 \pi)^3} \hbar c k = V \frac {\hbar c k_F^4}{4 \pi^2}$

The total number of electrons is obtained by just adding up all the available states in momentum space, up to $k_F$

$N = 2 V \int \frac {d^3 \vec k}{(2 \pi)^3} \rightarrow k_F^3 = 3 \pi^2 \frac {N}{V}$

We need to estimate the number of electrons in the white dwarf to start this calculation off. That’s what sets the value of $k_F$, the radius of the “sphere” in momentum space of filled energy states at zero temperature.

The mass of the star is $M$. That corresponds to $\frac {M} { \mu_e m_p}$ full atoms, where $m_p$ is the mass of the proton (the dominant part of mass) and $\mu_e$ is the ratio of atomic weight to atomic number for a typical constituent atom in the white dwarf. For a star composed of Carbon and Oxygen, this is $2$. So, $N = V \frac {M}{\mu_e m_p} = V \frac {M}{2 m_p}$.

Using all the above

$E_{Total} = \frac {4\pi}{3} R^3 \frac {\hbar c}{4 \pi^2} \left( 3 \pi^2 \frac {M}{\mu_e m_p \frac{4\pi}{3} R^3 }\right)^{4/3}$

Next, the white dwarf has some gravitational potential energy just because of its existence. This is calculated in high school classes by integration over successive spherical shells from $0$ to the radius $R$, as shown below

The gravitational potential energy is

$\int_{0}^{R} (-) G \frac {\frac{4 \pi}{3} \rho_m r^3 4 \pi r^2}{r} dr = - \frac{3}{5} \frac {G M^2}{R}$

A strange set of things happen if the energy of the electrons (which is called, by the way, the “degeneracy energy”) plus the gravitational energy goes negative. At that point, the total energy can become even more negative as the white dwarf’s radius gets smaller – this can continue ${\it ad \: infinitum}$ – the star collapses. This starts to  happen when you set the gravitational potential energy equal to the Fermi gas energy, this leads to

$\frac{3}{5} \frac{G M^2}{R} = \frac {4\pi}{3} R^3 \frac {\hbar c}{4 \pi^2} \left( 3 \pi^2 \frac {M}{\mu_e m_p \frac{4\pi}{3} R^3 }\right)^{4/3}$

the $R$ (radius) of the star drops out and we are left with a unique mass $M$ where this happens – the calculation above gives an answer of $1.7 M_{\odot}$. A more precise calculation at non-zero temperature gives $1.44 M_{\odot}$.

The famous physicist S. Chandrasekhar, after whom the Chandra X-ray space observatory is named, discovered this (Chandrasekhar limit) while ruminating about the effect of hyper-relativistic fermions in a white dwarf.

He was on  a cruise from India to Great Britain at the time and had the time for unrestricted rumination of these sorts!

Therefore, as is often the case, if a white dwarf is surrounded by a swirling cloud of gas of various sorts or has a companion star of some sort that it accretes matter from, it will collapse into a denser neutron star in a cataclysmic collapse precisely when this limit is reached. If so, once one understands the type of light emitted from such a supernova from some nearer location, one has a Standard Candle – it is like having a hand grenade that is of ${\bf exactly}$ the same quality at various distances. By looking at how bright the explosion is, you can tell how far away it is.

After this longish post, I will describe the wonderful results from this analysis in the next post – it has changed our views of the Universe and our place in it, in the last several years.

### A digression on statistics and a party with Ms. Fermi-Dirac and Mr. Bose – Post #5

Posted on Updated on

To explain the next standard candle, I need to digress a little into the math of statistics of lots of particles. The most basic kind is the statistics of distinguishable particles.

Consider the following scenario. You’ve organized a birthday party for a lot of different looking kids (no twins, triplets, quadruplets, quintuplets …). Each kid has equal access to a large pot of M&Ms in the center of the room. Each kid can grab M&Ms from the pot and additionally when they bounce off each other while playing, can exchange a few M&Ms with each other. After a long while, you notice that all the M&Ms in the central pot are gone.  Let’s suppose there are truly a ${\underline {large}}$ number of kids $(K)$ and a truly ${\underline {humongous}}$ number of M&Ms $(N)$.

Interesting question – how many M&Ms is each kid likely to have? A simpler question might be – how many kids likely have 1 M&M? How many likely have 2?..How many likely have 55?… How many likely have 5,656,005?…

How do we answer this question?

If you use the notation $n_i$ for the number of kids that have $i$ M&Ms?, then, we can easily write down

$\sum\limits_{i} n_i = K$

is the total number of kids.

$\sum\limits_i i n_i = N$

is the total number of M&Ms.

But that isn’t enough to tell! We need some additional method to find the most likely distribution of M&Ms (clearly this wouldn’t work if I were there; I would have all of them and the kids would be looking at the mean dad that took the pot home, but that’s for a different post). The result, that Ludwig Boltzmann discovered, at the end of the 19th century, was ${\bf not}$ simply the one where everybody has an equal number of M&Ms. The most likely distribution is the one with the most number of possible ways to exchange the roles of the kids and still have the same distribution. In other words, maximize the combinatoric number of ways

${\it \Omega} = \frac {K!} {n_1! n_2! n_3! ...n_{5,005,677}! ...}$

which is the way of distributing these kids so that $n_1$ have $1$ M&M, $n_2$ have $2$ M&Ms, $n_3$ have $3$ M&Ms…, $n_{5,005,677}$ have $5,005,677$ M&Ms and so on.

Boltzmann had a nervous breakdown a little after he invented the statistical mechanics, which is this method and its consequences, so don’t worry if you feel a little ringing in your ears. It will shortly grow in loudness!

How do we maximize this ${\it \Omega}$?

The simplest thing to do is to maximize the logarithm of ${\it \Omega}$, which means we maximize

$\log \Omega = \log K! - \sum\limits_{i} \log n_i!$

but we have to satisfy the constraints

$\sum\limits_{i} n_i = K, \hspace{5 mm} \sum\limits_i i n_i = N$

The solution (a little algebra is required here) is that $n_i \propto e^{-\beta i}$ where $\beta$ is some constant for this ‘ere party. For historical reasons and since these techniques were initially used to describe the behavior of gases, it is called the inverse temperature. I much prefer “inverse gluttony” – the lower $\beta$ is, the larger the number of kids with a lot of M&Ms.

Instead of the quantity $i$, which is the number of M&Ms the children have, if we considered $\epsilon_i$, which is (say) the dollar value of the $i$ M&Ms, then the corresponding number of kids with $"value" \: \epsilon_i$ is $n_i \propto e^{-\beta \epsilon_i}$

Few kids have a lot of M&Ms, many have very few – so there you go, Socialists, doesn’t look like Nature prefers the equal distribution of M&Ms either.

If you thought of these kids as particles in a gas and $\epsilon_i$ as one of the possible energy levels (“number of M&Ms”) the particles could have, then the fraction of particles that have energy $\epsilon_i$ would be

$n(\epsilon_i) \propto e^{- \beta \epsilon_i}$

This distribution of particles into energy levels is called the Boltzmann distribution (or the Boltzmann rule). The essential insight is that for several ${\bf distinguishable}$ particles the probability that a particular particle is in a state of energy $\epsilon$ is proportional to $e^{-\beta \epsilon}$.

After Boltzmann discovered this, the situation was static till the early $1920s$ when people started discovering particles in nature that were ${\bf indistinguishable}$. It is a fascinating fact of nature that every photon or electron or muon or tau particle is $exactly$ identical to every other photon or electron or muon or tau particle (respectively and for all other sub-atomic particles too). While this fact isn’t “explained” by quantum field theory, it is used in the construction of our theories of nature.

Back to our party analogy.

Suppose, instead of a wide variety of kids, you invited the largest $K$-tuplet the world has ever seen. $K$ kids that ${\bf ALL}$ look identical. They all have the same parents (pity them ${\bf please})$, but hopefully were born in some physically possible way, like test-tubes. You cannot tell the kids apart, so if one of them has 10 M&Ms, its indistinguishable from $any$ of the kids having 10 M&Ms.

Now what’s the distribution of the number of kids $n_i$ with $\epsilon_i$ value in M&Ms? The argument I am going to present is one I personally have heard from Lubos Motl’s blog (I wouldn’t be surprised if its more widely available, though given the age of the field) and it is a really cute one.

There are a couple of possibilities.

Suppose there was a funny rule (made up by Ms. Fermi-Dirac, a well known and strict party host) that said that there could be at most $1$ kid that had, say $\epsilon_i$ value in M&Ms (for every $i$). Suppose $P_0(\epsilon_i)$ were the probability that ${\underline {no}}$ kid had $\epsilon_i$ of value in M&Ms. Then the probability that that 1 kid has $\epsilon_i$ of value in M&Ms is $P_0 e^{-\beta \epsilon_i}$ – remember the Boltzmann rule! Now if no other possibility is allowed (and if one kid has $i$ M&Ms, it is indistinguishable from any of the other kids, so you can’t ask which one has that many M&Ms)

$P_0(\epsilon_i) + P_0(\epsilon_i) e^{-\beta \epsilon_i} = 1$

since there are only two possibilities, the sum of the probabilities has to be 1.

This implies

$P_0(\epsilon_i) = \frac {1}{1 + e^{-\beta \epsilon_i}}$

And we can find the probability of there being $1$ kid with value $\epsilon_i$ in M&Ms. It would be

$P_1({\epsilon_i}) = 1 - P_0({\epsilon_i}) = \frac {e^{-\beta \epsilon_i}}{1 + e^{-\beta \epsilon_i}}$

The $expected$ number of kids with value $\epsilon_i$ in M&Ms would be

${\bar{\bf n}}(\epsilon_i) = 0 P_0(\epsilon_i) + 1 P_1({\epsilon_i}) = {\bf \frac {1}{e^{\beta \epsilon_i}+1} }$

But we could also invite the fun-loving Mr. Bose to run the party. He has no rules! Take as much as you want!

Now, with the same notation as before, again keeping in mind that we cannot distinguish between the particles,

$P_0(\epsilon_i) + P_0(\epsilon_i) e^{-\beta \epsilon_i} + P_0(\epsilon_i) e^{-2 \beta \epsilon_i} + .... = 1$

which is an infinite (geometric) series. The sum is

$\frac {P_0(\epsilon_i) }{1 - e^{-\beta \epsilon_i} } = 1$

which is solved by

$P_0(\epsilon_i) = 1 - e^{-\beta \epsilon_i}$

The expected number of kids with value $\epsilon_i$ in M&Ms is

${\bar{\bf n}}(\epsilon_i) = 0 P_0(\epsilon_i) + 1 P_0(\epsilon_i) e^{-\beta \epsilon_i} + 2 P_0(\epsilon_i) e^{-2 \beta \epsilon_i} + ...$

which is

${\bar{n}}(\epsilon_i) = P_0(\epsilon_i) \frac {e^{-\beta \epsilon_i} } {(1 - e^{-\beta \epsilon_i})^2} = {\bf \frac {1}{e^{\beta \epsilon_i} -1}}$

Now, here’s a logical question. If you followed the argument above, you could ask this -could we perhaps have a slightly less strict host, say Ms. Fermi-Dirac-Bose-2 that allows up to 2 kids to possess a number of M&Ms whose value is $\epsilon_i$? How about a general number $L$ kids that are allowed to possess M&Ms of value $\epsilon_i$ (the host being the even more generous Ms. Fermi-Dirac-Bose-L). More about this on a different thread. But the above kinds of statistics are the only ones Nature seems to allow in our 4 – dimensional world (three space and one time). Far more are allowed in 3 – dimensional worlds (two space and one time) and that will also be in a different post (the sheer number of connections one can come up with is fantastic!).

The thing to understand is that particles that obey Fermi-Dirac statistics (a maximum of one particle in every energy state) have a “repulsion” for each other – they don’t want to be in the same state as another Fermi-Dirac particle, because Nature forces them to obey Fermi-Dirac statistics.  If the states were characterized by position in a box, they would want to stay apart. This leads to a kind of outwards pressure. This pressure (described in the next post) is called Fermi-degeneracy pressure – its what keeps a peculiar kind of dense star called a white dwarf from collapsing onto itself. However, beyond a certain limit of mass (called the Chandrasekhar limit after the scientist that discovered it), the pressure isn’t enough and the star collapses on itself – leading to a colossal explosion.

These explosions are the next kind of “standard candle”.

${\bf {Addendum}}$:

I feel the need to address the question I asked above, since I have been asked informally. Can one get statistics with choosing different values of $L$ in the above party? The answer is “No”. The reason is this – suppose you have $K L$ kids at the party, with a maximum of $L$ kids that can carry M&Ms of value $\epsilon_i$. Then we should be able to divide all our numbers by $L$ (making a scale model of our party that is $L$ times smaller) that has $K$ kids, with a maximum of $1$ kid that is allowed to hold M&Ms of value $\epsilon_i$. You’d expect the expected number of kids with M&Ms of value $\epsilon_i$ to be, correspondingly, $L$ times smaller! Then, the expected number of particles in a state (with a limit of $L$ particles in each state) is just $L$ times the expected number with a limit of $1$ particle in each state.

So all we have are the basic Fermi-Dirac and Bose statistics (1 or many), in our three-space-dimensional party!

### Cosmology: Cepheid Variables – or why Henrietta couldn’t Leavitt alone …(Post #4)

Posted on Updated on

Having exhausted the measurement capabilities for small angles, to proceed further, scientists really needed to use the one thing galaxies and stars put out in plenty – light. The trouble is, to do so, we either need detailed, correct theories of galaxy and star life-cycles (so we know when they are dim or bright) or we need a “standard candle”. That term needs explanation.

If I told you to estimate how far away a bulb was, you could probably make an estimate based on how bright the bulb seemed. For this you need two things. You need to know how bright the bulb is ${\bf intrinsically}$ – this is the absolute luminosity and its measured in $watts$ which is $Joules \: per \: second$. Remember, however, that a 100 watt bulb right next to you appears brighter (and hotter) than the same 100 watt bulb ten miles away! To account for that, you could use the fact that the bulb distributes its light almost uniformly into a sphere around itself, to compute what fraction of the light energy you are actually able to intercept – we might have a patch of CCD (like the little sensor inside your video camera), of area $A$ capturing the light emitted by the bulb. Putting these together, as in the figure below, the amount of light captured is $I_{Apparent}$ watts while the bulb puts out $I_{Intrinsic}$ watts.

$I_{Apparent} = I_{Intrinsic} \frac{CCD \: Area}{Sphere \: Surface \: Area}$

$I_{Apparent} = A \frac {I_{Intrinsic}}{4 \pi R^2}$

where if you dig into your memory, you should recall that the area of a sphere of radius $R$ is $4 \pi R^2$!

you can compute $R$

$R = \sqrt{A \frac {I_{Intrinsic}}{4 \pi I_{Apparent}}}$

You know how big your video camera’s sensor area is (it is in that manual that you almost threw away!) You know how much energy you are picking up every second (the apparent luminosity) – you’d need to buy a multimeter from Radio Shack for that (if you can find one now). But to actually compute the distance, you need to know the ${\bf Intrinsic}$ or ${\bf actual}$ luminosity of the light source!

That’s the problem! To do this, we need a set of “standard candles” (a light source of known actual luminosity in watts!) distributed around the universe. In fact the story of cosmology really revolves around the story of standard candles.

The first “standard candles” could well be the stars. If you assume you know how far away the Sun is, and if you assume other stars are just like our Sun, then you could make the first estimates of the size of the Universe.

We already know that the method of parallax could be used with the naked eye to calculate the distance to the moon. Hipparchus calculated that distance to be 59 earth radii. Aristarchus measured the distance to the sun (the method is a tour de force of elementary trigonometry and I will point to a picture here as an exercise!)

His calculation of the Earth-Sun distance was only 5 million miles, a fine example of a large experimental error – the one angle he had to measure was $\alpha$, he got wrong by a factor of 20. Of course, he was wise – he would have been blinded if he had tried to be very accurate and look at the sun’s geometric center!

Then, if you blindly used this estimate and ventured bravely on to calculate distances to other stars based on their apparent brightness relative to the sun, the results were startlingly large (and of course, still too small!) and people knew this as early as 200 B.C. The history of the world might have well been different if people had taken these observers seriously. It was quite a while and not till the Renaissance in Europe that quantitative techniques were re-discovered for distance measurements to the stars.

The problem with the technique of using the Sun as a “standard candle” is that stars differ quite a bit in their luminosity based on their composition, their size, their age and so on. The classification of stars and the description of their life-cycle was completed with the Hertzsprung-Russell diagram in 1910. In addition, the newly discovered nebulae had been resolved into millions of stars, so it wasn’t clear there was a simple way to think of stellar “standard candles” unless someone had a better idea of the size of these stellar clusters. However, some of the nearby galaxy companions of the Milky Way could have their distances estimated approximately (the Magellanic Cloud, for instance).

Enter Henrietta Leavitt. Her story is moving and representative of her time, from her Radcliffe college education to her \$0.30 / hour salary for her work studying variable stars (she was a human computer for her academic boss), as well as the parsimonious recognition for her work while she was alive. She independently discovered that a class of variable stars called Cepheids in the Magellanic clouds appeared to have a universal connection between their intrinsic luminosity and the time period of their brightness oscillation. Here’s a typical graph (Cepheids are much brighter than the Sun and can be observed separately in many galaxies)

If you inverted the graph, you simply had to observe a Cepheid variable’s period to determine the absolute luminosity. Voila! You had a standard candle.

A little blip occurred in 1940, when Walter Baade discovered that Cepheids in the wings of  the Andromeda galaxy were older stars (called Population II, compared to the earlier ones that are now referred to as Population I) and were in general dimmer than Population I Cepheids.  When the Luminosity vs. Period graph was drawn for those, it implied the galaxy they were in was actually even further away! The size of the universe quadrupled (as it turned out) overnight!

Henrietta Leavitt invented the first reliable light-based distance measurement method for galaxies. Edwin Hubble and Milton Humason used data collected mainly from an analysis of Cepheids to derive the equation now known as Hubble’s law.

Next post will be about something called Olbers’ paradox before we start studying the expansion of the Universe, the Cosmic Microwave background and the current belief that we constitute just 4% of the universe  – the rest being invisible to us and not (as far as we can tell) interacting with us.

### Cosmology: Distance Measurements – Parallax (Post #3)

Posted on Updated on

This post describes the cool methods people use to figure out how far away stars and galaxies are. Figuring out how far away your friend lives is easy – you walk or drive at a constant speed in a straight line from your home to their house – then once you know how much time this took, you multiply speed times the time of travel to get the distance to your friend’s house.

This might seem like an excessively detailed description of a simple task, but don’t forget that the ancients would have difficulty with several things here – how do you travel at a constant speed and how do you measure time of travel? The first seems like a possible task, but how do you measure time ? Humans have invented many ways to measure time – water clocks (reported in Greece, China and India), sand clocks, burning knotted ropes. The Antikythera mechanism, if confirmed to be an astronomical device, would be similar to a true mechanical clock, but it took the ability to work metal and the Industrial Revolution to reliably mass-produce clocks.

This was the most effective way to measure distances for many years; just travel there and keep notes!

The heavenly object closest to us appears to be the moon. Very early, to some extent by Aristarchus, but really by Edmund Halley (whose comet is more famous than he is), it was realized that Parallax could be used to figure distance to far away objects, without actually traveling there. Parallax is illustrated below – its the perceived angular shift in an object’s position relative to far-away things when you shift your viewing position. You experience this all the time when you see nearby things shift when you look from one eye, then the other.

The diagram above is a little busy, so let me explain it. $L$ is the distance that we are trying to measure, between the Earth (where the fellow with the telescope is) and the bright blue star. $R$ is the distance to the starry background, that is ${\bf really}$ far away. Since $R$ is ${\bf much}$ bigger than $L$,  you should be able to convince yourself that the angles $\alpha$ and $\beta$ are very close to each other. From basic geometry, to a good approximation

$D = \alpha L$

which means $L = \frac {D}{\alpha}$. We just need to compute $\alpha$, but it is roughly equal to $\beta$. $\beta$ is the just the angular separation of the stars $P$ and $Q$, which you could measure with, for instance, a sextant.

We know $D$, which is the baseline of the measurement. If you use your two eyes, it is a few inches. You could get ambitious and make measurements in summer and winter, when the baseline would be the diameter of the Earth’s orbit (OK, the orbit is very nearly a circle). The result is that you can figure out how far away the bright blue star is by computing the perceived angular shift.

The farther away something is, the smaller the perceived angular shift. For a long time, people could not measure angular shifts for really distant objects and made the assumption that the method was wrong for some reason, for they couldn’t believe stars could be that far away.

The state of the art in the parallax measurement was the Hipparcos satellite and is currently the Gaia satellite (as well as Hubble). Distances upto 30,000 light years are capable of being measured. For reference, we think Andromeda galaxy is 2.5 million light years away and the Milky Way’s dark matter halo extends out to 180,000 light years. So to measure out to these distances needs different techniques, which will be discussed in the next post.

### Cosmology and the Expanding Universe ..(Post #2)

Posted on Updated on

The previous post discussed what Cosmological Red Shift is (and we defined $z$, the red-shift parameter). The saga of cosmology begins with general speculations for thousands of years about what those points of light in the sky really were. The construction of the first telescope around 1608, followed by visual explorations (by people like Galileo) of the Moon, Venus, Jupiter, Saturn and their moons led to the increasing certainty that the heavens were made of the same materials as those found on the earth. By the way,  it is indeed surprising (as you will see) that to some extent, cosmology has come full circle – it seems to appear that the heavens might be composed of different “stuff” than us on Earth.

Anyway, as I alluded to in the first post, the first mystery of modern cosmology was discovered in the light from distant galaxies.  If we make the entirely reasonable assumption that those galaxies were composed of stars like our sun, the light from those stars should be similar in composition (the mix of colors etc) to the light from our sun. Of course, it was entirely reasonable to expect that some of those stars might be smaller/bigger/younger/older than our sun, so if you had a good idea of how stars produced their light, you could figure out what the light should look like. Now in the 1910’s, 1920’s, 1930s, which is the era we are talking about, people didn’t really understand nuclear fusion, so there was some speculation going on about what made the stars shine. However, one thing was clear – stars contain lots of hydrogen, so we should be able to see the colors (the wavelengths) typical of emission from hot hydrogen atoms. Vesto Slipher was the first to note that the light emitted from the hydrogen (and some other light elements) in the stars in distant galaxies appeared to be red-shifted, i.e., to be redder than expected. This was puzzling, if you expected that hydrogen and other elements had the same properties as that on the Earth. The most sensible explanation was that this was an indication the galaxies were receding away from the earth. Edwin Hubble did some more work and discovered the famous correlation, now known as Hubble’s Law – the more distant a galaxy, the faster it seemed to be receding away from us. If ${\bf V_{recession}}$ is the recession speed of a far-away galaxy, $D$ is how far away it is and ${\it H_0}$ is Hubble’s constant,

${\bf V_{recession}} = {\it H_0} D$

Hubble’s constant is currently believed to be around $70 \frac {km/sec}{MegaParsec}$. A MegaParsec is a million parsecs – a parsec is a convenient distance unit in cosmology and is roughly 3.26 light years. To interpret the formula, if a galaxy were 1 MegaParsec away, it would be rushing away from us at $70 km/sec$. In terms of miles, $1 MegaParsec$ is $19$million trillion miles.

The story of how Edwin Hubble and others discovered how ${\bf far}$ away the galaxies are (the right side of this equation) is interesting in its own right and features people such as Henrietta Leavitt. This will be the subject of my next post. Probably the best discussion of this is by Isaac Asimov, in a book called “The Guide to  Science – Physical Sciences”.

Getting back to our discussion, we don’t think we are somehow specially located in the Universe. This, by the way, was a philosophical principle that really traces back to the Copernican idea that the Earth wasn’t the Center of the Solar System. If we aren’t in some special place in the Universe, and if we see the galaxies receding away from us, it must be that ALL galaxies are receding from each other with a relative speed proportional to their mutual distance.

Thus was born the theory of the Expanding Universe.

One way to think of the Expanding Universe is to think of a rubber sheet, that is being stretched from all sides. Think of a coordinate system drawn on this rubber sheet, with the coordinates actually marked $1,2,3 ...$. The actual distance between points on the sheet is then, not just the coordinate difference, but a “scale factor” times the coordinate difference. This “scale factor”, which is usually referred to as $a$ in cosmological discussions, is usually assumed to be the same number for all points in space at the same point of time in the Universe’s life.

In this picture – the grid spacing is 1 as the Universe expands. However, the distance between the grid points is $a$ times the grid spacing of 1. In the picture, $a$ is initially 1, but it increases to 4 as the expansion continues.

Next post, after I talk about distance measurements in the Universe, I’ll discuss the ideas of homogeneity and isotropy – two important concepts  that we use when studying the Universe.