The Indian musical drums

There is a well-known paper by the famous scientist and Nobel laureate C. V. Raman about the harmonic drums of India – the mridangam and the tabla. While the paper was written in the 1930s, it is quite detailed and refreshing in its clear description of how these instruments work. This post attempts to popularize his description.

The idea of a tuned drum is quite alien to music outside India. Drums are percussive instruments, tuning is something performed on instruments that are musical in nature, is the general feeling. But it is clear that the principle of all musical instruments is some kind of stretched “thing”, which is forced to vibrate with a fundamental and several harmonics – the various harmonics being excited by the design of the instrument. Hence, for instance, note the excitement over expensive violins and cellos – they are expensive precisely because they have a particularly pleasing combination of harmonics when the air columns in the instrument vibrate sympathetically with a bow plied over the stretched string.

A stretched membrane, as found in a regular (timpani) drum, is not particularly harmonious sounding except for its percussive, repetitive property. The modes of the uniform, circular membrane can be solved – the solution is an exercise in solving the equation for a harmonic oscillator in cylindrical coordinates. The modes are described in the paper referred to above – a modern, color picture is below.

The dashed lines represent nodes of the membrane, where the membrane isn’t moving. The adjacent segments, across the nodes, move in opposite directions. The fundamental note is the mode named {\bf 01} in the above – the whole membrane vibrates as a whole – to excite this mode, you would bang the drum right in the center. The next mode is the mode named {\bf 11}; it has a frequency 1.59 times the frequency of the fundamental. This is a little more difficult to create. You’d have to find a way to limit the vibration down a diameter, then bang the drum a quarter of the way away from the diametrical line to excite that mode. However, its frequency is 1.59 times the fundamental – is that any good?

Now, if you know anything about how musical notes are organized, you will understand that there is a whole lot of personal choice involved in what combination of notes in a scale sound “pleasing”. The only uniformity is the central, organizing parts of the scale.

The central note in the scale could be chosen as, say the middle “C”, in Western music. In Indian music, any note could be chosen as the “reference” note – the center of “your” octave.

Then, the note whose frequency is double the frequency of your centered note is the upper end of this “middle” octave. It should really be called a “duplex”, but it is (by Western tradition at least), the eighth note (inclusive) from the middle “C”, so the name is not totally inappropriate.

The note with 50 \% higher frequency is the “Pa” in Indian music (“so” in Western music) is a note that is particular pleasing sounding, when played with the central (or “reference”) note.

This principle, as described, is very common in all forms of Indo-European musical traditions, which Indian and Western styles belong to (as do Iranian and Middle-Eastern music). In fact, in the Indian musical drone (the tanpura), the four strings play the “reference” note, its “Pa” (the note of 50\% higher frequency) and the upper octave “reference” (double the frequency of the “reference” note).

The other notes are between the “reference”, the “pa” and the upper octave “reference” (of double frequency). The complexity of Indian music (and indeed other kinds of music in the Middle East, for instance) is buried in the larger number of notes “in-between” compared to Western music. Western music has three flat notes between the “reference” and the “so”, then two more between “so” and the upper octave’s start. South Indian music has closer to twenty one; I have never bothered to count. In addition, notes sound different because the “attack” (how the note is approached) and “gamaka ” (how the note is shaken) is different for different ragas.

The intermediate notes are picked in different ways – the “equal tempered” scale (with equal ratios between successive notes as the twelfth root of 2) favored by Western orchestral instruments is a “medium” to allow different instruments to play together, The “harmonic” scale, with simple fractional ratios between various notes sounds better (and has sounded better since the days of Pythagoras) and is the basis of most non-Western musical traditions.

Now, let’s look at how the mridangam produces sound. Look at its vibrational modes, first (as detailed in Raman’s paper),

The numbers below each mode are the frequencies of each mode, as multiples of the fundamental frequency. Note that they are all “pleasing” multiples of the fundamental. There are nine such modes. There are higher modes, but since they involve vibrations that are have shorter distance between the “nodes”, they are easy to suppress by weighting the stretched membrane near its anchors at the ends of the cylindrical case.

The purpose of the black iron-oxide/gum paste at the center of the mridangam and tabla face, as well as the width of the border of the membrane, is to ensure the frequencies are as above and not as is usual for a plain stretched membrane. Note also, that to excite the various modes, you need to place some fingers on the membrane at some places, while striking it (“smartly”, to quote Raman) at other specific spots. In addition, if you have noticed tuning a mridangam, they rotate it on their thighs while testing the tension in the sixteen strings holding the membranes in place. This is because if the tension is not uniform, you might not get the appropriate harmonic unless you hit the “right” diametric spots. That becomes difficult to adjust to, especially when you are playing in situations where the temperature varies from place to place (from outside to inside an auditorium for instance).

Learning to play involves understanding how to strike the membrane, as well as producing adequate volume from the places you do strike. In addition, there are many variables that can be played with – the material used for the face, the diameter of the instrument, the attachment of the face to the body, the exact composition of the material used for the central black patch. All these serve to change the frequency of the fundamental, to suit the pitch of the singer that these instruments are supposed to accompany.

There are some videos that describe the techniques in detail on YouTube, linked to here and here.

{\bf \: Note \: added \: post \: publication:}

A couple of Indian physicists (then graduate students) studied the frequencies produced by a real mridangam using discrete Fourier transform techniques (sampling the sound at 200 \musecond intervals) as also a computer simulation of a “loaded” stretched membrane. They report that they notice the general pattern of a fundamental that is 7-10 \% higher in frequency than expected, but with the harmonics related by integer fractions to each other, i.e., the frequencies are 1.075, 2, 3, 4.025… in one experiment. This result appears to be borne out in the numerical simulation they perform and they see similar results for the ratio between the fundamental and the harmonics, as well as between harmonics. They speculate that audiences are simply not able to discern the difference between the expected fundamental and the “real” fundamental.

First, the numerical simulation is of a simple membrane with a denser central region that mimics the iron-oxide spot in the center of the mridangam’s face. However, the mridangam also has a third stretched membrane under the basic membrane, separated by short wooden sticks. This is a rather complex setup that doesn’t precisely match the simulated system.

Also, given the extreme sophistication of audiences, as well as performers (and especially critics!), in discerning “sruti” lapses in performances, it is frankly hard to believe that a 10 \% error in the tuned fundamental would not be noticed (this would imply the “reference” note is a sharp (second) “ri” that is played with the upper octave’s “sa” – this is discernibly dissonant!). Tuning a mridangam is difficult and keeping it tuned is hard and it is not clear from the paper what methods were used to keep the instrument (whose sound was sampled) tuned after it was initially set up. This needs more research, maybe another such study – stay tuned.

Pics. courtesy:

Stretched membrane:

Tanpura: Ravi Maharjan

The Rule of 72 – and what does the Swiss National Bank have to do with it

I was listening to an academic talk and someone mentioned the “Rule of 72”. Apparently invented by Einstein, it is a simple numerical approximation that helps you understand the power of compound interest. This, according to legend, became popular when interest rates offered on deposits by the Swiss National Bank dropped to 2-3 \%  in the 1930s. Only the Germans appeared to be suffering hyperinflation, the Swiss clearly weren’t (though that was before the advent of modern monetary policy, which made the connection between interest rates and inflationary expectations).

Einstein is also touted as the source of a quote on compound interest – “Compound interest is the eighth wonder of the world. He who understands it, earns it…he who doesn’t …pays it”. By the way,  I have seen several of the physical wonders of the world and have learnt several of the wonders of theoretical  physics. While, for instance, the power of the exponential is to be seen to be believed (read Perelman’s book for how to hold a ship with a few loops of rope around a post, as in the image above), one can see it also in the ability of tiny humans to combine forces to build buildings as big as the Pyramid of Khufu, the Madurai Meenakshi temple, or the Burj Khalifa building in Dubai (below) 1920px-An_aerial_view_of_Madurai_city_from_atop_of_Meenakshi_Amman_temple


I would point to those things, rather than the mere accumulation of interest, as a more picturesque depiction of the power of the exponential. Also, as the website seems to indicate, the attribution of the quote to Einstein might be an urban legend.

For what it is worth, the Rule of 72 is as follows. If you want to know how long it will take to double your money at an interest rate of X \%, the number of years is \frac{72}{X}. Obviously, this rule doesn’t apply if the interest rate is negative, as has been the case in some European countries in the last several years.

A quick check on Excel tells you that a better approximation is to use 69 or 70 in the numerator. Using 72 harkens back to a predilection for multiples of 12, something that dates back to Babylonian times.

You can also use this formula to deduce the ruinous effects of inflation.

A complete description, that I found after preparing this article, is to be found in this page. The Rule of 72 actually appears in articles aimed at investors in the NASDAQ market, as well as in bank advertisements.


The photograph of the rope at dock: Pratik Panda at

The photograph of the pyramid : By Nina – Own work, CC BY 2.5,

The photograph of the Meenakshi temple: Wikipedia

The photograph of the Burj Khalifa building: Wikipedia


Bucking down to the Bakhshali manuscript

The Bakhshali manuscript is an artifact discovered in 1881, near the town of Peshawar (in then British India, but now in present-day Pakistan). It is beautifully described in an article in the online magazine of the American Mathematical Society and I spent a few hours fascinated by the description in the article (written excellently by Bill Casselman of the University of British Columbia). It is not clear how old the 70 page manuscript fragment is – its pieces date (using radiocarbon dating)  to between 300-1200 AD. However, I can’t imagine this is simply only as old as that – the mathematical tradition it appears to represent, simply by its evident maturity, is much older.

Anyway, read the article to get a full picture, but I want to focus on generalizing the approximation technique described in the article. One of the brilliant sentences in the article referred to above is “Historians of science often seem to write about their subject as if scientific progress were a necessary sociological development. But as far as I can see it is largely driven by enthusiasm, and characterized largely by randomness”. My essay below is in this spirit!

The technique is as follows (and it is described in detail in the article)

Suppose you want to find the square root of an integer N, which is not a perfect square. Let’s say you find an integer p_1 whose square is close to N – it doesn’t even have to be the exact one “sandwiching” the number N. Then define an error term \mathcal E_1 in the following way,

N = p_1^2 + \mathcal E_1

Next, in the manuscript, you are supposed to define, successively

p_2 = p_1 + \frac{\mathcal E_1}{2 p_1}

and compute the 2^{nd} level error term using simple algebra

\mathcal E_2 = - \frac{\mathcal E_1^2}{4 p_1^2}

This can go on, according to the algorithm detailed in the manuscript and write

 p_3 = p_2 + \frac{\mathcal E_2}{2 p_2}

and compute the error term with the same error formula above, with p_2 replacing p_1. The series converges extremely fast (the p_n approaches the true answer quickly).

Here comes the generalization (this is not in the Bakhshali manuscript, I don’t know if it is a known technique) –  you can use this technique to find higher roots.

Say you want to find the cube root of N. Start with a number p_1 whose cube is close to N. Then we define the error term \mathcal E_1 using

N = p_1^3 + \mathcal E_1

Next, define p_2 = p_1 + \frac{\mathcal E_1}{3 p_1^2}

and compute the 2^{nd} level error term using simple algebra

\mathcal E_2 = -3 \frac{\mathcal E_1^2}{9 p_1^3} - \frac{\mathcal E_1^3}{27 p_6}

and carry on to define p_3 = p_2 +  \frac{\mathcal E_2}{3 p_2^2} and so on.

The series can be computed in Excel – it converges extremely fast. I computed the cube root of 897373742731 to Excel’s numerical accuracy in six iterations.

Carrying on, we can compute the fourth root of N, as should be obvious now, by finding a potential fourth root p_1 that is a close candidate and then

p_2 = p_1 + \frac{\mathcal E_1}{4 p_1^3}

and so on.

You can generalize this to a general q^{th} root by choosing p_2=p_1 + \frac{\mathcal E_1}{q p_1^{q-1}} and carrying on as before. It is a little complex to write down the error term, but it just needs you to brush up the binomial theorem expansion a little. I used this to compute the 19^{th} root of  198456, starting from the convenient 2, rather than 1 (see below!). It is 1.900309911. If I started from the intial number 3, convergence is slower but it still gets there quite fast (in twelve iterations).

Here’s a bit of the Excel calculation with a starting point of 2

p E
p_1 2 -325832
p_2 1.93458156 -80254.8113
p_3 1.90526245 -10060.9486
p_4 1.90042408 -226.670119
p_5 1.90030997 -0.12245308
p_6 1.90030991 -3.6118E-08
p_7 1.90030991 0

and here’s a starting point of 3. Notice how much larger the initial error is (under the column E).

p E
p_1 3 -1162063011
p_2 2.84213222 -415943298
p_3 2.69261765 -148847120
p_4 2.55108963 -53231970.3
p_5 2.41732047 -19003715.9
p_6 2.29140798 -6750923.06
p_7 2.17425159 -2365355.74
p_8 2.06867527 -797302.733
p_9 1.98149708 -240959.414
p_10 1.92430861 -53439.1151
p_11 1.90282236 -5045.06319
p_12 1.90033955 -58.8097322
p_13 1.90030991 -0.00825194
p_14 1.90030991 0 : CONVERGED!

The neat thing about this method is, unlike what Bill Casselman states, you don’t need to start from an initial point that sandwiches the number N. If you start reasonably close (though never at 1, for obvious reasons!), you do pretty well. The reason why these iterations converge is that the error term \mathcal E_m is always smaller than p_q^{m-1}.

The actual writer of the Bakhshali manuscript did not use decimal notation, but used rational numbers (fractions) to represent decimals, which required an order of magnitude more work to get the arithmetic right! It is also interesting how the numerals used in the manuscript are so close to the numerals I grew up using in Hindi-speaking Mumbai!

Note added: Bill Casselman wrote to me that the use of rational numbers with a large number of digits instead of decimals represents “several orders of magnitude” more difficulty. I have no difficulty in agreeing with that sentiment – if you want the full analysis, read the article.

There is also a scholarly analysis by M N Channabasappa (1974) that predates the AMS article that analyzes the manuscript very similarly.

A wonderful compendium of original articles on math history is to be found at Math10.



Schrodinger’s Zoo

I have been enjoying reading Richard Muller’s “Now: The Physics of Time” – Muller is an extremely imaginative experimental physicist and his writings on the “arrow of time” are quite a nice compendium of the various proposed solutions. Even though none of those solutions is to my liking, they are certainly worth a read.

Meanwhile, his chapter on the famous Schrödinger’s cat experiment is interesting though I think I can come up with a far better explanation. Some of these ideas have come out of discussions with two colleagues at Rutgers (though I would hate to have them labeled as the reason for any errors), so thanks to them for being open to discussing these oft-ignored subjects.

The “cat” experiment is simple. An unfortunate kitty is stuffed into a strongly built box, where a radioactive atom is also placed. The atom has a chance, but only a chance, to decay and when it does (through, let’s say, beta decay) produces an electron. The electron triggers a cathode-ray amplifier that sets off a bomb and “poof!”😪 – poor kitty is chasing mice in a different universe!

The kerfuffle appears to be that if we describe the system using standard quantum mechanics, Muller writes that it appears that we should describe the system (in the future) as

|\Psi> = \frac{|cat \: alive>+|cat \: dead>}{\sqrt{2}}

To be an accurate depiction of what he means, the initial state of the system is

|\Psi(0)> = |cat \: alive, undecayed \: atom>

and due to the passage of time and the possibility of decay of the atom, we should write the state in the future (at time t) as

|\Psi(t)>=|cat \: alive, undecayed \: atom > (1 - p(t)) + \\  \: \: \: \: \: \: |cat \: dead, decayed \: atom> p(t)

Here, p(t) is the probability that the atom has decayed in the time t. Clearly, p(0)=0 since the atom starts out undecayed and p(\infty)=1, since the atom will eventually decay. A “superposition” of states is much more than “this or that”. The system truly has to be thought of as being in both states and cannot be even conceived of as being in one of these states at that instant. Thinking it as being in “one” or the “other” leads to logical and experimental contradictions, due to the presence of “interference” between the two states.

To be honest, if someone asked you if cats were intelligent, you would probably say “Yes” and they certainly seem quite capable of understanding that they are alive, or facing imminent demise in a rapidly exploding fireball. Stating that the system is in a state where the cat is either alive or dead and worse, in a superposition of these states, is truly silly. Einstein found this galling, asking in eloquent terms, “Do you believe that the moon is not there when you are not looking at it?”.

Having the state of the system “collapse” into one of several alternatives is picturesquely referred to as “wave-function collapse”. Many physicist – hours have been spent thinking about how wave-functions must collapse (physicist hours are like man hours, except they can be charged to the NSF and other funding organizations). Solutions range from the GHZ mechanism, which says that wave-functions can spontaneously collapse (like stock prices can randomly jump) and this is modeled by a stochastic jump process (just as with stock prices). There is also the QB-ism approach (of David Mermin and others) that suggests that wave-function descriptions of nature are in your head and that they represent YOUR knowledge and beliefs about the state of the system. While that seems like an acceptable way to handle the “cat” experiment above, it begs the question why we all seem to agree on the Hamiltonian and state variables for systems in general. Could we manufacture, from our sense impressions, an alternative-reality explanation of the same situation that agrees in detail with the “real” description. This is one that a lot of Trump supporters would certainly embrace! But it doesn’t give one a warm and fuzzy feeling that there is “a reality”out there to explain.

Other solutions rely on the consciousness of the observer and how his/her consciousness collapses wave-functions. To which, I would ask (paraphrasing Aharonov, a famous physicist), “If you believe that you collapse wave-functions yourself, is this an inherited ability? Did your parents collapse wave-functions too?” If you believe that only you, the observer, has the right to decide when a “wave-function has collapsed”, it seems to give little meaning to my experience and the experience of millions of others. At that rate, the Taj Mahal only exists the moment you yourself see it. And America didn’t exist till Columbus saw it, unless you were Native American of course!

The key, in my opinion, to understanding all this is to recognize (admit, rather), that we, as observers, are just another cog in the quantum universe. We participate in measurements, we are part of wave-functions and a wave-function is supposed to “collapse” to varying degrees based on how many interactions the system it represents has had with the surroundings. So, if the only thing that happened was that a radioactive atom decayed, that’s one atom that emitted an electron, an anti-neutrino and created a proton out of a neutron, little, if anything in the universe was altered. This piffle of a disturbance isn’t big enough to collapse anything, let alone a wave-function. If the electron subsequently collided with a couple of other atoms on its way to the anode of the amplifier, that’s piffle too, The “state” of the system, which includes the original atom, the electron, the anti-neutrino, the proton, the couple of other atoms is still in a superposition of several quantum states, including states where the original atom hasn’t even decayed.

The fun begins when the disturbance is colossal enough to affect a macroscopic number of atomic-sized particles. Now, there is one state (in the superposition of quantum states) where nothing happened and no other particles were affected. There are a {\bf multitude} of other states where a colossal number of other particles were affected. At some point, we reach no-return – the number of “un-doings” we would need to do to “un-do” the effect of the (potential) decay on the (colossal number of) other particles is so large and the energy expended to reverse ALL the after-effects would be so humongous that there is no comparison between the two states. The wave-function has collapsed. Think of it this way – if the bomb exploded and the compartment where the cat was sitting was wrecked, the amount of effort you would need to put in to “un-do” the bomb would be macroscopically large. This is what we would refer to as wave-function collapse. Of course, if you were an observer the size of our galaxy and to you this energy was a piddlingly small amount, you might say – it is still too small! However, this energy is much much larger (by colossal amounts) than the energy you would need to expend to not do anything if the decay were {\bf not} to happen.

This explanation seems to capture the idea of “macroscopic effect” as being the way to understand wave-function collapse. It also allows one to quantitatively think of this problem and relate to it the way one relates to, say, statistical mechanics.

So, if you think one little kitty-cat is “piffle”, consider a zoo of animals in a colossal compartment, along with their keepers – Schrodinger’s Zoo, to be precise. Now, consider the effect of the detonation on this macroscopic collection of particles. It should be clear that even if you, as an observer, are far from the scene, a macroscopic number of variables were affected and you should consider the wave-function to have collapsed – even if you didn’t know about it at the time, since you were clearly not measuring things properly. No one said that physics needs to explain incorrect measurements, did we?

I’d like to thank Scott Thomas and Thomas Banks, both of Rutgers and both incomparably good physicists, for educating me through useful discussions. Tom, in particular, has a forthcoming book on Quantum Mechanics that has an extensive discussion of these matters.

Why do chocolate wrappers stick to things

Here’s something I saw while lazily surfing the net this morning. Someone throws a candy wrapper towards the floor and it sticks to the curtain or a book cover. How long will it stick?

First, the reason this happens is because of static electricity – and this is why this rarely happens in humid climates (except inside the confines of an air-conditioned room, which is usually much drier). Why static electricity – when you eagerly unwrap a piece of chocolate, some charges jump in or out. Usually slow moving positive ions stay where they are – the electrons jump and which direction they jump in is a simple question with a complicated answer.

Clearly the answer to how long it will stick depends on how dry the air is – the mechanism for leaking the charge back between the wrapper to the curtain to the ground depends on the resistance to current flow offered by the air between the wrapper and the curtain and the capacitance of the wrapper-curtain capacitor.

Since I finished teaching “RC” circuits to a whole lot of undergraduates last month, I thought it would be fun to carry out a small calculation.

What does the system look like – let’s idealize the wrapper/curtain capacitor to the below picture.


The effective area of contact is of area A and separated by a distance d from the curtain. This is a parallel plate capacitor, whose capacitance is (from your dim memory of high school physics) = \frac{ \epsilon_0 A}{d} where \epsilon_0 is the permittivity of the vacuum and I have used a dielctric constant of 1 for air. Using Meter-Kilogram-Seconds-SI units, if you make the entirely reasonable assumption that d = 1 \: mm and A = 1 \; mm^2, this translates to a capacitance of \approx 10^{-14} farads. The resistivity of air (between the plates of this parallel plate capacitor) is \rho \approx 10^{16} \Omega -m and again the resistance is R = \rho \frac{d}{A}, which approximates to 10^{19} \Omega. The value of R \times C \approx 10^5 secs \approx 1 \: \: day which is an enormously long time constant – post one day, chances are the force keeping it up (the electrical attraction) is not big enough to withstand gravity and it is likely to fall. So a candy wrapper could remain stuck for a day at least – I would bet on more since the contact area might actually be much more than 1 mm^2 given lots of folds in the wrapper and the curtain, as well as all the sticky saliva on the wrapper from licking the candy off….reminds me I need to get back to finishing the candy!


Is the longest day the warmest day?

I woke up to a snowy day on the 30th of December, here in New Jersey and immediately realized two things! It was colder and darker than at the same time on the shortest day of the year, the 21st of December. I suppose you could blame the colder weather on the polar vortex swinging its awful oscillation down towards us yokels in the New York metro area, rather than simply harass a bunch of Brahminical New Englanders. But was it really darker? Was I missing something? Is the shortest day for us folks living at 41^{\circ} North {\bf different} from the 21^{st} of December? After all, the winter solstice is defined as the day when the sun is over its extreme southward excursion, when it is over the Tropic of Capricorn.

And if you average out the vagaries of weather patterns, is the shortest day really the coldest day? Conversely, is the longest day (June 21^{st}) actually the warmest day?

Let’s look at why we have solstices and equinoxes, first. If you look at the Earth’s orbit around the sun, it looks like this


This is a view looking down over the North pole (a colorful view is here). The earth is then known to be spinning counter-clockwise, and also revolving around the sun in the counter-clockwise direction. The Angular Momentum associated with the spinning of the earth around its own axis is conserved. This phenomenon, which we know from tops and gyroscopes, keeps the axis pointed in the same absolute direction (towards Polaris, the Pole Star). While this is not exactly true due to something called precessional motion that causes the axis to slowly point along different directions, which I will discuss later – it is rather slow so we can ignore it for now.

In position marked {\bf A}, the Northern Hemisphere experiences summer, as the Sun is directly overhead the Tropic of Cancer (an imaginary line that is drawn at roughly 22.5^{\circ}) – this is on the 21^{st} of June. At position marked {\bf B}, which follows after the Northern Summer, the Sun is directly above the Equator; this is called the autumnal (or fall) equinox (21^{st} September}. Next, the position marked {\bf C} is the Northern winter (21^{st} December) which also corresponds to the Southern summer, while position {\bf D} corresponds to the Spring Equinox (the 21^{st} of March).

Why is the summer solstice the longest day? And why is it called a “sol-stice” – which is Latin for “the sun is still”. Look at the picture below for the earth on that day.


The yellow lines on the right depict the Sun’s rays. The vertical (blue) line is called the Terminator, not after the eponymous 90’s movie, but a line that separates those folk that can see the Sun from those that cannot. It is clear that at the maximum angle to the sun’s rays, as depicted, the folks in the Northern hemisphere experience the longest daytime, from leaving the Terminator behind them on one side to entering it on the other.

Why does the sun seem stationary? Well, it certainly still rises in the east and sets in the west due to the Earth’s rotation, so it does not seem stationary in that respect. However, it is stationary insofar as its North-South oscillation is concerned. Notice that on the 21^{st} of June, the Sun has proceeded (due to the Earth’s orbital motion of course) as far North as it can possibly get. That is a maximum of the sun’s latitudinal displacement and as you might remember from high school math, if something is at a maximum, the slope (the rate of change) is zero. {\it Ergo}, the sun appears to not move along the North-South direction.

Is the Northern hemisphere the warmest on this day. Let’s ignore the fact that water and soil take time to heat up; this is called the Lag of the Seasons, but that is only part of the answer. For it to feel warm, the sun has to be right atop your head, at the zenith of the sky. It is very clear that when the Tropic of Cancer has the sun at its zenith, other parts of the Earth, to the south, do {\bf not!}. So, there are going to be two hottest days of the year for parts of the earth between the Tropic of Cancer and the Equator (and similarly between the Tropic of Capricorn and the Equator). They will be the days that the Sun is overhead, but on its journey to the higher latitudes, then on its journey away from them! So May and July might actually be warmer than June!

There is one small problem – the Earth is also not always the same distance from the Sun. It is closest to the Sun in January and furthest in July. That also causes the temperature to vary a lot. But one can learn a lot by looking at annual charts of Sun Hours and Average Temperature for two cities, one at the Tropic of Cancer and the other between that and the Equator. Indore (India) is at {23 \frac{1}{2}}^{\circ} while Port Blair (also an Indian territory, but close to Indonesia) is at {11 \frac{1}{2}}^{\circ}, halfway between. Look at the dip in June between May and July for Port Blair vs. Indore (which has no such dip).


By the way, several websites as well as people discuss this topic, as does the physicist Hitoshi Murayama in a neat essay. Other websites discuss the  solstice , and you can find a list of cities by latitude here. You will find temperature data here.

Can you travel faster through time?

If you watch science fiction movies, the most dramatic effects are obtained through some form of time travel. Pick some time in the future, or the past and a fabulous machine or spell swoops you away to that time.

I have always had a problem with this simple approach to time travel. One obvious objection would be this. Remember that the earth is rotating around its axis at 1000 \frac{km}{hr}, revolving around the Sun at 100,000 \frac{km}{hr}, spinning around the center of the Milky Way at 792,000 \frac{km}{hr} and being dragged at 2.1 million \frac{km}{hr} towards the Great Attractor in Leo/Virgo together with the other denizens of the Milky Way! If someone took you away for a few seconds and plonked you back at the same {\bf {SPOT}} in the universe after a few couple of minutes, but several thousands years ago or ahead, you sure would need a space suit. Earth would be several billions of kilometers away, you’d be either totally in empty space or worse, somewhere inside a sun or something. I can’t imagine the Connecticut Yankee in King Arthur’s court carried spare oxygen or a shovel to dig himself out of an asteroid!

OK, so if you built a time machine without an attached space capsule to bring you back to the earth, woe is you. In addition, here is a simple way to sort out the paradoxes of time travel – just put down a rule that you can travel back in time, but you will land so far away that you can’t affect your history! This should surely be possible! Only time will tell.

Anyway, I didn’t really mean to share with you my proposal for how to make time travel possible. I was pondering a peculiarity of Einstein’s theory of relativity that isn’t often made clear in basic courses.

In Einstein’s world view, as modified by his teacher Hermann Minkowski, we live in a four dimensional world, where there are three space axes and one time axis. In this peculiar space, {\bf {Events}} are labeled by (t, \vec x) describing the precise time they occurred and their position (three coordinates give you a vector). In addition, if there are two such events (t_1, \vec x_1) and (t_2, \vec x_2), then the “distance” (t_1-t_2)^2 - (\vec x_1- \vec x_2)^2 is preserved in all reference frames. Why the peculiar minus sign between the time and space pieces? That’s what preserves the speed of light between all reference frames, as Minkowski realized was the key to Einstein’s reformulation of the geometry of space time. It is interesting that Einstein did not think of this initially and decided that Minkowski and other mathematicians were getting their dirty hands on his physical insights and turning them into complicated beasts – he came around very quickly of course, he was Einstein!

To generalize this idea further, physicists invented the idea of a 4-vector. A regular 3-vector (example \vec x) describes, for instance, the position of a particle in space. A 4-vector for this particle would be (t,\vec x) and it describes not just where it is, but when it is. Writing equations of relativity in vector form is useful. We can write them down without reference to one particular observer, without specifying how that observer is traveling.

In notation, the 4-vector for position of a particle is x^{\mu} \equiv (x^0, \vec x) = (t, \vec x).

Before we go any further, it is useful to consider the concept of the “length” of a vector. An ordinary 3-vector has the length |\vec x| = \sqrt{x^2+y^2+z^2}, but in the 4-vector situation, the appropriate length definition has a minus sign, |x^{\mu}| = \sqrt{c^2 t^2 - x^2 - y^2 - z^2}. The relative minus sign again comes from the essential consideration that the speed of light is constant between reference frames. Note that if you look at the 4-vector for an event – it has four independent components – the time and the position in three space.

Next, one usually needs a notion of velocity. Usually we would just write down \frac{d \vec x}{dt} with 3-vectors. However, when we differentiate with respect to t, we are using coordinates particular to one observer. A better quantity to use it one that all observers would agree on – the time measured by a clock traveling with the particle. This time is called the “proper time” and it is denoted by \tau. So the relativistic 4-velocity is defined as V^{\mu} = \frac{d x^{\mu}}{d \tau}. In terms of coordinates used by an observer standing in a laboratory, watching the particle, this is

V^{\mu} \equiv (V^0,\vec V) = (\frac{1}{\sqrt{1-v^2/c^2}}, \frac{\vec v}{\sqrt{1-v^2/c^2}}).

If you haven’t seen this formula before, you have to take it from me – it is a fairly elementary exercise to derive.

The peculiar thing about this formula is that if you look at the four components of the velocity 4-vector, there are only three independent components. Given the last three, you can compute the first one exactly.

In the usual way that this is described, people say that the magnitude of this vector is (V^0)^2-(\vec V)^2=1 as you can quickly check.

But it is peculiar. After all x^{\mu}, the position, does have four independent components. Why does the velocity vector V^{\mu} only have three independent components. Those three are the velocities along the three spatial directions. What about the velocity in the “time” direction?

Aha! That is \frac{dt}{dt}=1. By definition, or rather, by construction, you travel along time at 1 day per day, or 1 year per year or whatever unit you prefer. The way the theory of relativity is constructed, it is {\bf {incompatible}} with any other rate of travel. {\bf {You \: \: cannot \: \: travel \: \: faster \: \: or \: \: slower, \: \: or \: \: even \: \: backwards, \: \: in \: \: time \: \: without \: \: violating \: \: the \: \: classical \: \: theory \: \: of \: \: relativity}}.

The only relativistically correct way one can traverse the time axis slower is to rotate the time axis – that’s what happens when then observer sitting in a laboratory hops onto a speeding vehicle or spaceship, i.e., performs a Lorentz transformation upon him/herself. That’s what produces the effects of time dilation.

Your only consolation is that virtual quantum particles can violate relativity for short periods of time inversely proportional to their energy. However, they are just that – virtual particles. And you cannot create them for communication.

Oh, well!

Coffee, anyon?

Based on the stats I receive from, most readers of this blog live in the US, India and the UK. In addition, there are several readers in Canada, Saudi Arabia, China, Romania, Turkey, Nigeria and France. Suppose you live in the first three of these countries. In addition, let’s say you go to your favorite coffee shop (Starbucks, Cafe Coffee Day or your friendly neighbourhood Costa) and get a cup of coffee. Your coffee is brewed with hot water – usually filtered and quite pure, let’s even assume it is distilled, so it contains only H_2 O.

Now, here is an interesting question. Can you tell, from tasting the water, where it came from, which animals in the long history of the planet passed it through their digestive systems, or which supernova or neutron star collision chucked out the heavier elements that became these particular water molecules?

No, obviously, you can’t. Think of the chaos it would create – your coffee would taste different every day, for instance!

This is an interesting observation! It indicates that the way materials behave, they lose their memory of their past state. But isn’t this in apparent contradiction with the laws of classical mechanics that determine {\bf all} final conditions based on initial conditions. In fact, until the understanding of chaos and “mixing” came along, physicists and mathematicians spent a lot of time trying to reconcile this apparent contradiction.

When quantum mechanics came along, it allowed physicists to make sense of this in a rather simple way. Quantum states are superpositions of several microstates. In addition, when a measurement is made on a quantum system, it falls into one of the several microstates. Once it does, it has no memory of its previous superposition of states. However, there is yet another reason why history may not matter. This reason is only relevant in two spatial dimensions.

Let’s study some topology.

Suppose you have two particles, say two electrons. A quantum-mechanical system is described by a wave-function \psi, which is a function of the coordinates of the two particles, i.e., \psi(\vec x_1, \vec x_2). If the particles are {\bf {identical}}, then one can’t tell which particle is the “first” and which the “second” – they are like identical twins and physically we cannot tell them apart from each other. So, the wave function \psi might mean by its first argument, the first particle, or the second particle. How should the wave-function with one assumed order of particles be related to the wave-function with the “exchanged” order of particles? Are they the same wave function?

In three or more dimensions, the argument that’s made is as follows. If you exchange the two particles, then exchange them back again, you recover the original situation. So you should be back to your original wave function, correct? Not necessarily – the wave function is  a complex number. It is not exactly the quantity with physical relevance, only the square of the magnitude (which yields the probability of the configuration) is. So, the wave function could get multiplied by a complex phase – a number like e^{i \theta} which has magnitude 1 and still be a perfectly good description of the same situation.

So if we had the particles A & B like this and switch them


the wave function could get multiplied by a phase e^{i \theta} with the wave function describing an identical situation. But if we did the same operation and switched them again, we have to come back to exactly the same wave function as before –  we might need to multiply by the square of the above complex number (e^{i \theta})^2; however, this square has to be equal to 1. Why in this case? Well, remember, it looks like we are back to the identical original situation and it would be rather strange if the same situation were described by a different wave function just because some one might or might not have carried out a double “switcheroo”.

If the square of some number is 1, that number is either 1 or -1. Those two cases correspond to either bosons or fermions – yes the same guys we met in a previous post. Now, why the exchange property determines the statistical properties is a subtle theorem called the spin-statistics theorem and it will be discussed in a future post.

However, in two dimensions, it is very obvious that something very peculiar can happen.

If you have switched two particles in three dimensions, look at the picture below.


then let’s mould the switching paths for the total double switch as below


we can move the “switcheroo” paths around and shrink each to a point. Do the experiment with  strings to convince yourself.

So, exchanging the two particles twice is equivalent to doing nothing at all.

However, if we were denizens of “Flatland” and lived on a two-dimensional flat space, we {\bf {cannot}} shrink those spiral shaped paths to a point – there isn’t an extra dimension to unwind the two loops and shrink them to a point. We cannot pull one of the strings out into the third dimension and unravel it from the other! So, there is no reason for two switches to bring you back to the original wave function for the two identical particles. {\bf {The \: \: particles \: \: remember \: \: where \: \: they \: \: came \: \: from}}.This is why the quantum physics of particles in two dimensions is complicated – you can’t make the usual approximations to neglect the past and start a system off from scratch, the past be damned.

Particles in two dimensions are called anyons. Coffee with anyons in two dimensions would taste mighty funny and you would smell every damned place in two dimensions it had been in before it entered your two-dimensional mouth!

The unreasonable importance of 1.74 seconds

1.74 seconds.

If you know what I am talking about, you can discontinue reading this – its old news. If you don’t, its interesting what physicists can learn from 1.74 seconds. Its all buried in the story about GW170817.

A few days ago, the people who constructed the LIGO telescope observed gravitational waves from what appears to be the collision and collapse of a pair of neutron stars (of masses that are believed to be 1.16 M_{\odot} and 1.6 M_{\odot}. The gravitational wave observation is described in this Youtube posting, as also a reconstruction of how it might sound in audio (of course, it wasn’t something you could {\bf {hear}}!).

As soon as the gravitational wave was detected, a search was done through data recorded at several telescopes and satellites for  coincident optical or gamma ray (high frequency light waves) emissions – the Fermi (space-based) telescope did record a Gamma ray burst 1.7 seconds later. Careful analysis of the data followed.

Sky & Telescope has a nice article discussing this. To summarize it and some of the papers briefly, it turns out that a key part to unravelling the exact details about the stars involved is to figure out the distance. To find the distance, we need to know where to look (was the Fermi telescope actually observing the same event?). There are three detectors currently at work under the LIGO collaboration and two of them detected the event. This is the minimum needed to detect an event anyway (given the extremely high noise) one needs the near-simultaneous detection at two widely separated detectors to confirm that we have seen something. All the detectors have blind spots due to the angles at which they are placed, so the fact that two saw something and the third {\bf {didn't}} indicates something was afoot in the blind-spot of the third detector. That didn’t localize the event enough though. Enter Fermi’s observation – which only localized the event to tens of degrees (about twenty times the size of the moon or sun). But the combination was enough to put a small field of view as “region of interest”. Optical telescopes then looked at the region and discovered the “smoking gun” – the actual increasingly bright, then dimming star. The star appears to be in a suburb of the NGC 4993 galaxy, some 130 million light years away – note that our nearest galactic neighbour is the Andromeda galaxy which is roughly 2 million light years away. Finding the distance makes the precision in the spectra, the masses of the involved neutron stars etc much higher, so one can actually do a lot more precise analysis. The red-shift to the galaxy is 0.008, which looks small, but this connection helps understand the propagation of the light and gravitational waves from the event to us on the Earth.

Now, on to the simple topic of this post. If you have an event from which you received gravitational waves and photons and the photons reached us 1.74 seconds after the gravitational waves, we can estimate a limit on the difference in the mass of the photon versus the graviton (hypothesized particle that is the force carrier of gravity). If we, or simplicity, assume the mass of the graviton is zero, then we make the following argument:

From special relativity, if a particle has speed v, momentum p, energy E and the speed of light is c,

\frac{v}{c} = \frac{pc}{E} = \frac{\sqrt{E^2 - m^2c^4}}{E} \approx 1 - \frac{m^2c^4}{2E^2}

which implies

\frac{mc^2}{E} = \sqrt{1 - \frac{v}{c}} \approx 1 - \frac{v}{2c}

Since the graviton reached 1.7 seconds before, after travelling for 130 million years, that translates into a differential of speed, i.e., \frac {\delta v}{c}  of 4 \times 10^{-19}., i.e.,

\frac{\delta(m)c^2}{E} =2 \times 10^{-19}

Next, we need to compute the energy of the typical photons in this event. They follow an approximate black-body spectrum and the peak wavelength of a black-body spectrum follows the famous Wien’s law. To cut to the chase, the peak energy emission was in the 10 keV range (kilo-electronvolt), which means our mass differential is 2 \times 10^{-15} eV (this is measuring mass in energy terms, as Einstein instructed us to). While the current limits on the photon’s mass are much tighter, this is an interesting way to get a bound on the mass. In general, the models for this sort of emission indicate that gamma rays are emitted roughly a second after the collision, so the limits will get tighter as the models sharpen their pencils through observation.

However, the models are already improving bounds on various theories that rely on modifying Einstein’s (and Newton’s) theories of gravity. Keep a sharp eye out!

New kinds of Cash & the connection to the Conservation of Energy And Momentum

Its been difficult to find time to write articles on this blog – what with running a section teaching undergraduates (after 27 years of {\underline {not \: \: doing \: \:  so}}), as well as learning about topological quantum field theory – a topic I always fancied but knew little about.

However, a trip with my daughter brought up something that sparked an interesting answer to questions I got at my undergraduate section. I had taken my daughter to the grocery store – she ran out of the car to go shopping and left her wallet behind. I quickly honked at her and waved, displaying the wallet. She waved back, displaying her phone. And insight struck me – she had the usual gamut of applications on her phone that serve as ways to pay at retailers – who needs a credit card when you have Apple Pay or Google Pay. I clearly hadn’t adopted the Millennial ways of life enough to understand that money comes in yet another form, adapted to your cell phone and aren’t only the kinds of things you can see, smell or Visa!

And that’s the connection to the Law Of Conservation of Energy, in the following way. There were a set of phenomena that Wolfgang Pauli considered in the 1930s – beta decay. The nucleus was known and so were negatively charged electrons (these were called \beta-particles). People had a good idea of the composition and mass of the nucleus (as being composed of protons and neutrons), the structure of the atom (with electrons in orbit around the nucleus) and also understood Einstein’s revolutionary conceptions of the unity of mass and energy. Experimenters were studying the phenomenon of nuclear radioactive decay. Here, a nucleus abruptly emits an electron, then turns into a nucleus with one higher proton number and one less neutron number, so roughly the same atomic weight, but with an extra positive charge. This appears to happen spontaneously, but in concert with the “creation” of a proton, an electron is also produced (and emitted from the atom), so the change in the total electric charge is +1 -1 = 0 – it is “conserved”.  What seemed to be happening inside the nucleus was, that one of the neutrons was decaying into a proton and an electron. Now, scientists had constructed rather precise devices  to “stop” electrons, thereby measuring their momentum and energy. It was immediately clear that the total energy we started with – the mass-energy of the neutron (which starts out not moving very much in the experiment), in decaying into the proton and electron was more than the energy of the said proton (which also wasn’t moving very much at the end) and aforesaid electron.

People were quite confused about all this. What was happening? Where was the energy going? It wasn’t being lost to heating up the samples (that was possible to check). Maybe the underlying process going on wasn’t that simple? Some people, including some famous physicists, were convinced that the Law of Conservation of Energy and Momentum had to go.

As it turned out, much like I was confused in the car because I had neglected that money could be created and destroyed in an iPhone, people had neglected that energy could be carried away or brought in by invisible particles called neutrinos. It was just a proposal, till they were actually discovered in 1956 through careful experiments.

In fact, as has been rather clear since Emmy Noether discovered the connection between a symmetry and this principle years ago, getting rid of the Law of Conservation of Energy and Momentum is not that easy. It is connected to a belief that physics (and the result of Physics experiments) is the same whether done here, on Pluto or in empty space outside one of the galaxies on the Hubble deep field view! As long as you systematically get rid of all “known” differences at these locations – the gravity and magnetic field of the earth, your noisy cousin next door, the tectonic activity on Pluto, or small black holes in the Universe’s distant past, the fundamental nature of the universe is translationally \: \: invariant. So if you discover that you have found some violation of the Law of Conservation of Energy and Momentum, i.e., a perpetual motion machine, remember that you are announcing that there is some deep inequivalence between different points and time in the Universe.

The usual story is that if you notice some “violation” of this Law, you immediately start looking for particles or sources that ate up the missing energy and momentum rather than announce that you are creating or destroying energy. This principle gets carried into the introduction of new forms of “potential energy” too, in physics, as we discover new ways in which the Universe can bamboozle us and reserve energy for later use in so many different ways. Just like you have to add up so many ways you can store money up for later use!

That leads to a conundrum. If the Universe has a finite size and has a finite lifetime, what does it mean to say that all times and points are equivalent? We can deal with the spatial finiteness – after all, the Earth is finite, but all points on it are geographically equivalent, once you account for the rotation axis (which is currently where Antarctica and the Arctic are, but really could be anywhere). But how do you account for the fact that time seems to start from zero? More on this in a future post.

So, before you send me mail telling me you have built a perpetual motion machine, you really have to be Divine and if so, I am expecting some miracles too.

The Normal Distribution is AbNormal

I gave a talk on this topic exactly two years ago at my undergraduate institution, the Indian Institute of Technology, in Chennai (India). The speech is here, with the powerpoint presentation accompanying it The Normal Distribution is Abnormal And Other Oddities. The general import of the speech was that the Normal Distribution, which is a statistical distribution that applies to random data of a variety of sorts, that’s often used to model the random data, is often not particularly appropriate at all. I presented cases where this is the case and the assumption (of a normal distribution) leads to costly errors.


Mr. Olbers and his paradox

Why is the night sky dark? Wilhelm Olbers asked this question, certainly not for the first time in history, in the 1800s.

That’s a silly question with an obvious answer. Isn’t that so?

Let’s see. There certainly is no sun visible, which is the definition of night, after all. The moon might be, but on a new moon night, the moon isn’t, so all we have are stars in the sky.

Now, let’s make some rather simple-minded assumptions. Suppose the stars are distributed equally throughout space, at all previous times too. Why do we have to think about previous times? You know that light travels at 300,000 \: km/s, so when you look out into space, you also look back in time. So, one has to make some assumptions about the distribution of stars at prior times.

Then, if you draw a shell around the earth, that has a radius R and thickness \delta R, the volume of this thin shell is 4 \pi R^2 \delta R.

Shell Radius R

Suppose there were a constant density of stars n stars per unit volume, this thin shell has n 4 \pi R^2 \delta R stars. Now the further away a star is, the dimmer it seems – the light spreads out in a sphere around the star. A star a distance R that emits I units of energy per second, will project an intensity of \frac{I}{4 \pi R^2} per unit area at a distance R away. So a shell (thickness \delta R) of stars at a radius R will bombard us (on the earth) with intensity \frac {I}{4 \pi R^2} n 4 \pi R^2 \delta R units of energy per unit area. This is = I n \ \delta R units of energy per unit area.

This is independent of R!. Since we can do this for successive shells of stars, the brightness of each shell adds! The night sky would be infinitely bright, IF the universe were infinitely big.

Some assumptions were made in the above description.

  1. We assumed that the stars are distributed uniformly in space, at past times too. We also assumed, in particular,
  2. isotropy, so there are no special directions along which stars lie.
  3. We also assumed that stars all shine with the same brightness.
  4. We didn’t mention it, but we assumed nothing obscures the light of far away stars, so we are able to see everything.
  5. In addition, we also assumed that the universe is infinite in size and that we can see all the objects in it, so it also must have had an infinite lifetime before this moment. Since light travels at a finite speed, light from an object would take some time to reach us; if the universe were infinitely old, we’d see every object in it at the moment we look up.
  6. We also assumed that the Universe wasn’t expanding rapidly – in fact with a recession speed for far away stars that increased (at least proportionately, but could be even faster than linear) with their distance. In such a universe, if we went far enough, we’d have stars whose recession speed from us exceeds the speed of light. If so, the light from those stars couldn’t possibly reach us – like a fellow on an escalator trying hard to progress upwards while the escalator goes down.

There are a tremendous number of population analyses of the distribution of light-emitting objects in the universe that make a convincing case (next post!) that the universe is isotropic and homogeneous on enormously large length scales (as in 100 Mega parsecs). We don’t see the kind of peculiar distributions that would lead us to assume a conspiracy of the sort implied in point 2.

We have a good idea of the life cycles of stars, but the argument would proceed on the same lines, unless we had a systematic diminution of intrinsic brightness as we looked at stars further and further away. Actually, the converse appears to be true. Stars and galaxies further away had tremendous amounts of hydrogen and little else and appear to be brighter, much brighter.

If there were actually dust obscuring far away stars, then the dust would have absorbed radiation from the stars, started to heat up, then would have emitted radiation of the same amount, once it reached thermodynamic equilibrium. This is not really a valid objection.

The best explanation is that either the universe hasn’t been around infinitely long, or the distant parts are receding from us so rapidly that they are exiting our visibility sphere. Or both.

And that is the start of the study of modern cosmology.

The Great American Eclipse of 2017

I really had to see this eclipse – met up with my nephew at KSU, then eclipse chasing (versus the clouds) all the way from Kansas to central and south-east Missouri. The pictures I got were interesting, but I think the videos (and audio) reflect the experience of totality much better. The initial crescent shaped shadows through the “pinholes” in the leafy branches,

With the slowly creeping moon swallowing the sun

Followed by totality

and the sudden disappearance of sunlight, followed by crickets chirping (listen to the sounds as the sky darkens)

I must confess, it became dark and my camera exposure settings got screwed up – no pictures of the diamond ring. Ah, well, better luck next time!

This is definitely an ethereal experience and one worth the effort to see it. Everybody and his uncle did!

Mr. Einstein and my GPS

I promised to continue one of my previous posts and explain how Einstein’s theories of 1905 and 1915 together affect our GPS systems. If we hadn’t discovered relativity (special and general) by now, we’d have certainly discovered it by the odd behaviour of our clocks on the surface of the earth and on an orbiting satellite.

The previous post ended by demonstrating that the time interval between successive ticks of a clock at the earth’s surface \Delta t_R and a clock ticking infinitely far away from all masses \Delta t_{\infty} are related by the formula

\Delta t_{R} =  \Delta t_{\infty} (1 + \frac{ \Phi(R)}{c^2})

The gravitational potential \Phi(R)=-\frac{G M_E}{R} is a {\bf {negative}} number for all R. This means that the time intervals measured by the clock at the earth’s surface is {\bf {shorter}} than the time interval measured far away from the earth. If you saw the movie “Interstellar“, you will hopefully remember that a year passed on Miller’s planet (the one with the huge tidal waves) while 23 years passed on the Earth, since Miller’s planet was close to the giant Black Hole Gargantua. So time appears to slow down on the surface of the Earth compared to a clock placed far away.

Time for some computations. The mass of the earth  is 5.97 \times 10^{24} \: kg, Earth’s radius is R = 6370 \: km \: = 6.37\times 10^6 \: meters and the MKS units for G = 6.67 \times 10^{-11} \: MKS \: units. In addition, the speed of light c = 3 \times 10^8 \frac {m}{s}. If \Delta t_{\infty} = 1 \: sec, the clock on an orbiting satellite (assumed to be really far away from the earth) measures one second, the clock at the surface measures

\Delta t_R = (1 \: sec) \times (1 - \frac {(6.67 \times 10^{-11}) \:  \times \:  (5.97 \times 10^{-24} )}{(6.37 \times 10^6 )\: (3 \times 10^8 )^2})

this can be simplified to  0.69 \: nanoseconds less than 1 \: sec. In a day, which is  (24 \times 3600) \: secs, this is 70 \times 10^-6 = 60 \: \mu \: seconds (microseconds are a millionth of a second).

In reality, as will be explained below, the GPS satellites are operating at roughly 22,000 \: km above the earth’s surface, so what’s relevant is the {\bf {difference}} in the gravitational potential at 28,370 \: km and 6,370 \: km from the earth’s center. That modifies the difference in clock rates to 53 \: nanoseconds per second, or 46 \: microseconds in a day.

How does GPS work? The US (other countries too – Russia, the EU, China, India) launched several satellites into a distant orbit 20,000 \: - 25,000 \: km above the earth’s surface. Most of the orbits are designed to allow different satellites to cover the earth’s surface at various points of time. A few of the systems (in particular, India’s) have satellites placed in a Geo-Stationary orbit, so they rotate around the earth with the earth – they are always above a certain point on the earth’s surface. The key is that they possess rather accurate and synchronized atomic clocks and send the time signals, along with the satellite position and ID to GPS receivers.

If you think about how to locate someone on the earth, if I told you I was 10 miles from the Empire State Building in Manhattan, you wouldn’t know where I was. Then, if I told you that I was 5 miles from the Chrysler building (also in Manhattan), you would be better off, but you still wouldn’t know how high I was. If I receive a third coordinate (distance from yet another landmark), I’d be set.  So we need distances from three well-known locations in order to locate ourselves on the Earth’s surface.

The GPS receiver on your dashboard receives signals from three GPS satellites. It knows how far they are, because it knows when the signals were emitted, as well as what the time at your location is.  Since these signals travel at the speed of light (and this is sometimes a problem if you have atmospheric interference), the receiver can compute how far away the satellites are. Since it has distances to three “landmarks”, it can be programmed to compute its own location.

Of course, if its clock was constantly running slower than the satellite clocks, it would constantly overestimate the distance to these satellites, for it would think the signals were emitted \: earlier than they actually were. This would screw up the location calculation, to the distance travelled by light in 0.53 \: nanoseconds, which is 0.16 meters. Over a day, this would become 14 kilometers. You could well be in a different city!

There’s another effect – that of time dilation. To explain this, there is no better than the below thought experiment, that I think I first heard of from George Gamow’s book. As with {\bf {ALL}} arguments in special and general relativity, the only things observers can agree on is the speed of light and (hence) the order of causally related events. That’s what we use in the below.

There’s an observer standing in a much–abused rail carriage. The rail carriage is travelling to the right, at a high speed V. The observer has a rather cool contraption / clock. It is made with a laser, that emits photons and a mirror, that reflects them. The laser emits photons from the bottom of the carriage towards the ceiling, where the mirror is mounted. The mirror reflects the photon back to the floor of the car, where it is received by a photo-detector (yet another thing that Einstein first explained!).

Light Clock On Train

The time taken for this up- and down- journey (the emitter and mirror are separated by a length L) is

\Delta t' = \frac{2 L}{c}

That’s what the observer on the train measures the time interval to be. What does an observer on the track, outside the train, see?

Light Clock Seen from Outside Train

She sees the light traverse the path down in blue above. However, she also sees the light traveling at the same (numerical) speed, so she decides that the time between emission and reception of the photon is found using Pythagoras’ theorem

L^2 = (c \frac{\Delta t}{2})^2 - (V \frac {\Delta t}{2})^2

\rightarrow  \Delta t = \frac {2 L}{c} \frac{1}{\sqrt{1 - \frac{V^2}{c^2}}}

So, the time interval between the same two events is computed to be larger on the stationary observer’s clock, than on the moving observer’s clock. The relationship is

\Delta t = \frac {\Delta t'}{ \sqrt{1 - \frac{V^2}{c^2}} }

How about that old chestnut – well, isn’t the observer on the track moving relative to the observer on the train? How come you can’t reverse this argument?

The answer is – who’s going to have to turn the train around and sheepishly come back after this silly experiment runs its course? Well! The point is that one of these observers has to actively come back in order to compare clocks. Relativity just observes that you cannot make statements about {\bf {absolute}} motion. You certainly have to accept relative motion and in particular, how observers have to compare clocks at the same point in space.

From the above, 1 second on the moving clock would correspond to \frac {1}{ \sqrt{1 - \frac{V^2}{c^2}} } seconds on the clock by the tracks. A satellite at a distance D from the center of the earth has an orbital speed of \sqrt {\frac {G M_E}{D} } , which for an orbit 22,000 km above the earth’s surface, which is 28,370 \: km from the earth’s center, would be roughly

\sqrt { \frac {(6.67 \times 10^{-11} (5.97 \times 10^{-24})}{28370 \times 10^3} }\equiv  3700 \: \frac{meters}{sec}

which means that 1 second on the moving clock would correspond to 1 \: sec + 0.078 \: nanoseconds on the clock by the tracks. Over a day, this would correspond to a drift of 6 \: microseconds, in the {\bf {opposite}} direction to the above calculation for gravitational slowing.

Net result – the satellite clocks run faster by 40 microseconds in a day. They need to be continually adjusted to bring them in sync with earth-based clocks.

So, that’s three ways in which Mr. Einstein matters to you EVERY day!


Master Traders and Bayes’ theorem

Imagine you were walking around in Manhattan and you chanced upon an interesting game going on at the side of the road. By the way, when you see these games going on, a safe strategy is to walk on, since they usually reduce to methods of separating a lot of money from you in various ways.

The protagonist, sitting at the table tells you (and you are able to confirm this by a video taken by a nearby security camera run by a disinterested police officer), that he has managed to toss the same quarter (an American coin) thirty times and managed to get “Heads” {\bf ALL} of those times. What would you say about the fairness or unfairness of the coin in question?

Next, your good friend rushes to your side and whispers to you that this guy is actually one of a really \: large number of people (a little more than a billion) that were asked to successively toss freshly minted, scrupulously clean and fair quarters. People that tossed tails were “tossed” out at each successive toss and only those that tossed heads were allowed to toss again. This guy (and one more like him) were the only ones that remained. What can you say now about the fairness or unfairness of the coin in question?

What if the number of coin tosses was 100 rather than 30, with a larger number of initial subjects?

Just to make sure you think about this correctly, suppose you were the Director of a large State Pension Fund and you need to invest the life savings of your state’s teachers, firemen, policemen, highway maintenance workers and the like. You get told you have to decide to allocate some money to a bet based made by an investment manager based on his or her track record (he successively tossed “Heads” a hundred times in a row). Should you invest money on the possibility that he or she will toss “Heads” again? If so, how much should you invest? Should you stay away?

This question cuts to the heart of how we operate in real life. If you cut out the analytical skills you learnt in school and revert to how our “lizard” brain thinks, we would assume the coin was unfair (in the first instance) and express total surprise at the knowledge of the second fact. In fact, even though the second situation could well have happened to every similar situation of the first sort we encounter in the real world, we would still operate as if the coin was unfair, as our “lizard” brain would instruct us to behave.

What we are doing unconsciously is using Bayes’ theorem. Bayes’ theorem is the linchpin of inferential deduction and is often misused even by people who understand what they are doing with it. If you want to read couple of rather interesting books that use it in various ways, read Gerd Gigirenzer’s “Reckoning with Risk: Learning to Live with Uncertainty” or Hans Christian von Baeyer’s “QBism“. I will discuss a few classic examples. In particular Gigirenzer’s book discusses several such, as well as ways to overcome popular mistakes made in the interpretation of the results.

Here’s a very overused, but instructive example. Let’s say there is a rare disease (pick your poison) that afflicts 0.25 \% of the population. Unfortunately, you are worried that you might have it. Fortunately for you, there is a test that can be performed, that is 99 \% accurate – so if you do have the disease, the test will detect it 99 \% of the time. Unfortunately for us, the test has a 0.1 \% false positive rate, which means that if you don’t have the disease, 0.1 \% of such tested people will mistakenly get a positive result. Despite this, the results look exceedingly good, so the test is much admired.

You nervously proceed to your doctor’s office and get tested. Alas, the result comes back “Positive”. Now, ask yourself, what the chances you actually have the disease? After all, you have heard of false positives!

A simple way to turn the percentages above into numbers, suppose you consider a population of 1,000,000 people. Since the disease is rather rare, only (0.25 \% \equiv ) \: 2,500 have the disease. If they are tested, only (1 \% \equiv ) \: 25 of them will get an erroneous “negative” result. However, if the rest of the population were tested in the same way, (0.1 \%=) \: 1000 people would get a “Positive” result, despite not having the disease. In other words, of the 3475 people who would get a “Positive” result, only 2475 actually have the disease, which is roughly 72\% – so such an accurate test can only give you a 7-in-10 chance of actually being diseased, despite its incredible accuracy. The reason is that the “false positive” rate is low, but not low enough to overcome the extreme rarity of the disease in question.

Notice, as Gigirenzer does, how simple the argument seems when phrased with numbers, rather than with percentages. To do this using standard probability theory, one writes, if we are speaking about Events A and B and write the probability that A could occur once we know that B has occurred as P(A/B), then

P(A/B) P(B) = P(A)

Using this

P(I \: am \: diseased \: GIVEN \: I \: tested \: positive) = \frac {P(I \: am \: diseased)}{P(I \: test \: positive)}

and then we note

P(I \: am \: diseased) = 0.25\%

P(I \: test \: positive) = 0.25 \% \times 99 \% + 99.75 \% \times 0.1 \%

since I could test positive for two reasons – either I really among the 0.25 \% positive people and additionally was among the 99 \% that the test caught OR I really was among the 99.75 \% negative people but was among the 0.1 \% that unfortunately got a false positive.

Indeed, \frac{0.25 \%}{0.25 \% \times 99 \% + 99.75 \% \times 0.1 \%} \approx  0.72

which was the answer we got before.

The rather straightforward formula I used in the above is one formulation of Bayes’ theorem. Bayes’ theorem allows one to incorporate one’s knowledge of partial outcomes to deduce what the underlying probabilities of events were to start with.

There is no good answer to the question that I posed in the first paragraph. It is true that both a fair and an unfair coin could give results consistent with the first event (someone gets 30 or even 100 coin tosses). However, if one desires that probability has an objective meaning independent of our experience, based upon the results of an infinite number of repetitions of some experiment (the so-called “frequentist” interpretation of probability), then one is stuck. In fact, based upon that principle, if you haven’t heard something contrary to the facts about the coin, your a priori assumption about the probability of heads must be \frac {1}{2}. On the other hand, that isn’t how you run your daily life. In fact, the most legally defensible (many people would argue the {\bf {only}} defensible) strategy for the Director of the Pension Fund would be to

  • not assume that prior returns were based on pure chance and would be equally likely to be positive or negative
  • bet on the manager with the best track record

At a minimum, I would advise people to stay away from a stable of managers that simply are the survivors of a talent test where the losers were rejected (oh wait, that sounds like a large number of investment managers in business these days!). Of course, the manager that knows they have a good thing going is likely to not allow investors at all for fear of reducing their returns due to crowding. Such managers also exist in the global market.

The Bayesian approach has a lot in common with our every-day approach to life. It is not surprising that it has been applied to the interpretation of Quantum Mechanics and that will be discussed in a future post.