Gauge invariance, Global and Local Symmetry

This post, aimed at people with some knowledge of Maxwell’s equations, is aimed at connecting a bunch of concepts that are all central to how we understand the universe today. Nearly every word in the title has the status of being a buzz-word, but for good reason – they help organize the ideas well. Some of these ideas are being challenged, but nothing concrete has emerged as a convincing alternative.

When you study Maxwell’s equations in a sophomore course in college, you are presented with something like

\vec \nabla . E = \frac{\rho}{\epsilon_0} \: \: , \: \:  \vec \nabla \times \vec E = -  \frac{\partial \vec B}{\partial t} \: \: , \: \:  \vec \nabla . \vec B = 0\: \: , \: \:  c^2 \vec \nabla \times \vec B = \frac{\vec j}{\epsilon_0} + \frac{\partial \vec E}{\partial t}

You are then told that these equations can be much simplified by introducing something called vector and scalar potentials ( \vec A, \phi), such that \vec B = \vec \nabla \times \vec A and \vec E = - \vec \nabla \phi - \frac{1}{c} \frac{\partial \vec A}{\partial t}.

Then we discover this peculiar property (I say this the way I first heard of it): you can amend these potentials in the following way

\vec A \rightarrow \vec A - \vec \nabla \chi \: \: , \: \: \phi \rightarrow \phi + \frac{1}{c} \frac{\partial \chi}{\partial t} for {\bf any} field \chi and the physically measurable electric (\vec E) and magnetic fields (\vec B ) are {\bf completely} unchanged., so are Maxwell’s equations.

I presume when Maxwell discovered this property (invariance of the electric and magnetic fields upon a “gauge transformation”), which was later named “gauge invariance”, it seemed like a curiosity. The vector potential seems to have a redundancy – it is only relevant up to the addition of \vec \nabla \chi for any \chi (and in conjunction with the scalar potential). But the physical properties of matter and charge only care about the electric and magnetic fields, which are insensitive to the above freedom. If you solve for the vector and scalar potential that applies to a particular problem, you can compute the electric/magnetic fields and go on without thinking about the potentials again.

Quantum mechanics was the first blip on the sedate view above. It turns out that a charged particle notices the vector potential, in a typically beautiful way (in Nature) that both preserves the above (gauge) redundancy, but is still noticeable in its physical effects. We could have a magnetic field far far away, but the vector potential that produces that magnetic field could exist here, close to an electron and there would be measurable effects due to that far away magnetic field. It is called the Aharonov-Bohm effect, but it is a standard feature of quantum mechanics. The vector potential seems to have physical significance, but there is the curious question of what gauge invariance actually means. And the correct quantum mechanical version of Schrodinger’s equation in the presence of an electromagnetic field is not

\frac{1}{2m} (-i \hbar \frac{\partial }{\partial x})(-i \hbar \frac{\partial }{\partial x}) \psi + V(x) \psi = E \psi,


\frac{1}{2m} (-i \hbar \frac{\partial }{\partial x} - q A_x)(-i \hbar \frac{\partial }{\partial x} - q A_x) \psi + V(x) \psi = E \psi,

This is all we have to go on.

If we require this “gauge-invariance” to be a generic property of electromagnetism, it implies that the energy must be a simple function of the physically relevant \vec E‘s and \vec B‘s, not some term involving \vec A‘s that might not be gauge-invariant. No term like \vec A. \vec A can appear in the energy, because such a term would not be the same after you perform a gauge transformation on  \vec A, i.e., add a \vec \nabla \chi term as in the above. So what? This will become clear in the next paragraphs.

Let’s leave this aside for a bit.

The next big development in the theory of fundamental physics was quantum field theory. This theory has many things going for it, but the initial motivation was the discovery that particles can be created and destroyed, seemingly out of pure energy (or photons) and what’s more, every particle of a particular type is precisely identical to any other particle of the same type. All electrons look/feel the same and behave in exactly the same way. Whilst we have no idea why this is the way things are, we can model this behavior very nicely. The theory of quantum mechanics started out by considering a simple harmonic oscillator. The energy of the harmonic oscillator is E = \frac{p^2}{2m} + \frac{m \omega^2 x^2}{2}, where  \omega= the frequency of the oscillator, m= the mass of the thing that’s oscillating while x, p are the position and momentum of the thing that is oscillating.

It turns out that one can think of states of higher and higher energy of the harmonic oscillator as having more and more “quanta” of energy, since energy seems to be absorbed in packets (this is what was discovered in the quantum theory). These quanta appear identical to each other. If you have these harmonic oscillators at every point in space, then we have a “field” of oscillators. Then, with some simple construction, the “quanta” that are constructed from these several harmonic oscillators can be given a charge and a mass. The energy of this field can be written, for a field in only one dimension (the x-dimension) as E = \frac{1}{2}(\frac{\partial \phi}{\partial x})^2 + \frac{1}{2} m^2 \phi^2 :  there is a “strain energy” that’s the first term and a “mass” energy, which is related to the second term. We can treat the universe as being composed of these “fields”, though the universe seems to have scalar, vector, spinor and tensor fields too. They can each be mapped onto all known particles, so we can invent a quantum field for every single type of fundamental particle. Voila, all these particles are identical to each other and have exactly the same properties  – they are just “quanta” of these fields. Studying the properties of these particles is, basically, the business of quantum field theory.

If we had a field that wasn’t a scalar \phi, but a vector \vec A, the corresponding mass term would be a coefficient times \vec A. \vec A. But we already rejected such a term as it wouldn’t preserve gauge invariance, described in the above. You might then persist, to ask why we should try to preserve gauge invariance. Let me come back to this point a little further on. Right now, let me just say that the Aharanov-Bohm effect already tells us that Nature appears to care about this fig-leaf. In particular, it seemed for a long time that it was not possible to write a gauge-invariant theory of a massive (i.e,. not massless) particle described by a vector field.

The simplest kind of field is a scalar field, something akin to the temperature inside every point in a room. There is one number attached to every point in space.

Let’s simplify this to a simple collection of balls and springs. We could do this in any number of dimensions, but to start, let’s do this in one-dimension.


Imagine little springs connecting the balls on the lattice. At equilibrium, the balls like to be equidistant from each other, separated by the lattice constant “a”. The “field” \phi at every point is the displacement of the ball from the equilibrium point. The equilibrium value of this “field” is 0 at every point.

We can express this by saying that each ball is sitting in a potential energy well along the transverse directions (perpendicular to the one-dimensional lattice) that looks like this


which has the result that the ball doesn’t like to roll away from the lattice point it is supposed to be in.

Now, suppose there is an additional wrinkle in this lattice. Let’s assume each ball has a small charge. That doesn’t change the equilibrium situation, since like charges repel and the balls would still keep a distance “a” apart (don’t forget there are still the springs holding the balls together) .

Suppose, now, that there is a small fuzzy charge donut around each ball, with radius \delta, and assume the charge in the fuzzy donut is small and opposite to that of the ball. In that case, the ball is attracted away from its equilibrium position and would rather sit at the point \delta away from its usual position on the lattice.

The potential energy of each ball went from being zero, centered at the lattice point, to being positive (at the lattice site) with minima a little radius \delta around the lattice point. This may be called a “broken” symmetry phase, the symmetry of the smooth parabola is replaced with this “Mexican hat”.

The position of the ball can be specified by a complex number now, it is |\phi|, the along-the-chain displacement of the ball times e^{i \theta}, where \theta is the angular position on the donut shaped equilibrium surface. Why complex? Its just a simple representation of the position of the ball with this particular geometry.

PE-Broken Symmetry

Another depiction of the potential is here

Mexican hat

and the chain is

PE-chain with broken symmetry

Its not the particular mathematical function that is relevant here, just the idea that this sort of shape-shifting can happen due to natural evolution of parameters in the potential energy function, as we change external conditions.

But let’s assume the springs are pretty taut and hard to pull apart – when one ball decides to move to this new minimum, the others will all follow. This is a global shift of the entire system to the new potential energy well.  But where in this new well? There is a minimum all the way around and chain of balls could line up, ramrod straight, with all the balls on the same corresponding spot on the Mexican hat potential, at any point on the channel around the central hat.

The energy function of the scalar field in this case is E = \frac{1}{2} \frac{\partial \phi^{*}}{\partial x}  \frac{\partial \phi}{\partial x} {\bf -}  \frac{1}{2} \mu^2 \phi^{*} \phi + \frac{\lambda}{4} (\phi^{*} \phi)^2 or some similar function. Remember that \phi is complex and we have just written this in a form that gives a “real” number for the energy. In the particular case of this function, the new equilibrium would be |\phi|=\sqrt{\frac{\mu^2}{\lambda}}; remember |\phi| represents how far the ball is from the lattice point it started out at, so to choose a concrete spot, x=0, y= \sqrt{\frac{\mu^2}{\lambda}}, z=0. And the “global” symmetry here is that we could change the angular position \theta of {\bf all} the balls and it wouldn’t affect the total energy of the system.

Now, for a more daring and remarkable idea, which undoubtedly only came about because Chen Ning Yang and Robert Mills were playing around with the idea of gauge invariance before they came to this realization.

Suppose we let each ball be in any spot it wants to be in that channel (donut shaped minimum) of the Mexican hat potential, but we require that the energy of the total system be the same. In that case, remember there are springs between the balls! They don’t want the balls to get far apart – they need to be allowed to relax a bit.

Concretely, we are saying, let \theta be a function of x. If so, we have a problem. The terms proportional to \phi^{*} \phi are unaffected, since they remain |\phi|^2 e^{i \theta} e^{-i \theta} = |\phi|^2. So are the terms proportional to (\phi^{*} \phi)^2.  Not so for the “strain” energy terms – the springs doth protest!

\frac{1}{2} \frac{\partial \phi^{*}}{\partial x}  \frac{\partial \phi}{\partial x}

which was \frac{1}{2} \frac{\partial |\phi|}{\partial x}  \frac{\partial |\phi|}{\partial x} when \theta are independent of x becomes, instead,

\frac{1}{2} (\frac{\partial |\phi|}{\partial x}- i |\phi| \frac{\partial \theta}{\partial x})  (\frac{\partial |\phi|}{\partial x}+ i |\phi| \frac{\partial \theta}{\partial x})

This seems like a disaster for this idea – except that the idea that rescues it is to say, there is a field (let’s for reason of lack of imagination, call it A) that has does two things

  • enters into the energy expression through the derivative term, i.e., \frac{\partial \phi}{\partial x} - i q A \phi.
  • has a property that the field A has a peculiar kind of freedom – we can change A to A + \frac{\partial \chi}{\partial x}.

But this is exactly the way the vector potential appears in the energy function and Schrodinger’s equation of a charged particle in an electromagnetic field, as was noted a few paragraphs above! With such a field, we can make a “gauge transformation” by \frac{\partial \theta}{\partial x} and “remove” the effect of the spatially varying \theta term. The “gauge field” A allows you to turn the global symmetry into a local symmetry.

That was the connection that established that turning a global symmetry into a local symmetry (in a quantum field \phi in this case) establishes that a gauge-transforming field A must exist, must be coupled to the quantum field and must transform  following the mysterious “gauge transformation” formula to be consistent. And that is the model that all fundamental theories in particle physics have followed ever since.

In addition, if you expand out the energy function for such a coupled field,

E = \frac{1}{2} (\frac{\partial \phi^{*}}{\partial x}- i  A \phi^{*})  (\frac{\partial \phi}{\partial x}+ i A \phi) + \frac{\mu^2}{2} \phi^{*} \phi + \frac{\lambda}{4} (\phi^{*} \phi)^2

we get a very peculiar term, proportional to A^2. The coefficient of this term is \phi^{*} \phi. But this is exactly what a mass term is supposed to look like, for the “gauge field” A and in particular, since \phi is stuck at a non-zero absolute magnitude (with the peculiar potential we drew above), we can get a mass-term in the energy function, but in a gauge-invariant fashion! This is called the Englert-Brout-Higgs-Guralnik-Hagen-Kibble-Anderson-Nambu-Polyakov mechanism, but only Higgs and Englert were recognized for it with a Nobel Prize in 2013.

And why does a theory of a vector field like A need gauge invariance as a condition? This is a little harder work to understand, but a hint is that the theory is over specified right at the start. A vector field in space-time has four components at every point in space, while the spin-1 particle it describes only has three independent components. If something like gauge-invariance didn’t exist, we’d have to invent it, to reduce the excessive freedom in a theory of a vector field.

The next question (we never run out of questions!) one might have is why the potential takes the form we drew, with the Mexican hat shape. This is a profound realization – the concrete proof is that this has been seen in actual experimental studies of phase transitions. In addition, there is a connection to another set of realizations in physics – that of renormalization. That is such an interesting topic that it deserves a future post of its own.

p-‘s and q-‘s redux

Continuing our saga, trying to be intellectually honest, while a little prurient (Look It Up!, to adopt a recent political slogan), let’s look at the ridiculous “measured” correlation in point 3 of this public post. Let’s call it the PPP-GDP correlation! The scatter graph with data is displayed below

PPP GDP graph

Does it make sense? As in all questions about statistics, a lot of the seminal work traces back to Fisher and Karl Pearson. The data in the graph can be (painfully) transcribed and the correlation computed in an  Excel spreadsheet. The result is a negative correlation – around -34% – higher GDP implies lower PPP. Sorry, men in rich countries. You don’t measure up!

Some definitions first. If you take two normally distributed random variables X and Y with means \mu_X, \mu_Y and standard deviations \sigma_X, \sigma_Y, and you collect N samples of pairs (x_i, y_i),  then the Pearson coefficient

r = \frac{\sum_{i=1}^N(x_i - \mu_X^S)(y_i - \mu_Y^S)}{ \sqrt{\sum_{i=1}^N (x_i-\mu_X^S)^2} \sqrt{ \sum_{j=1}^N (y_j-\mu_Y^S)^2} }

measures the correlation between the pairs of variables measured by considering this sample. If the sample were infinitely large, you would expect to recover the “actual” correlation \rho between the two variables. They could then be assumed to be distributed through a joint (technical term “bivariate”) distribution for two random variables.

In general, this quantity r has a rather complicated distribution. However, Fisher discovered that a certain variable (called the “Fisher” transform of r) is approximately normally distributed.

F(r) = \frac{1}{2} \ln \frac{1+r}{1-r} = arctanh(r)

with mean F(\rho)=arctanh(\rho) and standard deviation \frac{1}{\sqrt{N-3}}.

Once we know something is (at least approximately) normally distributed, we can throw all sorts of simple analytic machinery at it. For instance, we know that

  • The probability that the variable is within 1.96 standard deviations of the mean (on either side) is 95%
  • The probability that the variable is within 3 standard deviations of the mean (on either side) is 99.73%
  • The probability that the variable is within 5 standard deviations of the mean (on either side) is 99.99994%

The z-score is the number of standard deviations a sample is away from the mean of the distribution.

So, if we were sensible, we would start with a null hypothesis – there is no correlation between PPP and GDP.

If so, the expected correlation \rho between these sets of variables, is 0.

The Fisher transform of \rho=0 is F(0)=arctanh(0)=0.

The standard deviation of the Fisher transform is \frac{1}{\sqrt{N-3}}. In the graph above, N=75, so the standard deviation is 0.11785.

If you measured a correlation of -0.34 from the data, that corresponds to a Fisher transform of -0.354, which is a little more than 3 standard deviations away from the “expected” mean of 0!

If you went to the average Central banker with a 3 standard deviation calculation of what could go wrong (at least before the 2008 Financial crisis), he or she would have beamed at you in fond appreciation. Now, of course, we realize (as particle physicists have) that the world needs to be held to a much higher standard. In fact, if you hold to a 5 standard deviation bound, the correlation could be between -52.9% and +52.9%.

So, if you fondly held the belief expressed in graph 3 of the post alluded to above, you might need to think again.

If you had been following the saga of the 750 GeV “bump” discovered in the data from the Large Hadron Collider a few years ago, that was roughly at the 3.5 standard deviation level. If you held off from publishing a theory early describing exactly which particle had been observed, you would have been smart. The data, at that level, was as believable as the PPP vs GDP data above. The puzzling thing, in my opinion, is why the “independent” detectors ATLAS and CMS saw the same sort of bump in the same places. Speaks to a level of cross-talk which is not supposed to happen!

The above calculation leads to a very simple extension of the p-value concept to correlation. It’s just the probability of seeing correlations more extreme than the one observed, given the null hypothesis. The choice of the null hypothesis doesn’t necessarily have be a correlation of 0. It might be reasonable to expect, for instance in the case of the correlation between the Japanese equity index and the exchange rate between the Yen and the dollar, that there is some stable (non-zero) correlation over at least one business cycle.

Featured image courtesy Maximilian Reininghaus

Buzzfeed post by Ky Harlin (Director of Data Science, Buzz Feed)

I haven’t bothered, in this analysis, to consider how the data was collected and whether it is even believable. We probably have a lot of faith in how the GDP increase data was computed, though methods have obviously changed in the last sixty years. However, did they use cadavers, to measure “lengths”?  Did they wander around poor villages with tape measures? How many of the data collectors survived the experience? This graph has all the signs of being an undergraduate alcohol-fueled experiment with all the attendant bias risks.

If you liked this post, you might like this website.

Minding your p-‘s and q-‘s

In the practice of statistical inference, the concept of p-value (as well as something that needs to exist, but doesn’t yet, called q-value), is very useful. So is a really important concept you need to understand if you want to fool people (or prevent yourself from being fooled!) – it’s called p-hacking.

The first (p-value) concerns the following kind of question (I have borrowed this example from a public lecture at the Math Museum by Jen Rogers in September 2018) – suppose I have a deadly disease where it is known that, if you perform no treatment of any kind, 40% of the people that contract it die, while the others survive, i.e., the probability of dying is 40 \%. On the other hand, a medical salesperson shows up at your doorstep and informs you that about the new miracle cure “XYZ”. They (the manufacturer) gave the drug to 10 people (that had the disease) and 7 of them survived (probability of dying with the new medical protocol appears to be 30 \%). Would you be impressed? What if she told you that they gave the drug to 1000 people and 700 of them survived? Clearly, the second seems more plausibly to have some real effect. How do we make this quantitative?

The second (I call this a q-value) concerns a sort of problem that crops up in finance. There are many retail investors that don’t have the patience to follow the market or follow the rise and fall of companies that issue stocks and bonds. They get ready-made solutions from their favorite investment bank – these are called structured notes. Structured notes can be “structured” any which way you want.

Consider one such example. Say you buy a 7-year US-dollar note exposed to the Nikkei-225 Japanese 225-stock index. The N225 index is the Japanese equivalent of the S&P500 index in the US Usually, you pay in $100 for the note, the bank unburdens you of $5 to feed the salesman and other intermediaries, then invests $70 in a “zero-coupon” US Treasury bond that will expire in 7 years. The Treasury bond is an IOU issued by the US Treasury – you give them $70 now (at the now prevailing interest rates) and they will return $100 in 7 years.

As far as we know right now, the US Treasury is a rock-solid investment, they are not expected to default, ever. Of course, governing philosophies change and someone might look at this article in a hundred years and wonder what I was thinking!

The bank then uses the remaining $25 to invest in a 7-year option that pays off (some percentage P) of the relative increase (written as P \times \frac{ \yen N225_{final}-\yen N225_{initial}}{\yen N225_{initial}}) in the Nikkei-225 index. This variety of payoff, that became popular in the early 1990s, was called a “quanto” option – note that \yen N225 is the Nikkei index in its native currency, so it is around 22,500 right now.

For a regular payoff (non-quanto), you would receive, not the expression above, but something similar converted into US dollars. This would make sense, since it would be natural (for an option buyer) to  convert the $25 into Japanese yen, buy some units of the Nikkei index, keeping only the increase (not losing money if it falls below the initial level), then converting the profits back to US dollars after 7 years. If we wrote this as an “non-quanto” option payoff, it would be P \times \frac{\$ N225_{final}-\$ N225_{initial}}{\$ N225_{initial}}, where \$ N225 is the Nikkei-225 index expressed in US dollars. If the \yen N225 index were 22,500, then the \$ N225 index is currently \frac{\yen N225}{Yen/Dollar} = \frac{22,500}{112} \approx 201. You would convert the index to US dollars after 7 years at the “then” Yen-dollar rate, to compute the “final” \$ N225 index value, which you would plug into the formula.

If  you buy a “quanto” option, you bear no exposure to the vagaries of the FX rate between the US dollar and the Japanese yen, so it is easy to explain and sell to investors. Just look at the first payoff formula above.  The second payoff formula, though natural, is a more complex formula.

However, as you should know, in finance, if there is a risk in the activity that you do, but you find that you don’t bear this risk in the instrument you have bought, it is because someone else has (presumably without your knowledge) bought this risk from you and has paid (much) less than what it is worth, through the assumptions used in pricing the instrument you just bought.

It turns out that option pricing formula invented by Fischer Black, Myron Scholes and Robert Merton can be expanded to value these sorts of “quanto” options. The formula depends on some extra parameters. One of these is the volatility (standard deviation per year) of the Yen-dollar exchange rate. The other is the correlation between two quantities – the \# Yen / Dollar and \# Yen / N225 \: index. That graph might look like this (not real data, but a common observation for these correlations).

Correlation JPYUSD vs JPYNikkei

You are asked to buy this correlation, in competition with others. How much would you pay? If you were in an uncompetitive environment, you might “buy” this correlation  at -100 \%. If you heard that someone paid  -30 \%, would you think it makes sense?

How seriously should one take this correlation? Consider the cases considered in this fantastic post. A correlation between Manoj “Night” Shyamalan’s movies and newspaper reading? Really? What correlations are sensible and what should we pay less heed to?

The idea of p-values answer the first question. The way to think about the miracle drug is this – suppose you did nothing and you assume (from your prior experience) that the results of doing nothing are – the probability of a patient dying of the deadly disease is p = 0.4, i.e.,  the probability of survival is 1- p  = 0.6. Then, if you assume that the patients live or die independent of each other, what is the probability that out of a pool of 10 patients, exactly 7, 8, 9 or 10 people would survive. Well, that would be (it’s called the p-value)

{10 \choose 7} (0.6)^7 (0.4)^3 + {10 \choose 8} (0.6)^8 (0.4)^2 +{10 \choose 9} (0.6)^9 (0.4)^1 +{10 \choose 10} (0.6)^{10} (0.4)^0 = 0.38

You might choose to add up the probability that you might get a result of 5 survivals and lower too (in case you are interested in a deviation of 1 or more from the average, rather than just a higher number).

{10 \choose 5} (0.6)^5 (0.4)^5 + {10 \choose 4} (0.6)^4 (0.4)^6 +{10 \choose 3} (0.6)^3 (0.4)^7 +{10 \choose 2} (0.6)^2 (0.4)^8 +{10 \choose 1} (0.6)^1 (0.4)^9 +{10 \choose 0} (0.6)^0 (0.4)^{10} = 0.37

The sum of these two (called the symmetrical p-value) is 0.75, i.e., there is 75% probability that such (and even more hopeful) results are explainable by the “null hypothesis”, that the miracle drug had absolutely no effect and that the disease simply took its usual course.

If we repeated the same test with a 1000 patients, of whom 700 survived, this has a dramatically different result. The same calculations would yield

{1000 \choose 700} (0.6)^{700} (0.4)^{300} + {1000 \choose 701} (0.6)^{701} (0.4)^{399} +. \: .\: .+{1000 \choose 1000} (0.6)^{1000} (0.4)^0 \\  \approx 3 \times 10^{-11}

Notice how small this number is. If you also add the probability of repeating the experiment and getting 500 or fewer survivals, that would be \approx 10^{-10}.

The symmetrical p-value in this case is \approx 10^{-10}. Consider how tiny this is compared to the 0.75 number we had before. This is clearly a rather effective drug!

The p-value is just the total probability that the “null hypothesis” generates the observed event or anything even more extreme than observed. Seems reasonable, doesn’t it? If this p-value is less than some lower threshold (say 0.05), you might decide this is acceptable as “evidence”. The \frac{700}{1000} test appears as if it proves that “XYZ” is an excellent “miracle” drug.

Next, we come to the underside of p-values. Its called p-hacking. Here’s a simple way to do it. Consider the test where you obtained a \frac{7}{10} result. Let’s say you decided, post-hoc, that the last person that died, actually had a fatal pre-existing condition that you didn’t detect. No autopsies were performed, so that patient might well have died of the condition. In that case, maybe we should exclude that person from the 10 people who were in the survey? And one other guy that died had a really bad attitude, didn’t cooperate with the nurses, maybe didn’t take his medication regularly! We should exclude him too? So we had 7 successful results out of 8 “real” patients. The p-value has now dropped to 0.106 for the 7 and above case and 0.17 for the 3 and below case, for a total p-value of 0.27. Much better! And we didn’t have to do any work, just some Monday morning quarter-backing. Wait, maybe that is exactly what Monday morning quarter-backing is.

Another example of p-hacking is one that I gave in this post. For convenience, I reproduce it here –

Imagine you were walking around in Manhattan and you chanced upon an interesting game going on at the side of the road. By the way, when you see these games going on, a safe strategy is to walk on, since they usually reduce to methods of separating a lot of money from you in various ways.

The protagonist, sitting at the table tells you (and you are able to confirm this by a video taken by a nearby security camera run by a disinterested police officer), that he has managed to toss the same quarter (an American coin) thirty times and managed to get “Heads” {\bf ALL} of those times. And it was a fair coin!

Next, your good friend rushes to your side and whispers to you that this guy is actually one of a really \: large number of people (a little more than a billion) that were asked to successively toss freshly minted, scrupulously clean and fair quarters. People that tossed tails were “tossed” out at each successive toss and only those that tossed heads were allowed to toss again. This guy (and one more like him) were the only ones that remained.

What if the number of coin tosses was 100 rather than 30, with a larger number of initial subjects?

Clearly, you would be p-hacked if you ignored your friend.

p-values are used throughout science, but it is desperately easy to p-hack. It still takes a lot of intellectual honesty and, yes, seat of the pants reasoning and experience to know when you are p-hacking and when you are simply being rational in ignoring certain classes of data.

The q-value is a quantity that describes when a correlation is outside the bonds of the “null hypothesis” – for instance, one might have an economic reason why the fx/equity index correlation is a certain number. Maybe it is linked to the size of trade in/out-flows, tariff structure, the growth in the economy and other aspects. But then, it moves around a lot and clearly follows some kind of random process – just not the one described by the binomial model  It would clarify a lot of the nonsense that goes in to price and estimate economic value in products such as quanto options.

More on this in a future post.

Front image : courtesy Hilda Bastian, from this article