I haven’t exactly delivered what I planned to in January. I kept on thinking about the two topics I intended to discuss (quantum mechanical measurements and the future of French in North America) and I just kept on uncovering new subtleties and changing my mind. Whenever I settle, I’ll be sure to let you know!
There is something that’s been on my mind lately that I wish to discuss. I am getting progressively involved in the search for the Higgs boson at the ATLAS experiment, and I have been thinking a lot about blind analyses lately. I want to write down my thoughts about it, and share them with you in the process. It is a very important topic and a great exercise in critical thinking.
What do I mean by a blind analysis? Simply not to look at the end result before the whole experimental procedure has been devised and executed. In a blind analysis, you are not allowed to change anything in the data analysis once you looked at the final result. This is not the same as a double-blind trial, although the aim is the same. The aim is to protect the science from the biases of the scientists conducting the experiment, although it may not be as effective as double-blind trials.
I like to think of science as the art of not fooling ourselves. Such an art requires a deep awareness of all the ways we can be fooled. This is one of the reasons I find psychology fascinating. Psychology tells us a lot about how we think and why we think that way. We have a powerful associative machine in our head that works without us being aware of it, that allows us to instantly recognize various elements of our environment. This associative machine works with all types of informations, whether it is visual, auditive or conceptual in nature. This machine is astonishingly accurate but it operates subconsciously and it only takes a little familiarity with optical and auditory illusions to know that it is not flawless. We are wired to recognize things more than others: faces, voices, things that go along with our current beliefs. Since the associative machine is automatic and fast, it may require a lot of training to recognize circumstances in which the machine can be mistaken. Some illusions, especially the cognitive ones, are more pernicious than others.
In a particle physics analysis, there is a background-only hypothesis and then a signal+background hypothesis. In the case of the Higgs boson search, the background-only hypothesis is all physical processes that can happen in proton-proton collisions that we already established as real in previous experiments. Essentially, these previous experiments are succinctly and elegantly summarized in the Standard Model of Particle Physics*. The signal+background hypothesis would have an excess on top of the background somewhere. In order to detect that excess (or rule it out if it doesn’t exist), an analysis must be designed to be sensitive in the region where the excess from the signal is expected. It is by carefully accounting for the potential errors and biases, either statistical, instrumental or theoretical that we can determine how sensitive we are to a signal. If we are sensitive and that we find an excess, we call the signal+background hypothesis the most likely. If we don’t find the signal despite being sensitive, the background-only hypothesis is preferred.
The thing is that the bias of the experimentalist himself is not taken into account in this procedure. That is why we do blind analyses. Imagine for a moment that an experimentalist looks at the signal region, and sees no excess. But since she is expecting to see the Higgs boson she figures that something must be wrong with her analysis and fiddles with it until the signal appears. This scenario can just as easily be reversed. What if you don’t expect to see an excess but you get one nevertheless? You figure that maybe you are just underestimating the errors in this region, and make the excess vanish in your account of uncertainties.
The question is how do you know, when all is said and done at the end of the analysis that you did take everything into account correctly? This is why we look at control regions. The background-only hypothesis will also predict how the background will look in other regions than where you expect you signal to be. You can look at these regions and see if the background-only hypothesis agrees with the data you are analyzing. You can do this at every step in the analysis, looking carefully all around the expected signal region but never touching it. If you see that for every control region you choose the data agrees with the background-only hypothesis, you start to build confidence that you are estimating the background correctly, and that your analysis is not introducing unexpected biases.
But what if there is an actual unexpected signal in one of your control region, and an unaccounted bias hides it? You see, you can still never know for sure that your analysis is airtight. You can only acquire so much confidence. This is why we require replication of the results by other experiments.
Unexpected biases is the main reason behind outstanding scientific claims being challenged and retracted. We have seen many of these recently. The arsenic-based lifeforms by NASA may have been the most dramatic, but there were also claims of funny muons with the same sign at the D0 experiment, and a potential new particle at the CDF experiment, both at the Tevatron. None of these have been seen by experiments at the LHC, despite them being sensitive to these signatures. CDF and D0 were nevertheless claiming very high statistical significance for both results. I am also predicting that the faster-than-light neutrinos of the OPERA experiment will soon go down the drain of non-replicable results, that is if OPERA doesn’t manage to find their unaccounted error before independent results are announced.
My point with all of this is that statistical significance isn’t all there is. We tend to be very impressed by a 5-sigma discovery, meaning that there is a 1/10,000,000 chance that the excess is the result of a statistical fluctuation. But statistical fluctuations isn’t the only factor that can throw us off. There are also potential errors in the instrumentation, in the theoretical knowledge that goes into precisely stating the background-only hypothesis, and experimentalist psychological biases. This last one is the only source of errors that is not accounted for quantitatively in particle physics. We like to think that as scientists, we are impervious to psychological biases, but we really are not. Psychological studies have found that even professional statisticians are victims to the same statistical biases as everyone else in their daily lives.
Psychology experiments and medical trials have a wonderful way of reducing that kind of errors in double-blind setups. The best we have in particle physics is to hope that there are enough people of opposite biases so that the discussion between the two sides will result in each side being more careful. Nevertheless, the best trick we have to produce accurate results is still the replication of results by independent experiments, which is why we have both the ATLAS and the CMS collaborations looking for the Higgs boson, as opposed to one single gigantic, more performant experiment.
*Strictly speaking, the Standard Model of Particle Physics does include the Higgs boson, even if not verified yet. So the background-only hypothesis would be the whole Standard Model minus the Higgs boson, while the signal+background hypothesis would be the entire Standard Model.