Falsification, Experimentation, and Popper

author Ishan Goel image Ishan Goel
7 Min Read
Generated via Open AI Dall-E 3

Let us start with a game. The first player, let’s call her Alice, thinks of a rule that is used to generate triplets: a, b, and c. To start the game, Alice gives an instance that conforms to the rule, let’s say, 2, 4, and 6. The second player, Bob, now has 10 chances to think of triplets and Alice will say Yes or No considering if the triplet satisfies the rule. Whenever Bob wants, he can use 3 chances to guess the rule. The game goes on till Bob exhausts all the chances or guesses the correct rule. The table below shows the gameplay. The first two columns show the triplets generated by Bob, and the second two columns show the guesses made.

CombinationSatisfies the Rule?GuessRule Identified?
2,4,6 (initial by Alice)
4,6,8Yes
6,8,10Yes
8,10,12Yes
Successive even numbers?No
3,5,7Yes
7,9,11Yes
13,15,17Yes
Arithmetic Progression of +2No
13,14,15Yes
13,17,18Yes
13,12,14No
Sequence of Increasing NumbersYes

Observe the combinations generated by Bob and try to think why Bob thought of those combinations. Bob’s gameplay highlights one of the most common mistakes that players make in this game. Players keep trying to hypothesize a rule and as soon as a combination starts to fit in, they try to generate more and more patterns that fit the rule in mind. Once they have enough evidence, they make their guess. In trying to confirm the validity of a rule in mind, they miss out on trying a sequence that contradicts the rule in mind. More often than not, one counter-evidence provides much more confidence than a string of positive evidence.

In the words of Nassim Nichloas Taleb, one of the best ways to decipher how a machine works is by trying to break it. But rarely do we humans do that. We observe the world, hypothesize the patterns that we see, and then go on iteratively searching for confirmatory evidence that strengthens our beliefs. We tend to seek confirmations but rarely do we seek contradictions. We fail to realize that trying to search for counter-evidences and iteratively updating our beliefs is a much faster way to learn. This is an idea that has deep implications for developing the right mindset and the right culture of experimentation.

The 20th-century philosopher Karl Popper took this idea to its logical conclusion by stating that anything that is unfalsifiable is not scientific at all. In other words, for a hypothesis to be scientific there should be some possible evidence that can prove it wrong. Karl Poppers’ theory came to be known as The Theory of Falsification and in this blog post, I explain the theory and its implications in scientific experimentation.

Source: Scherzatore

The Theory of Falsification

Sir Karl Raimund Popper was a social, political, and scientific philosopher trying to find the distinguishing line between scientific and non-scientific theories. He finally conceived the idea that scientific and non-scientific theories are distinguished by the thin line of being potentially disprovable. Observe the two statements below:

  1. The sun rises every morning. 
  1. There is life after death.

The first statement is a cosmological truth that has been proven over and over again for thousands of years. However still if tomorrow the sun does not rise you will be forced to part ways with the belief and millions of days confirming this statement will be rendered useless. The first statement hence is a scientific statement because it can potentially be disproved.

The second statement is interestingly unfalsifiable because there is no way for you to disprove this statement. If you talk about the possibility of it being true, you might be able to gather countless stories of resurrection but there is not a single way in which you can disprove this statement. Such a statement is what Popper called unfalsifiable and hence, non-scientific.

The scientific process hence relies on developing hypotheses and then finding ways to falsify them by designing an experiment. An experiment is much more useful if designed to contradict existing beliefs rather than to confirm them. The theory of falsification hence has countless implications in modern experimentation.

Implications in Modern Experimentation

The theory of falsification is a broader mindset that has influenced many philosophical ideas. In this section, I list down some of the implications it has had on experimentation that are not yet widely recognized and appreciated in the experimentation community.

  1. Never can an experiment result be 100% trustworthy: This often seems counterintuitive to experimenters but it has always been the norm in science. Countless scientific results are invalidated many decades after being published and similarly many experiment results are found to be wrong due to unseen factors. A similar rule applies for modern-day A/B testing. There is no checklist that will help you ensure that the results of an A/B test are 100% reliable and cannot be invalidated by any future knowledge. Rather than trying to ensure that the result of an A/B test is perfectly reliable, one should invest energies in trying to invalidate A/B testing results wherever possible.

    All A/B tests that survive such scrutiny are more likely to be trustworthy.
  2. Human intuition can often be unfalsifiable: Often human intuition and beliefs about things are unfalsifiable. In such situations, we are running experiments only to confirm our beliefs and if the experiment returns a surprising result, you are more likely to reject the results of the tests as unreliable. While data and A/B testing can also be flawed, one needs to understand the source of their beliefs and identify the beliefs that are fundamentally unfalsifiable. A healthy approach of skepticism should be developed and things that survive stronger scrutiny should generally be considered more reliable. A culture of experimentation and a data-driven mindset should be built around contradicting human intuition and not confirming it.
  1. Experimentation should be driven by surprises: If you know something will work, and it works, you haven’t learned anything new. Similarly, if you know something will not work, and it does not work, you haven’t learned anything new. Successful experimentation thrives on surprising results and surprises should be carefully studied and validated by demanding stronger evidence. In general, your most valuable learnings will be derived when experiments do not go in line with expectation. Make sure that such evidence is highlighted and dug deeper into because surprises always hold valuable insights.
  1. Don’t be afraid to break things when experimenting: When you make a change that causes a statistically significant reduction in core metrics, it is scary from a business aspect. However, it is a goldmine of learning from the scientific perspective. Most changes that you make on a website will not impact your core metrics in any direction, so when you make a change that actually reduces core metrics you have found a causal link that will be useful in understanding customer behavior. An experiment can always be run on a small proportion of visitors if stakes are large and harmful changes can always be reverted. True learning always comes at a cost and keeps giving a return long after the costs have been covered. Build the courage to let an experiment run its whole course even if it shows a reduction in core metrics.

Conclusion

The turkey that is being raised to be served as dinner on Thanksgiving is often raised with great care and generosity by its owner. It is given a shelter to grow and it is fed regularly every day so that it becomes fat and healthy. Thinking from the perspective of the turkey every new day is an empirical confirmation that her guardian is a good person who cares about her survival. Thanksgiving day comes as a surprise to the turkey as the guardian arrives while sharpening his knife. From the perspective of the guardian, there was nothing surprising at all about what happened on Thanksgiving day.

Nassim Nicholas Taleb uses the story of the turkey to tell us that what has not happened yet is much more important than what has been happening all along. Taleb calls these the Black Swan events. Navigate the world through what you cannot see. Try to falsify all that you believe in to be left with the most reliable hypotheses. Don’t be the turkey in the big wide world.

You might also love to read these

Share

Get new content on mail