A Whole-Brain Look at the Challenges of Measurement and Motivation
Note: This is Part 3 in a four-part series on the challenges to being data-driven. To understand the purpose of this series and read from the beginning, Click Here.
In Part 1 and Part 2 of this series, I detailed data-driven decision-making using the Blue (The What) and Green (The How) aspects of the Whole-Brain Model. I discussed challenges in choosing what to measure and how to measure it.
Part 3 of this series is where the rubber meets the road. A far greater challenge lies in how we interpret metrics once they’re gathered.
The Why — Human Fallibility and Interpretation
Ok, you’ve decided what to measure, and you’ve figured out how to gather your data. It’s all downhill from here, right? The answer would be yes if people were rational, but the level of reason in decision-making is a lot like my level of tolerance in homeschooling during the coronavirus lockdown — it depends on how fired up my emotions are.
We know our emotions play an oversized role in our decision-making when we answer questions with the statement, “I feel like…” We feel our answers, thoughts and beliefs more often than we think them.
Our Biases
As human beings, our brains are not wired to be rational. If they were, no one would buy a lottery ticket or believe in rally caps during the 9th inning. We carry significant biases that cloud our objective interpretation of the world around us. I’ve written previously about confirmation bias and its impacts on rational thinking. A perfect example of this is the differing opinions on the seriousness of COVID-19 between the Fox News and MSNBC crowds — the statistics on mortality and transmission rates are the same for both audiences. Still, they yield different interpretations based on tribal mentality.
It’s not just confirmation bias, though. There are multiple types of biases at play when we’re interpreting information and forming our beliefs:
Hindsight Bias — In “Fooled By Randomness,” Nassim Nicholas Taleb describes a theory about the role of emotions in behavior from Joseph LeDoux, Neuroscientist at NYU and the Center for Neural Science. The theory says the connections from the emotional systems to the cognitive systems are stronger than the connections from the cognitive systems to the emotional systems. The implication is that we “feel” emotions, then find an explanation for them.¹
In essence, we’re really good at finding a narrative that satisfies our emotions and fits with the events around us. Sports enthusiasts might call this armchair quarterbacking. In a business context, when KPIs for a particular strategy are measured, it becomes easy to say, “I knew it would happen,” no matter which direction the KPIs move and despite the uncertainty of the strategy ahead of time.
Attribution Bias — Once again, in “Fooled By Randomness,” Taleb brings up a convenient argument people make, especially those in leadership positions. We attribute our successes to skill, but our failures to randomness or “bad luck.” In describing why we behave this way, Taleb says “It is a human heuristic that makes us actually believe so in order not to kill our self-esteem and keep us going against adversity.”²
If our KPIs trend positively, we can’t be too quick to pin the result on our great skill or ideas. We have to ask critical questions to make sure the results we’re observing are correlated with our efforts. Similarly, if our KPIs trend negatively, we shouldn’t be too quick to dismiss it as chance or bad luck, when our efforts might be directly causing the outcome.
Here’s a good example of attribution bias. I am an occasional poker player. I enjoy the strategies and have gained some skills in a game that features both skill and luck. When I happen to place well in a tournament, I may brag to my wife about the money I “earned” based on my good play, but when I lose, I may point out a “bad beat” on a particular hand, which is a situation when I’m in a dominant position but lose due to an opponent drawing a winning card. I wonder how many bad beats I gave out on my way to past victories?
Publication Bias — Here is an interesting situation from the book “Naked Statistics”, by Charles Whelan, that occurs in the world of medical publications. In 2011, the Wall Street Journal ran a front-page story on what it described as one of the “dirty little secrets” of medical research, stating “most results, including those that appear in top-flight peer-reviewed journals, can’t be reproduced.”
One of the reasons cited for this was positive publication bias. Let’s say 100 studies are performed about a topic — one of them is likely to turn up a positive statistical result based solely on chance. The problem is that the 1 study with positive results will get published, while the 99 other studies with contradictory results will go unpublished or get ignored. An example of this involved depression drugs. The New York Times ran an article stating, “The makers of antidepressants like Prozac and Paxil never published the results of about a third of the drug trials that they conducted to win government approval, misleading doctors and consumers about the drugs’ true effectiveness.” Ninety-four percent of the studies with positive results were published, while only 14% of studies with negative results were published.
Either consciously or unconsciously, Whelan says that researchers sway the results either because of a strongly-held belief or because a positive result would be “better” for their careers. Returning to the Wall Street Journal article, John Ioannidis, a Greek doctor, examined 49 studies published in three prominent medical journals. Each study had been cited in the medical literature at least 1000 times, yet about one-third of the research was subsequently refuted by later work.³
Our interpretation of results, even those in top-flight journals, must look beyond the outcomes in a specific study and consider the context of similar studies as a comparison, even if those other results are hard to find.
Expert Bias — In 2005, a Canadian psychologist named Phillip Tetlock wrote and published a book titled “Expert Political Judgement.” It is a dense, scientific look at roughly 27,500 forecasts over 18 years across more than just politics to quantify how poorly the experts fared in making forecasts.
How can experts, who rely on loads of metrics to make forecasts, get things wrong when they’re supposed to be the most knowledgeable about their domains? For starters, they are less likely to recognize their own biases. If someone has dedicated their career to a subject area, they may not consider alternative ideas that don’t fit their views.
But a larger reason for bad predictions has to do with the complexity of the domain being forecasted. For areas like national economies or the stock market, with complex, adaptive systems featuring many interactive variables, randomness plays a significant role in outcomes. It makes it more difficult for forecasts to rely on expertise or skill alone.
Daniel Kahneman and Gary Klein, two of the world’s leading psychologists, have differing views on the role of intuition in expertise. Kahneman believes people tend to rely too much on intuition, leading to predictable mistakes, and thus he is skeptical of expert forecasts. Klein, by contrast, studied expert intuition in fields like medicine and found that intuition informed by lots of experience can lead to good decision-making. While the psychologists have opposing views on expertise and intuition, they agreed on one thing in their joint paper, “Conditions for Expertise.” They concluded expertise is only valid in narrow conditions, when the cause and effect are clear and consistent, and where the practice of expertise produces clear and reliable feedback.⁴
If we’re looking at metrics — especially forecasts — in a complex domain, we have to balance the interpretations of data from experts with the degrees of chance or randomness that may nullify their predictions.
Other Brain Games
Our emotional wiring isn’t the only way our brains affect our interpretation of metrics. Sometimes our basic intellectual faculties fail us. Here is a look at some common mistakes.
The Causation Conundrum
Statisticians love to remind us that “correlation does not mean causation.” For us left-brain people, analysis is a process of linear thinking, but the interpretation of the causal effect in data isn’t always linear. For example, my children practice soccer every day, but that doesn’t mean they improve in even increments during each practice. Instead, their improvement is exhibited in a plateau of skill, with occasional ah-ha moments followed by an explosion of capability, more like waves of progress.
According to Taleb, in “Fooled by Randomness,” “Our brain is not structured for nonlinearities…our emotional apparatus is designed for linear causality.”⁵ Michael J. Mauboussin from “The Success Equation” puts it this way: psychologists have figured out that our minds use shortcuts to learning. Because we learn by experience, and we cannot possibly experience everything, our minds leverage cause and effect to bridge the gap. That’s why we use stories and narratives — they are digestible versions of cause and effect.⁶
When we see metrics showing a correlation between two variables, we must be careful not to assume one causes the other. Whelan gives a simple example of this in “Naked Statistics” by showing a correlation between height and weight, meaning the taller a person is, the more they tend to weigh. However, we know that height doesn’t cause weight.⁷
Mauboussin gives a more nuanced example in “The Success Equation,” provided by the psychologist Daniel Kahneman. Ask someone to explain the following true statement: “highly intelligent women tend to marry men who are less intelligent than they are.” The initial reaction will be to determine some sort of cause and effect from this rather provocative statement. But then make a second statement: “The correlation between intelligence scores between spouses is less than perfect.” This has the same meaning as the first statement, but is trivial and far less controversial, eliciting less tendency towards cause and effect.⁸
We must consider causation when interpreting metrics that involve highly complex systems. If, for example, a stock price rises six months after the implementation of a CEO’s new strategy, we cannot say definitively that the strategy caused the rise in price because there are so many factors that could have played a role.
Putting too much Weight in the Noise
There is a lot of randomness and variation in our world, and sometimes we misinterpret this noise as something more concrete than it is. Taleb gives an example of this — say you have an investment with the chance of a 15% market return and 10% volatility each year. You would have a 93% chance of success in a given year, but only a 54% chance of success on a given day. That’s a lot of daily noise, leading to misinterpretation of trends if we look at too short a timeframe.⁹
There is a parallel to the news cycle here. Sometimes (COVID-19 coverage included) watching the daily news can be head-spinning and jarring — murders, crime, war, a Lakers loss — and leave us with the impression that things are really bad. But if we zoom out and look at generational trends, we’re on a much more positive track as an entire society. Hans Rosling dedicated an entire book to this subject, calling it “Factfulness.” Of the many positive global trends that he highlighted, the percentage of people living in extreme poverty has been cut in half in the last 40 years. This is hardly what we would expect by listening to the noise on the daily news.
When looking at KPIs, be sure to evaluate the noise in the numbers, especially when they exhibit significant variation.
Reversion to the Mean
Most of us have heard of the Sports Illustrated jinx. If an athlete graced the cover of Sports Illustrated (back in the old days when we turned pages instead of swiping right), they tended to have a poor subsequent season. There’s a perfectly reasonable explanation for this. The athlete must have had an exceptional season to appear on the cover, so it stands to reason their performance would have a higher likelihood to move back toward an average performance the following year.
This is simply reversion to the mean in action. When we see a metric overperforming, we tend to assume this good news will predicate more good news, when the metric may be suffering from randomness and may revert to the average by nothing more than ending its run of good luck. Financial investing is full of examples — because of the randomness in returns, a money manager may have a period of good returns, followed by gradual degradation of performance. Eddie Lampert comes to mind; many thought he would become the next Warren Buffet. He had stellar returns as an investor and purchased Sears and other declining retail chains much in the same way that Buffet purchased declining textile mills early in his career. But time and reversion caught up to Lampert, and today he’s sitting on a pile of increasingly worthless equity in Sears.
In “The Success Equation,” Mauboussin describes a paper written in 1933 by Horace Seacrest called “The Triumph of Mediocrity in Business,” which stated that mediocrity tends to be highly prevalent in the conduct of business. Both expenses and profits approach the mean — in the presence of competition, advantageous positions eventually erode. Mauboussin reminds us when comparing a correlation, if the correlation coefficient is low, reversion to the mean is very powerful.¹⁰ If I throw a 100 mph fastball, I can assume a high correlation with strikeouts and expect my performance to continually excel as long as my arm doesn’t give out. If I play Old-Maid with my kids, I can assume a low correlation with my strategy and the outcome. If I win the first game, there’s a good chance that the old lady will end up in my hand in the subsequent game.
Sometimes We Just Get It Wrong
As if it wasn’t hard enough to be data-driven, with our emotions and flawed thinking, sometimes we’re just plain wrong in our interpretation because we don’t understand the numbers.
Taleb gives a great example of this in “Fooled by Randomness” by describing a famous quiz given to doctors to see if they have a good understanding of probability. Here is the scenario:
- A test of a disease presents a false positive rate of 5%
- The disease will infect 1/1000 of the population
- People are tested at random, regardless of whether they show symptoms of the disease
If a patient tests positive, what is the probability of the patient actually having the disease? Most people think the answer is 95%, based on the 5% false-positive rate. But the real answer is 2%! Here’s why: using the concept of Expected Value, if 1000 people take the test, one will have the disease. Fifty people (5%) will have a false positive. The equation to calculate the probability of the patient actually having the disease if they test positive is:
Number of people who actually have the disease / Total number of people testing positive, or
1 / 51 = 1.96%¹¹
To be honest, I didn’t get the answer right when I first read this (I chose 95%). It is an excellent example of a situation in which I simply misinterpreted the data.
Here’s one more great example from “Naked Statistics.” Whelan talks about the Monty Hall problem, from the TV Show “Let’s Make a Deal.” For anyone who is Gen-X and older, you remember the show. You’re given three doors to choose from; behind one is a great prize, and behind the other two are goats. After you make your first selection, Monty Hall does something tricky — he reveals a goat behind one of the other two doors (that you didn’t choose) and asks if you want to change your selection. Here’s the question — should you take him up on his offer?
The answer is yes. By choosing the new door, you increase your odds of making the right selection from 1/3rd to 2/3rds. Here’s how — when you choose the first door (say, door 1), you have a 1 in 3 chance of making the right selection. After Monty Hall reveals the goat behind one of the other doors (say, door 2), if you now switch your selection to door 3, it would be the same as if you had chosen doors 2 and 3 from the start. This gives you a 2 in 3 chance of making the right selection.¹²
Summary
What is the moral of the story? Even when we pick the right thing to measure, even when we calculate the metric correctly, we still risk not benefiting from our good work if we misinterpret our results. For everyday metrics involving simple calculations like click rates or sales trends, this risk is low. But not all metrics are so simple, so hopefully, this article will help you navigate the challenges with interpretation on your data-driven journey.
In the last article in this series, Part 4, we’ll tie together everything we’ve learned in Parts 1, 2 and 3 and explore the holy-grail of data-driven decision-making: changing motivations and behaviors based on metrics.
Bibliography
- Fooled by Randomness, Nassim Nicholas Taleb — Chapter 11
- Fooled by Randomness, Nassim Nicholas Taleb — Chapter 13
- Naked Statistics, Charles Whelan — Chapters 12, 7
- The Success Equation, Michael J. Mauboussin — Chapter 8
- Fooled by Randomness, Nassim Nicholas Taleb — Chapter 10\
- The Success Equation, Michael J. Mauboussin — Chapter 11
- Naked Statistics, Charles Whelan — Chapters 11
- The Success Equation, Michael J. Mauboussin — Chapter 10
- Fooled by Randomness, Nassim Nicholas Taleb — Chapter 3
- The Success Equation, Michael J. Mauboussin — Chapter 10
- Fooled by Randomness, Nassim Nicholas Taleb — Chapter 11
- Naked Statistics, Charles Whelan — Chapter 5 1/2