I am a huge fan of Gerd Gigerenzer who preaches numeracy and uncertainty education. One of Prof. Gigerenzer’s pivotal theses is “Fast and Frugal Heuristics” which is also popularized in his book “Gut Feelings” (listen to this podcast if you don’t want to read the book). I like this approach.
Today, I listened to the latest episode of the Brainfluence podcast that hosted the psychologist Dr. Gleb Tsipursky who wrote an extensive book called “Never Trust your Gut” with a seemingly contradicting thesis. I added this book to my TOREAD list.
This is another post in the series Because You Can. This time, I will claim that the fact that you can put error bars on a bar chart doesn’t mean you should.
It started with a paper by prof. Gerd Gigerenzer whose work in promoting numeracy I adore. The paper, “Natural frequencies improve Bayesian reasoning in simple and complex inference tasks” contained a simple graph that meant to convince the reader that natural frequencies lead to more accurate understanding (read the paper, it explains these terms). The error bars in the graph mean to convey uncertainty. However, the data visualization selection that Gigerenzer and his team selected is simply wrong.
First of all, look at the leftmost bar, it demonstrates so many problems with error bars in general, and in error bars in barplots in particular. Can you see how the error bar crosses the X-axis, implying that Task 1 might have resulted in negative percentage of correct inferences?
The irony is that Prof. Gigerenzer is a worldwide expert in communicating uncertainty. I read his book “Calculated risk” from cover to cover. Twice.
Why is this important?
Communicating uncertainty is super important. Take a look at this 2018 study with the self-explaining title “Uncertainty Visualization Influences how Humans Aggregate Discrepant Information.” From the paper: “Our study repeatedly presented two [GPS] sensor measurements with varying degrees of inconsistency to participants who indicated their best guess of the “true” value. We found that uncertainty information improves users’ estimates, especially if sensors differ largely in their associated variability”.
Also recall the surprise when Donald Trump won the presidential elections despite the fact that most of the polls predicted that Hillary Clinton had higher chances to win. Nobody cared about uncertainty, everyone saw the graphs!
First of all, error bars tend to be symmetric (although they don’t have to) which might lead to the situation that we saw in the first example above: implying illegal values.
Secondly, error bars are “rigid”, implying that there is a certain hard threshold. Sometimes the threshold indeed exists, for example a threshold of H0 rejection. But most of the time, it doesn’t.
More specifically to bar plots, error lines break the bar analogy and are hard to read. First, let me explain the “bar analogy” part.
The thing with bar charts is that they are meant to represent physical bars. A physical bar doesn’t have soft edges and adding error lines simply breaks the visual analogy.
Another problem is that the upper part of the error line is more visible to the eye than the lower one, the one that is seen inside the physical bar. See?
But that’s not all. The width of the error bars separates the error lines and makes the comparison even harder. Compare the readability of error lines in the two examples below
The proximity of the error lines in the second example (take from this site) makes the comparison easier.
Are there better alternatives?
Yes. First, I recommend reading the “Error bars considered harmful” paper that I already mentioned above. It not only explains why, but also surveys several alternatives