This is another post in the series Because You Can. This time, I will claim that the fact that you can put error bars on a bar chart doesn’t mean you should.
It started with a paper by prof. Gerd Gigerenzer whose work in promoting numeracy I adore. The paper, “Natural frequencies improve Bayesian reasoning in simple and complex inference tasks” contained a simple graph that meant to convince the reader that natural frequencies lead to more accurate understanding (read the paper, it explains these terms). The error bars in the graph mean to convey uncertainty. However, the data visualization selection that Gigerenzer and his team selected is simply wrong.
First of all, look at the leftmost bar, it demonstrates so many problems with error bars in general, and in error bars in barplots in particular. Can you see how the error bar crosses the X-axis, implying that Task 1 might have resulted in negative percentage of correct inferences?
The irony is that Prof. Gigerenzer is a worldwide expert in communicating uncertainty. I read his book “Calculated risk” from cover to cover. Twice.
Why is this important?
Communicating uncertainty is super important. Take a look at this 2018 study with the self-explaining title “Uncertainty Visualization Influences how Humans Aggregate Discrepant Information.” From the paper: “Our study repeatedly presented two [GPS] sensor measurements with varying degrees of inconsistency to participants who indicated their best guess of the “true” value. We found that uncertainty information improves users’ estimates, especially if sensors differ largely in their associated variability”.
Also recall the surprise when Donald Trump won the presidential elections despite the fact that most of the polls predicted that Hillary Clinton had higher chances to win. Nobody cared about uncertainty, everyone saw the graphs!
Uncertainty is one of the most neglected aspects of number-based communication and one of the most important concepts in general numeracy. Comprehending uncertainty is hard. Visualizing it is, apparently, even harder.
Last week I read a paper called Value-Suppressing Uncertainty Palettes, by M.Correll, D. Moritz, and J. Heer from the Data visualization and interactive analysis research at the University of Washington. This paper describes an interesting approach to color-encoding uncertainty.
Visualizing uncertainty is one of the most challenging tasks in data visualization. Uncertain
The graph clearly “communicates” the uncertainty but does it really convey it? Would you consider the lines and their corresponding confidence intervals very uncertain had you not seen the points?
What if I tell you that there’s a 30% Chance of Rain Tomorrow? Will you know what it means? Will a person who doesn’t operate on numbers know what it means? The answer, to both these questions, is “no”, as is shown by Gigerenzer and his collaborators in a 2005 paper.
Communicating uncertainty is not a new problem. Until recently, the biggest “clients” of uncertainty communication research were the weather forecasters. However, the recent “data era” introduced uncertainty to every aspect of our personal and professional lives. From credit risk to insurance premiums, from user classification to content recommendation, the uncertainty is everywhere. Simply “doing” uncertainty communication, as Sigrid Keydana from the Recurrent Null blog suggested isn’t enough. The huge public surprise caused by the 2016 US presidential election is the best evidence for that. Proper uncertainty communication is a complex topic. A good starting point to this complex topic is a paper Visualizing Uncertainty About the Future by David Spiegelhalter.