You’ve probably heard the skeptical aphorism about statistics: “There are three kinds of lies: lies, damned lies, and statistics.” Unfortunately, I worry that we may soon hear “data visualization” tacked on to that list as the fourth and most deceiving way of communicating information. This is a problem.
A couple months ago, I was reading through some materials about a company’s financial health, and they included a dual axis chart showing the company’s Revenue and EBITDA from 2008 to 2014 (projected). (See below.)
Looking at the chart, it appears that the two measures increase in a near parallel fashion. The slope of their lines is pretty comparable, particularly in the later years on the chart. Problem is, this misrepresents what is actually happening. The axis on the left increases in increments of 20 while the one on the right does so in increments of 10, which means the relationship that appears between the lines is a misrepresentation.
When we chart the same data on a single axis we see that EBITDA fails to increase nearly as dramatically between 2012 and 2014 as revenue does. (See below.) If I’m evaluating the health of this company and its future prospects, that difference may be important!
Adding a second axis seems like such a simple, innocuous thing, but it changes how the data might be interpreted and understood. This is just one example of the substantial impact a seemingly small design decision can have.
Why should we care about this? There are two main reasons:
- As consumers of more and more visual data, we need to be aware of situations like this where visualization design decisions may obscure (or at least distract from) certain critical pieces of information. Just because it’s data (data never lies!) and you can see it (my eyes would never deceive me!), doesn’t mean it is presented in an objective way.
- As more of us are in roles where we create data visualizations, we need to be aware that if we are careless, we run the risk of misleading our audience or imposing (hopefully unintentionally) our own viewpoint on the information we present.
Data visualization likely will be one of those things many of us try to do without any formal training, and I worry that, as a result, a lot of folks will do it badly. Am I being overly paranoid? I hope so. But this particular example doesn’t do much to allay that paranoia.