This is part seven of a series on dashboard visualization.
Yesterday, I covered useful and interesting (and sometimes both!) visuals. Today, I cover visuals which are fun to hate.
Of course I’m going to start with pie charts. Pie charts are the whipping boy of data visualization and for good reason.
The best use of a pie chart is to show a simple share of a static total. Here, we can see that Daredevil has almost half of the critics’ reviews, and that The Punisher and Jessica Jones are split.
This simple pie chart also shows some of the problems of pie charts. The biggest issue is that people have trouble with angle, making it hard to distinguish relative slices. For example, is Jessica Jones’s slice larger or is The Punisher’s? It’s really hard to tell in this case, and if that difference is significant, you’re making life harder for your viewers.
Second, as slice percentages get smaller, it becomes harder to differentiate slices. In this case, we can see all three pretty clearly, but if we start getting 1% or 2% slices, they end up as slivers on the pie, making it hard to distinguish one slice from another.
Third, pie charts usually require one color per slice. This can lead to an explosion of color usage. Aside from potential risks of using colors which in concert are not CVD-friendly, adding all of these colors has yet another unintended consequence. If you use the same color in two different pie charts to mean different things, you can confuse people, as they will associate color with some category, and so if they see the same color twice, they will implicitly assign both things the same category. That leads to confusion. Yes, careful reading of your legend dissuades people of that notion, but by the time they see the legend, they’ve already implicitly mapped out what this color represents.
Fourth, pie charts often require legends, which increases eye scanning. I took the legend off of the pie chart above, more common is a pie chart like the following:
I particularly love this chart because I have a legend, I didn’t sort things by relative size, and I have two pairs of almost-equal data but you can feel that there’s a difference. The problem is, it’s hard for people to tell which slice is bigger, A or C, B or D.
There are several alternatives to pie charts that you should consider. If you have a large number of categories you wish to compare, a treemap might be your better option. If you just have a few categories, I’d use a column chart or a bar chart, depending upon how long the labels are.
And if you have a two-class visual, just use a pair of percentages or numbers. For example, suppose you are visualizing a Congressional race with two candidates, where one candidate received 58% of the vote and the other received 42%. A visual which reads “58% – 42%” is at least as clear to a person as a pie chart, and often clearer.
Gauges are another highly risky visual, as they tend to get misused. Their job is to show either progress toward a goal or the percent fill at a point in time. In the following example, I have a goal of 50 entries and currently have 34, so I’m inching my way toward the goal.
The above is what I would consider a good use, but even then, I don’t like it very much. By default, Power BI takes your current value as the 50% marker, so I have this arbitrary selection of 68 as the bottom of the visual. But if I do want to indicate that I can go above 50, I have to pick some end point somewhere.
We’re familiar with gauges in real life, of course:
Gauges themselves are fine, but they need to show you either how far along to your goal you are or an easily-understandable current status. Quite often, people will misuse gauges by showing things without real targets, like the following:
In this case, we’re at 34 of 68. We’re halfway there…but what does “there” mean? We don’t really know whether this is good, bad, or indifferent. Even if I switch the vague and generic “entries” with something like revenue or profit, it’s still hard to make heads or tails from the visual alone.
Stacked Area Charts
Another pretty visual that I don’t like is stacked area charts.
Stacked area charts take the numbers for each category and lay them on top of one another, so it’s like having multiple line charts which, instead of starting from the origin, start from the prior line. This can lead to a very appealing visual effect. Stacked area charts are best used when you want to show the relative and absolute differences of data which changes over time but has relatively few periods, like my 13-episode season.
But here’s the problem with the stacked area chart: looking at this data, what can I tell you with certainty? I can definitely tell you the total number of critic reviews by episode: that’s the top of The Punisher’s line. Aside from that, I can tell you how many critic reviews there were for each Daredevil episode. But to figure out how many reviews there were of Jessica Jones or The Punisher requires mentally calculating a difference.
I also, while putting this post together, just noticed that the origin started at 20 instead of 0. That’s something you can change in Power BI, but goes back to the line chart / dot plot versus bar / column chart issue. If you’re using area as an indicator of value—which we certainly are doing with a stacked area chart—then you want to start at the origin. Daredevil looks like it has a suspiciously small number of reviews, but that’s because there’s a solid block of 20 sitting below our viewpoint.
This visual format is also harder to do exactly what it was intended to do: allocate proportional responsibility for changes over time. We can see that there was definitely a drop between episode 7 and episode 9. Which shows were responsible for that drop by losing reviews? Which shows held steady? Were there any which gained reviews? It’s hard to tell.
The best alternative to a stacked area chart is a line chart. Here’s the same data in line form:
We can see here that The Punisher kept a steady number of reviews, but the other two dropped between episode 7 and episode 9. That would have been hard to tease out of a stacked area chart with three categories; if you have five or six categories which fluctuate in both directions, it becomes close to impossible.
The downside to using a line chart is that I don’t have a totals line. But if I add that, I get the same information as that stacked area chart and more.
I could also use a ribbon chart:
Ribbon charts give more information, especially for desktop users who can mouse over the ribbon sections and get a lot of info from the tooltip. They also show relative changes in rank from period to period. This is a fairly boring ribbon chart because there was only one change, but this will let you see who’s currently in which position.
Otherwise, ribbon charts suffer from the same flaws as the stacked area chart and also tend to dominate a screen. If your ribbon chart is the single focal point of your dashboard and you expect people will mouse over the sections to gain more insight, then it can work as an exciting visual. Otherwise, it tends to be simply too much.
A common theme today is “pretty but useless.” Dashboards are fundamentally utilitarian in nature: they exist to provide people timely and accurate information so that those people can do their jobs. The more noise you throw on—even if it’s pretty noise—the harder it is for people to do their jobs.
This was not an exhaustive look at risky or often-bad visuals, and there are more that I am willing to throw on the pyre (like donut charts) and others which get misused more than they get correctly used (like waterfall charts), but I hope that this smattering of examples serves the purpose of explaining that not all visuals deserve to show up on your dashboard.