Types Of Visuals, Part 1

This is part six of a series on dashboard visualization.

Today, we will look at a few of the many types of visuals available to us. For each of these, I’ll cover some general information about the visual, as well as good uses on a dashboard. All of my examples here will be in Power BI and will use a data set of IMDB ratings and votes for each Marvel Netflix series.

Tables And Matrices

The first type of visual is the table or matrix, which I’ll look at as a unit instead of breaking them out separately. These are great when users need to compare large amounts of data directly. Tables and matrices do not belong on strategic dashboards (because tl;dr), but they can belong on tactical or operational dashboards.

11_Matrix[1] — Don’t forget about the venerable table of numbers.

A table or matrix is a compact method for disseminating a good amount of information to an informed audience, and that’s why it shows up as the basis of so many reports. Just try to avoid them on high-priority strategic dashboards.

Bar And Column Charts

Column Charts

Next, we have column charts. Column charts are great when you have a relatively small number of categories but a fairly large number of data points. In the following example, I compare across three shows how many critic reviews each episode got.

11_ColumnChart[1] — A column chart covering episodic critic reviews.

We have three shows but a fairly large number of episodes, and I want to compare the trend for each show. If I were to do this for all six Netflix shows, that would be too much, in my opinion. Also, note that we need a legend to explain each show. That’s a downside to using a column chart to show periodic data like this, but it’s not a deal-killer.

Bar Charts

Bar charts run left-to-right whereas column charts run bottom-to-top. The fact that bar charts run left-to-right makes them much better when you have a large number of categories, as well as when the labels themselves are lengthy. Notice how clear the following bar chart of critic reviews by show is:

This is much better than how it’d look as a column chart because we can put the labels in their natural format: left-to-right and flowing to the visual element.

Choosing A Bar Or A Column

This leads me to a little bit of advice for choosing bars versus columns. You will want to choose a bar chart if the following are true:

Category names are long, where by “long” I mean more than 2-3 characters.
You have a lot of categories.
You have relatively few periods—ideally, you’ll only have one period with a bar chart.

By contrast, you would choose a column chart if:

Viewing across periods is important. For example, I want to see the number of critic reviews fluctuate across the season for each of the TV shows.
You have many periods with relatively few categories. The more periods and the fewer categories, the more likely you are to want a column chart.
Category names are short, by which I mean approximately 1-3 characters.

Some people will rotate text 90 degrees to try to turn a bar chart into a column chart. I don’t like that because then people need to rotate the page or crane their necks. In that case, just use the bar chart.

Cleveland Dot Plot

The Cleveland dot plot is a minimalist take what a bar or column chart would normally visualize. It’s not built into Power BI, but there are a couple custom visuals which implement the dot plot, and I think you’ll want to check them out.

For example, here is a bar chart with average rating by show:

Bar Chart Rating — Bars dominate the picture.

By contrast, here is a dot plot of the same information using the MAQ Software Dot Plot:

Dot Plot Rating — Dots don’t dominate like bars do.

The dot plot gives us the same results as the bar chart, but we can fit it in a more compact space if we want, scrunching down the visual to take up a lot less space without losing any of the information about series variance. For example, in this picture, I can safely cut the visual down to about a third of its original size and still clearly see the differences in shows. The reason for this is that dot plots don’t need to start at the origin, whereas bar and column charts really do.

Let’s get into that for a moment. When we see bar charts, we immediately compare the relative areas of those bars and get information from that comparison. If I show you a bar which looks to be about twice the size of another bar, we automatically assume that the first element has about twice the value of the second. When we start at the origin, this is true. But if we start from a different spot, we exaggerate the differences and can lead viewers to the wrong conclusion. Yes, if you read the numbers you’ll see the difference, but you immediately put the image into someone’s mind before that person even gets a chance to see the numbers.

By contrast, if you see dots, you aren’t assuming relative sizes and can orient yourself to the axes more easily. In the above picture, we can see that our shows hover between just under 8 and about 9 on the ratings scale. We can also see that there is a pretty big difference between The Punisher & Daredevil versus the other shows.

Cleveland dot plots area nice way of contrasting values for a relatively large number of categories and can let you get finer-grained than a bar chart. I’d use this type of visual if most of the categories are relatively close together, like our ratings examples are.

One downside to the dot plot visuals in Power BI is that they aren’t as complete as the built-in visuals. For example, the dot plot that I showed above doesn’t appear to let you sort the data by average rating, which is a bit surprising to me. But that doesn’t change the fact that in principle, dot plots are useful.

Radar Charts

I like radar charts, but there are very few good uses for them.

The best use for a radar chart is giving a normalized comparison of measures across several categories. For example, the above chart shows normalized critic reviews and normalized user reviews by show. In other words, both charts are normalized to 1.0, and it turns out that Daredevil season 1 had the greatest number of critic reviews and the greatest number of user reviews. Then, the other shows follow. Luke Cage had almost as many critic reviews, but the smallest proportion of user reviews; Jessica Jones was the opposite, having the smallest number of critic reviews but the second-largest number of user reviews.

I consider this a fun type of chart but one that probably doesn’t end up on many dashboards.

Line Charts

Line charts are great for time series data which stretches over a large number of periods. The trick is that you don’t want to have many categories. For example, the following line chart shows average rating by episode for Daredevil versus The Punisher:

This line chart does a great job comparing two shows across a full season. It also shows the three-episode stretch where the crew behind Daredevil ran out of material and padded one episode out into three before giving us a fantastic finale.

Line charts start to get risky after you get past 4-5 categories. At that point, lines start overlapping and it becomes tricky to read. But for a two- or three-category comparison over time, line charts are hard to beat. I could also use column charts here, but if you want to emphasize the time series nature of the data, lines are great because they indicate flowing from left to right.

Line And Column Charts

Combining a line chart with a column chart can be risky, but under the right circumstances, it can pay off handsomely:

Many people voted on episode 12 of The Punisher, and they loved it.

This works because we have two interrelated but distinct variables that we are measuring over time. We have the number of votes as a column chart taking up the background, and we have a line chart representing rating per episode in the foreground.

Another place that you will see line and column charts is dealing with stocks, where you typically see a line with the daily closing price and a column chart showing volume traded per day.

Scatter Plots

Scatter plots are great for showing relationships between two variables over a relatively small number of categories. In the following example, I show two variables per episode: the number of user votes on the x axis and the number of critic reviews on the y axis:

In this case, a scatter plot is great because it shows just how distinct one show was from the other two. We see episodes of The Punisher and Jessica Jones overlap, but Daredevil blows them both away in terms of number of votes and number of reviews. This makes sense given that Daredevil season 1 was also two years before The Punisher and a year before Jessica Jones, so there was more time to rack up votes. We can also see a fairly linear relationship for each series, showing that more user votes correlates positively with more critic reviews.

Bubble Charts

Bubble charts are the three-variable versions of scatter plots. We have x and y axis variables as well as a third variable which controls the bubble size.

A bubble plot which also includes number of user reviews.

This bubble chart includes the number of user reviews as the bubble size. This particular example also shows why I’m not typically a fan of bubble charts: they get too busy too fast. If there are only a few points on the chart, they’re not bad, but even with just 39 points, it’s hard to make out much information here. It’s also a lot harder to get approximate values for the different points, given how big some of the bubbles can get.

Treemaps

Treemaps are another example of a risky chart that I really like in the right circumstance. Here is an example of a treemap:

A treemap breaking down entries by state and then by city.

Treemaps work if the following circumstances hold:

You have categorical data. For example, my treemap is data about states and provinces.
That data is typically hierarchical. This is optional, but hierarchies tend to make treemaps a bit more interesting. Here my hierarchy is state to city, and you can see how each city makes up a certain percentage of the state totals.
You have a medium to large number of categories. This works best with 20-40 categories. If you have a lot more than that, then you probably don’t want to use a treemap.
A relatively small percentage of categories dominate. In this example, we can see that the four biggest states (Florida, Ohio, Virginia, and New York) take up about 40% of the total. If everything were approximately equal, that would make this treemap a little harder to follow.

Treemaps have their downsides, of course (starting with all of that color), but they do a pretty good job at letting people estimate relative value by area. People aren’t fantastic at estimating relative areas in general, but they are pretty decent at getting relative sizes of similarly shaped squares and rectangles right, and that’s what a treemap gives us.

Conclusion

Today, we looked at a series of visuals that I like on dashboards. I wouldn’t use all of these all of the time, and there are other good visuals that I didn’t include here, but you can see that you have some options available to you. Each one of these has its own good and bad use cases, but they all work. Tomorrow, we’re going to look at some other types of visuals which range from “very risky” to “If you use this, you’re out of the family.”

4 thoughts on “Types Of Visuals, Part 1”

Pingback: Visuals I Like – Curated SQL
oneguythought says:

January 21, 2018 at 5:31 pm

Nice intro to different chart types.

The big thing about line charts is to illustrate a trend or a numeric relationship. Don’t use it if you have only arbitrary categories; you need the X axis to be “the independent variable” e.g. time. Lines draw more attention to the direction of the line than to the actual data points.

Line-and-column is a special scenario. It’s well-suited to stocks as there’s a strong interest in the trends of stock price, while the trade volumes are more seen as events that occur at certain points in time. Remember the point above regarding lines though – they’re about trends with a numerical or time-based X axis.

Radar charts: I disagree with your example, but agree with most of your comment, especially that “there are very few good uses”. I would use radar to describe multiple independent facets of an object. You can then view the total area as a value, and compare it to other radars. e.g. for projects you could have dimensions for ROI, resource availability, management backing, risk containment, customer demand. When all those rate highly you get a big fat shape. Any low values pull it in. You’d draw a separate radar for each project. As a proposal evolves you could overlay initial & current values in different colours, or perhaps overlay the average values for successful and unsuccessful projects.

1. Kevin Feasel says:
  
  January 23, 2018 at 4:38 pm
  
  Thanks for your thoughts.
  
  On radar charts, I have a more “traditional” example coming up in February as part of a different series which is closer to what you’re describing.
  
Pingback: ggplot Basics: Mappings And Geoms – 36 Chambers – The Legendary Journeys: Execution to the max!

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

36 Chambers – The Legendary Journeys: Execution to the max!