Stacked area graphs are not your friend. Seriously. I want to make this abundantly clear.

I’m going to expound on some of the work of Stephen Few here and lay out what stacked area graphs are, why they are a poor type of data visualization, and what are some good alternatives.

#### What is a stacked area graph?

*i.e.*on the

*x-axis), broken up across more than one categorical variables (or into different “data series” in MS Excel’s parlance) which make up the whole. The different shaded areas are stacked on top of one another, so that the height of each shaded area represents the value for each particular categorical variable, and the total height is their sum.*

Pretty, no? |

#### Shortcomings of Stacked Area Graphs

For instance, in the example graph I produced above, it can be easy to think there are very well-defined peaks in all the series around Jan 9 and Jan 22. This is because of the effect just mentioned. Look at the same graph if I selectively shuffle the order of the stacking of the areas:

While we still see those peaks at the times mentioned because those are the peaks for the total, but look at the series for Group D (in purple). Do you still feel the same about how it fluctuates between the dates of the 8th and the 22nd as you did before, in the first figure?

Because of the inclination to interpret the top of the area as quantity, interpreting the trend in the different areas of a stacked area graph is usually quite difficult.

#### Alternative Approaches

When the graph gets a bit noisy like this it might also be a good idea to thin the lines.

Okay, that’s better. But as the number of values of the categorical variable increases the graph is going to get increasingly noisy. What do we do in those cases?

Well, as I often have to remind myself, nowhere does it say that you have to tell your story all in one graph. There’s nothing stopping us from breaking up this one graph into smaller individual graphs, one for each and also the total. The disadvantage here is that it’s not as easy to compare between the different groups, however we can make it easier by using the same axis scaling for the graphs for each individual group.

Here there were an odd number of graphs so I chose to keep the graph for the total larger (giving it emphasis) and maintain their original aspect ratios. You could just as easily make a panel of 6 with equal sizes if you had a different number of graphs, or put them all in tall or wide graphic in a column or row.

Also, now that each individual graph depicts the value for a different group, we don’t need the colours on the figures on the right anymore; that information is in each individual plot title. So we can ditch the color. I’ll keep the total black to differentiate between the total and the value for individual group.

As the number of values for the categorical variable gets very large you go from multiple figures into true small multiple (trellis plot) territory, like in the figure below:

Another option, if you have the benefit of more dynamic visualization tools available, would be to use interactivity and gray out series in the background, such as in this amazing visualization of housing prices from the New York Times:

Click me for dataviz goodness. |

*relative proportions of the whole*.

#### Concluding Remarks

In my opinion, and my experience working with data visualization, you are almost always better served by the simpler, more minimalistic types of visualizations (the fundamental three being the bar chart, line graph and scatterplot) than more complicated ones. This has been an example of that, as stacked area graphs are really just a combination of area graphs, which are, in turn, an extension of the line graph.

#### References

*Quantitative Displays for Combining Time-Series and Part-to-Whole Relationships.*