So I have a shocking confession to make: I love Skittles.
This post is not sponsored, endorsed, compensated, or paid for in any way, shape or form, by Skittles Candy. I’m not particular – I like other types of candy that are similar – you know, those ones that are chocolate covered in a hard shell, whether they be the kind where you eat the red ones last or not.
Anyhow, I got to thinking about how, abstractly, each individual candy can be viewed like a pixel of a different color. So you can make art using candy, just like artists make a mosaic. There’s lots of this on the internet you can already see: in fact, Skittles has done print advertising this way.
But…. each individual candy can also represent something else: a unit of measurement. I thought it would be cool to go through some data visualization fundamentals using the candy in this way. So let’s dive in.
Data Visualization using only 1 bag of Skittles
A perhaps more useful way to do the same would be to organize each colour in rows, with each row a set number (like tally marks). Here it’s not only easy to see the relative proportions of the different colours in the bag, but also count them as each row and group is a set number (5 & 10, respectively). This is equivalent to a pictogram, with each Skittle representing, well, 1 Skittle:
It’s not a big stretch of the imagination to collapse those groups together into groups of a set height. So here we have a proportional bar chart, where the length of each bar represents the percentage of the bag that is each colour. Note that because I didn’t slice Skittles in half, the physical analogue is not exactly the same as what you’d put down on paper or in Excel (there is one additional unit for yellow and orange):
And, as I both often have to remind people of this rule, and also observe many people not following it, it is best practice to sort the bars in descending order for maximum clarity / comparative value (assuming there is not another more important ordering):
And, if we want to transform our proportional bar chart into one comparing absolute quantities, it is not a giant stretch of the imagination to break apart the different bars so they are only one ‘pixel’ high:
Here it’s much easier to get an idea of the absolute number of each colour in the bag, but harder to tally that numbers exactly – for that we’d need to add an axis or data labels.
More bags please
So let’s look at some more visualization fundamentals, where we required comparing not only across a categorical variable (colour) but also between groups.
Here is the equivalent to our first graph from before, only showing the different numbers of Skittles in each bag. You can see there’s actually a fair amount of variance; the smallest bag had 89 pieces of candy, whereas the largest had 110.
Now let’s make a bubble graph which not only compares the sizes between the different bags, but also their makeups by colour. The end result is actually closer to a collection of pie charts:
We can also group by colour only to see the overall makeup for the whole group of bags. Whereas orange dominated in the first bag we looked at, you can see here that orange and yellow are approximately at parity overall.
Now let’s look at the tally mark / pictograph method. Here each row represents a bag:
You can see there’s a fair bit of variance in the different colours. I also tried rearranging things so they result was less like a pictograph and more like a treemap:
Really the best way to compare would be a bar graph. Here’s a stacked area graph. I didn’t bother sorting by length, because at this point I was pretty tired of shuffling Skittles around: