The importance of colour in data visualisation
When it comes to data visualisation – whether it’s a complex org chart or a traditional bar chart showing average salary across grades, colour is a crucial component. Beyond aesthetics, the colour we use in conjunction with other visual elements tells stories in data – the information trying to convey and its patterns and relationships. Despite its importance many people tend to choose some arbitrary colours that happen to look good without much thought. Colour used effectively, will help readers readily understand the information and discover insights. Used poorly, it can confuse or even mislead readers. Although most data visualisation tools today come with a feature for controlling colours, using it in the right way is not always straightforward. In this blog, I will discuss four key functions and theories of colours that could help you find the right set of colours to communicate your data more effectively.
1. Colours set the context for primary information
A piece of data visualisation consists of various visual elements which can be categorised into either of two kinds; data and non-data (or primary and secondary). Bars in the bar chart, lines in the line chart and dots in the scatter chart are data/ primary elements. Meanwhile, elements such as grids, labels, and axes are contextual elements that ‘set the stage’ for the data points – the main ‘actors’. In this sense the non-data elements are also called contextual elements. The best practise is to use unobtrusive yet still legible colours such as light grey for the non-data elements, to avoid distracting the focus from the data. As can be seen in Figure 1 below, ‘silencing’ the contextual elements makes the chart on the right much clearer.
2. Colours measure numerical values
When using colour to represent numerical scale (i.e. continuously increasing or decreasing), using a single colour (e.g. blue, green, red) varying evenly from light to dark is often the safest and most prudent choice. As people will tend to interpret darker colours as representing more, we can use a lightness ramp to visually convey differences between values. The key thing is to keep the perceptual distance between incremental values equal.
Figure 2b exemplifies a colour scheme that should be avoided. The perceived difference for the first and second unit is closer than the difference between third and fourth unit.
Figure 3a shows how performance scores vary across grades and number of absence days in the form of heat map. The intensity of colour indicates the average performance score for each of the intersection. As seen in this example using different shades of a single colour (hue) would be reasonably effective to show patterns. However, if you want to emphasise data variation, use two or more hues as they give a stronger colour contrast between data points. Figure 3b visualises the same data but using multi-colour gradients. Finding a set of shades with constant incremental values when using more than two colours is not an easy task. Fortunately, there are tools available online for free that will be able to help you get a palette that is both logical and aesthetically pleasing (see the last section of this blog).
Another scenario where you may want to use more than one colour is when there is a critical breakpoint in your data and every other data point exists either above or below it. Examples include showing growth ranging from zero, comparing values to a reference point, mean or median. The best practice is to fuse two colours with varying lightness together, while using a neutral colour as midpoint.
For example, by using red and blue shades, Figure 4a allows readers to quickly identify where performance is below or above a reference point (5 in this case). Often, in social media platforms, you will encounter maps painted with a range of colours (known as rainbow colour schemes) as shown by Figure 4b. Although this looks beautiful, studies have shown that rainbow scheme is no more effective than a carefully selected scheme using only one or two colours (read more). See if Figure 4b makes it any easier for you to identify patterns and variations in data. As to functional colours, it is very rare to look for more than two or three colours.
3. Colours label items
When using colours to label categories (e.g. department, gender), they should look distinct from one another. This can be done by manipulating the hues and saturation, while keeping lightness constant. We should also avoid implying inherent order or association by our colour choices. One common mistake is to use eye-popping neon colours and soft colours in the same chart. In Figure 5 for example, the Bright magenta makes the Head Hunter category stand out most for no reason.
Due to the limits of human perception, the maximum number of categories that can be displayed is around 10-12 or fewer in practice. If you have to label more than that, you may need to incorporate other cues such as patterns, or label directly with text, as relying solely on colours won’t work.
Types of chart chosen also matter. Usage of pastels can fail to present data correctly, if you’re using scatter plot, with small or thin shapes. These scatter plot examples (Figure 6a and 6b) show that colour alone cannot reliably label items as small as the individual dots, although the colouring does indicate clusters and trends. Conversely, when using bar chart or area chart which have larger shapes than dots, you may want to tone down the overall colour.
In case categories have inherent sequence (i.e. ordinal data), for example grade levels, capability ratings, sales stages and etc. Rather than assigning random colours, varying the intensity of the hue can make it easier to understand (click on Figure 7 below). As you can see in Figure 7c, increasing the intensity of the hue with seniority (in this case sales person, sales manager, senior sales manager, and sales director) gives the most intuitive way for readers to understand the chart.
4. Colours bring meanings and emotions
Although some colours may evoke different meanings and emotions in different cultural or business contexts, there seems to be a common thread. Figure 8 is one such example, where a universal colour scheme tends to cut across cultural boundaries, when visualising temperature variation across the year. This visualisation is also an example where a palette with a wide range of hues used in the useful way, in contrast to the Figure 4b. It is commonly understood that warm colours denote high temperature and while cool colour, low temperature. Choosing an entirely irrelevant colour scheme, may make the visualisation less intuitive to people.
With these theories in mind, should we go on and create colour palettes from scratch? We can, but this would take a lot of practice and mastery to get right. Useful tools are already out there, such as a web-based application ColorBrewer2 that offers well-designed colour palettes with colour blind safe option. If you already have sufficient knowledge about colours and would like to experiment and create your own palettes, try Colorgorical. Movies in Color is for those who would like to get some fun and inspiration.
The next time you create a visualisation, make sure you use colour with deliberation. Why don’t you revisit some of your charts and check whether they did their jobs?