Why rainbow colors aren't the best option for data visualizations
Data visualizations are beautiful, exciting ways to tell stories. But you have to choose carefully in designing a map or chart, and one of the biggest mistakes is misusing rainbow colors.
Rainbow color schemes -- also called spectral color schemes -- are frequent choices for visualizing data, both because they look bold and exciting and because they're the default for many visualization software tools. But they usually do more harm than good. Detecting the colors at all is a problem for more readers than you might guess, and the rest of the audience will find it easier to understand the visualization if it's presented with a different palette.
Rainbow color schemes are “almost always the wrong choice,” Anthony C. Robinson, geography professor at Pennsylvia State University, wrote in an online class on Coursera, which taught students how to use geospatial technologies to map data.
Here are some reasons why rainbow colors are the "wrong choice":
Colorblindness and ordering colors
People who are colorblind have difficulties detecting colors, particularly red and green. (Try this color vision test to see if you're one of them.) Colorblindness affects up to 10 percent of men. That means if you're serving up visuals to an audience of hundreds of thousands, you're missing out on a large slice of your audience.
Even though most people aren’t colorblind, rainbow color schemes can be confusing because there’s no clear "greater than" or "less than" logic to ordering the colors, warn computer science researchers David Borland and Russell M. Taylor II. People generally agree on the progression from light to dark, but sort colors differently, as shown here:
Changes can be hard to see
Visualizations tell the story behind changes in data; their job is to simplify complex patterns into an illustration that lets you understand -- ideally at a glance -- what's going on. But human eyes aren’t good at detecting the edges of different colors sitting side by side. We’re better at seeing small changes within single color ranges because luminance and saturation values change smoothly where colors do not, wrote Robert Kosara, visual analysis researcher at Tableau and an expert on how we see color, on his personal website, EagerEyes.
The details get technical very quickly, but the key lesson is rainbow colors only show differences when the actual color changes, while color gradients allow people to see gradual changes.
Depending on your audience, the wrong choice can have serious consequences. In a Harvard study, researchers found 2-D diagrams of heart arteries that used a gradient from black to red were more effective tools for doctors making diagnoses than 3-D models using rainbow colors. Clinical studies showed the diagrams that used a gradient increased the accuracy of doctors' diagnoses of atherosclerosis and heart disease from 39 percent to 91 percent.
Not every data visualization is used in making critical medical calls, but rainbow colors may mislead when journalists use them to incorrectly show quantitative data.
"Rainbow colors are not bad if you're using them for categorical data," Drew Skau, visualization architect at Visual.ly, told Poynter in a video interview. “They're bad if you use them to represent continuous data.”
What's the difference? Continuous data is quantitative and described by numbers; categorical data is qualitative and described by words. For example, compare these groupings:
- Exotic pets: chinchilla, ocelot, scorpions, hissing cockroaches, pythons
- Temperature in Fahrenheit: -459.67°F, 32°F, 212°F
- Electoral votes during elections: 206, 270, 332
The exotic pets are related to each other, but not continuous -- you can’t measure the difference between a chinchilla and an ocelot. The temperature readings, on the other hand, are continuous -- they're numbers on a scale with measurable distances.
Electoral votes are continuous data, but they're also divergent. We want to know what the mid point is (270 electoral votes) because whoever receives more than 50 percent of the votes wins. Thus, the data visualization usually shows blue to represent Democrats on one end and red for Republicans on the other end, which is the ideal way to represent divergent data.
This exercise from Robinson shows how spectral colors make it much harder to tell the difference in volume of tweets (which is quantitative data) during the 2012 presidential elections:
But rainbow colors are often used to illustrate quantitative data, even by NASA scientists. Academics have urged the scientific community to stop using spectral colors, and scientists and engineers are worried about the accuracy of color use. As journalists, we can learn from both the research and the arguments.
Help from the experts
Many data experts have built useful tools to help you pick colors:
- ColorBrewer by Cynthia Brewer, Mark Harrower and Penn State helps you design color palettes for maps; you can choose the number of data items, the type of data, and even colorblind-safe colors.
- Color Tool, created by former NASA researchers, offers a professional-grade app for complex infographics and aeronautical displays.
- Adobe's Kuler is a slick color wheel that offers color schemes.
- Poynter's NewsU's digital tools catalog has a range of tools with which you can get started visualizing data.
Colors are wonderful -- in researching this article, I discovered things about them I never knew, such as the fact that yellow is the brightest color of the rainbow and that people who speak other languages may see colors English speakers can't. Colors help make visualizations exciting, but a few wise color choices can ensure those visualizations are more importantly informative.