Dual axes time series plots may be ok sometimes after all

Are they really as bad as all that?

I’ve been mulling over time series charts with two different y axes, which are widely deprecated in the world of people who see ourselves as professional data workers. Looking down on dual axis time series charts is one of the things that mark one as a member of a serious data visualiser - after shaking our heads at pie charts, and cringing in horror at three-dimensional chart junk. But I’ve come to the surprising conclusion (surprising for me) that the arguments against them don’t stack up - at least not to the stage of justifying a blanket ban.

Before I go further, here’s my best example of one of these charts with real data, as a talking point. The data come from the Reserve Bank of New Zealand, and my chart below is essentially a version of their graphic that I’ve enhanced to meet my own style requirements. For the record, my improvement steps were:

choose vertical scales that make the two series as comparable as possible, equivalent graphically to how they would be drawn it they were indexed rather than original values
remove distracting shadows
remove borders
make the data stand out more
colour code the axes to match the data series, using colours with equal chroma and luminance so they are perceptually equal

Stephen Few looks at the issue of dual axes plots in a well reasoned piece and concludes cautiously that they are never justified. A nice piece by Keran Healy looks at a particular example and concludes that the dual axes encourage sloppy thinking with regard to time series causality. There are a lot of shorter pieces such as this one pouring scorn on the technique too. Hadley Wickham is famously opposed to dual axes and has made it difficult to use them in ggplot2 (I have to say, in the case of this example I agree with Professor Wickham’s comment “MY EYES. MY EYES. OH THE HUMANITY”).

Here’s my attempt at a paraphrase of the case against dual axis charts:

The designer has to make choices about scales and this can have a big impact on the viewer
In particular, “cross-over points” where one series cross another are results of the design choices, not intrinsic to the data, and viewers (particularly unsophisticated viewers) will not appreciate this and think there is more significance in cross over than is actually the case
They make it easier to lazily associate correlation with causation, not taking into account autocorrelation and other time-series issues
Because of the issues above, in malicious hands they make it possible to deliberately mislead
They often look cluttered and aesthetically unpleasing

All of these points have some substance, but having carefully thought it through I think that they are not conclusive. I reached this position not by disagreeing with the statements above, but by observing that the same criticisms can be made of most if not all standard methods of representing the sort of data for which a dual axis time series chart would be considered. Most importantly, in preparing this post I came up with the idea (surely not original to me but I can’t find the prior references) that one can mitigate many of the problems of dual axes time series plots by choosing the limits of the two y axes in such a way that the two series are rendered on the page as though they had been converted to indices first.

Let’s look at the alternatives to dual axes charts in the case of the foreign exchange data plotted earlier.

The most commonly recommended alternative to dual axes is a free scaled facet plot like this one:

This is a good graphic, no doubt about it, and I think will often be my preferred approach to this type of data. Interestingly, this is still basically a dual axis plot. We’ve let the vertical axes take two different scales, but by displacing the two facets from eachother we avoid problems like the false “cross-over” points, and we improve the uncluttered aesthetic appeal. However, it’s worth noting that the choice of scale is still a design choice. This graphic, drawn with ggplot2 which has good defaults and does not allow easy configuration of faceted vertical axes, rightly chooses vertical scales to maximise use of the page - but the difficulty in over-riding that is a ggplot feature, not a feature of faceted scales in general.

Because the choice of scale is a choice by the designer, objections 1 and 4 (arbitrary differing scales, and vulnerable to malicious misleading) apply to this sort of graphic as much as a dual axis plot. Consider this alternative version:

It’s a similar plot; I’ve just changed the vertical limits of the bottom of the two charts. I had to exit the facet paradigm of ggplot2 to do this because it rightly makes it difficult to warp the y axes of individual facets; but with a little effort I managed to make it look to a naive viewer as though the US exchange rate is more volatile than the trade weighted index.

What the facet version does do is make it harder to do micro comparisons. For example, compared to my original superimposed dual axis plot, it’s no longer possible to tell in either of the facet versions the periods when changes to the trade weighted index preceded those to the US exchange rate and vice versa.

Finally, splitting the plot into facets does precisely nothing about avoiding problem #3. The fact is, confusing correlation with causation is rife, and making the correlation less obvious is not a sustainable solution.

Indexes don’t solve the problem fully

Here’s a second common way of dealing with this problem - converting both variables to a common index so they can use the same scale. In fact, I’m the author of the ‘stat_index()’ function in the ggseas package which makes it easy to do this in the ggplot2 universe, and I’m a fan of this approach in the right circumstance too. Here’s how it looks with this data:

[By the way, if the shape of this graphic looks familiar, that’s because it is basically identical to my first dual axis chart, which I started the blog post with. I chose the range of the scales shown by my two dual axes in that chart in such a way that the two series would be drawn effectively identically to this index plot.]

With this indexing, we’ve now solved problem #1 about the arbitrary scaling of the axes. Unlike in the standard dual (free) axis version or its close cousin the free vertical axis facet plot, there is now a natural vertical scale which we can’t change for just one data series at a time. However, problems 2, 3, 4, 5 are not addressed in any way by this. Consider how the choice of reference period for the index impacts on the crossover point of the series, as shown in this alternative:

This revised plot is still ok, but an unsophisticated viewer might see it as telling a different story from the first in terms of which series “catches up” with the other and when (the cross-over objection again). By constraining ourselves to an index we leave a few less options for arbitrarily changing scales in misleading ways, but we still to need to make design choices that materially impact on the look of the data.

You might think the answer is “always make the earliest point in the series the reference point of the index”, so these plots are always showing growth from the beginning of time, but it doesn’t take long to think of times when that’s going to be an unhelpful restriction.

In converting to an index we haven’t addressed the problem of spurious correlation (of course not - how could a graphic do that anyway, except by failing to show the patterns?). We have still left the designer in charge of key decision about cross over of the series. And in the process, we have taken a decisive hit in interpretability by losing the original scale. Many of my stakeholders hate indexes with a passion, because of their abstract nature and the difficulty in explaining how they relate to units they are familiar with.

Below, I suggest a method of dual axis plots that has the non-misleading scale properties of an index chart while retaining interpretability.

Connected scatterplots are cluttered, and too complex for most audiences

For analytical purposes I think a connected scatter plot should be the standard choice for examining two continuous variables change over time (but see below for caveats on this). Here’s the continuous scatter plot in action for this particular dataset:

The strengths and weakenesses of this are apparent. It’s a good plot for seeing the relationship of the two (although a connected scatter plot of the first differences, or growth rates, is often better) but it’s cluttered; the data is drawn over itself as time goes on; and its going to be an unfamiliar visual shock for most non-specialist audiences. It gets worse with longer time series and with seasonal data. This is a good plot for the exploratory and analytical phases of data analysis but unlikely to be a final presentation graphic.

How not to use dual axes

Before I go on to good practice, let’s clear up some potential straw men. Here’s some examples of bad practice that I agree with critics should be very definitely avoided.

Don’t muck up one of the scales

Those are both instances of arbitrarily playing with the scales of one of the axes to use less vertical space and hence artificially misleading - it looks like one of the series has less variance than the other. As I pointed out earlier, this can be done even if you have two separate charts; I would argue that the dual axes in fact is revealed as having power precisely because the misleading effect is so important when done with the data series crossing eachother. With great data comes great responsibility. In this case, the responsibility is to use a justifiable, objective-if-possible, algorithm to set the scales of the two vertical axes.

Don’t use two completely conceptually different scales (and in particular, don’t scrawl growth rates all over the original data)

Here’s another case that I think is probably never justified, although it is very commonly used:

I hate this format that overlays growth rates on the original data - I find it a crime against aesthetics to start with, but more importantly I find it very difficult to read and I’m sure that’s the case for many viewers. When the scales are so different I see no justification for overlaying the two data series - use two facets in this case.

The choice to use them

Here’s what I think should be taken into account in the choice to use dual axis time series charts:

As Few suggests, the case to use these charts is strongest when the eye is drawn to the pattern over time of an individual series ie horizontal rather than vertical comparisons. Vertical comparisons between the two series are misleading and need to be discouraged.
Don’t use two axes when the unit of measure is the same (eg pass rates out of 100 in exams in two subjects). There’s just too much scope for confusion. Use a single axis.
Don’t use two axes the scales are fundamentally different (eg temperature, and change in temperature). Use two charts (or two facets, depending on how you look at it).
Do consider their use when the scales are comparable (eg two things in which getting larger on the graphic has a similar interpretation - like the two exchange rate variables in my examples)
Use dual axes rather than an index when the interpretability cost of converting to a common index is greater than acceptable (sorry, I don’t have a guideline for when that would be….)
Don’t use when you really have three data series to show, not just two (this should be obvious, but…)

Good practice when you do use them

In general, playing around with these has led me to the following conclusions for good dual axes charts:

Use a consistent process to choose the limits of the two vertical axes, an objective, automated algorithm if possible. One way of doing this, which I’ve made the default in my R function below, is to set the limits of the two axes so that the actual lines drawn look the same as if they were of indices, even though the scales give you actual values.
In particular, make the scales of the axes conceptually similar. Don’t show growth on one and original data on the other. Don’t use logarithms on one and natural scale on the other. And so on. If meeting this principle is difficult because of the complex concepts, measurement and scales you have, you probably don’t want to use dual axes.
Colour-code the axes. This is the single biggest thing that would make most of the dual-axes graphics I see more readable. And even if you’re using Excel it should still be possible. I’m surprised it’s not standard (and believe me it isn’t even common).
Use colours with equal chroma and luminance, so only the hue varies. This gives equal perceptual weight to both the catogories.
Avoid the acronyms “LHS” and “RHS” for left-hand side and right-hand side. This is exclusive jargon that puts off many users.

The `dualplot()` function

To help me produce good quality dual axes time series charts with scales that aren’t misleading and good colour schemes, I built a little function in R.

The dualplot() function lets you choose the end points of the two vertical scales in such a way that the data are rendered on the screen as though they had been converted to an index. In fact, you can think of the result as a graphic of indexed series - as recommended by many critics of dual-axes plots - that just has the original scales back-transformed and added as annotations (this isn’t how it works under the hood but the effect is the same). The ylim.ref argument let’s you specify which element in the time series to use as the reference point for indexing (it defaults to 1, and won’t work for values other than 1 when the two series to be plotted are of different lengths).

Here’s an example output of dualplot() which replicates the visual impression of the index chart I used before with January 2014 as the reference point for indexing - this now becomes one of the crossover points of the two series, and the vertical scales are forced to be non-misleading (ie the same on the page as they would be for an index chart):

You can load the dualplot() function by:

source("https://gist.githubusercontent.com/ellisp/4002241def4e2b360189e58c3f461b4a/raw/9ab547bff18f73e783aaf30a7e4851c9a2f95b80/dualplot.R")

…or here’s the original code:

You’re welcome to use it or adapt it but I can’t guarantee its stability just yet - I may change it as developing and testing continues. More thorough testing on a wider range of data is needed, but would make this blog post too long so I’ll get round to it in the next month or so.

Code

Here’s the code in R that downloaded the data and produced all the graphics above.

library(readxl)
library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)
library(ggseas) # for stat_index
library(grid)
library(gridExtra)

# Download data from the Reserve Bank of New Zealand
download.file("http://www.rbnz.govt.nz/-/media/ReserveBank/Files/Statistics/Key%20graphs/graphdata.xlsx?la=en",
              destfile = "rbnz.xlsx", mode = "wb")

# Import some of that data into R and create a numeric TimePeriod variable from the original
# string that shows year and month:
forex <- read_excel("rbnz.xlsx", sheet = "8_NZDUSD", skip = 4) %>%
   mutate(year = as.numeric(substring(DATE, 1, 4)),
          month = as.numeric(substring(DATE, 6, 7)),
          TimePeriod = year + (month - 0.5) / 12) %>%
   select(-DATE, -year, -month)

# Tidy up names:
names(forex)[1:2] <- c("NZDUSD", "TWI")

# Create a long, thin ("tidy") version for use with ggplot2:
forex_m <- forex %>%  gather(variable, value, -TimePeriod) 

# Set the basic foundation of the coming ggplot graphics:
basicplot <- ggplot(forex_m, aes(x = TimePeriod, y = value, colour = variable)) +
   labs(x = "", caption = "Data from RBNZ; graphic by http://ellisp.github.io", colour = "")


#-----------------facet versions-------------------
# Good facet plot:
basicplot +
   geom_line() +
   facet_wrap(~variable, scales = "free_y", ncol = 1) +
   ggtitle("Comparing two time series with facets may reduce comparability")

# Misleading facet plot from playing around with one of the scales:
p1 <- forex_m %>%
   filter(variable == "NZDUSD") %>%
   ggplot(aes(x = TimePeriod, y = value)) +
   geom_line() +
   labs(x = "", y = "USD purchased for one NZD") +
   ggtitle("NZDUSD")

p2 <- forex_m %>%
   filter(variable != "NZDUSD") %>%
   ggplot(aes(x = TimePeriod, y = value)) +
   geom_line() +
   labs(x = "", y = "Trade weighted index") +
   ylim(c(0, 100)) +
   ggtitle("TWI")

grid.arrange(p1, p2, ncol = 1)
#--------------index--------------

# Good index plot:
basicplot +
   stat_index(index.ref = 1) +
   labs(y = "Index (January 1984 = 100)") +
   ggtitle("Usually accepted version of comparing two time series",
           subtitle = "Converted to an index, reference period first point in time")

# Also a good index plot, but showing that arbitrary choices are still being made:
basicplot + 
   stat_index(index.ref = 361) +
   labs(y = "Index (January 2014 = 100)") +
   ggtitle("But then, a different picture?",
           subtitle = "Converted to an index, reference period chosen arbitrarily later in the series")

#---------------connected scatterplot-------------
forex %>%
   mutate(label = ifelse(round(TimePeriod - floor(TimePeriod), 3) == 0.042, substring(TimePeriod, 1, 4), "")) %>%
   ggplot(aes (x = NZDUSD, y = TWI, label = label, colour = TimePeriod)) +
   geom_path() +
   geom_text(fontface = "bold") +
   scale_colour_gradientn("", colours = c("grey75", "darkblue")) +
   ggtitle("Connected scatter plot may be the best analytically\nbut is intimidating to non-specialists")


#------------dual axis version-------------
# As we're drawing a number of these, we want a function to make it easier.
# Here's one I prepared earlier:
source("https://gist.githubusercontent.com/ellisp/4002241def4e2b360189e58c3f461b4a/raw/9ab547bff18f73e783aaf30a7e4851c9a2f95b80/dualplot.R")     

# bad:
with(forex, dualplot(x1 = TimePeriod, y1 = NZDUSD, y2 = TWI, 
                     colgrid = "grey90", ylim2 = c(20, 200)))

# bad:
with(forex, dualplot(x1 = TimePeriod, y1 = NZDUSD, y2 = TWI, 
	colgrid = "grey90", ylim1 = c(0, 1)))

# verybad:
forex2 <- forex %>%
   mutate(NZDUSD_growth = c(NA, diff(NZDUSD)) / NZDUSD * 100)
with(forex2[-1, ], dualplot(x1 = TimePeriod, y1 = NZDUSD, y2 = NZDUSD_growth, 
	colgrid = "grey90", ylim2 = c(-15, 15)))   

# ok:
with(forex, dualplot(x1 = TimePeriod, y1 = NZDUSD, y2 = TWI, lwd = 1.2, colgrid = "grey90", 
                     main = "NZ dollar exchange rate & trade-weighted index",
                     ylab1 = "US dollars for one NZ dollar",
                     ylab2 = "Index",
                     yleg1 = "NZD / USD exchange rate (left axis)",
                     yleg2 = "Trade-weighted index (right axis)",
                     mar = c(5,6,3,6)))
mtext(side = 1, "Data from RBNZ; graphic by http://ellisp.github.io", 
	adj = 1, cex = 0.8, line = 3)

# ok again, equivalent to reference point for indexing of January 2014
with(forex, dualplot(x1 = TimePeriod, y1 = NZDUSD, y2 = TWI, lwd = 1.2, colgrid = "grey90", 
				 main = "NZ dollar exchange rate & trade-weighted index",
				 ylim.ref = c(361, 361), 
                 ylab1 = "US dollars for one NZ dollar",
				 ylab2 = "Index",
				 yleg1 = "NZD / USD exchange rate (left axis)",
				 yleg2 = "Trade-weighted index (right axis)",
				 mar = c(5,6,3,6)))
mtext(side = 1, "Data from RBNZ; graphic by http://ellisp.github.io", 
	adj = 1, cex = 0.8, line = 3)

#--------tidy up-----------
unlink("rbnz.xlsx")

Dual axes time series plots may be ok sometimes after all

At a glance:

Are they really as bad as all that?

Facets can sometimes make comparisons more difficult

Indexes don’t solve the problem fully

Connected scatterplots are cluttered, and too complex for most audiences

How not to use dual axes

Don’t muck up one of the scales

Don’t use two completely conceptually different scales (and in particular, don’t scrawl growth rates all over the original data)

The choice to use them

Good practice when you do use them

The `dualplot()` function

Code

Dual axes time series plots may be ok sometimes after all

At a glance:

Are they really as bad as all that?

Facets can sometimes make comparisons more difficult

Indexes don’t solve the problem fully

Connected scatterplots are cluttered, and too complex for most audiences

How not to use dual axes

Don’t muck up one of the scales

Don’t use two completely conceptually different scales (and in particular, don’t scrawl growth rates all over the original data)

The choice to use them

Good practice when you do use them

The dualplot() function

Code

The `dualplot()` function