Declining sea ice in the Arctic

24 Nov 2016

At a glance:

Adding a (small amount of) polish to a well known chart of seasonal Arctic sea declining over the years.

A number of data visualisations are circulating showing the disturbing rise in temperature at the North Pole and drop in coverage of Arctic sea ice. The current level of interest is credited to a tweet from Zack Labe, whose Twitter page is a great source of interesting visualisations on sea ice. Secondary examples chosen more or less at random include James Renwick’s daily tweet which also combines the Arctic and Antarctic extent, the Washington Post, and the Economist. We even have ‘meta’ articles like this really useful one in The Verge.

I wanted to familiarise myself with the data - published by the USA’s National Snow and Ice Data Center - to see if the presentations are reasonable and faithful (spoiler - they are). As there’s some mild controversy about adding together the Antarctic and Arctic sea ice levels I focused on just one and chose the Arctic. I think “sea ice” is an easier concept to deal with in the Arctic, without the complication of ice-covered polar land mass in the south.

Improving a graphic

Here’s my own graph of the data:


Some of the improvements I’ve attempted from the many - all of them excellent - prior versions include:

  • Include each individual year, not just a summary of +/- two standard deviations and 2012 picked out. Many of the graphics have a grey area showing the “usual” past distribution, but miss the chance to show the steady downwards decline over the years.
  • Colour code years in a sequential fashion. Versions that I’ve seen that colour code the years do so in a way that isn’t easy to follow the progress of time (eg rainbow colours), and I chose instead the carefully crafted viridis colours scheme.
  • Carefully design the legend to make it easy for the eye to follow the downwards pattern in time - there’s no point in having each individual colour for 30+ years listed in the legend.
  • Include all the data for each year, not just five or so months (a problem with a couple of the circulating examples).
  • Include as many years’ data as are available since regular estimates are available (bi-monthly at first, then monthly).
  • Careful use of transparency on all the years other than the latest year, to make it stand out.
  • A bit more detail on the source of the data on the graphic itself.
  • Minimal use of tick marks, axes, etc.

Not big things, and many of the graphics out there already have some of these, but the careful combination of them all makes a nice graphic and did help me get my head around the data. The data are well curated, uncontroversial in themselves, and readily available. The R code at the bottom of this post should work end to end for downloading the data and reproducing the image.

Time series analysis

Presented with decades worth of daily data, I couldn’t resist pulling out the time series analytical toolkit. To start with, here’s a seasonal decomposition down with the help of Cleveland’s stl loess-based seasonal decomposition, and my ggseas package which integrates it into a ggplot2 graphic workflow. The stl algorithm struggles to distinguish between trend and randomness for the first year of the data, then picks up a compelling story:


The downwards trend in the last few decades is if anything even more prominent in this graphic.

I also did a two year forecast using the seasonal auto-regressive integrated moving average (ARIMA) modelling approach. I adapt the approach set out by Rob Hyndman in a post on forecasting daily data, of using a regular Fourier cycle of artificial variables rather than relying on lagging variables by 365.25 days. This seems to work ok and gets plausible if dispiriting results:

  • a likely record low peak Arctic sea ice in March 2017 after the coming cold season
  • a non-zero chance that for a brief moment in 2018 there will be zero Arctic sea ice at all.


Ho hum. Melting sea ice doesn’t directly impact on sea level (because the ice is already floating on the water before it melts), but is a huge warning sign of serious climate change happening as we speak. Loss of sea ice reflection of the suns rays is also a warming factor itself (so I understand - I’m an amateur here). Plus it’s sad for the Arctic eco-system.

This blog post was only possible because of open data from the amazing USA National Snow and Ice Data Centre with extensive datasets from NASA and others.

Here’s the R code that did all the above (minus some fonts); it should work out of the box:

library(lubridate) # for yday


mon <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")

# This is the latest incomplete year's "near real time" data:
              destfile = "seaice_nrt.csv")

# And this is the earlier, fully definitive years' data
              destfile = "seaice_final.csv")

seaice_nrt <- read.csv("seaice_nrt.csv", skip = 2, header = FALSE)[ , 1:5]
seaice_final <- read.csv("seaice_final.csv", skip = 2, header = FALSE)[ , 1:5]

seaice <- rbind(seaice_final, seaice_nrt)
names(seaice) <- c("year", "month", "day", "extent", "missing")

# test none missing:
expect_equal(sum(seaice$missing == 0), nrow(seaice))
seaice <- seaice %>%
   mutate(date = as.Date(paste(year, month, day, sep = "-"))) %>%
   group_by(month) %>%
   mutate(monthday = month + day / max(day)) %>%
   ungroup() %>%
   mutate(month = factor(month, labels = mon)) %>%
   arrange(year, month, day) %>%
   mutate(timediff = c(NA, diff(date)),
          dayofyear = yday(date))

# clean up (unless you want to keep the csvs)

# try to improve on

# difference between area and extent explained at
# Basically, scientists prefer extent, and that is the only one in the daily data (monthly has area too).

# There is a trick in the below, equally spacing the months.  This only slightly warps
# the visual though(makes February look about 8% longer than it is), not noticeable
seaice %>%
   mutate(latestyear = (year == max(year))) %>%
   ggplot(aes(x = monthday, y = extent, group = as.factor(year), 
			  colour = year, alpha = latestyear)) +
   geom_line() +
   scale_alpha_discrete(range = c(0.25, 1), guide = "none") +
   scale_color_viridis("", direction = -1, guide = guide_legend(reverse = FALSE))  +
   scale_x_continuous(breaks = 1:12, labels = mon) +
   theme_tufte() +
   theme(legend.position = c(0.2, 0.3)) +
   # x coordinate in the next line is a magic number, would need manual updating if done with more data:
   annotate("text", x = 12, y = seaice[nrow(seaice), ]$extent, label = maxyear) +
   ggtitle("The extent of Arctic sea ice has been steadily declining...",
           subtitle = "...but this year is special.") +
   labs(y = "Area of ocean with at least 15% sea ice \n(millions of square kilometres)",
        x = "",
        caption = "Source: USA National Snow and Ice Data Center\n") +
   theme(plot.caption = element_text(colour = "grey50"))

#===============timeseries analysis==========
# the data is not actually all daily but starts as every second day until about 1987.
# Latest date that is 2 days ahead of the previous date:
seaice %>%
   filter(timediff == 2)  %>%
   filter(date == max(date)) %>%

seaice_daily <- seaice %>%
   filter(date > as.Date("1987-08-20")) 

# Decomposition:
ggsdc(seaice_daily, aes(x = date, y = extent), method = "stl", 
		s.window = 7, frequency = 365.25) +
   geom_line() +
   ggtitle("Seasonal decomposition of Arctic seasonal ice coverage",
           subtitle = "Some difficult patterns when daily data was new, then a steady decline") +
   labs(x = "", y = "Sea ice extent\n",
        caption = "Analysis by; data from National Snow and Ice Data Center")

# Forecasting:		
# adapting the method at
# going to pretend no frequency in the call to ts we use later in auto.arima
seaice_ts1 <- ts(seaice_daily$extent, frequency = 1, start = 1987 + (8 + 20 / 31) / 12 )
seaice_ts365 <- ts(seaice_daily$extent, frequency = 365.25, start = 1987 + (8 + 20 / 31) / 12 )

# and instead create a set of fourier seasonal terms
z <- fourier(seaice_ts365, K = 5)
zf <- fourier(seaice_ts365, K = 5, h = 2 * 365)

# Fit model:
model <- auto.arima(seaice_ts1, xreg = z)

# Perform forecast:
seaice_fc <- forecast(model, 2 * 365, xreg = zf)

# Graph results:
autoplot(seaice_fc) + 
      labs(x = "", y = "Arctic ice extent") +
      ggtitle("Forecast Actic ice coverage to late 2018") +
      labs(caption = "Analysis by; data from National Snow and Ice Data Center",
           x = "Days since 20 August 1987 (when daily measurements began)")