Thursday, July 25, 2013

ggplot2 axis mark labels: dates

I banged my head on this for a while, so I'll drop things here for future use.

I had a dataset like this (in case someone wants to reproduce things).

dset <- structure(list(period = structure(c(15340, 15340, 15431, 15431, 
                                            15522, 15522, 15614, 15614, 15706, 15706, 15340, 15340, 15431, 
                                            15431, 15522, 15522, 15614, 15614, 15706, 15706), class = "Date"), 
                       mylevel = c("first", "second", "first", "second", "first", 
                                   "second", "first", "second", "first", "second", "first", 
                                   "second", "first", "second", "first", "second", "first", 
                                   "second", "first", "second"), mygroup = c("first", "first", 
                                                                             "first", "first", "first", "first", "first", "first", "first", 
                                                                             "first", "second", "second", "second", "second", "second", 
                                                                             "second", "second", "second", "second", "second"), myval = c(5.68927789934355, 
                                                                                                                                          12.4668435013263, 8.50574712643678, 9.21052631578947, 5.79964850615114, 
                                                                                                                                          11.864406779661, 6.63507109004739, 8.27067669172932, 7.60233918128655, 
                                                                                                                                          10.3030303030303, 11.5713243721996, 12.868193989884, 10.9409799554566, 
                                                                                                                                          12.2498118886381, 10.3649843170801, 13.6053288925895, 10.0093381580483, 
                                                                                                                                          13.4111885477491, 10.2109430074291, 11.9337016574586), mylower = c(4.27785665208779, 
                                                                                                                                                                                                             9.30578687083796, 6.73766221814479, 6.20777175527346, 4.02547019162226, 
                                                                                                                                                                                                             8.40474543043271, 5.05063053061992, 5.25570937453824, 5.91602875207158, 
                                                                                                                                                                                                             7.24127641498182, 11.1220283594508, 12.0765169377083, 10.4880238235312, 
                                                                                                                                                                                                             11.4706823400243, 9.86527996838732, 12.7476946734056, 9.5638481418534, 
                                                                                                                                                                                                             12.505411895047, 9.77751066855676, 11.14516485388), myupper = c(7.39412079167001, 
                                                                                                                                                                                                                                                                             16.2315194276058, 10.5605825092892, 13.036419049039, 8.04876321621977, 
                                                                                                                                                                                                                                                                             16.112730355332, 8.53001582404356, 12.2542112131, 9.58747177688514, 
                                                                                                                                                                                                                                                                             14.0992747034508, 12.0322972656053, 13.6922322688037, 11.4066255686044, 
                                                                                                                                                                                                                                                                             13.0622336650376, 10.8811950353039, 14.4984825068148, 10.4684536436971, 
                                                                                                                                                                                                                                                                             14.3573893347113, 10.6569666040485, 12.7574674651682)), .Names = c("period", 
                                                                                                                                                                                                                                                                                                                                                "mylevel", "mygroup", "myval", "mylower", "myupper"), row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                                                    20L), class = "data.frame")
Created by Pretty R at inside-R.org

I needed a chart like the following:




Which I quickly obtained with the code:

library(ggplot2)
ggplot(data=dset, aes(x=period, y=myval)) +
  geom_line(lwd=1, aes(col=mygroup)) +
  geom_ribbon(aes(ymin=mylower, ymax=myupper, fill=mygroup), alpha=.5) +
  facet_grid(mylevel ~ .) +
  xlab("period") +
  ylab("value") +
  theme(legend.title = element_blank())
Created by Pretty R at inside-R.org

(Note: I have an Italian locale--the months displayed are January, April, July and January again)

Since the period column is actually the first day of a quarter, I wanted to change the labels on the x-axis to quarters, with the help of the zoo package.

However, when I submitted the code:

library(zoo)
ggplot(data=dset, aes(x=period, y=myval)) +
  geom_line(lwd=1, aes(col=mygroup)) +
  geom_ribbon(aes(ymin=mylower, ymax=myupper, fill=mygroup), alpha=.5) +
  facet_grid(mylevel ~ .) +
  xlab("period") +
  ylab("value") +
  scale_x_continuous(breaks=unique(dset$period)
                     , labels=dset$dtFrom) +
  theme(legend.title = element_blank())
Created by Pretty R at inside-R.org


I got the following error:

Error: Discrete value supplied to continuous scale

After some time of trials and errors, I discovered a way to get to my desired results, which requires the conversion of dates to numbers.

ggplot(data=dset, aes(x=as.integer(period), y=myval)) +
  geom_line(lwd=1, aes(col=mygroup)) +
  geom_ribbon(aes(ymin=mylower, ymax=myupper, fill=mygroup), alpha=.5) +
  facet_grid(mylevel ~ .) +
  xlab("period") +
  ylab("value") +
  scale_x_continuous(breaks=as.integer(unique(dset$period))
                     , labels=unique(as.yearqtr(dset$dtFrom))) +
  theme(legend.title = element_blank())
Created by Pretty R at inside-R.org

And here's my desired result.


I don't know if there is a less awkward way to get the chart above. Let me know in the comments below if I missed something obvious.