Monday, August 26, 2013

Georouting with Rome2rio's API

As I was looking for transit routes for a trip, I stumbled on the Rome2Rio website, which provides tons of results when you query for a trip. It provides directions for several travel modes (flights, trains/buses, driving), and it currently gathers transit info from more providers than Google Maps.

The great thing is they also provide an API, so it's quite straightforward to access their data from R.

Just sign up for an API key on their website, and the following few lines are what you need for a first exploration.

library(RJSONIO)
r2rK <- "<your-API-key-here>"
r2rS <- "<server-address-here>"
 
from <- "<origin-location-here>"
to <- "<destination-location-here>"
 
url_string <- URLencode(paste0("http://", r2rS, "/api/1.2/json/Search?key=", r2rK, "&oName=", from, "&dName=", to, "&currency=EUR"))
 
trip <- fromJSON(paste(readLines(url_string), collapse = ""))
Created by Pretty R at inside-R.org


You need to provide your key and the server address (both provided by Rome2Rio when you sign up), and the origin and destination (simply type a city name, it will be geocoded!).

Check the API documentation if you need help to understand what's inside the trip object.
Note that paths are encoded Google Maps style... but you already know how to deal with that!

Thursday, July 25, 2013

ggplot2 axis mark labels: dates

I banged my head on this for a while, so I'll drop things here for future use.

I had a dataset like this (in case someone wants to reproduce things).

dset <- structure(list(period = structure(c(15340, 15340, 15431, 15431, 
                                            15522, 15522, 15614, 15614, 15706, 15706, 15340, 15340, 15431, 
                                            15431, 15522, 15522, 15614, 15614, 15706, 15706), class = "Date"), 
                       mylevel = c("first", "second", "first", "second", "first", 
                                   "second", "first", "second", "first", "second", "first", 
                                   "second", "first", "second", "first", "second", "first", 
                                   "second", "first", "second"), mygroup = c("first", "first", 
                                                                             "first", "first", "first", "first", "first", "first", "first", 
                                                                             "first", "second", "second", "second", "second", "second", 
                                                                             "second", "second", "second", "second", "second"), myval = c(5.68927789934355, 
                                                                                                                                          12.4668435013263, 8.50574712643678, 9.21052631578947, 5.79964850615114, 
                                                                                                                                          11.864406779661, 6.63507109004739, 8.27067669172932, 7.60233918128655, 
                                                                                                                                          10.3030303030303, 11.5713243721996, 12.868193989884, 10.9409799554566, 
                                                                                                                                          12.2498118886381, 10.3649843170801, 13.6053288925895, 10.0093381580483, 
                                                                                                                                          13.4111885477491, 10.2109430074291, 11.9337016574586), mylower = c(4.27785665208779, 
                                                                                                                                                                                                             9.30578687083796, 6.73766221814479, 6.20777175527346, 4.02547019162226, 
                                                                                                                                                                                                             8.40474543043271, 5.05063053061992, 5.25570937453824, 5.91602875207158, 
                                                                                                                                                                                                             7.24127641498182, 11.1220283594508, 12.0765169377083, 10.4880238235312, 
                                                                                                                                                                                                             11.4706823400243, 9.86527996838732, 12.7476946734056, 9.5638481418534, 
                                                                                                                                                                                                             12.505411895047, 9.77751066855676, 11.14516485388), myupper = c(7.39412079167001, 
                                                                                                                                                                                                                                                                             16.2315194276058, 10.5605825092892, 13.036419049039, 8.04876321621977, 
                                                                                                                                                                                                                                                                             16.112730355332, 8.53001582404356, 12.2542112131, 9.58747177688514, 
                                                                                                                                                                                                                                                                             14.0992747034508, 12.0322972656053, 13.6922322688037, 11.4066255686044, 
                                                                                                                                                                                                                                                                             13.0622336650376, 10.8811950353039, 14.4984825068148, 10.4684536436971, 
                                                                                                                                                                                                                                                                             14.3573893347113, 10.6569666040485, 12.7574674651682)), .Names = c("period", 
                                                                                                                                                                                                                                                                                                                                                "mylevel", "mygroup", "myval", "mylower", "myupper"), row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                                                    20L), class = "data.frame")
Created by Pretty R at inside-R.org

I needed a chart like the following:




Which I quickly obtained with the code:

library(ggplot2)
ggplot(data=dset, aes(x=period, y=myval)) +
  geom_line(lwd=1, aes(col=mygroup)) +
  geom_ribbon(aes(ymin=mylower, ymax=myupper, fill=mygroup), alpha=.5) +
  facet_grid(mylevel ~ .) +
  xlab("period") +
  ylab("value") +
  theme(legend.title = element_blank())
Created by Pretty R at inside-R.org

(Note: I have an Italian locale--the months displayed are January, April, July and January again)

Since the period column is actually the first day of a quarter, I wanted to change the labels on the x-axis to quarters, with the help of the zoo package.

However, when I submitted the code:

library(zoo)
ggplot(data=dset, aes(x=period, y=myval)) +
  geom_line(lwd=1, aes(col=mygroup)) +
  geom_ribbon(aes(ymin=mylower, ymax=myupper, fill=mygroup), alpha=.5) +
  facet_grid(mylevel ~ .) +
  xlab("period") +
  ylab("value") +
  scale_x_continuous(breaks=unique(dset$period)
                     , labels=dset$dtFrom) +
  theme(legend.title = element_blank())
Created by Pretty R at inside-R.org


I got the following error:

Error: Discrete value supplied to continuous scale

After some time of trials and errors, I discovered a way to get to my desired results, which requires the conversion of dates to numbers.

ggplot(data=dset, aes(x=as.integer(period), y=myval)) +
  geom_line(lwd=1, aes(col=mygroup)) +
  geom_ribbon(aes(ymin=mylower, ymax=myupper, fill=mygroup), alpha=.5) +
  facet_grid(mylevel ~ .) +
  xlab("period") +
  ylab("value") +
  scale_x_continuous(breaks=as.integer(unique(dset$period))
                     , labels=unique(as.yearqtr(dset$dtFrom))) +
  theme(legend.title = element_blank())
Created by Pretty R at inside-R.org

And here's my desired result.


I don't know if there is a less awkward way to get the chart above. Let me know in the comments below if I missed something obvious.

Sunday, June 16, 2013

ggmap + animation = fly from Bologna to Copenhagen

This is something that one should do with something different than R.
I was looking for a free tool for creating a video from Google Maps: my goal was to input a start location, an end location, and obtaining a video zooming out from the start location then zooming in to the end location.

Not having found anything of my liking, I turned to R and a couple of its packages: ggmap and animation.

Surprisingly I did not need too many lines to get something close to the desired results.
Here's what I have done.

# load the packages
library(animation)
library(ggmap)
 
# you need ffmpeg.exe installed to obtain an .mp4 video!
# tell R where to find it on your PC
oopts <- ani.options(ffmpeg = "~your/path/to/ffmpeg.exe")
 
# starting location
startLoc <- "Aeroporto Marconi, Bologna, Italy"
# destination
endLoc <- "Kastrup Airport, Denmark"
 
# minimum level of detail (how high do you want to fly?)
minDetail <- 4
 
# maximum level of detail 
maxDetail <- 18
 
# convert location do lat/lon coordinates
startCoord <- geocode(startLoc)
endCoord <- geocode(endLoc)
 
# save video in .mp4 format (it takes a while to finish!)
saveVideo({
  # from ground to sky...
  for(i in maxDetail:minDetail){
    map <- get_map(location=c(startCoord$lon, startCoord$lat), zoom=i, maptype="satellite")
    print(ggmap(map, extent="device"))
  }  
  # ...from sky to ground
  for(i in minDetail:maxDetail){
    map <- get_map(location=c(endCoord$lon, endCoord$lat), zoom=i, maptype="satellite")
    print(ggmap(map, extent="device"))
  }
}, video.name="BLQ_CPH.mp4", other.opts = "-b 1200k"  # set the file name and the video bitrate
          , outdir="~my/video/folder")  # set the folder where you want to save the video
Created by Pretty R at inside-R.org


And here's a sample result, taking us from Bologna Airport to Kastrup Airport, near Copenhagen.



 I'd certainly like to have a smoother animation, but for that some more rolling-up-the-sleeves is needed!

Feel free to tell me what you think and to improve the code.

Thursday, February 21, 2013

Melbourne R Users Youtube Channel

The Melbourne R Users Youtube Channel contains a lot of videos of R presentations.

It currently has stuff on, among others, reproducible research, package writing, spatial data, forecasting time series, ...

I should certainly watch several of them!

Wednesday, January 9, 2013

Cluster generation

I could find this data handy when building model for pitch classification.

clusterGeneration: random cluster generation (with specified degree of separation).

Thursday, January 3, 2013

Picture to LEGO!

If you have never been to Legoland you have missed a lot.
I was in Billund a couple of summers ago, on my way to a vacation in Copenhagen, and spent the night at the hotel inside Legoland Park.

Here is a picture of me beside Mona Lisa, Lego version, featured in the hotel hall.



And here (actually in a few paragraphs) is R code to translate a picture into a Lego painting.

A few notes before the actual code.

  1. I have heavily drawn from codes posted at is.R(), particularly from this post and this one.
  2. I limited the colors for the resulting Lego painting to the ones available on the Pick a Brick Lego Shop. I created one png file for each color: here is the zip containing them all (I could have specified RGB values directly in R.)
  3. The code only uses 1x2 and 1x1 plate pieces (or 2x2 and 1x2 if you want your Lego painting ticker.) That was to avoid over-complication. However I did not neglect the basics of Lego construction, so identical pieces are not on top of one another.
  4. It requires some time to run, so try it on small pictures first.
  5. It does not always produce satisfying results.
  6. Several improvements can be done on the code, but right now I'm happy with it. Should you make some changes, let everybody know.
# package loading
library(png)
library(ReadImages)
library(reshape)
 
# image to be rendered as a Lego painting
sourceImage <- "C:/yourFolder/monalisa.jpg"
 
# folder containing color files (provided in the zip above)
colFolder <- "C:/yourFolder/LegoColorFiles"
 
# files with the lego colors 
pngURLs <- list.files(colFolder, full.names=T)
pngCOLs <- list.files(colFolder, full.names=F)
 
# loading the color files into a list
pngList <- list()
for(ii in 1:length(pngURLs)){
  tempName <- paste("Pic", ii)
  tempPNG <- readPNG(pngURLs[ii]) 
  pngList[[tempName]] <- tempPNG 
}
 
# calculating the mean RGB values for each color file
# should actually be useless 
meanRGB <- t(sapply(pngList, function(ll){
  apply(ll[, , -4], 3, mean)[1:3]
}))
 
# reading image
readImage <- read.jpeg(sourceImage)
 
# processing image
longImage <- melt(readImage)
rgbImage <- reshape(longImage, timevar = "X3",
                    idvar = c("X1", "X2"), direction = "wide")
rgbImage[, 1:2] <- rgbImage[, 2:1]
rgbImage[,2] <- dim(readImage)[1] - rgbImage[,2] + 1
  #plot(rgbImage[, 1:2], col = rgb(rgbImage[, 3:5]), pch = 19, asp = 1)
 
# "Snap" colors to a smaller number of nearby colors
nNearbyCol <- 3
rgbImage[, c(3:5)] <- round(rgbImage[, c(3:5)] * nNearbyCol) / nNearbyCol
plot(rgbImage[, 1:2], col = rgb(rgbImage[, 3:5]), pch = 19, asp = 1)
 
 
# set image size 
pictureWd <- dim(readImage)[2]
pictureHt <- dim(readImage)[1]
canvasHt <- 600 # change this to improve LEGO resolution (it's the height of the Lego painting in millimeters)
canvasWd <- round(canvasHt / pictureHt * pictureWd)
 
# calculating the number of Lego pieces needed
piecesWd <- round(10 * canvasWd / 158)
piecesHt <- round(10 * canvasHt / 32)
 
 
# odd/even function
is.odd <- function(x) x %% 2 == 1
 
# empty plot
plot(c(0,piecesWd), c(0,piecesHt), type = "n", asp = 32/158)
 
# draw bricks and keep count of them
pieces <- data.frame(size=numeric(0), color=character(0))
for(ht in 1:piecesHt){
  cat(paste("doing line:", ht, "of", piecesHt, "\n"))
  flush.console()
  if(is.odd(ht)){
    for(wd in 1:piecesWd){
      #relative brick area
      brick <- c((wd-1)/piecesWd, (ht-1)/piecesHt, wd/piecesWd, ht/piecesHt)
      #area on the picture
      area <- ceiling(c(brick[1] * pictureWd, brick[2] * pictureHt, brick[3] * pictureWd, brick[4] * pictureHt))
      #picture portion
      picPort <- subset(rgbImage, X1 %in% ((area[1]:area[3])+1) & X2 %in% ((area[2]:area[4])+1))
      #portion color
      portCol <- c(mean(picPort$value.1),mean(picPort$value.2),mean(picPort$value.3))
      #closest Lego color
      nearestPic <- which.min(rowSums(sweep(meanRGB, 2, portCol)^2))
      #keep note of the piece
      if(length(nearestPic)) piece <- data.frame(size = 2, color = pngCOLs[nearestPic])
      pieces <- rbind(pieces, piece)
      #plot
      rect(wd-1, ht-1, wd, ht, col=rgb(meanRGB[nearestPic,1]
                          , meanRGB[nearestPic,2]
                          , meanRGB[nearestPic,3])
           , border = "grey70"
           )
     }
   } else {
     for(wd in seq(.5, piecesWd+1, 1)){
 
       #relative brick area
       brick <- c((wd-1)/piecesWd, (ht-1)/piecesHt, (wd+1)/piecesWd, (ht+1)/piecesHt)
       #area on the picture
       area <- ceiling(c(brick[1] * pictureWd, brick[2] * pictureHt, brick[3] * pictureWd, brick[4] * pictureHt))
       #picture portion
       picPort <- subset(rgbImage, X1 %in% ((area[1]:area[3])+1) & X2 %in% ((area[2]:area[4])+1))
       #portion color
       portCol <- c(mean(picPort$value.1),mean(picPort$value.2),mean(picPort$value.3))
       #closest Lego color
       nearestPic <- which.min(rowSums(sweep(meanRGB, 2, portCol)^2))
       #keep note of the piece
       piece <- data.frame(size = ifelse(wd %in% c(.5, piecesWd+.5), 1, 2), color = pngCOLs[nearestPic])
       pieces <- rbind(pieces, piece)
 
       #plot
       rect(max(wd-1,0), ht-1, min(wd,piecesWd), ht, col=rgb(meanRGB[nearestPic,1]
                                        , meanRGB[nearestPic,2]
                                        , meanRGB[nearestPic,3])
            , border = "grey70"
       )
 
     }
   }
}
 
# how many bricks of each size/color do you need?
table(pieces$color, pieces$size)
Created by Pretty R at inside-R.org

And now, let's test it with Mona Lisa herself!

Here is the original picture I used.



Here is the result for a 600mm height (click to enlarge):

Not the worst thing in the world, but I would have hoped for better. It requires over 5,000 pieces and, at the current price at  Pick a Brick, would cost over 350€.

Doubling the height (and thus the Lego resolution) produces a better result—but at a very high cost.
Again, click to enlarge.

Enjoy!

PS: Let me know if you come up with nice looking Lego paintings—especially if you actually build them!