Thursday, December 27, 2012

Mixing symbols and characters in plots

When you pass integers to the pch parameter of a plot you get the nice R symbols drawn on your plot.
Alternatively you can pass characters and have them used as the plotting symbol.
But what if you want to mix symbols and characters?

Here it is: the example uses the first two symbols (circle and triangle) and the letter "c" as plotting symbols.


# generate some data
foo <- data.frame(x = rnorm(100, 0, 1), y = rnorm(100, 0, 1), grp = letters[1:3][c(runif(100, 1, 4))])
 
# plot them with lattice
library(lattice)
xyplot(y ~ x,data = foo, groups = foo
  , pch = c(1, 2, as.integer(charToRaw("c"))) 
)
Created by Pretty R at inside-R.org

The trick is done by the charToRaw function.

Sunday, December 16, 2012

Namespace fixing

This is a don't-try-this-at-home type of thing. However I needed to perform it and it may happen again, so I'll drop the note here for future reference.

The fixInNamespace function lets you modify objects in a given namespace.

Saturday, December 15, 2012

Google Maps API - decoding polylines for drawing routes

What happens when ggplot2 and GoogleMaps have babies?
Answer: the ggmap package, by David Kahle and Hadley Wickam.

A few weeks ago, the mapdist function in the above package happened to be very useful to me at work.
The magic it does is calculating distances between locations. And you can specify if you want the driving, walking or bicycling distance.

You know what happens when you show people a cat who can sing? They say "well, but he can't play the piano!"
That's me. In fact I wondered why there was no option for distance using public transit.

I looked at Google API documentation and got all the info I needed. It's just a matter of building the right URL and you get a JSON file that is easily read by R.

The JSON contains all one would look for, including distance and time required to cover it for all legs.
But my attention was caught by lines that looked like this:

[1] "es~mGe|scAEXpCBbHHtAE|ATpFnB`DjA~B~@tDlAbBh@p@R^F\\??HBNDLJJPBLCLIDKDQ?UEQAEPEhBk@bI}CRG@BB@HDJ?JEHKFM?GbAc@`DqAfA]xBo@dAKh@KfA]xAu@pA_AzAo@\\Mj@[XQrAkAxAwAzAiAf@St@Wl@ENER?n@?d@Fr@RlAp@h@LN?\\G`@[lAiA|@cAf@o@v@o@^SlBg@v@O^AZCn@?j@F`ATdAd@n@T^@f@Al@IvASvAc@jCwAv@W|Bi@f@QhAq@n@k@X_@Zq@xAaF|@mChAqCpCgHRi@Lm@Jm@PcAJ[h@e@rGgC\\KJ@f@FXLh@\\Zl@JRAB?F?LDP@DDDHFH@JAHEFKDM@OAQEMCC?g@Jm@tAkCdAqA\\WpCyA`Ay@Z_@Zc@TMtA[hEs@n@Kh@SbBmAfAUpGZhBBp@AVEbAa@dHyCjB}@nCqAfKyDnCoAlA_@|A_@~AWnAe@pAy@^U`Be@xGmAnBWbFc@pDSfJo@tIm@fBCnBDjBTzBj@bBn@d@JbANlCGpB]vAa@dAe@lBw@xAk@fA]^Gn@IfDKpBUhJuB~AG`A@lBHr@NRJn@^p@v@T\\x@jBfCnGh@~ApAtC`ChFtDrK\\z@n@t@h@h@f@VdAXnAD~BBjCL|@LpBv@zAx@nAkCENoEoBmDiBaC}AcBuAk@k@eAiA{AqBeR_]umG_aLqC_C_D}ByBuAmBcAyDiBgCaAeDeAaB_@iB[{BUsEQcFBmDP{r@bGsDTqA@cCKmC]iCs@iBs@mAo@a|Au}@cEmBsAi@yC_AcGmAaCYmCSoCIcC?cEL}V|AsBFoCGcD_@{\\yFcBQ_x@}EkBQgB[iAYcC_AwAu@iAu@}b@{\\uCcC{BwBwB_Cwb@gg@qC}CeDwCsD_CaCeAgBm@o^wJmh@kNoCm@{Bc@iG}@sIq@mGQ_NFyoAfBqDJoBRsB`@o@PgC`AqBdA_BfAsAhAwAvAiAvAkBrCmBxDw@pBeAfDmOlg@sFnPgQtl@_@bBq@rDwA`Kw@hGu@`Hs@tHgAdPg@bDJBdBsBB@tAVLBBc@ZkDf@}EZ_DjBf@HHF@^H^P~@X|B`@xHzAxFpAtDp@`HrA~GrA~FbA"
Created by Pretty R at inside-R.org

Again, it's all explained in Google API's documentation. All those funny looking characters are the coordinates of the route after going through an encoder.
I promptly found online some decoding functions written in Javascript (I can't retrieve the links anymore, so apologies to the authors for not crediting them—I'll try to make up for this,) and rolled up my sleeves to translate them in R.

Here's the decoding function I have come up with:


decodeLine <- function(encoded){
  require(bitops)
 
  vlen <- nchar(encoded)
  vindex <- 0
  varray <- NULL
  vlat <- 0
  vlng <- 0
 
  while(vindex < vlen){
    vb <- NULL
    vshift <- 0
    vresult <- 0
    repeat{
      if(vindex + 1 <= vlen){
        vindex <- vindex + 1
        vb <- as.integer(charToRaw(substr(encoded, vindex, vindex))) - 63  
      }
 
      vresult <- bitOr(vresult, bitShiftL(bitAnd(vb, 31), vshift))
      vshift <- vshift + 5
      if(vb < 32) break
    }
 
    dlat <- ifelse(
      bitAnd(vresult, 1)
      , -(bitShiftR(vresult, 1)+1)
      , bitShiftR(vresult, 1)
    )
    vlat <- vlat + dlat
 
    vshift <- 0
    vresult <- 0
    repeat{
      if(vindex + 1 <= vlen) {
        vindex <- vindex+1
        vb <- as.integer(charToRaw(substr(encoded, vindex, vindex))) - 63        
      }
 
      vresult <- bitOr(vresult, bitShiftL(bitAnd(vb, 31), vshift))
      vshift <- vshift + 5
      if(vb < 32) break
    }
 
    dlng <- ifelse(
      bitAnd(vresult, 1)
      , -(bitShiftR(vresult, 1)+1)
      , bitShiftR(vresult, 1)
    )
    vlng <- vlng + dlng
 
    varray <- rbind(varray, c(vlat * 1e-5, vlng * 1e-5))
  }
  coords <- data.frame(varray)
  names(coords) <- c("lat", "lon")
  coords
}
Created by Pretty R at inside-R.org




I tested the decodeLine function on my home-to-work route.

# set the origin, destination and travel mode
# if travel mode is "transit" you need to specify a departure (or arrival) time
origin <- "Sasso Marconi, Italy"
destination <- "Bologna, Italy"
travelMode <- "transit"
departureTime <- Sys.time() #I want to leave now!
 
# build the URL
baseUrl <- "http://maps.googleapis.com/maps/api/directions/json?"
origin <- gsub(" ", "+", origin)
destination <- gsub(" ", "+", destination)
finalUrl <- paste(baseUrl
                  , "origin=", origin
                  , "&destination=", destination
                  , "&sensor=false"
                  , "&mode=", travelMode
                  , "&departure_time=", as.integer(departureTime)
                  , sep = "")
 
# get the JSON returned by Google and convert it to an R list
url_string <- URLencode(finalUrl)
trip <- fromJSON(paste(readLines(url_string), collapse = ""))
 
# get the encoded coordinates for the full trip
tripPathEncoded <- trip$routes[[1]]$overview_polyline$points
tripPathEncoded
 
[1] "es~mGe|scAEXpCBbHHtAE|ATpFnB`DjA~B~@tDlAbBh@p@R^F\\??HBNDLJJPBLCLIDKDQ?UEQAEPEhBk@bI}CRG@BB@HDJ?JEHKFM?GbAc@`DqAfA]xBo@dAKh@KfA]xAu@pA_AzAo@\\Mj@[XQrAkAxAwAzAiAf@St@Wl@ENER?n@?d@Fr@RlAp@h@LN?\\G`@[lAiA|@cAf@o@v@o@^SlBg@v@O^AZCn@?j@F`ATdAd@n@T^@f@Al@IvASvAc@jCwAv@W|Bi@f@QhAq@n@k@X_@Zq@xAaF|@mChAqCpCgHRi@Lm@Jm@PcAJ[h@e@rGgC\\KJ@f@FXLh@\\Zl@JRAB?F?LDP@DDDHFH@JAHEFKDM@OAQEMCC?g@Jm@tAkCdAqA\\WpCyA`Ay@Z_@Zc@TMtA[hEs@n@Kh@SbBmAfAUpGZhBBp@AVEbAa@dHyCjB}@nCqAfKyDnCoAlA_@|A_@~AWnAe@pAy@^U`Be@xGmAnBWbFc@pDSfJo@tIm@fBCnBDjBTzBj@bBn@d@JbANlCGpB]vAa@dAe@lBw@xAk@fA]^Gn@IfDKpBUhJuB~AG`A@lBHr@NRJn@^p@v@T\\x@jBfCnGh@~ApAtC`ChFtDrK\\z@n@t@h@h@f@VdAXnAD~BBjCL|@LpBv@zAx@nAkCENoEoBmDiBaC}AcBuAk@k@eAiA{AqBeR_]umG_aLqC_C_D}ByBuAmBcAyDiBgCaAeDeAaB_@iB[{BUsEQcFBmDP{r@bGsDTqA@cCKmC]iCs@iBs@mAo@a|Au}@cEmBsAi@yC_AcGmAaCYmCSoCIcC?cEL}V|AsBFoCGcD_@{\\yFcBQ_x@}EkBQgB[iAYcC_AwAu@iAu@}b@{\\uCcC{BwBwB_Cwb@gg@qC}CeDwCsD_CaCeAgBm@o^wJmh@kNoCm@{Bc@iG}@sIq@mGQ_NFyoAfBqDJoBRsB`@o@PgC`AqBdA_BfAsAhAwAvAiAvAkBrCmBxDw@pBeAfDmOlg@sFnPgQtl@_@bBq@rDwA`Kw@hGu@`Hs@tHgAdPg@bDJBdBsBB@tAVLBBc@ZkDf@}EZ_DjBf@HHF@^H^P~@X|B`@xHzAxFpAtDp@`HrA~GrA~FbA"
 
 
# decode the encoded coordinates
tripPathCoords <- decodeLine(tripPathEncoded)
head(tripPathCoords)
 
       lat      lon
1 44.39875 11.24819
2 44.39878 11.24806
3 44.39805 11.24804
4 44.39659 11.24799
5 44.39616 11.24802
6 44.39569 11.24791
 
# draw a map
library(ggmap)
map <- get_map(location = "Bologna, Italy")
ggmap(map) +
  geom_path(aes(lon, lat), data=tripPathCoords, lwd = 2)
Created by Pretty R at inside-R.org



Here it is! Actually I take a shorter path when I commute, but the one shown here is not a fault of the decodeLine function—for some reason the train line I use is not available on GoogleMaps (despite it's run by the same company running the displayed one!)

While writing this post, I discovered someone else had already invented this wheel, so you can use Diego Valle's code if you prefer.

Friday, December 14, 2012

Fuzzy clustering

On the 13th day of the R-advent (by is.R()), a post on fuzzy clustering.
I'm sure one day I'll need this (maybe for pitch classification,) so I leave a trace of it here.

Fuzzy clustering with fanny()

Thursday, December 13, 2012

Roger Peng's Coursera videos

While I have taken a couple of Coursera courses, I did not enroll in the Computing for Data Analysis one.

Well, that course is over, it will likely be available again in the future at Coursera, but the lesson are one click away on Youtube, at Roger Peng's page.

Hat tip: David Smith.

By the way, Coursera ROCKS!

Wednesday, December 12, 2012

Welcome to my sandbox

I have a simple goal with this blog: keeping my R stuff in a single place.
Thus I'm basically talking to my future self. But if the commutative property holds for the Schwab/Claerbout sentence below the blog logo, I'm also talking to anyone passing by, so feel free to say what you think in the comments.

Speaking of the logo, it just says what this blog is: a space for R notes.

Bottom line. The only promise I make is to myself: anytime I find some R code that solves a problem I'm facing (or promises to be useful in the future,) I'm putting it here instead of saving a .R file I won't be able to retrieve when I need it.

And maybe I can make some friends along the way.