Sunday, May 18, 2014

T-Shirts ... designed with R!

On Friday, I saw David Smith's post on a competition to design this year's useR! conference t-shirt. The goal is to create a design generated using an R script, which will be featured on the back of the shirt.

Having a bit of time this weekend, I decided to try plotting the R logo, using base graphics, represented by a scatter of points – one for each package published on CRAN. Having recently posted on a very similar idea for visualizing twitter followers, I realized I could take advantage of my past code. With a bit of tweaking, I came up with the image above.

The code I used is below – you should feel free to tweak / improve / experiment with it! Before running, you'll need to install the EBImage package available on bioconductor. Roughly, the script works by first downloading a copy of the R logo (this step might make the entry illegal for the purposes of the contest..), as well as the current number of R packages. Next, there are a few functions to simplify the colors presented in the image – this part probably isn't necessary, but I think it makes the final result look a bit better. Finally, the image is actually generated by sampling pixels from the modified image, and replotting.

If you're interested in trying out your own ideas (which you definitely should!) you can submit entires to the contest as pull requests on Github.

# Script by Mark T Patterson
# May 17, 2014
# twitter: @M_T_Patterson
 
# General Notes:
 
# This script creates an image of the R logo 
# represented by n points, 
# where n is the current number of packages on CRAN
 
 
# note: this script requries the EBImage package
# available from bioconductor:
# http://bioconductor.wustl.edu/bioc/html/EBImage.html
 
# approximate run time: 2 mins
 
#### initialize ####
 
# clear workspace
rm(list = ls())
 
 
# load libraries
library(EBImage)
 
# coordinate the version of the program:
set.seed(2014)
 
#### gather web data: reference image and CRAN package count ####
 
# load the R logo, save the rgb values:
img = readImage("http://www.thinkr.spatialfiltering.com/images/Rlogo.png")
img.2 = img[,,1:3]
 
cran.site = "http://cran.r-project.org/web/packages/"
lns = readLines(cran.site)
ref.line = grep(lns, pattern = "CRAN package repository features")
package.count = as.numeric(strsplit(lns[ref.line],split = "\\s")[[1]][7])
 
 
#### helper functions ####
 
# functions for color simplification:
num.to.let = function(x1){
  ref.dat = data.frame(num = 10:15, let = LETTERS[1:6])
  out = as.character(x1)
  if(x1 %in% 10:15){out = as.character(ref.dat$let[which(ref.dat$num == x1)])}
  return(out)
}
 
rgb.func = function(vec){
  #note: vec is a triple of color intensities
  r1 = floor(255*vec[1])
  g1 = floor(255*vec[2])
  b1 = floor(255*vec[3])
 
  x1 = r1 %/% 16
  x2 = r1 %% 16
  x3 = g1 %/% 16
  x4 = g1 %% 16
  x5 = b1 %/% 16
  x6 = b1 %% 16
 
  x1 = num.to.let(x1)
  x2 = num.to.let(x2)
  x3 = num.to.let(x3)
  x4 = num.to.let(x4)
  x5 = num.to.let(x5)
  x6 = num.to.let(x6)
 
  out = paste("#",x1,x2,x3,x4,x5,x6, sep = "")
  return(out)
 
}
 
 
im.func.1 = function(image, k.cols = 5, samp.val = 3000){
  # creating a dataframe:
  test.mat = matrix(image,ncol = 3)
  df = data.frame(test.mat)
  colnames(df) = c("r","g","b")
  df$y = rep(1:dim(image)[1],dim(image)[2])
  df$x = rep(1:dim(image)[2], each = dim(image)[1])
 
  samp.indx = sample(1:nrow(df),samp.val)
  work.sub = df[samp.indx,]
 
  # extracting colors:
  k2 = kmeans(work.sub[,1:3],k.cols)
 
  # adding centers back:
  fit.test = fitted(k2)
 
  work.sub$r.pred = fit.test[,1]
  work.sub$g.pred = fit.test[,2]
  work.sub$b.pred = fit.test[,3]
 
  return(work.sub)
 
}
 
add.cols = function(dat){
  apply(dat,1,rgb.func)
}
 
# general plotting function
plot.func = function(dat){
  # assumes dat has colums x, ym cols
  plot(dat$y,max(dat$x) - dat$x, col = dat$cols,
       main = "A point for each CRAN package",
       xaxt='n',
       yaxt="n",
       xlab = "useR!",
       ylab = "2014",
       cex.lab=1.5,
       cex.axis=1.5,
       cex.main=1.5,
       cex.sub=1.5)
}
 
#### simplify colors; sample n points ###
 
temp = im.func.1(img.2, samp.val = 25000, k = 12)
temp$cols = add.cols(temp[,6:8])
 
final = temp[sample(1:nrow(temp), package.count),]
 
 
#### generate plot ####
 
plot.func(final)

Created by Pretty R at inside-R.org

2 comments:

  1. A blog must be connected to the person in need. It is really important to understand the actual feel of such necessity and the essence of objective behind it. Author must give proper time to understand every topic before writing it.
    polo shirts

    ReplyDelete
  2. When we see the logos of Toyota, Nike, Target, BMW, Mercedes, and so on we in a split second remember them and realize which organization every logo speaks to. logo design service

    ReplyDelete