tag:blogger.com,1999:blog-89734395346448455612024-03-13T09:36:45.921-07:00Decisions and RMark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.comBlogger24125tag:blogger.com,1999:blog-8973439534644845561.post-68107971801748532112015-02-08T12:21:00.000-08:002015-02-08T12:21:41.541-08:00Morse Code Converter<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5_cjPvlMzkxEYDrRoruNhRXTfcoSKHc02zQElgNhZzwt3XWmCqUGBXDCE_lo_3s6F-rRfPWShlUULQQDKaeRYu0PUs5HSuwPZYgf5drcpOXrWGTBFSCK4D2txjjNO2s_KRNoqGGTJ1LoV/s1600/morse.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5_cjPvlMzkxEYDrRoruNhRXTfcoSKHc02zQElgNhZzwt3XWmCqUGBXDCE_lo_3s6F-rRfPWShlUULQQDKaeRYu0PUs5HSuwPZYgf5drcpOXrWGTBFSCK4D2txjjNO2s_KRNoqGGTJ1LoV/s640/morse.jpg" /></a></div>
A few months ago, I finally got a chance to see The Imitation Game (the new Alan Turing movie), which gave me an idea for a Sunday morning R hacking session. The movie features a bunch of scenes with bustling rooms full or workers intercepting (and documenting) encrypted radio transmissions, which are then passed along to Turing’s decryption device (bombe). The process seemed to be:<br />
<ol style="list-style-type: decimal;">
<li>Listen to Morse code from a live wire</li>
<li>Write down the series of short and long beeps on a piece of paper</li>
<li>Hand the paper to someone who runs the code over to another tent, ultimately to be sent to the decryption device.</li>
</ol>
All this got me thinking – wouldn’t it be great if the ‘wire listening’ part of this process could be automated too? So here’s the toy problem all this made me think about: could I write an R function which would take a sound file (.wav) with Morse code, and return decrypted text?<br />
To start on all of this, I found a site with a bunch of example Morse code sound files, maintained by The National Association for Amateur Radio (there are a bunch of files here – <a href="http://www.arrl.org/code-practice-files">http://www.arrl.org/code-practice-files</a>).<br />
<br />
To get these into R, I used the readWave function from Uwe Ligges’ very cool tuneR package. The only bother here was needing to convert the sound files from .mp3 to .wav first.. ultimately not that bad though.<br />
Figuring out how to deal with the converted audio file was actually a lot of fun – after a few hours of tinkering, I arrived at a solution (copied below) which converts audio Morse code to text. It’s still pretty fragile because the example sound files I’ve been using are computer generated, so the function won’t really work on Morse code ‘in the wild’ yet.<br />
<br />
If you’d like to give it a spin, try downloading one of the example morse code files e.g. <a href="http://www.arrl.org/files/file/Morse/Archive/10%20WPM/140625_10WPM.mp3">http://www.arrl.org/files/file/Morse/Archive/10%20WPM/140625_10WPM.mp3</a><br />
<br />
Next, convert the mp3 file to .wav (and make sure to change the file path for the sf.1 object to wherever the .wav file is stored on your machine), then execute my code. The solution starts with a bit of a strange stamp, but the rest should be pretty easy to read.<br />
<br />
If you’ve got ideas for how to improve things, definitely let me know on twitter – I’m <span class="citation">@M_T_Patterson</span>.<br />
Here’s the code:
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#### initialize ####</span>
<span style="color: #666666; font-style: italic;"># clear workspace:</span>
<a href="http://inside-r.org/r-doc/base/rm"><span style="color: #003399; font-weight: bold;">rm</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/list"><span style="color: #003399; font-weight: bold;">list</span></a> = <a href="http://inside-r.org/r-doc/base/ls"><span style="color: #003399; font-weight: bold;">ls</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">## loading libraries:</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/tuneR"><span style="">tuneR</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/RCurl"><span style="">RCurl</span></a><span style="color: #009900;">)</span>
morseref.url <span style=""><-</span> getURL<span style="color: #009900;">(</span><span style="color: #0000ff;">"https://raw.githubusercontent.com/MarkTPatterson/Blog/master/Morse/morseref.csv"</span><span style="color: #339933;">,</span>
ssl.verifypeer = <span style="color: #000000; font-weight: bold;">FALSE</span><span style="color: #009900;">)</span>
ref.df <span style=""><-</span> <a href="http://inside-r.org/r-doc/utils/read.csv"><span style="color: #003399; font-weight: bold;">read.csv</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/text"><span style="color: #003399; font-weight: bold;">text</span></a> = morseref.url<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># helper function:</span>
var_find = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>vec<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/t"><span style="color: #003399; font-weight: bold;">t</span></a><span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/mgcv/s"><span style="color: #003399; font-weight: bold;">s</span></a><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
var.out = <a href="http://inside-r.org/r-doc/stats/var"><span style="color: #003399; font-weight: bold;">var</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">[</span><span style="color: #009900;">(</span>t<span style="">-</span><a href="http://inside-r.org/r-doc/mgcv/s"><span style="color: #003399; font-weight: bold;">s</span></a><span style="color: #009900;">)</span><span style="">:</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/t"><span style="color: #003399; font-weight: bold;">t</span></a><span style="">+</span><a href="http://inside-r.org/r-doc/mgcv/s"><span style="color: #003399; font-weight: bold;">s</span></a><span style="color: #009900;">)</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>var.out<span style="color: #009900;">)</span><span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">## loading reference files</span>
<span style="color: #666666; font-style: italic;">## note: you'll need to change the file path for the sf.1 file</span>
<span style="color: #666666; font-style: italic;">## sound file can be downloaded here:</span>
<span style="color: #666666; font-style: italic;">## http://www.arrl.org/files/file/Morse/Archive/10%20WPM/140625_10WPM.mp3</span>
<span style="color: #666666; font-style: italic;">## note: you'll need to convert the file to .wav for the function to work.</span>
sf.1 = readWave<span style="color: #009900;">(</span><span style="color: #0000ff;">"C:/Users/Mark/Desktop/RInvest/Morse Code/Sound Files/140625_10WPM.wav"</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># defining the morse to text function:</span>
m.to.text.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>sound.file<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;"># read data into a dataframe</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>indx = <span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>sound.file<span style="">@</span>left<span style="color: #009900;">)</span><span style="color: #339933;">,</span> vec = sound.file<span style="">@</span>left<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># points to sample:</span>
sample.points = <a href="http://inside-r.org/r-doc/base/seq"><span style="color: #003399; font-weight: bold;">seq</span></a><span style="color: #009900;">(</span>from = <span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/by"><span style="color: #003399; font-weight: bold;">by</span></a> = <span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span> to = <a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>vec<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># applying the variance finder at the sampled points:</span>
tiny.df = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/var"><span style="color: #003399; font-weight: bold;">var</span></a> = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>sample.points<span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span>var_find<span style="color: #009900;">(</span>vec = <a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>vec<span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/base/t"><span style="color: #003399; font-weight: bold;">t</span></a> = x<span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/mgcv/s"><span style="color: #003399; font-weight: bold;">s</span></a> = <span style="color: #cc66cc;">50</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># decide which points are 'on'</span>
tiny.df<span style="">$</span>on = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span>tiny.df<span style="">$</span>var <span style="">></span> <span style="color: #cc66cc;">100000</span><span style="color: #009900;">)</span>
tiny.df<span style="">$</span>indx = <span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span>tiny.df<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># create a vector of changes in on:</span>
raw.vec = <a href="http://inside-r.org/r-doc/base/diff"><span style="color: #003399; font-weight: bold;">diff</span></a><span style="color: #009900;">(</span>tiny.df<span style="">$</span>on<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># create indices for change instances -- these will be 1 and -1</span>
beep.start.vals = <a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>raw.vec <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span>
beep.stop.vals = <a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>raw.vec <span style="">==</span> <span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># converting indices to durations:</span>
beep.durs = beep.stop.vals <span style="">-</span> beep.start.vals
pause.durs = beep.start.vals<span style="color: #009900;">[</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span> <span style="">-</span> beep.stop.vals<span style="color: #009900;">[</span><span style="">-</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>beep.stop.vals<span style="color: #009900;">)</span><span style="color: #009900;">]</span>
<span style="color: #666666; font-style: italic;">## note: for some files, there seems to be a few </span>
<span style="color: #666666; font-style: italic;">## few beep durs that are only 1; for now, hard coding these out:</span>
beep.durs = beep.durs<span style="color: #009900;">[</span>beep.durs<span style="">></span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
pause.durs = pause.durs<span style="color: #009900;">[</span>pause.durs<span style="">></span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
<span style="color: #666666; font-style: italic;">## recoding beep durs </span>
<span style="color: #666666; font-style: italic;">## note: this step needs to take the beep.durs data and the pause.durs data</span>
<span style="color: #666666; font-style: italic;">## and return duration barriers. </span>
<span style="color: #666666; font-style: italic;">## first, creating pause barriers:</span>
raw.tab = <a href="http://inside-r.org/r-doc/base/table"><span style="color: #003399; font-weight: bold;">table</span></a><span style="color: #009900;">(</span>pause.durs<span style="color: #009900;">)</span>
pause.centers.raw = <a href="http://inside-r.org/r-doc/stats/kmeans"><span style="color: #003399; font-weight: bold;">kmeans</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>raw.tab<span style="color: #009900;">[</span>raw.tab <span style="">></span> <span style="color: #cc66cc;">5</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span><span style="">$</span>centers<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
pause.centers = pause.centers.raw<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/order"><span style="color: #003399; font-weight: bold;">order</span></a><span style="color: #009900;">(</span>pause.centers.raw<span style="color: #339933;">,</span>decreasing = F<span style="color: #009900;">)</span><span style="color: #009900;">]</span>
pause.levels = <a href="http://inside-r.org/r-doc/base/as.vector"><span style="color: #003399; font-weight: bold;">as.vector</span></a><span style="color: #009900;">(</span>pause.centers<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># determining separator values:</span>
pause.sep.1 = <a href="http://inside-r.org/r-doc/base/mean"><span style="color: #003399; font-weight: bold;">mean</span></a><span style="color: #009900;">(</span>pause.levels<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
pause.sep.2 = <a href="http://inside-r.org/r-doc/base/mean"><span style="color: #003399; font-weight: bold;">mean</span></a><span style="color: #009900;">(</span>pause.levels<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="">:</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">## similar exercise for beep.durs:</span>
raw.tab = <a href="http://inside-r.org/r-doc/base/table"><span style="color: #003399; font-weight: bold;">table</span></a><span style="color: #009900;">(</span>beep.durs<span style="color: #009900;">)</span>
beep.centers.raw = <a href="http://inside-r.org/r-doc/stats/kmeans"><span style="color: #003399; font-weight: bold;">kmeans</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>raw.tab<span style="color: #009900;">[</span>raw.tab <span style="">></span> <span style="color: #cc66cc;">5</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span><span style="">$</span>centers<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
beep.centers = beep.centers.raw<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/order"><span style="color: #003399; font-weight: bold;">order</span></a><span style="color: #009900;">(</span>beep.centers.raw<span style="color: #339933;">,</span>decreasing = F<span style="color: #009900;">)</span><span style="color: #009900;">]</span>
beep.levels = <a href="http://inside-r.org/r-doc/base/as.vector"><span style="color: #003399; font-weight: bold;">as.vector</span></a><span style="color: #009900;">(</span>beep.centers<span style="color: #009900;">)</span>
beep.sep = <a href="http://inside-r.org/r-doc/base/mean"><span style="color: #003399; font-weight: bold;">mean</span></a><span style="color: #009900;">(</span>beep.levels<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">## creating the letter and word end vectors:</span>
letter.ends = <a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>pause.durs <span style="">></span> pause.sep.1<span style="color: #009900;">)</span>
word.ends = <a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span>pause.durs<span style="color: #009900;">[</span>pause.durs <span style="">></span> pause.sep.1<span style="color: #009900;">]</span> <span style="">></span> pause.sep.2<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># recoding beep durations to long and short:</span>
beep.durs.let = beep.durs
beep.durs.let<span style="color: #009900;">[</span>beep.durs.let <span style="">></span> beep.sep<span style="color: #009900;">]</span> = <span style="color: #0000ff;">"l"</span>
beep.durs.let<span style="color: #009900;">[</span>beep.durs.let <span style=""><</span> beep.sep<span style="color: #009900;">]</span> = <span style="color: #0000ff;">"s"</span>
<span style="color: #666666; font-style: italic;">## grouping the beep duration letters (l's and s's) into letters</span>
<span style="color: #666666; font-style: italic;">## based on the letter ends vector</span>
empty.list = <a href="http://inside-r.org/r-doc/base/list"><span style="color: #003399; font-weight: bold;">list</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span>
start.val = <span style="color: #cc66cc;">1</span>
<span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>letter.ends<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
cur.points = beep.durs.let<span style="color: #009900;">[</span>start.val<span style="">:</span>letter.ends<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="color: #009900;">]</span>
empty.list<span style="color: #009900;">[</span><span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/paste"><span style="color: #003399; font-weight: bold;">paste</span></a><span style="color: #009900;">(</span>cur.points<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a> = <span style="color: #0000ff;">""</span><span style="color: #009900;">)</span>
start.val = letter.ends<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span> <span style="">+</span> <span style="color: #cc66cc;">1</span>
<span style="color: #009900;">}</span>
letter.vec = <a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/lapply"><span style="color: #003399; font-weight: bold;">lapply</span></a><span style="color: #009900;">(</span>empty.list<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span>ref.df<span style="">$</span>letter<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>ref.df<span style="">$</span>code <span style="">==</span> x<span style="color: #009900;">)</span><span style="color: #009900;">]</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">## grouping letters into words based on word.ends vec:</span>
start.val = <span style="color: #cc66cc;">1</span>
empty.list = <a href="http://inside-r.org/r-doc/base/list"><span style="color: #003399; font-weight: bold;">list</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span>
<span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>word.ends<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
cur.points = letter.vec<span style="color: #009900;">[</span>start.val<span style="">:</span>word.ends<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="color: #009900;">]</span>
empty.list<span style="color: #009900;">[</span><span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/paste"><span style="color: #003399; font-weight: bold;">paste</span></a><span style="color: #009900;">(</span>cur.points<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a> = <span style="color: #0000ff;">""</span><span style="color: #009900;">)</span>
start.val = word.ends<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span> <span style="">+</span> <span style="color: #cc66cc;">1</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">## saving as a new vector, with spacing:</span>
out = <a href="http://inside-r.org/r-doc/base/paste"><span style="color: #003399; font-weight: bold;">paste</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span>empty.list<span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a> = <span style="color: #0000ff;">" "</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;"># examples:</span>
m.to.text.func<span style="color: #009900;">(</span>sf.1<span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-681976926013799942014-10-26T08:34:00.000-07:002014-10-26T08:34:03.033-07:00Quarterback Completion Heatmap Using dplyr<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxQ_3yO7yigi9ClRB_XaTJ8ZRTJb0hmKhaVc9rEI3hQBJwvokZG_lMjsuBd2nh_Ohuqc-w57mfrhqwkHn7wP3djcXlK19EZU4gRKqzryaTTMxgYvF4dxZac10WmOB1mAcD9g15vw34SVCx/s1600/QBComp.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxQ_3yO7yigi9ClRB_XaTJ8ZRTJb0hmKhaVc9rEI3hQBJwvokZG_lMjsuBd2nh_Ohuqc-w57mfrhqwkHn7wP3djcXlK19EZU4gRKqzryaTTMxgYvF4dxZac10WmOB1mAcD9g15vw34SVCx/s640/QBComp.png" /></a></div>
<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta http-equiv="x-ua-compatible" content="IE=9" >
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<p>Several months ago, I found Bryan Povlinkski's <a href="https://www.dropbox.com/sh/xb78cdculs39rdr/HgU1Vip9dV">(really nicely cleaned) dataset</a> with 2013 NFL play-by-play information, based on data released by Brian Burke at <a href="http://www.advancedfootballanalytics.com/">Advanced Football Analytics</a>.</p>
<p>I decided to browse QB completion rates based on Pass Location (Left, Middle, Right), Pass Distance (Short or Deep), and Down. I ended up focusing on the 5 quarterbacks with the most passing attempts.</p>
<p>The plot above (based on code below) shows a heatmap based on completion rate. Darker colors correspond to a better completion percentage.</p>
<p>Because we've only got data from one year, even looking at the really high-volume passers means that the data are pretty sparse for some combinations of these variables. It's a little rough, but in these cases, I deced not to plot anything. This plot could definitely be improved by plotting gray areas instead of white. </p>
<p>There are a few patterns here – first, it's iteresting to look at each player's
success with Short compared to Deep passes. Every player, as we would expect, has more success with Short rather than Deep passes, but this difference seems especially pronounced for Drew Brees (who seems to have more success with Short passes compared to the other players). Brees seems to have pretty uniform completion rates across the three pass locations at short distance too – most other players have slightly better completion rates to the outside, espeically at short distance.</p>
<p>As we would expect, we can also see a fairly pronounced difference in completion rates for deep throws on 3rd down vs. 1st and 2nd down. The sample size is small, so the estimates aren't very precise, this pattern is definitely there – probably best exemplified by Tom Brady and Peyton Manning's data.</p>
<p>As a next step, it would be interesting to make the same plot with pass attempts rather than completion rates.</p>
</body>
</html>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span>dplyr<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/ggplot2"><span style="">ggplot2</span></a><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># note: change path to the dataset</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/utils/read.csv"><span style="color: #003399; font-weight: bold;">read.csv</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"C:/Users/Mark/Desktop/RInvest/nflpbp/2013 NFL Play-by-Play Data.csv"</span><span style="color: #339933;">,</span>
stringsAsFactors = F<span style="color: #009900;">)</span>
passers = <a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> %<span style="">></span>% <a href="http://inside-r.org/r-doc/stats/filter"><span style="color: #003399; font-weight: bold;">filter</span></a><span style="color: #009900;">(</span>Play.Type <span style="">==</span> <span style="color: #0000ff;">"Pass"</span><span style="color: #009900;">)</span> %<span style="">></span>% group_by<span style="color: #009900;">(</span>Passer<span style="color: #009900;">)</span> %<span style="">></span>% summarize<span style="color: #009900;">(</span>n.obs = <a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>Play.Type<span style="color: #009900;">)</span><span style="color: #009900;">)</span> %<span style="">></span>% arrange<span style="color: #009900;">(</span>desc<span style="color: #009900;">(</span>n.obs<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
top.passers = <a href="http://inside-r.org/r-doc/utils/head"><span style="color: #003399; font-weight: bold;">head</span></a><span style="color: #009900;">(</span>passers<span style="">$</span>Passer<span style="color: #339933;">,</span><span style="color: #cc66cc;">5</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> %<span style="">></span>% <a href="http://inside-r.org/r-doc/stats/filter"><span style="color: #003399; font-weight: bold;">filter</span></a><span style="color: #009900;">(</span>Play.Type <span style="">==</span> <span style="color: #0000ff;">"Pass"</span><span style="color: #339933;">,</span>
Passer <span style="">%in%</span> top.passers<span style="color: #009900;">)</span> %<span style="">></span>%
mutate<span style="color: #009900;">(</span>Pass.Distance = <a href="http://inside-r.org/r-doc/base/factor"><span style="color: #003399; font-weight: bold;">factor</span></a><span style="color: #009900;">(</span>Pass.Distance<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/levels"><span style="color: #003399; font-weight: bold;">levels</span></a> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"Short"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"Deep"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> %<span style="">></span>%
group_by<span style="color: #009900;">(</span>Down<span style="color: #339933;">,</span>Passer<span style="color: #339933;">,</span>Pass.Location<span style="color: #339933;">,</span> Pass.Distance<span style="color: #009900;">)</span> %<span style="">></span>% summarize<span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/SHARE"><span style="">share</span></a> = <span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>Pass.Result <span style="">==</span> <span style="color: #0000ff;">"Complete"</span><span style="color: #009900;">)</span> <span style="">/</span> <a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>Pass.Result<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
n.obs = <a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>Pass.Result<span style="color: #009900;">)</span><span style="color: #009900;">)</span> %<span style="">></span>%
<a href="http://inside-r.org/r-doc/stats/filter"><span style="color: #003399; font-weight: bold;">filter</span></a><span style="color: #009900;">(</span>n.obs <span style="">></span> <span style="color: #cc66cc;">5</span><span style="color: #009900;">)</span> %<span style="">></span>%
<a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>.<span style="color: #339933;">,</span> aes<span style="color: #009900;">(</span>Pass.Location<span style="color: #339933;">,</span> Pass.Distance<span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_tile<span style="color: #009900;">(</span>aes<span style="color: #009900;">(</span>fill = <a href="http://inside-r.org/packages/cran/SHARE"><span style="">share</span></a><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
colour = <span style="color: #0000ff;">"white"</span><span style="color: #009900;">)</span> <span style="">+</span>
facet_wrap<span style="color: #009900;">(</span>Passer <span style="">~</span> Down<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/ncol"><span style="color: #003399; font-weight: bold;">ncol</span></a> = <span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="">+</span>
scale_fill_gradient<span style="color: #009900;">(</span>low = <span style="color: #0000ff;">"white"</span><span style="color: #339933;">,</span> high = <span style="color: #0000ff;">"steelblue"</span><span style="color: #339933;">,</span> limits = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> theme_bw<span style="color: #009900;">(</span><span style="color: #009900;">)</span> <span style="">+</span>
ggtitle<span style="color: #009900;">(</span><span style="color: #0000ff;">"NFL QB completion by Pass Distance, Location, and Down"</span><span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com1tag:blogger.com,1999:blog-8973439534644845561.post-40872247130256030702014-08-27T06:18:00.000-07:002014-08-27T06:18:20.088-07:00The first rule of brainstorming is..<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0IgIwHbcsbC7qKXytOVWh2BMqfDxKMkIfX-vcn11iV_BgVv4QdZZtnzDWjUYYhzAila5gBbW1uTX1Mk39QdOIpFjSyGJOdKSNhHJXg-fE10Z1dPNo-pZwHDW1YIbYidFa22X4WngCOSlV/s1600/secondcity.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0IgIwHbcsbC7qKXytOVWh2BMqfDxKMkIfX-vcn11iV_BgVv4QdZZtnzDWjUYYhzAila5gBbW1uTX1Mk39QdOIpFjSyGJOdKSNhHJXg-fE10Z1dPNo-pZwHDW1YIbYidFa22X4WngCOSlV/s400/secondcity.jpg" /></a></div>
About a year ago I was at a workshop on behavioral economics and public policy, and Mike Norton, who was leading the session, laid out a 'first rule' of brainstorming that has really resonated with me ever since. Mike's first rule was taken from the rules of improv comedy -- no matter how ridiculous / impractical / nonsensical the idea a person comes up with, you respond with "Yes, and..." There's nothing that kills a brainstorming session like someone pointing out that a particular candidate solution won't work, or doesn't quite solve the initial problem. As well, it's often the really off-the-wall ideas that actually lead to solutions.
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-56884535447591806682014-08-26T09:41:00.002-07:002014-08-26T09:41:17.602-07:00Replication wiki for econ papersI found this on <a href="http://andrewgelman.com/2014/08/22/replication-wiki-economics/">Andrew Gelman's Blog</a> -- a page for replications of experimental results in economics. This seems like a great idea!
<a href="http://replication.uni-goettingen.de/wiki/index.php/Main_Page">Here</a>'s the linkMark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-90167163804919373412014-08-26T07:47:00.000-07:002014-08-26T07:47:43.441-07:00Prizes in public health -- the new kaggle?This morning I saw a <a href="http://www.newyorker.com/magazine/2014/08/25/ebolanomics">short piece</a> by James Surowiecki on the absence of a vaccination for the Ebola virus. Surowiecki points out that the incentive structure for pharmaceutical companies rewards work on drugs that are likely to be taken by i) a large number of people, ii) Westerners, and iii) are likely to be taken over a long period of time. This means, Surowiecki argues, that under the present incentive scheme we are unlikely to develop cures for things like Ebola, which is essentially confined to the developing world, and up until recently did not affect many people.<br />
<br />
As a possible solution, Surowiecki offers 'prizes' -- sponsored by governments -- which compensate firms in exchange for the right to manufacture the resulting pharmaceuticals. Put differently, the government can intervene on the market value of development in these areas by paying 'more than the going rate.'<br />
<br />
I think Surowiecki is surely right about the potential benefits of this sort of approach -- paying companies more will definitely give them an incentive to change their research priorities. Thinking about all of this, though, reminded me of the kaggle-style data competitions that are growing increasingly popular. Here, a problem is posted (sometimes with a large prize for the best solution), and data scientists of all stripe work on solutions. The competition that really vaulted these into the mainstream was the <a href="http://en.wikipedia.org/wiki/Netflix_Prize">Netflix Prize</a> -- offering $1 Million for the best improvement in predictive film ratings. I remember going to a talk a few years ago by the 'BellKor' team that ended up winning the competition, where the winners remarked that while the prize seemed big, the tremendous time the team put into the challenge meant that they were actually working at a really low wage-rate.. and they were the one's who actually won!<br />
<br />
I bring all this up because it seems that, at least in the case of data competitions, it's not really the money that's driving entry (or work) on these problems. The competition, collaboration, and social value of doing well seem like much more important causes. Now, data science is quite different from pharmaceutical research -- the start up costs are MUCH lower, and it can be a much more individual activity -- but after reading Surowiecki's article, I'm especially curious to see whether some of the non-monitary incentives that we're seeing at work in the data science world might emerge if public health adopts a similar incentive structure around sponsored prizes. Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-11407437868563693622014-06-20T14:37:00.001-07:002014-06-20T14:37:59.867-07:00dplyr tutorial with baseball data<iframe allowfullscreen="" frameborder="0" height="270" src="//www.youtube.com/embed/FI_BUJPtCG8" width="480"></iframe>Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com1tag:blogger.com,1999:blog-8973439534644845561.post-10889494289474328152014-05-18T06:16:00.000-07:002014-05-18T06:16:11.672-07:00T-Shirts ... designed with R!<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRWMBKZGUhlAzdqGYLcY3Tj-_haoDGsjB4HoR26cH7W8sHnnZ9IDu9FO7KUlLLANX8Mo5mOUf1GdKF0gml8RVwthGeKnOxNLO5quWSthnWq6IL1g2IIlxB0B6nSuPCpreLdjbhRzMIcnCC/s1600/MPattersonShirtImage.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRWMBKZGUhlAzdqGYLcY3Tj-_haoDGsjB4HoR26cH7W8sHnnZ9IDu9FO7KUlLLANX8Mo5mOUf1GdKF0gml8RVwthGeKnOxNLO5quWSthnWq6IL1g2IIlxB0B6nSuPCpreLdjbhRzMIcnCC/s640/MPattersonShirtImage.png" /></a></div>
<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta http-equiv="x-ua-compatible" content="IE=9" >
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<p>On Friday, I saw <a href="http://blog.revolutionanalytics.com/2014/05/design-the-t-shirt-for-user-2014.html">David Smith's post</a> on a competition to design this year's useR! conference t-shirt. The goal is to create a design generated using an R script, which will be featured on the back of the shirt. </p>
<p>Having a bit of time this weekend, I decided to try plotting the R logo, using base graphics, represented by a scatter of points – one for each package published on CRAN. Having <a href="http://decisionsandr.blogspot.com/2014/04/visualizing-twitter-followers-using.html">recently posted</a> on a very similar idea for visualizing twitter followers, I realized I could take advantage of my past code. With a bit of tweaking, I came up with the image above.</p>
<p>The code I used is below – you should feel free to tweak / improve / experiment with it! Before running, you'll need to install the EBImage package <a href="http://www.bioconductor.org/packages/release/bioc/html/EBImage.html">available on bioconductor</a>. Roughly, the script works by first downloading a copy of the R logo (this step might make the entry illegal for the purposes of the contest..), as well as the current number of R packages. Next, there are a few functions to simplify the colors presented in the image – this part probably isn't necessary, but I think it makes the final result look a bit better. Finally, the image is actually generated by sampling pixels from the modified image, and replotting. </p>
<p>If you're interested in trying out your own ideas (which you definitely should!) you can submit entires to the contest as pull requests <a href="https://github.com/user2014/t-shirt">on Github</a>.</p>
</body>
</html>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><span style="color: #666666; font-style: italic;"># Script by Mark T Patterson</span>
<span style="color: #666666; font-style: italic;"># May 17, 2014</span>
<span style="color: #666666; font-style: italic;"># twitter: @M_T_Patterson</span>
<span style="color: #666666; font-style: italic;"># General Notes:</span>
<span style="color: #666666; font-style: italic;"># This script creates an image of the R logo </span>
<span style="color: #666666; font-style: italic;"># represented by n points, </span>
<span style="color: #666666; font-style: italic;"># where n is the current number of packages on CRAN</span>
<span style="color: #666666; font-style: italic;"># note: this script requries the EBImage package</span>
<span style="color: #666666; font-style: italic;"># available from bioconductor:</span>
<span style="color: #666666; font-style: italic;"># http://bioconductor.wustl.edu/bioc/html/EBImage.html</span>
<span style="color: #666666; font-style: italic;"># approximate run time: 2 mins</span>
<span style="color: #666666; font-style: italic;">#### initialize ####</span>
<span style="color: #666666; font-style: italic;"># clear workspace</span>
<a href="http://inside-r.org/r-doc/base/rm"><span style="color: #003399; font-weight: bold;">rm</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/list"><span style="color: #003399; font-weight: bold;">list</span></a> = <a href="http://inside-r.org/r-doc/base/ls"><span style="color: #003399; font-weight: bold;">ls</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># load libraries</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span>EBImage<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># coordinate the version of the program:</span>
<a href="http://inside-r.org/r-doc/base/set.seed"><span style="color: #003399; font-weight: bold;">set.seed</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">2014</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">#### gather web data: reference image and CRAN package count ####</span>
<span style="color: #666666; font-style: italic;"># load the R logo, save the rgb values:</span>
img = readImage<span style="color: #009900;">(</span><span style="color: #0000ff;">"http://www.thinkr.spatialfiltering.com/images/Rlogo.png"</span><span style="color: #009900;">)</span>
img.2 = img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span>
cran.site = <span style="color: #0000ff;">"http://cran.r-project.org/web/packages/"</span>
lns = <a href="http://inside-r.org/r-doc/base/readLines"><span style="color: #003399; font-weight: bold;">readLines</span></a><span style="color: #009900;">(</span>cran.site<span style="color: #009900;">)</span>
ref.line = <a href="http://inside-r.org/r-doc/base/grep"><span style="color: #003399; font-weight: bold;">grep</span></a><span style="color: #009900;">(</span>lns<span style="color: #339933;">,</span> pattern = <span style="color: #0000ff;">"CRAN package repository features"</span><span style="color: #009900;">)</span>
package.count = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>lns<span style="color: #009900;">[</span>ref.line<span style="color: #009900;">]</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">"<span style="color: #000099; font-weight: bold;">\\</span>s"</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">7</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">#### helper functions ####</span>
<span style="color: #666666; font-style: italic;"># functions for color simplification:</span>
num.to.let = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x1<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
ref.dat = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>num = <span style="color: #cc66cc;">10</span><span style="">:</span><span style="color: #cc66cc;">15</span><span style="color: #339933;">,</span> let = <span style="color: #000000; font-weight: bold;">LETTERS</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">6</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
out = <a href="http://inside-r.org/r-doc/base/as.character"><span style="color: #003399; font-weight: bold;">as.character</span></a><span style="color: #009900;">(</span>x1<span style="color: #009900;">)</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>x1 <span style="">%in%</span> <span style="color: #cc66cc;">10</span><span style="">:</span><span style="color: #cc66cc;">15</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>out = <a href="http://inside-r.org/r-doc/base/as.character"><span style="color: #003399; font-weight: bold;">as.character</span></a><span style="color: #009900;">(</span>ref.dat<span style="">$</span>let<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>ref.dat<span style="">$</span>num <span style="">==</span> x1<span style="color: #009900;">)</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
rgb.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;">#note: vec is a triple of color intensities</span>
r1 = <a href="http://inside-r.org/r-doc/base/floor"><span style="color: #003399; font-weight: bold;">floor</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">255</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
g1 = <a href="http://inside-r.org/r-doc/base/floor"><span style="color: #003399; font-weight: bold;">floor</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">255</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
b1 = <a href="http://inside-r.org/r-doc/base/floor"><span style="color: #003399; font-weight: bold;">floor</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">255</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
x1 = r1 <span style="">%/%</span> <span style="color: #cc66cc;">16</span>
x2 = r1 %% <span style="color: #cc66cc;">16</span>
x3 = g1 <span style="">%/%</span> <span style="color: #cc66cc;">16</span>
x4 = g1 %% <span style="color: #cc66cc;">16</span>
x5 = b1 <span style="">%/%</span> <span style="color: #cc66cc;">16</span>
x6 = b1 %% <span style="color: #cc66cc;">16</span>
x1 = num.to.let<span style="color: #009900;">(</span>x1<span style="color: #009900;">)</span>
x2 = num.to.let<span style="color: #009900;">(</span>x2<span style="color: #009900;">)</span>
x3 = num.to.let<span style="color: #009900;">(</span>x3<span style="color: #009900;">)</span>
x4 = num.to.let<span style="color: #009900;">(</span>x4<span style="color: #009900;">)</span>
x5 = num.to.let<span style="color: #009900;">(</span>x5<span style="color: #009900;">)</span>
x6 = num.to.let<span style="color: #009900;">(</span>x6<span style="color: #009900;">)</span>
out = <a href="http://inside-r.org/r-doc/base/paste"><span style="color: #003399; font-weight: bold;">paste</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"#"</span><span style="color: #339933;">,</span>x1<span style="color: #339933;">,</span>x2<span style="color: #339933;">,</span>x3<span style="color: #339933;">,</span>x4<span style="color: #339933;">,</span>x5<span style="color: #339933;">,</span>x6<span style="color: #339933;">,</span> sep = <span style="color: #0000ff;">""</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
im.func.1 = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #339933;">,</span> k.cols = <span style="color: #cc66cc;">5</span><span style="color: #339933;">,</span> samp.val = <span style="color: #cc66cc;">3000</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;"># creating a dataframe:</span>
test.mat = <a href="http://inside-r.org/r-doc/base/matrix"><span style="color: #003399; font-weight: bold;">matrix</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/ncol"><span style="color: #003399; font-weight: bold;">ncol</span></a> = <span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>test.mat<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/colnames"><span style="color: #003399; font-weight: bold;">colnames</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"r"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"g"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"b"</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>y = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>x = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span> each = <a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
samp.indx = <a href="http://inside-r.org/r-doc/base/sample"><span style="color: #003399; font-weight: bold;">sample</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span><span style="color: #339933;">,</span>samp.val<span style="color: #009900;">)</span>
work.sub = <a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">[</span>samp.indx<span style="color: #339933;">,</span><span style="color: #009900;">]</span>
<span style="color: #666666; font-style: italic;"># extracting colors:</span>
k2 = <a href="http://inside-r.org/r-doc/stats/kmeans"><span style="color: #003399; font-weight: bold;">kmeans</span></a><span style="color: #009900;">(</span>work.sub<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>k.cols<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># adding centers back:</span>
fit.test = <a href="http://inside-r.org/r-doc/stats/fitted"><span style="color: #003399; font-weight: bold;">fitted</span></a><span style="color: #009900;">(</span>k2<span style="color: #009900;">)</span>
work.sub<span style="">$</span>r.pred = fit.test<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
work.sub<span style="">$</span>g.pred = fit.test<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
work.sub<span style="">$</span>b.pred = fit.test<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>work.sub<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
add.cols = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>dat<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/apply"><span style="color: #003399; font-weight: bold;">apply</span></a><span style="color: #009900;">(</span>dat<span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span>rgb.func<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;"># general plotting function</span>
plot.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>dat<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;"># assumes dat has colums x, ym cols</span>
<a href="http://inside-r.org/r-doc/graphics/plot"><span style="color: #003399; font-weight: bold;">plot</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>y<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>x<span style="color: #009900;">)</span> <span style="">-</span> dat<span style="">$</span>x<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/col"><span style="color: #003399; font-weight: bold;">col</span></a> = dat<span style="">$</span>cols<span style="color: #339933;">,</span>
main = <span style="color: #0000ff;">"A point for each CRAN package"</span><span style="color: #339933;">,</span>
xaxt=<span style="color: #0000ff;">'n'</span><span style="color: #339933;">,</span>
yaxt=<span style="color: #0000ff;">"n"</span><span style="color: #339933;">,</span>
xlab = <span style="color: #0000ff;">"useR!"</span><span style="color: #339933;">,</span>
ylab = <span style="color: #0000ff;">"2014"</span><span style="color: #339933;">,</span>
cex.lab=<span style="color: #cc66cc;">1.5</span><span style="color: #339933;">,</span>
cex.axis=<span style="color: #cc66cc;">1.5</span><span style="color: #339933;">,</span>
cex.main=<span style="color: #cc66cc;">1.5</span><span style="color: #339933;">,</span>
cex.sub=<span style="color: #cc66cc;">1.5</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">#### simplify colors; sample n points ###</span>
temp = im.func.1<span style="color: #009900;">(</span>img.2<span style="color: #339933;">,</span> samp.val = <span style="color: #cc66cc;">25000</span><span style="color: #339933;">,</span> k = <span style="color: #cc66cc;">12</span><span style="color: #009900;">)</span>
temp<span style="">$</span>cols = add.cols<span style="color: #009900;">(</span>temp<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">6</span><span style="">:</span><span style="color: #cc66cc;">8</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
final = temp<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/sample"><span style="color: #003399; font-weight: bold;">sample</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span>temp<span style="color: #009900;">)</span><span style="color: #339933;">,</span> package.count<span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span>
<span style="color: #666666; font-style: italic;">#### generate plot ####</span>
plot.func<span style="color: #009900;">(</span>final<span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com2tag:blogger.com,1999:blog-8973439534644845561.post-43245093621028302802014-05-02T06:48:00.001-07:002014-05-02T06:48:18.070-07:00Function for rounding a group of numbers<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta http-equiv="x-ua-compatible" content="IE=9" >
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<!-- Styles for R syntax highlighter -->
<style type="text/css">
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: rgb(88, 72, 246)
}
pre .number {
color: rgb(0, 0, 205);
}
pre .comment {
color: rgb(76, 136, 107);
}
pre .keyword {
color: rgb(0, 0, 255);
}
pre .identifier {
color: rgb(0, 0, 0);
}
pre .string {
color: rgb(3, 106, 7);
}
</style>
<!-- R syntax highlighter -->
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"</",c:[hljs.CLCM,hljs.CBLCLM,hljs.QSM,{cN:"string",b:"'\\\\?.",e:"'",i:"."},{cN:"number",b:"\\b(\\d+(\\.\\d*)?|\\.\\d+)(u|U|l|L|ul|UL|f|F)"},hljs.CNM,{cN:"preprocessor",b:"#",e:"$"},{cN:"stl_container",b:"\\b(deque|list|queue|stack|vector|map|set|bitset|multiset|multimap|unordered_map|unordered_set|unordered_multiset|unordered_multimap|array)\\s*<",e:">",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
</head>
<body>
<p>Sometimes when I'm creating summary statistics for factor variables (usually demographics), I find I need to round percentages a bit. If I round each number individually, I occasionally (and frustratingly) change the total sum. For example, suppose I've got information on how many individuals are in each of four groups:</p>
<pre><code class="r">group.totals = c(13, 39, 16, 11)
</code></pre>
<p>and I'd like to report the distribution as a share of the total number of individuals:</p>
<pre><code class="r">(tab = prop.table(group.totals))
</code></pre>
<pre><code>## [1] 0.1646 0.4937 0.2025 0.1392
</code></pre>
<p>however, I only want to report 2 significant digits after the decimal:</p>
<pre><code class="r">(rounded.tab = round(tab, 2))
</code></pre>
<pre><code>## [1] 0.16 0.49 0.20 0.14
</code></pre>
<p>Here, the rounding process (annoyingly) changes the sum:</p>
<pre><code class="r">sum(tab)
</code></pre>
<pre><code>## [1] 1
</code></pre>
<pre><code class="r">sum(rounded.tab)
</code></pre>
<pre><code>## [1] 0.99
</code></pre>
<p>To fix this (a bit), here's a quick function which rounds a group of numbers together:</p>
<pre><code class="r">round.group = function(vec, digits) {
r.vec = round(vec, digits)
total.resid = sum(vec) - sum(r.vec)
sq.diffs = ((r.vec + total.resid) - vec)^2
indx = which.min(sq.diffs)
r.vec.copy = r.vec
r.vec.copy[indx] = r.vec.copy[indx] + total.resid
out = r.vec.copy
return(out)
}
</code></pre>
<p>This solves some of the problems:</p>
<pre><code class="r">(group.rounded.tab = round.group(tab, 2))
</code></pre>
<pre><code>## [1] 0.17 0.49 0.20 0.14
</code></pre>
<pre><code class="r">sum(group.rounded.tab)
</code></pre>
<pre><code>## [1] 1
</code></pre>
<p>But has sort of unusual behavior for some inputs:</p>
<pre><code class="r">bug.vec = c(0.4, 0.4, 0.4, 0.4, 9.2, 9.2)
round.group(bug.vec, 0)
</code></pre>
<pre><code>## [1] 2 0 0 0 9 9
</code></pre>
<p>Despite being a bit buggy, this function does well enough for my purposes.. if you'd like to find a better version, or are generally interested, <a href="http://stackoverflow.com/questions/792460/how-to-round-floats-to-integers-while-preserving-their-sum">here</a>'s a link to a nice discussion on group rounding at stackoverflow.</p>
<p><a href="http://stackoverflow.com/questions/792460/how-to-round-floats-to-integers-while-preserving-their-sum">http://stackoverflow.com/questions/792460/how-to-round-floats-to-integers-while-preserving-their-sum</a></p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-83421744540617652262014-04-21T07:46:00.001-07:002014-04-21T07:47:57.194-07:00Organizing data-cleaning scripts<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2UrCwVfyII4jhyDsi0KAj3NtvDLXbp5VOclUJduKA9rVlLbPiI_eCxE3cpcEb9ImP_8Cev9aKqMsbFn5IBsS_MP7EmZ9gTf5Xf21MmCQCEzCBUYKZU-m74XbgLv4t0EM6n_rmye0S8nOI/s1600/Untitled.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2UrCwVfyII4jhyDsi0KAj3NtvDLXbp5VOclUJduKA9rVlLbPiI_eCxE3cpcEb9ImP_8Cev9aKqMsbFn5IBsS_MP7EmZ9gTf5Xf21MmCQCEzCBUYKZU-m74XbgLv4t0EM6n_rmye0S8nOI/s1600/Untitled.jpg" height="640" width="510" /></a></div>
<br />
About two years ago it finally dawned on me that having a single gigantic R file for a project wasn't all that practical. Since then, I've been trying out a few systems for breaking the larger project into smaller scripts. Today, I came across <a href="http://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf">this</a> introduction to data cleaning in R, which nicely divides the project into several steps (in the figure above). The authors suggest (at a minimum) saving the data at each of these stages, which seems totally reasonable.<br />
<br />
Roughly, the stages of the data cleaning process can be broken down into: 1) Raw data: this is the format of the original data source -- it's possible that some sort of conversion is necessary before the data can even be read into R, or that once the data are are loaded, the variable types or column names have problems. 2) Technically correct data is the result of the most basic cleaning process -- at the very least, your data should be the "shape" you expect (the right number of rows and columns if you're expecting rectangular data), numbers should look like numbers rather than strings, etc. Technically correct data, despite the proper formatting, may have erroneous values -- these may range from 'outlawed' values (like negative durations), to suspicious values (e.g. an individual's height entered as 9'). 3) Consistent data is ready for analysis.<br />
<br />
Clearly this division won't be sufficient for all file-organization needs, but it seems like a nice thing to keep in the back of the mind.. <br />
<br />
<br />Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com1tag:blogger.com,1999:blog-8973439534644845561.post-84362216747286578272014-04-19T11:13:00.000-07:002014-04-19T12:03:20.304-07:00Play 2048... using R!<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPX0JFdYAErbP5aNmBocDIuzw1a7AfNh0XowQjcmspiuBCGdaGYxt75CUZOzzLHc7uwk4MtOKPAy1bvT1KILPjIikbbD6bC7bYWWoOXLMsVHg3_bhTCvQ0op6I4sDjJyzgx8wh-FiMdFFN/s1600/2048_Screenshot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPX0JFdYAErbP5aNmBocDIuzw1a7AfNh0XowQjcmspiuBCGdaGYxt75CUZOzzLHc7uwk4MtOKPAy1bvT1KILPjIikbbD6bC7bYWWoOXLMsVHg3_bhTCvQ0op6I4sDjJyzgx8wh-FiMdFFN/s640/2048_Screenshot.png" /></a></div>
<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta http-equiv="x-ua-compatible" content="IE=9" >
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<p>I've lost about 100 hours over the past week to the black hole of 2048. In an attempt to extricate myself, I thought I'd try writing an R script to play for me. While there are already <a href="http://stackoverflow.com/questions/22342854/what-is-the-optimal-algorithm-for-the-game-2048">a ton of great algorithms for the game</a>, I haven't seen any implemented in R. </p>
<p>There's a recent package, <a href="http://cran.fhcrc.org/web/packages/RSelenium/index.html">RSelenium</a> that allows you to drive your browser through R, so we can jump right into playing the game. As an aside, it's definitely worth browsing through the really nice <a href="http://cran.fhcrc.org/web/packages/RSelenium/vignettes/RSelenium-basics.html">vignette</a> to get a sense of just how cool this package is.</p>
<p>The code below is divided into a few sections. Section I loads RSelenium, and navigates to the 2048 site. Make sure that in the remDr command, you specify the right sort of browser – I'm using firefox, but it's pretty easy to adjust this (see the vignette).</p>
<p>Sections II - VI break down different steps in the development of an algorithm to play the game. The rough steps are (SII) writing a function to predict the next board states depending on which move is selected, (SIII) writing a few functions to report on features of these boards (for example, the sum of the scores on the tiles in a particular column), (SIV) writing functions to score the various potential future positions based on the features of the boards, and (SV, SVI) putting these together to actually play.</p>
<p>The algorithm I've written is mediocre at best – its mean score is around 7000, but it's won once or twice. It's a pretty thrown-together attempt, but I think it's fun to watch. Hopefully this'll be the antidote I've needed… code is below.</p>
</body>
</html>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#### SECTION I: fire up the game ####</span>
<a href="http://inside-r.org/r-doc/base/require"><span style="color: #003399; font-weight: bold;">require</span></a><span style="color: #009900;">(</span>RSelenium<span style="color: #009900;">)</span>
checkForServer<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
startServer<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
remDr <span style=""><-</span> remoteDriver<span style="color: #009900;">(</span>remoteServerAddr = <span style="color: #0000ff;">"localhost"</span>
<span style="color: #339933;">,</span> port = <span style="color: #cc66cc;">4444</span>
<span style="color: #339933;">,</span> browserName = <span style="color: #0000ff;">"firefox"</span>
<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/Sys.sleep"><span style="color: #003399; font-weight: bold;">Sys.sleep</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">10</span><span style="color: #009900;">)</span>
remDr<span style="">$</span>open<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># navigate to page</span>
remDr<span style="">$</span>navigate<span style="color: #009900;">(</span><span style="color: #0000ff;">"http://gabrielecirulli.github.io/2048/"</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">#### SECTION II: functions for predicting board states ####</span>
<span style="color: #666666; font-style: italic;"># functions to determine current board state:</span>
pos.strip = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>string<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
first.cut = <a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>string<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">" tile-position-"</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span>
val.sub = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>first.cut<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">"-"</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
pos.sub = first.cut<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
second.cut = <a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>pos.sub<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">" "</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
third.cut = <a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>second.cut<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">"-"</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span>
conv.to.num = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span>third.cut<span style="color: #009900;">)</span>
rev.order = <a href="http://inside-r.org/r-doc/base/rev"><span style="color: #003399; font-weight: bold;">rev</span></a><span style="color: #009900;">(</span>conv.to.num<span style="color: #009900;">)</span>
out = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span>rev.order<span style="color: #339933;">,</span>val.sub<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
conv.to.frame = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>htmlParsedPage<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
n1 = xpathSApply<span style="color: #009900;">(</span>htmlParsedPage<span style="color: #339933;">,</span><span style="color: #0000ff;">"//div[@class='tile-container']"</span><span style="color: #339933;">,</span>xmlValue<span style="color: #009900;">)</span>
n2 = xpathSApply<span style="color: #009900;">(</span>htmlParsedPage<span style="color: #339933;">,</span><span style="color: #0000ff;">"//div[@class='tile-container']//@class"</span><span style="color: #009900;">)</span>
n2 = n2<span style="color: #009900;">[</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
curr.len = <a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>n2<span style="color: #009900;">)</span>
n2 = n2<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span>curr.len %% <span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span>
mat = <a href="http://inside-r.org/r-doc/base/t"><span style="color: #003399; font-weight: bold;">t</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>n2<span style="color: #339933;">,</span>pos.strip<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/rownames"><span style="color: #003399; font-weight: bold;">rownames</span></a><span style="color: #009900;">(</span>mat<span style="color: #009900;">)</span> = <span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span>mat<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/colnames"><span style="color: #003399; font-weight: bold;">colnames</span></a><span style="color: #009900;">(</span>mat<span style="color: #009900;">)</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"x"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"y"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"val"</span><span style="color: #009900;">)</span>
mat = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>mat<span style="color: #009900;">)</span>
empty.frame = <a href="http://inside-r.org/r-doc/base/matrix"><span style="color: #003399; font-weight: bold;">matrix</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">16</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a> = <span style="color: #cc66cc;">4</span><span style="color: #009900;">)</span>
<span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span>mat<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.frame<span style="color: #009900;">[</span>mat<span style="">$</span>x<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="color: #339933;">,</span>mat<span style="">$</span>y<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="color: #009900;">]</span> = mat<span style="">$</span>val<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span>
<span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>empty.frame<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">## predicting next board state:</span>
comb.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">)</span>
four.three = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> <span style="">==</span> vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
three.two = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> <span style="">==</span> vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
two.one = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span> <span style="">==</span> vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
layout.vec = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span>two.one<span style="color: #339933;">,</span>three.two<span style="color: #339933;">,</span>four.three<span style="color: #009900;">)</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
<span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span> = vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> = vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
<span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="">:</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> = vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="">:</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
<span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
<span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">2</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span> = vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
empty.vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span> = vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
<span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/all"><span style="color: #003399; font-weight: bold;">all</span></a><span style="color: #009900;">(</span>layout.vec <span style="">==</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty.vec = vec
<span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>empty.vec<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
collect.right = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
first.move = <a href="http://inside-r.org/r-doc/base/t"><span style="color: #003399; font-weight: bold;">t</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/apply"><span style="color: #003399; font-weight: bold;">apply</span></a><span style="color: #009900;">(</span>board<span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
n.na = <a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/is.na"><span style="color: #003399; font-weight: bold;">is.na</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
stripped = x<span style="color: #009900;">[</span><span style="">!</span><a href="http://inside-r.org/r-doc/base/is.na"><span style="color: #003399; font-weight: bold;">is.na</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">]</span>
comb = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span>n.na<span style="color: #009900;">)</span><span style="color: #339933;">,</span>stripped<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
second.move = <a href="http://inside-r.org/r-doc/base/t"><span style="color: #003399; font-weight: bold;">t</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/apply"><span style="color: #003399; font-weight: bold;">apply</span></a><span style="color: #009900;">(</span>first.move<span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span>comb.func<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>second.move<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
ninety.rot = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>mat<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
empty = <a href="http://inside-r.org/r-doc/base/matrix"><span style="color: #003399; font-weight: bold;">matrix</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">16</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a> = <span style="color: #cc66cc;">4</span><span style="color: #009900;">)</span>
empty<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span> = mat<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
empty<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span> = mat<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span>
empty<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span> = mat<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
empty<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span> = mat<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>empty<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
collect.down = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp.turn = ninety.rot<span style="color: #009900;">(</span>board<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a> = collect.right<span style="color: #009900;">(</span>temp.turn<span style="color: #009900;">)</span>
turn.back = ninety.rot<span style="color: #009900;">(</span>ninety.rot<span style="color: #009900;">(</span>ninety.rot<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>turn.back<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
collect.up = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp.turn = ninety.rot<span style="color: #009900;">(</span>ninety.rot<span style="color: #009900;">(</span>ninety.rot<span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a> = collect.right<span style="color: #009900;">(</span>temp.turn<span style="color: #009900;">)</span>
turn.back = ninety.rot<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>turn.back<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
collect.left = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp.turn = ninety.rot<span style="color: #009900;">(</span>ninety.rot<span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a> = collect.right<span style="color: #009900;">(</span>temp.turn<span style="color: #009900;">)</span>
turn.back = ninety.rot<span style="color: #009900;">(</span>ninety.rot<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/nlme/collapse"><span style="color: #003399; font-weight: bold;">collapse</span></a><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>turn.back<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
count.tiles = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span><span style="">!</span><a href="http://inside-r.org/r-doc/base/is.na"><span style="color: #003399; font-weight: bold;">is.na</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
preds.lst = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>Parsed<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
board.temp = conv.to.frame<span style="color: #009900;">(</span>Parsed<span style="color: #009900;">)</span>
preds = <a href="http://inside-r.org/r-doc/base/list"><span style="color: #003399; font-weight: bold;">list</span></a><span style="color: #009900;">(</span>orig = board.temp<span style="color: #339933;">,</span>
left = collect.left<span style="color: #009900;">(</span>board.temp<span style="color: #009900;">)</span><span style="color: #339933;">,</span>
right = collect.right<span style="color: #009900;">(</span>board.temp<span style="color: #009900;">)</span><span style="color: #339933;">,</span>
up = collect.up<span style="color: #009900;">(</span>board.temp<span style="color: #009900;">)</span><span style="color: #339933;">,</span>
down = collect.down<span style="color: #009900;">(</span>board.temp<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>preds<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
allowed.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>lst<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;"># note: this is a function of the output from preds.lst</span>
<span style="color: #666666; font-style: italic;"># returns the directions that are currently allowed.</span>
vals = <a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/lapply"><span style="color: #003399; font-weight: bold;">lapply</span></a><span style="color: #009900;">(</span>lst<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="">:</span><span style="color: #cc66cc;">5</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/base/identical"><span style="color: #003399; font-weight: bold;">identical</span></a><span style="color: #009900;">(</span>x<span style="color: #339933;">,</span>lst<span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/sub"><span style="color: #003399; font-weight: bold;">sub</span></a> = <a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>vals<span style="color: #009900;">)</span><span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>vals <span style="">==</span> F<span style="color: #009900;">)</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sub"><span style="color: #003399; font-weight: bold;">sub</span></a><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
legal.sub = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>Parsed<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
preds = preds.lst<span style="color: #009900;">(</span>Parsed<span style="color: #009900;">)</span>
moves = allowed.func<span style="color: #009900;">(</span>preds<span style="color: #009900;">)</span>
out = preds<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>preds<span style="color: #009900;">)</span> <span style="">%in%</span> moves<span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
prep.to.send = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>choice.arrow<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/paste"><span style="color: #003399; font-weight: bold;">paste</span></a><span style="color: #009900;">(</span>choice.arrow<span style="color: #339933;">,</span><span style="color: #0000ff;">"_arrow"</span><span style="color: #339933;">,</span>sep = <span style="color: #0000ff;">""</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
send.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>prepped.choice<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
remDr<span style="">$</span>sendKeysToActiveElement<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/list"><span style="color: #003399; font-weight: bold;">list</span></a><span style="color: #009900;">(</span>key = prepped.choice<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
comb.move = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/grid/arrow"><span style="color: #003399; font-weight: bold;">arrow</span></a><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>send.func<span style="color: #009900;">(</span>prep.to.send<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/grid/arrow"><span style="color: #003399; font-weight: bold;">arrow</span></a><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">#### Section III: functions to determine properties of boards ####</span>
tiles.in.fourth = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span><span style="">!</span><a href="http://inside-r.org/r-doc/base/is.na"><span style="color: #003399; font-weight: bold;">is.na</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
tot.sum.in.fourth = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
bottom.right.val = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
bottom.right.third.val = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
bottom.right.sec.val = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
bottom.right.first.val = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
prep.for.next = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span> <span style="">==</span> board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
prep.for.next.third = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span> <span style="">==</span> board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
prep.for.next.second = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span> <span style="">==</span> board<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">#### SECTION IV: scoring boards ####</span>
top.val.moves = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>score.vec<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
raw.scores = score.vec
temp.max = <a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span>raw.scores<span style="color: #009900;">)</span>
indx = <a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>raw.scores <span style="">==</span> temp.max<span style="color: #009900;">)</span>
maxima = <a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>raw.scores<span style="color: #009900;">)</span><span style="color: #009900;">[</span>indx<span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>maxima<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
score.em = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>legal.board<span style="color: #339933;">,</span> FUN<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/lapply"><span style="color: #003399; font-weight: bold;">lapply</span></a><span style="color: #009900;">(</span>legal.board<span style="color: #339933;">,</span>FUN<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">#### SECTION V: algorithm for a single play ####</span>
play.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>parsed<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
legal.boards = legal.sub<span style="color: #009900;">(</span>parsed<span style="color: #009900;">)</span>
bottom.right = score.em<span style="color: #009900;">(</span>legal.boards<span style="color: #339933;">,</span> bottom.right.val<span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>bottom.right<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
bottom.right.third = score.em<span style="color: #009900;">(</span>leftover.boards<span style="color: #339933;">,</span> bottom.right.third.val<span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>bottom.right.third<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
bottom.right.sec = score.em<span style="color: #009900;">(</span>leftover.boards<span style="color: #339933;">,</span> bottom.right.sec.val<span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>bottom.right.sec<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
tot.fourth.scores = score.em<span style="color: #009900;">(</span>leftover.boards<span style="color: #339933;">,</span> tot.sum.in.fourth<span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>tot.fourth.scores<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
prep.scores = score.em<span style="color: #009900;">(</span>leftover.boards<span style="color: #339933;">,</span> prep.for.next<span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>prep.scores<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
tile.tots = score.em<span style="color: #009900;">(</span>leftover.boards<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span><span style="color: #cc66cc;">20</span> <span style="">-</span> count.tiles<span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>tile.tots<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
prep.scores.third = score.em<span style="color: #009900;">(</span>leftover.boards<span style="color: #339933;">,</span> prep.for.next.third<span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>prep.scores.third<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
prep.scores.second = score.em<span style="color: #009900;">(</span>leftover.boards<span style="color: #339933;">,</span> prep.for.next.second<span style="color: #009900;">)</span>
leftover.moves = top.val.moves<span style="color: #009900;">(</span>prep.scores.second<span style="color: #009900;">)</span>
leftover.boards = legal.boards<span style="color: #009900;">[</span>leftover.moves<span style="color: #009900;">]</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
rand.choice = leftover.moves<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/sample"><span style="color: #003399; font-weight: bold;">sample</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>leftover.boards<span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>comb.move<span style="color: #009900;">(</span>rand.choice<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
execute = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp = htmlParse<span style="color: #009900;">(</span>remDr<span style="">$</span>getPageSource<span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
play.func<span style="color: #009900;">(</span>temp<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;">#### SECTION VI Playing the game ####</span>
grand.play = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
remDr<span style="">$</span>navigate<span style="color: #009900;">(</span><span style="color: #0000ff;">"http://gabrielecirulli.github.io/2048/"</span><span style="color: #009900;">)</span>
temp2 = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"Continue"</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span>
<span style="color: #000000; font-weight: bold;">while</span><span style="color: #009900;">(</span>temp2<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span> <span style="">!</span>= <span style="color: #0000ff;">"Game over!"</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp = htmlParse<span style="color: #009900;">(</span>remDr<span style="">$</span>getPageSource<span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
execute<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
temp2 = xpathSApply<span style="color: #009900;">(</span>temp<span style="color: #339933;">,</span><span style="color: #0000ff;">"//p"</span><span style="color: #339933;">,</span>xmlValue<span style="color: #009900;">)</span>
curr.score = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>xpathSApply<span style="color: #009900;">(</span>temp<span style="color: #339933;">,</span><span style="color: #0000ff;">"//div[@class='score-container']"</span><span style="color: #339933;">,</span>xmlValue<span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">"<span style="color: #000099; font-weight: bold;">\\</span>+"</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>curr.score<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;"># example:</span>
grand.play<span style="color: #009900;">(</span><span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com3tag:blogger.com,1999:blog-8973439534644845561.post-61736106920031314432014-04-11T08:13:00.000-07:002014-04-11T08:13:10.601-07:00Rblogger Posting Patterns Analyzed with R<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGtMIXNEqKM4BcvfADUUty1U26hNZXo7wB9Vv4wN4urLpvCW6oicB-s4Q-G1j5clq-kDIJuXO1Qh-fATskdVaHLmqbC_2ztyGTZfSlQcwhD7qF94cCO27i1tvlEbLJ4vO2fUoy3cSVXahD/s1600/avgDelay.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGtMIXNEqKM4BcvfADUUty1U26hNZXo7wB9Vv4wN4urLpvCW6oicB-s4Q-G1j5clq-kDIJuXO1Qh-fATskdVaHLmqbC_2ztyGTZfSlQcwhD7qF94cCO27i1tvlEbLJ4vO2fUoy3cSVXahD/s640/avgDelay.png" /></a></div>
<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta http-equiv="x-ua-compatible" content="IE=9" >
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<p>I've been a big fan of rbloggers for quite some time, but have only recently started contributing myself. After my first post yesterday, I immidiately started wondering how long most other bloggers go between posts. </p>
<p>I decided to gather the list of past posts to rbloggers to investigate a bit. I've posted the data (as of yesterday evening) <a href="https://github.com/MarkTPatterson/Blog">here</a> – I'm a bit new to github, but the file (RBloggersData.csv) should be there.</p>
<p>I started by using plyr to calculate the average delay between each author's posts. It turns out that this distribution has a ton of right-skew, and looks fairly normal (or at least mound-shaped.. see plot above) when logged. Depending on how 0s are handled, the average (log) delay between posts is around 3.5 to 3.75, meaning most people post around once each month. </p>
<p>Next, still pretty new to blogging, I wondered which day of the week most people are posting. The distrubution we get shows that weekends have markedly fewer posts than weekdays, and there's a fairly strong downward trend over the course of the week. I'm guessing most people (like me) end up experimenting with data over the weekends, and scaping together a post for Monday. (See first figure below)</p>
<p>Finally, even though I've been seeing the feed of rbloggers posts for a while, I'd never really tracked the total number of posts per day. When I collected the data at the day level, I was surprised to find what explosive growth the site had starting around 2009. After fitting a nonparametric line (see second figure below), we can see the average posts per day roughly double from 2009 to 2010, and double again between 2010 and 2012! Below are the figures and code used to generate.</p>
</body>
</html>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfxyogis1DXU9Wz8PrPs2PYedbWTo5FiewaaAO77ARYHsWg_SG99L6ekICHMroCn6NFpvPgAm8y0yaQ9HkoR0eQUVGVKg4W1dKVtXYEVj017Y_y9OvOHi_UhRxs_4mtdCNd7xtoI25V7Bg/s1600/dayOfWeek.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfxyogis1DXU9Wz8PrPs2PYedbWTo5FiewaaAO77ARYHsWg_SG99L6ekICHMroCn6NFpvPgAm8y0yaQ9HkoR0eQUVGVKg4W1dKVtXYEVj017Y_y9OvOHi_UhRxs_4mtdCNd7xtoI25V7Bg/s640/dayOfWeek.png" /></a></div>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOaKifh0LH2ZRihyphenhyphen5p0jmoHGp1HO25yuQEHnDPEkUW-zEnCEl003ZzOLo5YBBBurRRlzPM728gRbDbuxFHKR24l62vN8NXRJ2D5kZSRi2OQhJZ_LGIC2W6HKI8m7UrOeraUmlEhtsz5Grg/s1600/postsPerDay.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOaKifh0LH2ZRihyphenhyphen5p0jmoHGp1HO25yuQEHnDPEkUW-zEnCEl003ZzOLo5YBBBurRRlzPM728gRbDbuxFHKR24l62vN8NXRJ2D5kZSRi2OQhJZ_LGIC2W6HKI8m7UrOeraUmlEhtsz5Grg/s640/postsPerDay.png" /></a></div>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><a href="http://inside-r.org/r-doc/base/load"><span style="color: #003399; font-weight: bold;">load</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"C:/Users/Mark/Desktop/RInvest/WebScraping/rblogger.RData"</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/ggplot2"><span style="">ggplot2</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/plyr"><span style="">plyr</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span>lubridate<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/np"><span style="">np</span></a><span style="color: #009900;">)</span>
find.avg = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>post.inputs<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>post.inputs<span style="color: #009900;">)</span> <span style="">==</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>out = <span style="color: #000000; font-weight: bold;">NA</span><span style="color: #009900;">}</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">{</span>
diffs.raw = <a href="http://inside-r.org/r-doc/base/difftime"><span style="color: #003399; font-weight: bold;">difftime</span></a><span style="color: #009900;">(</span>post.inputs<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span>post.inputs<span style="color: #009900;">[</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/utils/tail"><span style="color: #003399; font-weight: bold;">tail</span></a><span style="color: #009900;">(</span>post.inputs<span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/base/units"><span style="color: #003399; font-weight: bold;">units</span></a> = <span style="color: #0000ff;">"days"</span><span style="color: #009900;">)</span>
diffs = diffs.raw<span style="color: #009900;">[</span><span style="">-</span><a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>diffs.raw<span style="color: #009900;">)</span><span style="color: #009900;">]</span>
out = <a href="http://inside-r.org/r-doc/base/mean"><span style="color: #003399; font-weight: bold;">mean</span></a><span style="color: #009900;">(</span>diffs<span style="color: #009900;">)</span><span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
delay.frame = ddply<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="color: #339933;">,</span>.<span style="color: #009900;">(</span>author<span style="color: #009900;">)</span><span style="color: #339933;">,</span>summarize<span style="color: #339933;">,</span>
avg.delay = <a href="http://inside-r.org/r-doc/base/round"><span style="color: #003399; font-weight: bold;">round</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span>find.avg<span style="color: #009900;">(</span>date.format<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
tot.posts = <a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span>date.format<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>delay.frame<span style="color: #339933;">,</span>aes<span style="color: #009900;">(</span>x = <a href="http://inside-r.org/r-doc/base/log"><span style="color: #003399; font-weight: bold;">log</span></a><span style="color: #009900;">(</span>avg.delay<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_density<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
p <span style="">+</span> xlab<span style="color: #009900;">(</span><span style="color: #0000ff;">"average delay between posts (log days)"</span><span style="color: #009900;">)</span> <span style="">+</span> theme_bw<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
ggsave<span style="color: #009900;">(</span><span style="color: #0000ff;">"avgDelay.png"</span><span style="color: #009900;">)</span>
log.delay = <a href="http://inside-r.org/r-doc/base/log"><span style="color: #003399; font-weight: bold;">log</span></a><span style="color: #009900;">(</span>delay.frame<span style="">$</span>avg.delay<span style="color: #009900;">)</span>
log.delay<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>log.delay <span style="">==</span> <span style="">-</span><span style="color: #000000; font-weight: bold;">Inf</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">0</span>
<a href="http://inside-r.org/r-doc/base/mean"><span style="color: #003399; font-weight: bold;">mean</span></a><span style="color: #009900;">(</span>log.delay<span style="color: #339933;">,</span>na.rm = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="">$</span>dow = wday<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="">$</span>date.format<span style="color: #339933;">,</span> label = T<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="">$</span>month = month<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="">$</span>date.format<span style="color: #339933;">,</span> label = T<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="">$</span>year = year<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="">$</span>date.format<span style="color: #009900;">)</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="color: #339933;">,</span> aes<span style="color: #009900;">(</span>x = dow<span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_bar<span style="color: #009900;">(</span>fill = <span style="color: #0000ff;">"blue"</span><span style="color: #009900;">)</span>
p <span style="">+</span> theme_bw<span style="color: #009900;">(</span><span style="color: #009900;">)</span> <span style="">+</span> xlab<span style="color: #009900;">(</span><span style="color: #0000ff;">"day of week"</span><span style="color: #009900;">)</span> <span style="">+</span> ylab<span style="color: #009900;">(</span><span style="color: #0000ff;">"total posts"</span><span style="color: #009900;">)</span>
ggsave<span style="color: #009900;">(</span><span style="color: #0000ff;">"dayOfWeek.png"</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">## how many posts per day?</span>
post.per.day.frame = ddply<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base"><span style="color: #006600; font-weight: bold;">base</span></a><span style="color: #339933;">,</span>.<span style="color: #009900;">(</span>date.format<span style="color: #009900;">)</span><span style="color: #339933;">,</span>
summarize<span style="color: #339933;">,</span>
tot.posts = <a href="http://inside-r.org/r-doc/base/length"><span style="color: #003399; font-weight: bold;">length</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/title"><span style="color: #003399; font-weight: bold;">title</span></a><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
post.per.day.frame<span style="">$</span>time = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/difftime"><span style="color: #003399; font-weight: bold;">difftime</span></a><span style="color: #009900;">(</span>post.per.day.frame<span style="">$</span>date.format<span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/min"><span style="color: #003399; font-weight: bold;">min</span></a><span style="color: #009900;">(</span>post.per.day.frame<span style="">$</span>date.format<span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span>post.per.day.frame<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/base/units"><span style="color: #003399; font-weight: bold;">units</span></a> = <span style="color: #0000ff;">"days"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
np.1 = npreg<span style="color: #009900;">(</span>tot.posts <span style="">~</span> <a href="http://inside-r.org/r-doc/stats/time"><span style="color: #003399; font-weight: bold;">time</span></a><span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/utils/data"><span style="color: #003399; font-weight: bold;">data</span></a> = post.per.day.frame<span style="color: #009900;">)</span>
post.per.day.frame<span style="">$</span>pred = <a href="http://inside-r.org/r-doc/stats/predict"><span style="color: #003399; font-weight: bold;">predict</span></a><span style="color: #009900;">(</span>np.1<span style="color: #339933;">,</span> newdata = post.per.day.frame<span style="color: #009900;">)</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>post.per.day.frame<span style="color: #339933;">,</span> aes<span style="color: #009900;">(</span>x = date.format<span style="color: #339933;">,</span>
y = tot.posts<span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_point<span style="color: #009900;">(</span><span style="color: #009900;">)</span> <span style="">+</span>
geom_line<span style="color: #009900;">(</span>aes<span style="color: #009900;">(</span>x = date.format<span style="color: #339933;">,</span> y = pred<span style="color: #009900;">)</span><span style="color: #339933;">,</span> color = <span style="color: #0000ff;">"red"</span><span style="color: #339933;">,</span> size = <span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span>
p <span style="">+</span> theme_bw<span style="color: #009900;">(</span><span style="color: #009900;">)</span> <span style="">+</span> xlab<span style="color: #009900;">(</span><span style="color: #0000ff;">"date"</span><span style="color: #009900;">)</span> <span style="">+</span> ylab<span style="color: #009900;">(</span><span style="color: #0000ff;">"total posts"</span><span style="color: #009900;">)</span>
ggsave<span style="color: #009900;">(</span><span style="color: #0000ff;">"postsPerDay.png"</span><span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-38721398841479409132014-04-10T06:57:00.000-07:002014-04-10T07:41:36.945-07:00Visualizing Twitter Followers Using Pointillism<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta http-equiv="x-ua-compatible" content="IE=9" >
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmDYvRNx5ORKaUOX0-xBMmzfeADJXYO0ViRMiL61CcbaGBxqUPA8qSZIjtI8_eQ4jIe7YscoaTHpLTRnR_dvyuUny3rAoHPGTV_wvzOSm2H3DG7be-Z0O6X77B7r668YRe9UNuD_pSlBg2/s1600/revoDavid.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmDYvRNx5ORKaUOX0-xBMmzfeADJXYO0ViRMiL61CcbaGBxqUPA8qSZIjtI8_eQ4jIe7YscoaTHpLTRnR_dvyuUny3rAoHPGTV_wvzOSm2H3DG7be-Z0O6X77B7r668YRe9UNuD_pSlBg2/s640/revoDavid.png" /></a></div>
</head>
<body>
<p>A funny thing about social media in the 21st century is that it allows us to connect with a lot of people.. by a lot, I mean so many that it's easy to lose track of any sense of scale. Maybe others are better at this, but I have a hard time wrapping my head around what (say) 8,000 twitter followers looks like.</p>
<p>To try to get a grip on this, I thought it'd be fun to try to represent the number of followers a person has by creating plots with a point for each follower. Using R, this turns out to be really easy!</p>
<p>In order to make these plots a bit more interesting than just a mass of dots, I decided to use twitter profile pictures as a source of color. The result is pretty cool – we get a plot with a 'pointillist' representation of the profile picture. To try this out, I've created representations for a few famous R bloggers – David Smith above, Tal Galili and Hadley Wickham below. </p>
<p>While there are a bunch of ways to do this, the code below (roughly) samples a bunch of (x,y) coordinates from the same dimensions as the picture, and finds a 'close color' from the original image, then replots this set of points. </p>
<p>In order to preserve the pointillist aesthetic for very small and vary large numbers of followers, the size of the points is a decreasing function of the number of followers. </p>
<p>There are a bunch of ways this function could be improved – right now, it only works if the original image is a jpeg file. Also, I've limited the number of points the function will visualize to 30,000.</p>
<p>The script draws twitter profile pictures, and the number of followers using the Jeff Gentry's twitteR package. Getting this running can be a bit of a pain, but there's help <a href="https://sites.google.com/site/dataminingatuoc/home/data-from-twitter/r-oauth-for-twitter">here</a></p>
<p>Here's the code – please feel free to improve on it.. it's pretty hacked-together:</p>
</body>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3A1qGmQ_tIPCdiB_m03mur9e_ZKReqcLb70l5dqj0duLnLp2ft1wV46TQPI8DyyxvZwUjKoVfEiST0n1iDb1hQ2gezsAkqFKhNtPyFo-K8T0wExo8Z7C2UDlYOPrXfqC5hmLImct1Mp4L/s1600/TG.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3A1qGmQ_tIPCdiB_m03mur9e_ZKReqcLb70l5dqj0duLnLp2ft1wV46TQPI8DyyxvZwUjKoVfEiST0n1iDb1hQ2gezsAkqFKhNtPyFo-K8T0wExo8Z7C2UDlYOPrXfqC5hmLImct1Mp4L/s640/TG.png" /></a></div>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEFD0vP3zAEkJUkDVQLxsPGWjJIAM-AM13GTnmR-6pNvAlL7n3AaEEPsiNyw9l0-RVkMXF4FTfiCoNrF9INSnjcU_G5TU4zsO68hlU0gaE5i7tyPj9wJ98JrqYMBaq6As9z-lsJKsbYDpt/s1600/HW.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEFD0vP3zAEkJUkDVQLxsPGWjJIAM-AM13GTnmR-6pNvAlL7n3AaEEPsiNyw9l0-RVkMXF4FTfiCoNrF9INSnjcU_G5TU4zsO68hlU0gaE5i7tyPj9wJ98JrqYMBaq6As9z-lsJKsbYDpt/s640/HW.png" /></a></div>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">## required libraries:</span>
<span style="color: #666666; font-style: italic;">## note: you need to register twitteR credentials before running!</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/ggplot2"><span style="">ggplot2</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span>EBImage<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/grDevices/jpeg"><span style="color: #003399; font-weight: bold;">jpeg</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/grid"><span style="color: #003399; font-weight: bold;">grid</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/twitteR"><span style="">twitteR</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span>ROAuth<span style="color: #009900;">)</span>
im.func.1 = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #339933;">,</span> k.cols = <span style="color: #cc66cc;">10</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;"># creating a dataframe:</span>
test.mat = <a href="http://inside-r.org/r-doc/base/matrix"><span style="color: #003399; font-weight: bold;">matrix</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/ncol"><span style="color: #003399; font-weight: bold;">ncol</span></a> = <span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>test.mat<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/colnames"><span style="color: #003399; font-weight: bold;">colnames</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"r"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"g"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"b"</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>y = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>x = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span> each = <a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/image"><span style="color: #003399; font-weight: bold;">image</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># extracting colors:</span>
k2 = <a href="http://inside-r.org/r-doc/stats/kmeans"><span style="color: #003399; font-weight: bold;">kmeans</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>k.cols<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># adding centers back:</span>
fit.test = <a href="http://inside-r.org/r-doc/stats/fitted"><span style="color: #003399; font-weight: bold;">fitted</span></a><span style="color: #009900;">(</span>k2<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>r.pred = fit.test<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>g.pred = fit.test<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>b.pred = fit.test<span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
num.to.let = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x1<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
ref.dat = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>num = <span style="color: #cc66cc;">10</span><span style="">:</span><span style="color: #cc66cc;">15</span><span style="color: #339933;">,</span> let = <span style="color: #000000; font-weight: bold;">LETTERS</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">6</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
out = <a href="http://inside-r.org/r-doc/base/as.character"><span style="color: #003399; font-weight: bold;">as.character</span></a><span style="color: #009900;">(</span>x1<span style="color: #009900;">)</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>x1 <span style="">%in%</span> <span style="color: #cc66cc;">10</span><span style="">:</span><span style="color: #cc66cc;">15</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>out = <a href="http://inside-r.org/r-doc/base/as.character"><span style="color: #003399; font-weight: bold;">as.character</span></a><span style="color: #009900;">(</span>ref.dat<span style="">$</span>let<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>ref.dat<span style="">$</span>num <span style="">==</span> x1<span style="color: #009900;">)</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
rgb.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;">#note: vec is a triple of color intensities</span>
r1 = <a href="http://inside-r.org/r-doc/base/floor"><span style="color: #003399; font-weight: bold;">floor</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">255</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
g1 = <a href="http://inside-r.org/r-doc/base/floor"><span style="color: #003399; font-weight: bold;">floor</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">255</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
b1 = <a href="http://inside-r.org/r-doc/base/floor"><span style="color: #003399; font-weight: bold;">floor</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">255</span><span style="">*</span>vec<span style="color: #009900;">[</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
x1 = r1 <span style="">%/%</span> <span style="color: #cc66cc;">16</span>
x2 = r1 %% <span style="color: #cc66cc;">16</span>
x3 = g1 <span style="">%/%</span> <span style="color: #cc66cc;">16</span>
x4 = g1 %% <span style="color: #cc66cc;">16</span>
x5 = b1 <span style="">%/%</span> <span style="color: #cc66cc;">16</span>
x6 = b1 %% <span style="color: #cc66cc;">16</span>
x1 = num.to.let<span style="color: #009900;">(</span>x1<span style="color: #009900;">)</span>
x2 = num.to.let<span style="color: #009900;">(</span>x2<span style="color: #009900;">)</span>
x3 = num.to.let<span style="color: #009900;">(</span>x3<span style="color: #009900;">)</span>
x4 = num.to.let<span style="color: #009900;">(</span>x4<span style="color: #009900;">)</span>
x5 = num.to.let<span style="color: #009900;">(</span>x5<span style="color: #009900;">)</span>
x6 = num.to.let<span style="color: #009900;">(</span>x6<span style="color: #009900;">)</span>
out = <a href="http://inside-r.org/r-doc/base/paste"><span style="color: #003399; font-weight: bold;">paste</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"#"</span><span style="color: #339933;">,</span>x1<span style="color: #339933;">,</span>x2<span style="color: #339933;">,</span>x3<span style="color: #339933;">,</span>x4<span style="color: #339933;">,</span>x5<span style="color: #339933;">,</span>x6<span style="color: #339933;">,</span> sep = <span style="color: #0000ff;">""</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span>out<span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
dot.size.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>n<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #cc66cc;">1</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>n<span style="">></span><span style="color: #cc66cc;">10000</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #cc66cc;">.5</span><span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>n<span style=""><</span><span style="color: #cc66cc;">5000</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #cc66cc;">2</span><span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>n<span style=""><</span><span style="color: #cc66cc;">2000</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #cc66cc;">3</span><span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>n<span style=""><</span><span style="color: #cc66cc;">1000</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #cc66cc;">4</span><span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>n<span style=""><</span><span style="color: #cc66cc;">500</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #cc66cc;">5</span><span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>n<span style=""><</span><span style="color: #cc66cc;">200</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #cc66cc;">6</span><span style="color: #009900;">}</span>
<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>n<span style="">></span><span style="color: #cc66cc;">30000</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a> = <span style="color: #000000; font-weight: bold;">NA</span><span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/grDevices/dot"><span style="color: #003399; font-weight: bold;">dot</span></a><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
general.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>user<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
get.em = getUser<span style="color: #009900;">(</span>user<span style="color: #339933;">,</span> cainfo = <span style="color: #0000ff;">"cacert.pem"</span><span style="color: #009900;">)</span>
img = readImage<span style="color: #009900;">(</span>get.em<span style="">$</span>profileImageUrl<span style="color: #009900;">)</span>
n.follow = get.em<span style="">$</span>followersCount
dot.size = dot.size.func<span style="color: #009900;">(</span>n.follow<span style="color: #009900;">)</span>
dat1 = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>x = <a href="http://inside-r.org/r-doc/stats/runif"><span style="color: #003399; font-weight: bold;">runif</span></a><span style="color: #009900;">(</span>n.follow<span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
y = <a href="http://inside-r.org/r-doc/stats/runif"><span style="color: #003399; font-weight: bold;">runif</span></a><span style="color: #009900;">(</span>n.follow<span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/dim"><span style="color: #003399; font-weight: bold;">dim</span></a><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
radius = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span>dot.size<span style="color: #339933;">,</span>n.follow<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
temp = im.func.1<span style="color: #009900;">(</span>img<span style="color: #339933;">,</span>k.cols = <span style="color: #cc66cc;">10</span><span style="color: #009900;">)</span>
dat1<span style="">$</span>x.round = <a href="http://inside-r.org/r-doc/base/round"><span style="color: #003399; font-weight: bold;">round</span></a><span style="color: #009900;">(</span>dat1<span style="">$</span>x<span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">)</span>
dat1<span style="">$</span>y.round = <a href="http://inside-r.org/r-doc/base/round"><span style="color: #003399; font-weight: bold;">round</span></a><span style="color: #009900;">(</span>dat1<span style="">$</span>y<span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">)</span>
dat1<span style="">$</span>x.round<span style="color: #009900;">[</span>dat1<span style="">$</span>x.round <span style="">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">1</span>
dat1<span style="">$</span>y.round<span style="color: #009900;">[</span>dat1<span style="">$</span>y.round <span style="">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">1</span>
dat1 = <a href="http://inside-r.org/r-doc/base/merge"><span style="color: #003399; font-weight: bold;">merge</span></a><span style="color: #009900;">(</span>dat1<span style="color: #339933;">,</span>temp<span style="color: #339933;">,</span>by.x = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"x.round"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"y.round"</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> by.y = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"x"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"y"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># splice in the colors:</span>
dat1<span style="">$</span>col = <a href="http://inside-r.org/r-doc/base/apply"><span style="color: #003399; font-weight: bold;">apply</span></a><span style="color: #009900;">(</span>dat1<span style="color: #009900;">[</span><span style="color: #cc66cc;">9</span><span style="">:</span><span style="color: #cc66cc;">11</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span>rgb.func<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">## trying out a different plot:</span>
dat1<span style="">$</span>x = <a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span>dat1<span style="">$</span>x<span style="color: #009900;">)</span> <span style="">-</span> dat1<span style="">$</span>x
g = rasterGrob<span style="color: #009900;">(</span>img<span style="color: #339933;">,</span> interpolate = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>dat1<span style="color: #339933;">,</span>aes<span style="color: #009900;">(</span>x = y<span style="color: #339933;">,</span> y = x<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/col"><span style="color: #003399; font-weight: bold;">col</span></a> = <a href="http://inside-r.org/r-doc/base/col"><span style="color: #003399; font-weight: bold;">col</span></a><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_point<span style="color: #009900;">(</span>size = dat1<span style="">$</span>radius<span style="color: #009900;">)</span> <span style="">+</span> scale_colour_identity<span style="color: #009900;">(</span><span style="color: #009900;">)</span> <span style="">+</span>
ylim<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/min"><span style="color: #003399; font-weight: bold;">min</span></a><span style="color: #009900;">(</span>temp<span style="">$</span>x<span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span>temp<span style="">$</span>x<span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span>
xlim<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/min"><span style="color: #003399; font-weight: bold;">min</span></a><span style="color: #009900;">(</span>temp<span style="">$</span>y<span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span>temp<span style="">$</span>y <span style="">+</span> <span style="color: #cc66cc;">50</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span>
theme_bw<span style="color: #009900;">(</span><span style="color: #009900;">)</span> <span style="">+</span>
theme<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/line"><span style="color: #003399; font-weight: bold;">line</span></a> = element_blank<span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/graphics/text"><span style="color: #003399; font-weight: bold;">text</span></a> = element_blank<span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/stats/line"><span style="color: #003399; font-weight: bold;">line</span></a> = element_blank<span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/graphics/title"><span style="color: #003399; font-weight: bold;">title</span></a> = element_blank<span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span>
annotation_custom<span style="color: #009900;">(</span>g<span style="color: #339933;">,</span> xmin = <a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span>temp<span style="">$</span>y<span style="color: #009900;">)</span> <span style="">+</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span> xmax = <a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span>temp<span style="">$</span>y<span style="color: #009900;">)</span> <span style="">+</span> <span style="color: #cc66cc;">50</span><span style="color: #339933;">,</span> ymin = <span style="">-</span><span style="color: #000000; font-weight: bold;">Inf</span><span style="color: #339933;">,</span> ymax = <span style="color: #000000; font-weight: bold;">Inf</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/print"><span style="color: #003399; font-weight: bold;">print</span></a><span style="color: #009900;">(</span>p<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
t.start = <a href="http://inside-r.org/r-doc/base/Sys.time"><span style="color: #003399; font-weight: bold;">Sys.time</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span>
general.func<span style="color: #009900;">(</span><span style="color: #0000ff;">"revodavid"</span><span style="color: #009900;">)</span>
t.end = <a href="http://inside-r.org/r-doc/base/Sys.time"><span style="color: #003399; font-weight: bold;">Sys.time</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span>
t.end <span style="">-</span> t.start
ggsave<span style="color: #009900;">(</span><span style="color: #0000ff;">"RD.png"</span><span style="color: #339933;">,</span> width = <span style="color: #cc66cc;">8.25</span><span style="color: #339933;">,</span> height =<span style="color: #cc66cc;">4.42</span><span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com3tag:blogger.com,1999:blog-8973439534644845561.post-23910243403921677602013-11-07T08:32:00.001-08:002013-11-07T08:33:35.379-08:00College Basketball: Presence in the NBA over Time<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZ1yNasgz9VeBx7CE84fWuuJCzu_7w2pksLZpfx62ljUerVwWKWhB75ZqGG9xXDTp5zVxx078FZJxXUsUkBBMYegYJnVyweZJ_9MGmHsZLrUhQdqhpE8U3XpQjZsC-2nlk9Xoh-zmDMDNn/s1600/topCollegeNBA.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZ1yNasgz9VeBx7CE84fWuuJCzu_7w2pksLZpfx62ljUerVwWKWhB75ZqGG9xXDTp5zVxx078FZJxXUsUkBBMYegYJnVyweZJ_9MGmHsZLrUhQdqhpE8U3XpQjZsC-2nlk9Xoh-zmDMDNn/s640/topCollegeNBA.png" /></a></div>
<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<p>Interested in practicing a bit of web-scraping, I decided to make use of a nice dataset provided by <a href="http://www.databasebasketball.com/players/playerbycollege.htm">Databasebasketball.com</a> in order to examine the representation of various college programs in the NBA/ABA over time. This dataset only includes retired players, and ends in 2010, so I decided to only plot data through 2000.</p>
<p>Originally, I was excited to try out a <a href="http://code.google.com/p/google-motion-charts-with-r/">googleVis motion chart</a> using this data, but the result turned out less exciting that I expected. </p>
<p>Here, I've restricted my attention to teams which (at some point) have at least 11 players in the league simultaneously – this turns out limit the inclusion to a handful of programs. </p>
<p>While enthusiasts of NBA history surely will not need this plot to recall these periods of schools' strong presence in the league, I think the plot nicely captures the story behind several programs. It's easy to see the relatively recent emergence of Georgia Tech and Arizona, the slow climb of UNC and Michigan, the powerhouse years of Kentucky (1950s), and UCLA (1980s).</p>
<p>Generating code is below.</p>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><span style="color: #666666; font-style: italic;"># data scrape:</span>
site = <span style="color: #0000ff;">"http://www.databasebasketball.com/players/playerbycollege.htm"</span>
<span style="color: #666666; font-style: italic;"># turn off warnings:</span>
<a href="http://inside-r.org/r-doc/base/options"><span style="color: #003399; font-weight: bold;">options</span></a><span style="color: #009900;">(</span>warn = <span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># readlines:</span>
tab = <a href="http://inside-r.org/r-doc/base/readLines"><span style="color: #003399; font-weight: bold;">readLines</span></a><span style="color: #009900;">(</span>site<span style="color: #009900;">)</span>
trim = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp = <a href="http://inside-r.org/r-doc/base/substr"><span style="color: #003399; font-weight: bold;">substr</span></a><span style="color: #009900;">(</span>x<span style="color: #339933;">,</span><span style="color: #cc66cc;">9</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/nchar"><span style="color: #003399; font-weight: bold;">nchar</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="">-</span><span style="color: #cc66cc;">8</span><span style="color: #009900;">)</span>
temp2 = <a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>temp<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">">"</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/paste"><span style="color: #003399; font-weight: bold;">paste</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"http://www.databasebasketball.com"</span><span style="color: #339933;">,</span>temp2<span style="color: #339933;">,</span>sep = <span style="color: #0000ff;">""</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/base/sub"><span style="color: #003399; font-weight: bold;">sub</span></a> = tab<span style="color: #009900;">[</span><span style="color: #cc66cc;">81</span><span style="">:</span><span style="color: #cc66cc;">553</span><span style="color: #009900;">]</span>
sites = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sub"><span style="color: #003399; font-weight: bold;">sub</span></a><span style="color: #339933;">,</span> trim<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># find lines around players:</span>
dates.grab = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>s1<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp = <a href="http://inside-r.org/r-doc/base/readLines"><span style="color: #003399; font-weight: bold;">readLines</span></a><span style="color: #009900;">(</span>s1<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/start"><span style="color: #003399; font-weight: bold;">start</span></a> = <a href="http://inside-r.org/r-doc/base/grep"><span style="color: #003399; font-weight: bold;">grep</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"listed separately"</span><span style="color: #339933;">,</span>temp<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/end"><span style="color: #003399; font-weight: bold;">end</span></a> = <a href="http://inside-r.org/r-doc/base/grep"><span style="color: #003399; font-weight: bold;">grep</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"font class=foot"</span><span style="color: #339933;">,</span>temp<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/sub"><span style="color: #003399; font-weight: bold;">sub</span></a> = temp<span style="color: #009900;">[</span><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/start"><span style="color: #003399; font-weight: bold;">start</span></a><span style="">+</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span><span style="">:</span><span style="color: #009900;">(</span>end<span style="">-</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span>
pattern = <span style="color: #0000ff;">"[[:digit:]]+-[[:digit:]]+"</span>
m = <a href="http://inside-r.org/r-doc/base/gregexpr"><span style="color: #003399; font-weight: bold;">gregexpr</span></a><span style="color: #009900;">(</span>pattern<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/sub"><span style="color: #003399; font-weight: bold;">sub</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span>regmatches<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/sub"><span style="color: #003399; font-weight: bold;">sub</span></a><span style="color: #339933;">,</span>m<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/lapply"><span style="color: #003399; font-weight: bold;">lapply</span></a><span style="color: #009900;">(</span>sites<span style="color: #339933;">,</span>dates.grab<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"years"</span><span style="color: #009900;">)</span>
test = <a href="http://inside-r.org/r-doc/base/rownames"><span style="color: #003399; font-weight: bold;">rownames</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span>
clean.school = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>name<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
temp = <a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>name<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">">"</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/substr"><span style="color: #003399; font-weight: bold;">substr</span></a><span style="color: #009900;">(</span>temp<span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/nchar"><span style="color: #003399; font-weight: bold;">nchar</span></a><span style="color: #009900;">(</span>temp<span style="color: #009900;">)</span><span style="">-</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>school = <a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/lapply"><span style="color: #003399; font-weight: bold;">lapply</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/rownames"><span style="color: #003399; font-weight: bold;">rownames</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span><span style="color: #339933;">,</span>clean.school<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/rownames"><span style="color: #003399; font-weight: bold;">rownames</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span> = <span style="color: #cc66cc;">1</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.start = <a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/lapply"><span style="color: #003399; font-weight: bold;">lapply</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>years<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/base/substr"><span style="color: #003399; font-weight: bold;">substr</span></a><span style="color: #009900;">(</span>x<span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.end = <a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/lapply"><span style="color: #003399; font-weight: bold;">lapply</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>years<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/base/substr"><span style="color: #003399; font-weight: bold;">substr</span></a><span style="color: #009900;">(</span>x<span style="color: #339933;">,</span><span style="color: #cc66cc;">6</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">10</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="">:</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.start = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.start<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.end = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.end<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/min"><span style="color: #003399; font-weight: bold;">min</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.start<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/max"><span style="color: #003399; font-weight: bold;">max</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>year.end<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">#looking for players in 1946:</span>
was.playing.func = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>years<span style="color: #339933;">,</span>test.year<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span>test.year <span style="">%in%</span> years<span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">:</span>years<span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
<span style="color: #666666; font-style: italic;"># 65 years</span>
mat = <a href="http://inside-r.org/r-doc/base/matrix"><span style="color: #003399; font-weight: bold;">matrix</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span><span style="">*</span><span style="color: #cc66cc;">65</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/ncol"><span style="color: #003399; font-weight: bold;">ncol</span></a> = <span style="color: #cc66cc;">65</span><span style="color: #009900;">)</span>
<span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #cc66cc;">65</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span>
mat<span style="color: #009900;">[</span><span style="color: #339933;">,</span>i<span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/apply"><span style="color: #003399; font-weight: bold;">apply</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">[</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="">:</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span>was.playing.func<span style="color: #009900;">(</span>x<span style="color: #339933;">,</span><span style="color: #009900;">(</span>i <span style="">+</span> <span style="color: #cc66cc;">1945</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
copy = <a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a>
copy = <a href="http://inside-r.org/r-doc/base/cbind"><span style="color: #003399; font-weight: bold;">cbind</span></a><span style="color: #009900;">(</span>copy<span style="color: #339933;">,</span> mat<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>copy<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">4</span><span style="">:</span><a href="http://inside-r.org/r-doc/base/ncol"><span style="color: #003399; font-weight: bold;">ncol</span></a><span style="color: #009900;">(</span>copy<span style="color: #009900;">)</span><span style="color: #009900;">]</span> = <span style="color: #cc66cc;">1946</span><span style="">:</span><span style="color: #cc66cc;">2010</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/reshape"><span style="color: #003399; font-weight: bold;">reshape</span></a><span style="color: #009900;">)</span>
mdata = melt<span style="color: #009900;">(</span>copy<span style="color: #339933;">,</span> id = <span style="color: #0000ff;">"school"</span><span style="color: #009900;">)</span>
mdata = mdata<span style="color: #009900;">[</span><span style="">-</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>mdata<span style="">$</span>variable <span style="">%in%</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"year.start"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"year.end"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/names"><span style="color: #003399; font-weight: bold;">names</span></a><span style="color: #009900;">(</span>mdata<span style="color: #009900;">)</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"school"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"year"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"players"</span><span style="color: #009900;">)</span>
mdata<span style="">$</span>year = <a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/as.character"><span style="color: #003399; font-weight: bold;">as.character</span></a><span style="color: #009900;">(</span>mdata<span style="">$</span>year<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/plyr"><span style="">plyr</span></a><span style="color: #009900;">)</span>
comb = ddply<span style="color: #009900;">(</span>mdata<span style="color: #339933;">,</span>.<span style="color: #009900;">(</span>school<span style="color: #339933;">,</span>year<span style="color: #009900;">)</span><span style="color: #339933;">,</span>summarise<span style="color: #339933;">,</span>tot.players = <a href="http://inside-r.org/r-doc/base/sum"><span style="color: #003399; font-weight: bold;">sum</span></a><span style="color: #009900;">(</span>players<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># looking at a subset:</span>
comb2 = comb<span style="color: #009900;">[</span>comb<span style="">$</span>year<span style=""><</span><span style="color: #cc66cc;">2001</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span>
top.sub = <a href="http://inside-r.org/r-doc/base/unique"><span style="color: #003399; font-weight: bold;">unique</span></a><span style="color: #009900;">(</span>comb2<span style="">$</span>school<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>comb2<span style="">$</span>tot.players <span style="">></span> <span style="color: #cc66cc;">10</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
df2 = comb2<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>comb2<span style="">$</span>school <span style="">%in%</span> top.sub<span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/ggplot2"><span style="">ggplot2</span></a><span style="color: #009900;">)</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>df2<span style="color: #339933;">,</span> aes<span style="color: #009900;">(</span>x = year<span style="color: #339933;">,</span> y = tot.players<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/col"><span style="color: #003399; font-weight: bold;">col</span></a> = school<span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_line<span style="color: #009900;">(</span>lwd = <span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span> <span style="">+</span>
facet_grid<span style="color: #009900;">(</span>school<span style="">~</span>.<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/print"><span style="color: #003399; font-weight: bold;">print</span></a><span style="color: #009900;">(</span>p <span style="">+</span> ylab<span style="color: #009900;">(</span><span style="color: #0000ff;">"players in the NBA/ABA"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> opts<span style="color: #009900;">(</span>strip.text.y = theme_blank<span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
ggsave<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/file"><span style="color: #003399; font-weight: bold;">file</span></a> = <span style="color: #0000ff;">"topCollegeNBA.png"</span><span style="color: #339933;">,</span>height = <span style="color: #cc66cc;">8</span><span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-76396978436189731642013-11-06T08:56:00.000-08:002013-11-06T08:56:09.429-08:00Twitter follower counts are log-normal<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgU1k1XPlz-8UIbBQgrh47K6X7_rswybQAt7e5J1innRcqkovv2jFLAC-y5GUBIEsLJzP9FBkH9wKlFd9j6mHhmpTo3FaW3fg62Yot5b9qKGSh8oNcG0nPFh8zJ2srbiDbeKajBpRoxlkAj/s1600/followerDensity.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgU1k1XPlz-8UIbBQgrh47K6X7_rswybQAt7e5J1innRcqkovv2jFLAC-y5GUBIEsLJzP9FBkH9wKlFd9j6mHhmpTo3FaW3fg62Yot5b9qKGSh8oNcG0nPFh8zJ2srbiDbeKajBpRoxlkAj/s640/followerDensity.png" /></a></div>
<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<p><a href="http://decisionsandr.blogspot.com/2013/11/using-r-to-find-obamas-most-frequent.html">Continuing my investigation</a> of Jeff Gentry's <a href="http://cran.r-project.org/web/packages/twitteR/twitteR.pdf">twitteR package</a>, I decided to take a look at the distribution of twitter users' followers.</p>
<p>As a rough place to start, I examined the distribution of followers <em>for those who follow me</em> – that is, I first gather a dataframe with all my followers, then I look at the number of followers those users have. Fortunately, Jeff's package makes this really easy (see my code below).</p>
<p>Since I was expecting a distribution with a very long right-tail, I decided to plot the logarithm of the number of followers.</p>
<p>The result was an almost perfect normal distribution, which was surprising given my small sample-size (I have about 650 followers).</p>
<p>To give a sense of reference, I added the log-follower count for some famous folks (and plotted my own as well).</p>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><span style="color: #666666; font-style: italic;"># note: you'll need to save the 'credentials' file, and load</span>
<span style="color: #666666; font-style: italic;"># it before you can access twitter data. </span>
<span style="color: #666666; font-style: italic;"># for help with this, see this post: </span>
<span style="color: #666666; font-style: italic;">#https://sites.google.com/site/dataminingatuoc/home/data-from-twitter/r-oauth-for-twitter</span>
<a href="http://inside-r.org/r-doc/base/load"><span style="color: #003399; font-weight: bold;">load</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"C:/Users/Mark/Documents/twitteR_credentials"</span><span style="color: #009900;">)</span>
registerTwitterOAuth<span style="color: #009900;">(</span>Cred<span style="color: #009900;">)</span>
me = getUser<span style="color: #009900;">(</span><span style="color: #0000ff;">"M_T_Patterson"</span><span style="color: #339933;">,</span> cainfo = <span style="color: #0000ff;">"cacert.pem"</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">#this works</span>
<span style="color: #666666; font-style: italic;"># What can I learn about a user?</span>
me<span style="">$</span>getFavorites<span style="color: #009900;">(</span>cainfo = <span style="color: #0000ff;">"cacert.pem"</span><span style="color: #009900;">)</span>
fl = me<span style="">$</span>getFollowers<span style="color: #009900;">(</span>cainfo = <span style="color: #0000ff;">"cacert.pem"</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>name = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>fl<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span> x<span style="">$</span>screenName<span style="color: #009900;">)</span><span style="color: #339933;">,</span>
id = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>fl<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span> x<span style="">$</span>id<span style="color: #009900;">)</span><span style="color: #339933;">,</span>
<span style="color: #666666; font-style: italic;">#last.tweet.date = sapply(fl,function(x) x$lastStatus$created),</span>
followers = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>fl<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span> x<span style="">$</span>followersCount<span style="color: #009900;">)</span><span style="color: #339933;">,</span>
location = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>fl<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span> x<span style="">$</span>location<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># sorting by number of followers:</span>
df.f = <a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/order"><span style="color: #003399; font-weight: bold;">order</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>followers<span style="color: #339933;">,</span>decreasing = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/utils/head"><span style="color: #003399; font-weight: bold;">head</span></a><span style="color: #009900;">(</span>df.f<span style="color: #339933;">,</span><span style="color: #cc66cc;">50</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">#(Not run)</span>
<span style="color: #666666; font-style: italic;">#library(ggplot2)</span>
<span style="color: #666666; font-style: italic;">#p = ggplot(df.f, aes(x = log(followers))) + geom_density()</span>
<span style="color: #666666; font-style: italic;">#p + geom_text(aes(log(refs$followers), y = 0.3, label = refs$name, fill = "blue", size = 5))</span>
<span style="color: #666666; font-style: italic;"># this is interesting -- it looks like a log-normal distribution.</span>
<span style="color: #666666; font-style: italic;"># adding some references:</span>
refs = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span>
name = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"Graduate Student:<span style="color: #000099; font-weight: bold;">\n</span>Mark Patterson"</span><span style="color: #339933;">,</span>
<span style="color: #0000ff;">"Famous R Statistician:<span style="color: #000099; font-weight: bold;">\n</span>Hadley Wickham"</span><span style="color: #339933;">,</span>
<span style="color: #0000ff;">"Famous Journalist:<span style="color: #000099; font-weight: bold;">\n</span>Thomas Friedman"</span><span style="color: #339933;">,</span>
<span style="color: #0000ff;">"Famous Heartthrob:<span style="color: #000099; font-weight: bold;">\n</span>Justin Bieber"</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>
followers = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">656</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">5446</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">234686</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">46602072</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># a bit more on the density of the distribution at various points:</span>
dens = <a href="http://inside-r.org/r-doc/stats/density"><span style="color: #003399; font-weight: bold;">density</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/log"><span style="color: #003399; font-weight: bold;">log</span></a><span style="color: #009900;">(</span>df.f<span style="">$</span>followers<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
refs<span style="">$</span>log.followers = <a href="http://inside-r.org/r-doc/base/log"><span style="color: #003399; font-weight: bold;">log</span></a><span style="color: #009900;">(</span>refs<span style="">$</span>followers<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># find.closest dens.value:</span>
dens.lookup = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>val<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
dens<span style="">$</span>y<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which.min"><span style="color: #003399; font-weight: bold;">which.min</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/abs"><span style="color: #003399; font-weight: bold;">abs</span></a><span style="color: #009900;">(</span>val <span style="">-</span> dens<span style="">$</span>x<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span>
<span style="color: #009900;">}</span>
refs<span style="">$</span>dens = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>refs<span style="">$</span>log.followers<span style="color: #339933;">,</span> <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span>dens.lookup<span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">}</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/ggplot2"><span style="">ggplot2</span></a><span style="color: #009900;">)</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>df.f<span style="color: #339933;">,</span> aes<span style="color: #009900;">(</span>x = <a href="http://inside-r.org/r-doc/base/log"><span style="color: #003399; font-weight: bold;">log</span></a><span style="color: #009900;">(</span>followers<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_density<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
p <span style="">+</span> geom_text<span style="color: #009900;">(</span>aes<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/log"><span style="color: #003399; font-weight: bold;">log</span></a><span style="color: #009900;">(</span>refs<span style="">$</span>followers<span style="color: #009900;">)</span><span style="color: #339933;">,</span> y = refs<span style="">$</span>dens<span style="color: #339933;">,</span> label = refs<span style="">$</span>name<span style="color: #339933;">,</span> size = <span style="color: #cc66cc;">5</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>color = <span style="color: #0000ff;">"blue"</span><span style="color: #009900;">)</span><span style="">+</span>
theme<span style="color: #009900;">(</span>legend.position = <span style="color: #0000ff;">"none"</span><span style="color: #009900;">)</span> <span style="">+</span> scale_x_continuous<span style="color: #009900;">(</span>limits = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">20</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span>
labs<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/title"><span style="color: #003399; font-weight: bold;">title</span></a> = <span style="color: #0000ff;">"Density of log(Followers) on twitter"</span><span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com3tag:blogger.com,1999:blog-8973439534644845561.post-49245252876226925972013-11-04T14:42:00.000-08:002013-11-04T14:42:21.513-08:00Using R to find Obama's most frequent twitter hashtags<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbsYLMRqC_kbHXuFL25jcTcZp0mkRVxw9Lu84To-axXXhiz2GRbLf_D42rK4WMwS5K1kEBnY-9hXsRxBcxygguKfO8U6Bp3AFJlVZkmQJdFQA0t-mX8kN8_cBIzHzxOwQPmVJSPf_jJ-NC/s1600/ObamaTweetFreqs.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbsYLMRqC_kbHXuFL25jcTcZp0mkRVxw9Lu84To-axXXhiz2GRbLf_D42rK4WMwS5K1kEBnY-9hXsRxBcxygguKfO8U6Bp3AFJlVZkmQJdFQA0t-mX8kN8_cBIzHzxOwQPmVJSPf_jJ-NC/s640/ObamaTweetFreqs.png" /></a></div>
<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<p>I've been exploring Jeff Gentry's <a href="http://cran.r-project.org/web/packages/twitteR/twitteR.pdf">twitteR package</a>, which has a ton of great functionality for intereacting with twitter data in R. Today, I thought a bit about a problem I've noticed several times on twitter: users profiles are often only noisy signals of the content they tweet about!</p>
<p>I decided that a table of a user's commonly-used tweets might give a better sense of the content a user tweets about. My code to extract the hashtags is below (note: you'll need to load the twitteR package, and complete the OAuth Authentication first.. if you're having trouble with this, try visiting <a href="https://dev.twitter.com/discussions/1596">this page</a>) </p>
<p>Here's the code I used:</p>
<p><div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;">tw = userTimeline<span style="color: #009900;">(</span><span style="color: #0000ff;">"BarackObama"</span><span style="color: #339933;">,</span> cainfo = x1<span style="color: #339933;">,</span> n = <span style="color: #cc66cc;">3200</span><span style="color: #009900;">)</span>
tw = twListToDF<span style="color: #009900;">(</span>tw<span style="color: #009900;">)</span>
vec1 = tw<span style="">$</span>text
extract.hashes = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>vec<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
hash.pattern = <span style="color: #0000ff;">"#[[:alpha:]]+"</span>
have.hash = <a href="http://inside-r.org/r-doc/base/grep"><span style="color: #003399; font-weight: bold;">grep</span></a><span style="color: #009900;">(</span>x = vec<span style="color: #339933;">,</span> pattern = hash.pattern<span style="color: #009900;">)</span>
hash.matches = <a href="http://inside-r.org/r-doc/base/gregexpr"><span style="color: #003399; font-weight: bold;">gregexpr</span></a><span style="color: #009900;">(</span>pattern = hash.pattern<span style="color: #339933;">,</span>
<a href="http://inside-r.org/r-doc/graphics/text"><span style="color: #003399; font-weight: bold;">text</span></a> = vec<span style="color: #009900;">[</span>have.hash<span style="color: #009900;">]</span><span style="color: #009900;">)</span>
extracted.hash = regmatches<span style="color: #009900;">(</span>x = vec<span style="color: #009900;">[</span>have.hash<span style="color: #009900;">]</span><span style="color: #339933;">,</span> m = hash.matches<span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/base/data.frame"><span style="color: #003399; font-weight: bold;">data.frame</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/table"><span style="color: #003399; font-weight: bold;">table</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/tolower"><span style="color: #003399; font-weight: bold;">tolower</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span>extracted.hash<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/colnames"><span style="color: #003399; font-weight: bold;">colnames</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span> = <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"tag"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"freq"</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a> = <a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/order"><span style="color: #003399; font-weight: bold;">order</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="">$</span>freq<span style="color: #339933;">,</span>decreasing = <span style="color: #000000; font-weight: bold;">TRUE</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #009900;">]</span>
<a href="http://inside-r.org/r-doc/base/return"><span style="color: #003399; font-weight: bold;">return</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/stats/df"><span style="color: #003399; font-weight: bold;">df</span></a><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
dat = <a href="http://inside-r.org/r-doc/utils/head"><span style="color: #003399; font-weight: bold;">head</span></a><span style="color: #009900;">(</span>extract.hashes<span style="color: #009900;">(</span>vec1<span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">50</span><span style="color: #009900;">)</span>
dat2 = <a href="http://inside-r.org/r-doc/base/transform"><span style="color: #003399; font-weight: bold;">transform</span></a><span style="color: #009900;">(</span>dat<span style="color: #339933;">,</span>tag = <a href="http://inside-r.org/r-doc/stats/reorder"><span style="color: #003399; font-weight: bold;">reorder</span></a><span style="color: #009900;">(</span>tag<span style="color: #339933;">,</span>freq<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/ggplot2"><span style="">ggplot2</span></a><span style="color: #009900;">)</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>dat2<span style="color: #339933;">,</span> aes<span style="color: #009900;">(</span>x = tag<span style="color: #339933;">,</span> y = freq<span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_bar<span style="color: #009900;">(</span>fill = <span style="color: #0000ff;">"blue"</span><span style="color: #009900;">)</span>
p <span style="">+</span> coord_flip<span style="color: #009900;">(</span><span style="color: #009900;">)</span> <span style="">+</span> labs<span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/graphics/title"><span style="color: #003399; font-weight: bold;">title</span></a> = <span style="color: #0000ff;">"Hashtag frequencies in the tweets of the Obama team (@BarackObama)"</span><span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p></p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com3tag:blogger.com,1999:blog-8973439534644845561.post-77434307542593188702013-11-03T07:10:00.001-08:002013-11-03T07:10:41.774-08:00Simulating Abstract Art with R<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<!-- Styles for R syntax highlighter -->
<style type="text/css">
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: rgb(88, 72, 246)
}
pre .number {
color: rgb(0, 0, 205);
}
pre .comment {
color: rgb(76, 136, 107);
}
pre .keyword {
color: rgb(0, 0, 255);
}
pre .identifier {
color: rgb(0, 0, 0);
}
pre .string {
color: rgb(3, 106, 7);
}
</style>
<!-- R syntax highlighter -->
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"</",c:[hljs.CLCM,hljs.CBLCLM,hljs.QSM,{cN:"string",b:"'\\\\?.",e:"'",i:"."},{cN:"number",b:"\\b(\\d+(\\.\\d*)?|\\.\\d+)(u|U|l|L|ul|UL|f|F)"},hljs.CNM,{cN:"preprocessor",b:"#",e:"$"},{cN:"stl_container",b:"\\b(deque|list|queue|stack|vector|map|set|bitset|multiset|multimap|unordered_map|unordered_set|unordered_multiset|unordered_multimap|array)\\s*<",e:">",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
</head>
<body>
<p>Piet Mondrian <em>Composition with Red, Blue, Black, Yellow, and Gray (1921)</em>:</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSZDE7F0o9FvzislsCpKIGKo21KTuSZUW50_n-R3TMRVdpKjFxCCkX9CjrpnsnRiEkTRkQcbOEWxDFIHFZbT7mb236rL4Zm_AO9fFfEK7F-e6fDRbZD0yMDc05NL24-r_668SEJUVrEFdr/s1600/mondrian.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSZDE7F0o9FvzislsCpKIGKo21KTuSZUW50_n-R3TMRVdpKjFxCCkX9CjrpnsnRiEkTRkQcbOEWxDFIHFZbT7mb236rL4Zm_AO9fFfEK7F-e6fDRbZD0yMDc05NL24-r_668SEJUVrEFdr/s400/mondrian.jpg" /></a></div>
<p>An example draw from my simulation function: </p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaUztpG_1N_dHXamT0ZQGdyAiZruAFV2uhwiKmeQtk11MsQd2fOmLbPBaCNj4J12zrbKZ7zQhx_8MWVtQ0CwjsKLJesfnEeIIXvdm5kkN1L4d1JXia63pVUvBiQpibz5uWKM_076lj9eGu/s1600/mondrian1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaUztpG_1N_dHXamT0ZQGdyAiZruAFV2uhwiKmeQtk11MsQd2fOmLbPBaCNj4J12zrbKZ7zQhx_8MWVtQ0CwjsKLJesfnEeIIXvdm5kkN1L4d1JXia63pVUvBiQpibz5uWKM_076lj9eGu/s640/mondrian1.png" /></a></div>
<p>We're in the midst of planning our spring course on Empirical Research Methods, and as a result, I've found myself spending a lot of time thinking about some of the <em>fist ideas</em> in statistics – For example, that the data we observe are actually draws from some (usually unobserved) generating function.</p>
<p>On a recent museum triup, I started thinking about how we could apply this idea to abstract art. Here, I thought, we could think of a particular painting as a single manifestation of some set of generating rules. </p>
<p>Piet Mondrian has a ton of work in a style I thought would be interesting to explore – <a href="https://www.google.com/search?q=piet+mondrian+gallery&client=firefox-a&hs=MXK&rls=org.mozilla:en-US:official&tbm=isch&tbo=u&source=univ&sa=X&ei=9Vd2UvrBOPPOsASzs4CwBA&ved=0CD4QsAQ&biw=1280&bih=654">his work</a> involves combinations of a small number of lines and filled rectangles.</p>
<p>In order to experiment, I decided to write a function which would simulate the (1921) <em>Composition with Red, Blue, Black, Yellow, and Gray (1921)</em> featured above.</p>
<p>My first idea was to maintain the same set lines and colors, while varying (slightly) the locations and widths of those lines – we can think of these locations as parameters drawn from specified distributions. To start, I only use uniform distributions, with means near the values in the original.</p>
<p>For the color, I found <a href="http://www.yasoypintor.com/abstract-painting-masters-rational-vs-emotional-painting/">a site</a> that extracts hexadecimal colors, which R recognizes.</p>
<p>The result, I think, is interesting. While the variation is still quite constrained, this lets us examine the abstract art.. abstractly (ha!)</p>
<p>Here are a few more draws from the function:</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggVFTvbb8Kz2r4DCWeWcQtwAlyW7QfM1LrckuuKsZ0Bt-OLRAy5Js3HvDgdnuRvKwWE3MwDAazu6xnKIfSalRwIhKYljuZGFm6zP1snsYBNOwGzGYH8YLmQiNu_SCsX85uq2XWzx_OA8ZF/s1600/mondrian2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggVFTvbb8Kz2r4DCWeWcQtwAlyW7QfM1LrckuuKsZ0Bt-OLRAy5Js3HvDgdnuRvKwWE3MwDAazu6xnKIfSalRwIhKYljuZGFm6zP1snsYBNOwGzGYH8YLmQiNu_SCsX85uq2XWzx_OA8ZF/s320/mondrian2.png" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjakqencbUGcu7k1b0nbReQs15TcDr5MCnzyf7s2T5ZJVsuk-EGsjURLQ8YE8wBYL7LfdY3teJWJZw2kpU7sBNO6Egxm88tpBhx1zG0xxfUNT8k5v6QlujTCYx3meGJR2d0NXCAXPiUf6Qf/s1600/mondrian3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjakqencbUGcu7k1b0nbReQs15TcDr5MCnzyf7s2T5ZJVsuk-EGsjURLQ8YE8wBYL7LfdY3teJWJZw2kpU7sBNO6Egxm88tpBhx1zG0xxfUNT8k5v6QlujTCYx3meGJR2d0NXCAXPiUf6Qf/s320/mondrian3.png" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiaLlQimluug_hb_R4XdUlt7NIVBOjDf08nCcc0xBjefO6N_X65nOezAOneLUIWBvj8CSIjoD3yAJhzehrE-zX4ajfNGfcAZmGDHfRp0KgbY9X0225cLQ07FxgfJmEnZDct9aWSvfFT9Yjj/s1600/mondrian4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiaLlQimluug_hb_R4XdUlt7NIVBOjDf08nCcc0xBjefO6N_X65nOezAOneLUIWBvj8CSIjoD3yAJhzehrE-zX4ajfNGfcAZmGDHfRp0KgbY9X0225cLQ07FxgfJmEnZDct9aWSvfFT9Yjj/s320/mondrian4.png" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiA_cq4s40NuKLPPHfNq4JMITPOVbRxZrmUav_0P1Q9duL3tP6DBTkzYjFUZdm6Fns0YM0c5cls5Kx0le1kfBLOSurbzKws2XVKxzOQv2Hwp9mDS5Wl3d9zfiyckEQMSivA-B1luLv5J0KF/s1600/mondrian5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiA_cq4s40NuKLPPHfNq4JMITPOVbRxZrmUav_0P1Q9duL3tP6DBTkzYjFUZdm6Fns0YM0c5cls5Kx0le1kfBLOSurbzKws2XVKxzOQv2Hwp9mDS5Wl3d9zfiyckEQMSivA-B1luLv5J0KF/s320/mondrian5.png" /></a></div>
<p>Here is the code for the simulation – (it's a bit of a mess). Feel free to download, experiment with, and improve it!</p>
<pre><code class="r">sim.func = function() {
# pick a place (near the middle) for central anchor line:
left.anchor = runif(1, 40, 60)
width = runif(1, 2, 3)
right.anchor = left.anchor + width
# plot:
plot(0, 0, type = "n", xlim = c(0, 100), ylim = c(0, 10), xaxt = "n", yaxt = "n",
ann = FALSE)
polygon(c(left.anchor, left.anchor, right.anchor, right.anchor), c(-1, 11,
11, -1), col = "#1c1b23")
upper.left = runif(1, 7, 9)
lower.left = runif(1, 2, 5)
small.line.height = runif(1, 0.1, 0.2)
polygon(c(-100, -100, left.anchor, left.anchor), c(lower.left, lower.left +
small.line.height, lower.left + small.line.height, lower.left), col = "#1c1b23")
polygon(c(-100, -100, left.anchor, left.anchor), c(upper.left, upper.left +
small.line.height, upper.left + small.line.height, upper.left), col = "#1c1b23")
polygon(c(-10, -10, left.anchor, left.anchor), c(upper.left + small.line.height,
12, 12, upper.left + small.line.height), col = "#cbccd4")
polygon(c(-10, -10, left.anchor, left.anchor), c(-1, lower.left, lower.left,
-1), col = "#cbccd4")
upper.right = runif(1, 7, 9)
polygon(c(left.anchor, left.anchor, 200, 200), c(upper.right, upper.right +
small.line.height, upper.right + small.line.height, upper.right), col = "#1c1b23")
polygon(c(left.anchor + width, left.anchor + width, 200, 200), c(upper.right +
small.line.height, 20, 20, upper.right + small.line.height), col = "#db5b2c")
polygon(c(-10, -10, left.anchor, left.anchor), c(lower.left + small.line.height,
upper.left, upper.left, lower.left + small.line.height), col = "#273f70")
lowest.left = runif(1, 0.1, 1.2)
polygon(c(-100, -100, left.anchor, left.anchor), c(lowest.left, lowest.left +
small.line.height, lowest.left + small.line.height, lowest.left), col = "#1c1b23")
lh.ref = runif(1, 5, 20)
polygon(c(lh.ref, lh.ref, lh.ref + width, lh.ref + width), c(lowest.left,
upper.left, upper.left, lowest.left), col = "#1c1b23")
lh.mid = lower.left + runif(1, 0.1, 0.4) * (upper.left - lower.left)
polygon(c(lh.ref, lh.ref, 200, 200), c(lh.mid, lh.mid + small.line.height,
lh.mid + small.line.height, lh.mid), col = "#1c1b23")
lowest.right = runif(1, 0.1, 1.6)
polygon(c(left.anchor + width, left.anchor + width, 105, 105), c(lowest.right,
lowest.right + small.line.height, lowest.right + small.line.height,
lowest.right), col = "#1c1b23")
rh.ref = runif(1, 75, 90)
polygon(c(left.anchor + width, left.anchor + width, rh.ref, rh.ref), c(-1,
lowest.right, lowest.right, -1), col = "#1c1b23")
polygon(c(rh.ref, rh.ref, 200, 200), c(-1, lowest.right, lowest.right, -1),
col = "#dccd1e")
polygon(c(right.anchor, right.anchor, 200, 200), c(lowest.right + small.line.height,
lh.mid, lh.mid, lowest.right + small.line.height), col = colors()[358])
polygon(c(right.anchor, right.anchor, 200, 200), c(lh.mid + small.line.height,
upper.right, upper.right, lh.mid + small.line.height), col = "#cbccd4")
top.left = runif(1, 0.1, 0.3) * (10 - upper.left) + upper.left + small.line.height
polygon(c(-100, -100, left.anchor, left.anchor), c(top.left, top.left +
small.line.height, top.left + small.line.height, top.left), col = "#1c1b23")
polygon(c(-100, -100, left.anchor, left.anchor), c(upper.left + small.line.height,
top.left, top.left, upper.left + small.line.height), col = colors()[358])
}
</code></pre>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-3957690210423513402013-11-02T09:57:00.001-07:002013-11-02T09:57:48.162-07:00First Looks at R: Downloading R and R Studio<iframe width="560" height="315" src="//www.youtube.com/embed/2qOCMVfKKC0?rel=0" frameborder="0" allowfullscreen></iframe>Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-85032651980573336532013-11-01T05:40:00.002-07:002013-11-01T05:41:06.135-07:00Quarterback Wonderlic Scores by Institution (Academic) Strength<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<!-- Styles for R syntax highlighter -->
<style type="text/css">
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: rgb(88, 72, 246)
}
pre .number {
color: rgb(0, 0, 205);
}
pre .comment {
color: rgb(76, 136, 107);
}
pre .keyword {
color: rgb(0, 0, 255);
}
pre .identifier {
color: rgb(0, 0, 0);
}
pre .string {
color: rgb(3, 106, 7);
}
</style>
<!-- R syntax highlighter -->
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"</",c:[hljs.CLCM,hljs.CBLCLM,hljs.QSM,{cN:"string",b:"'\\\\?.",e:"'",i:"."},{cN:"number",b:"\\b(\\d+(\\.\\d*)?|\\.\\d+)(u|U|l|L|ul|UL|f|F)"},hljs.CNM,{cN:"preprocessor",b:"#",e:"$"},{cN:"stl_container",b:"\\b(deque|list|queue|stack|vector|map|set|bitset|multiset|multimap|unordered_map|unordered_set|unordered_multiset|unordered_multimap|array)\\s*<",e:">",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
</head>
<body>
<pre><code>## geom_smooth: method="auto" and size of largest group is <1000, so using
## loess. Use 'method = x' to change the smoothing method.
</code></pre>
<p><img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfgAAAH4CAMAAACR9g9NAAAA9lBMVEUAAAAAADoAAGYAOmYAOpAAZrYzZv86AAA6ADo6AGY6OgA6Ojo6OmY6OpA6ZrY6kJA6kNtmAABmADpmAGZmOgBmOjpmOpBmZjpmZmZmkJBmtv9/f39/f5V/f6t/lcF/q9aQOgCQOjqQOmaQZgCQkDqQkGaQtpCQ27aQ29uQ2/+Vf3+VlcGVweurf6urlZWr1v+2ZgC2Zjq2tma225C2/7a2/9u2///BlX/BlZXBlavBwdbB6//Wq3/W///bkDrb29vb/7bb/9vb///l5eXrwZXr1qvr///y8vL/tmb/tpD/1qv/25D/68H//7b//9b//9v//+v////ZesTXAAAACXBIWXMAAAsSAAALEgHS3X78AAAbrUlEQVR4nO2dC3/byHXFKdWS7DbeSFa6zYpeW0pqp0rliNv1auNYYhpJtVmxtITv/2U6T7wxmAvMBUDgnJ+XK4LAnJn5Yy7mAYKzCJqkZn1nAOpHAD9RAfxEBfATFcBPVAA/UQH8RAXwExXAT1QAP1EB/EQF8BMVwE9UAD9RAfxEBfATFcBPVAA/UY0J/OPpbDbbE39sjsQfs90rtXUht6xmh+J1+ey27LiV2dNq8+pc/GPObN8aEfj1gYS7EBQ3RyfyL41Z0V78fk+eGIelB5aA585r/xoPeItVtHAN3vBcv7iKHs/+dHylgMqoIM6Ezav3B7OdcxUcdt6JHc32x7N3s50/6Ra/PpjZCLGcqZ3NlvS+5+bN1mk84CVg83/T4k/U+8fTk2j98h/y9cXV4+meCgWbIwFb/F+eLfJvddYs5Xt1VkjwIhF7LslTSPxnttg01L72wL5K3VjjAb8yta+YyWv8zAT2xWG0OoyW4t+ePjsEQHVqCJYqKtj/i1NEcdTgUxcA82cqhET2tLAHdl7cthopeAliuaMv1as92fjXL2/F60qH6kMLXrVVQXI506eKQqjBp5qxDOcnccO2aah97YE9FLidxgO+EOr1q9zwUV7gj385O68GbyiXgxdaiMt5KfgtjPJK4wGvI+9e0rmz4B/P3r+UHfsfxKsz1EcZ8Lm+vki+EOpPiiOCrdF4wOvhnGyZBnncGBf/ogbx8jXp3Glo8r3u3ImdV7KPHoOXuyz3VAKS7vq53ZJ07k5kgvrA3krdVCMCrydw5MArM4EjZ2/kebA+OInioZgFL9/Hw7md83SLl4M3m8Ripi7yZosdzp3aBLeQ+6jAG/3vFmLoXiMED/kI4CcqgJ+oAH6iGiH4v7feIbjqHf9e3JE3m52Ct1MqWS2cE93lMyTJyml+DbVq7TWzQ2lGVo6ZV/95mpI981kqLvuqPL06T+9oNnm60tU/eDNJUqXaKl+/zEyaNge/cGQkKPjyPOV2rD+onToG/4OaAlFVbIBvjq/UzIidIVGT5+ZtbrF8/fzdbLa3knfZpNfLF6aG1KK53HUvs1KeW3r/Re6gMmKnXcx+CzPlY+d2nEv252ajfpPLbzw7pLerLEWpdXs1QWQSTzK9efUfOmvqYJ3NkhsIQqlj8KI2lnpm3C5lSv6qsIt4wdu+zS+Wrw/2RDzeU7OnqfVysyxnFs31inlqpdwuvW+ODgWJJFV7h46dgbUtPgbvXLKPE7PTgKn8Jikk29Pr9irzOvFUpk2oTyaTzabsDQTBWHQf6s0SiVlMSyJaaqkk/Ta1WC4nXeWhMoXUIoq5Epr3cS3bo+L1GFlrS4snWV+xay5F8I4l+3RihfymU7cM0+v2duE4KXE1+PyqUjAW3YKXiGSJRB0vU9Wsb4/cjRdH029Ti+WipaskNPh4Z1OdZtFcg0+tlMcrsOqmyyJ4u8paBb50yT5JLJ99uyqsUtfbZeLpdfsU+FSmy8Hn15GDsegH/PrFLybSq02ifg7jOs69TS2WV4K3rW6hrpcWvL2ihgKfXrLPgM/nN0lBb1fgU1E6BT6V6VGDN6FelOQHc9uELpAsXRL4Mm9TV4As+LgWUjc+qdsjDHj7edBQb+2SxPLZTx9ptseh3lZDBrzN9KhDfdxDWc7M2Cm+HK8PduyCt32bXyzPgo/Xy1PXeLFHDN4elanJoyL4fOdOvn88TfCVLtkniZ1EJfnVKSTbDzPr9rlrvMm0AW8PLnTuthq8HJOomKcXx6UUOzmU+ih7fHrB277NLZZnwcfr5fZmO7NoLs6pzEp5XGVqsCX2XcZjJtuE9UDLjuPlfu9TV+qyJXubmH6Tz69JId6+NMM5MxxLt/gk07JIckd78DI7nNtq8InsDXIhvr2wIMx0rJ6FGxCFTaxr9QR+mcCqmbmrVW7mrlIr1VoDTYcFTawXVYO/ns9ffxav4iW01gepttLypnT/iLEMeRt00MT6UCX4p79K4A9vvoh/ZtNdQSWb6vSVfgjdpROTBi591pcv+G9/+fP8bXR/EX37cCPezoXan2bQYFQJ/uHHm+j+8v4yevp0Yzb5nltOocVzm7Rs8Qr+RdziAT6EyzaAf7iIRIvHNT6oyzaAl736iyjTq/e1cArguU0ChPqcfC2cAnhuE4BnNgF4kgCe2wTgmU0AniSA5zYBeGYTgCcJ4LlNAJ7ZBOBJAnhuE4BnNgF4kgCe26Qj8LNZTwXp3WTa4OXXQ/opSO8mAN9PQXo3mTZ4hPrBmaBzx2wC8CQBPLcJwDObADxJAM9tAvDMJgBPEsBzmwA8swnAkwTw3CYAz2wC8CQBPLcJwDObADxJAM9tAvDMJgBPEsBzmwA8swnAkwTw3CYAz2wC8CQBPLcJwDObADxJAM9tAvDMJgBPEsBzmwA8swnAkwTw3CYAz2wyHfBfCyrZxKBOXMZjUuGCFh/KZDot3tfCKYDnNgF4ZhOAJwnguU0AntkE4EkCeG4TgGc2AXiSAJ7bBOCZTQCeJIDnNgF4ZhOAJwnguU0AntkE4EkCeG4TgGc2AXiSegTvfAhnwaT+kZ0AT1F/4N2P3c2beDykF+ApAniaAL69CUL9RMEHNgF4kgCe2wTgmU0AniSA5zYBeGYTgCdpcOBN7x3gJwbejtcBHuAZXIwAniSE+hAuAB/KBOBJAnhuE4BnNgF4kgCe2wTgmU0AnqRwTBzrpgBfsnUs4F13SgB8yVaAJ5g4BfAUIdRzm4wefN8mAE8SwHObADyzCcCTBPDcJgDPbDJe8PcXUXQ9f/0Z4DOKRxONwNffrp9VD+Af5hfRw5sv4h/Ap5TMHzQB7/EFnay6B/9///k/F7LRf/twI97NhRzBYUKS6Po6OqAqcyF4Pwjwl9HTpxuzyffccmrrW/zYQ/29bOMXcYsH+BAuW9K5e8A1PrDL9oBHrz6oy5aAz8vXwimA5zYBeGYTgL+r7aumPp4yeGqPvpFJp+BrpiXSH08YPHnyponJHcBzmwD8HUK9n8YX6gmaMvgRdu78BfDcJgDPbALwJAE8twnAM5sAPEnNCpLvD9f0j3sDX5Ovdia+g4IRgc+PgOtGxH2Br8tXKxPvaQCAp5mQBfDNCuIthHqvxGtcthP8EE3QuSMJ4LlNAJ7ZBOBJAnhuE4BnNgH4O8qCI8Bzm9xFpTRYwBNuMQB4bpMKGgAfygTg7xDqh2TSZagnCOC5TdCrZzYBeJIAntsE4JlNAJ4kgOc2mQj40h5ssK6wUwBPUWDw5WPWYINfpwCeIoDnNpkGeIR6X5exge/PBOBJAnhuE4BnNgF4kgCe2wTgmU0A3leq3+woSFW/muRSZ1LlVDCp7+U3Al+SrNNpBOD1SLm6IJUjaYpLnUmlU97EY1zfBHxJsm6nIYD/WlDJpmrJAjb6nOJSZ1K5U97EKyGqorJkgzuV11d/LR6hfqKhXgudO24TgGc2AXiSAJ7bBOCZTQCeJIDnNpkY+EzXGOBLto4TfHYWBOBLtgJ8Y5OcAJ4ihHpuk4mB794E4EkCeG4TgGc2AXiSAJ7bBOCZTQCeIt35pt3DXuFSnojXsqy/iUsA7y893CZ+a6XcpTwRrztw/E2cAnh/ATy/ySDBI9TzmwwTPDp37CYAz2wC8CQBPLcJwDObADxJAM9t0gn42T41V6mCeP3eiNgYkX+VNcSDERrdV9/op0m28L56MWimkk8K4vULQ3Jj6RePPE28Rf29o1KXRj9GNPyvUJXnmUge4GkHDRO8jI+NwSPU+xzUEfjH09mz/3517g1evtDIo3PHbdIE/OPp4frl7erZLQU8jTzAc5s0Ab85vhLgxSsJPIk8wHObNG/xS2KLJ5EHeG6Tptf42ayCu2sCx588wHObdDpz5z+oqylIroer3vbTq+dwGWyvvurqXgeeMJx3FyQ3ptVvexnHs7gMdhz/eFYxkqsH70se4GnOHbX4I5mL2S6xV6+jlR95hPqCBhDq3aqz8CKPzh23SQ/Lsj7kAZ7bpNPhnBbAD8Gk4QSOeKVP4Fh5kAd4bpPmwznylG2ievIAz23StsVfz+eX8vX1ZwJ4J/nqW95rV7GTHZIb8wPfVx+kV99oWbbOhJqzdtf4h7fRtw83D2++iH8E8A7y1V9yqRlDp8fxyVdxAn+TJsg4vtGNGHUm5Jy17tUL8PcX8lX8PRdy7JrSftUH6mYa4ieFHfRfFnyDpOi56zqRThJ1gb+eX0T3l9HTpxuzwfPcqmzzCPXDDfWr2Um03LHztvcXcYsngHf38NC54zZp1KtXd12tX4he/cOFBE+/xku5yAM8t0mzXv2JbPa2V/+W3qvXcpAHeG6TRqFerdJUrNEQwDvIAzy3Sa9foQL4/kz6/e5cQj7XIacWRN9XT1TWxKtX7GfS8vmZviVJ2XQEfvnsdjmbnbQHH5PPz70QC6K/SUNVxsRvAsTLJJsUG/i0TWe9evFP9erbgrfkAZ5q0gv44yvR5sOAT8gn5WlQEIT6QC5O8NFytnO+ChLqhfbLunjo3HGb9Nu50yohD/DcJkMAX0Ie4LlNBgG+SB7guU2GAb5AHuC5TQYCPj+VQ7uv3rdXnzusrrZK+vk93lcf2CTAsmwQ8LmpHNI3aXzH8fnDas+uIuX+vkkT2qTlsmw48NmpHIDnNmm7LBsOfGYqB6Ge26R8/qSjZdnqrKBzx2qyLzSQzp3JUMOC0Fw6NRkgeN3ABgU+Jg/wPCayqe+7XBzgN8cfm35N2itnpII0dLnrzmQ44BPmDpe+WvydQQ/woU0KfbkBgbc95v1MQXyfJVnhkt49/zhSz3xlTaiHfa1zLiQYHHyurbtcHODN8zDCh/pkjLy//7Vka/n7Gpf07tlDGwzJ7+7oz1tRTFzOxQRDglcX9dIsD6fFp2sgOUN7B7+f1faAjzty5VludgcOB/jsHSX7+8Wtpe9rXBqG+oR0zqRko1t9hPpcHktL2mTmruFTr4gFIVVvwM6di2zksY+fi0OtwHvnrEmLZxzOZQpCQh8EfF2tZUz86rgz8MRYNJxrfL4gUgTyAcDXV1vRpLa2uwBPvfxUuwwFPKHRtwPvWXUVJhU9goKLn0gl0a4drcc3+aECmlIF8UXfGDyludTfPpgTJ/hUxrtalm3yQwUuFbremYK06ayUGMT6WsPcZ1m2bmznigVVypck61GaomPCy9fFbHWBb/hDBdUqDrYLw5OmBSkaxEnWwfC5EcNjVG/H8YVQUKlcSZRHzYHVE16+LnZrfYtv/py7gmrBK06NClI0SJgHuAOHAD63uXhNiBW5P64y8c5R4lKy1QW+5ZMtS+QO9UY1pfcK9ZlKDHEHTn0tl0zglMkB2oNkN6HeLV8L34Kk5Dz161yKDWdSq3OeLsMEL1UJv9ol3EBr0uDZVufSqltm9ChI/fUR4Eu2Olv8Qj3SdK+XFq9VwjOKP/AdPgF8yVYX+NYPMa6X761EKb4RcbQM8KVbXeAbP7bcfyXbUZBcIkkTp66UN7wDh5p6LZPAy7KcvXo1nKuI9A7wBDLVtVWZiO54eKZPy00ibyZJ6vWDxqA3YrCO4536WpDdJPNU/JSoykQ0+AAJuVRSurapF/f0NmllXO7SArzj3OII9YkLQn15chVq0uJXPQ/nyrM8TJMxde42RxXPPQL4IC7DBc90s2VKAM9t0nwCB+C5XAYLvrObLUkC+BAu7hbvlK+FU6Vrme7+alR4PGaNKucDHDt09dMkXjaZ7yF47phxKds6APDEmx6Kv0LlVsU+7kQazPoUmdQl4vk9rfROTvDBvkkT/kaMggB+iODb/sSohxDqPVLuPNQPZHWutQs6dyVbh9DiaQL4EC5O8B1d44kC+BAubvBO+Vo4BfDcJo3uuXNN1QN8a5eBghdaOCJ9HfiaMVSxIO2WGStUefN+SBOtjItXYcpNnId2GOpXzaZsS8eUxZtngt1RUq6Kr+uENTFKu/gVptTEfWhX4OVsfVW8d1sAfEOTAYAX1KseWV4PHqG+qUnvoX5T9cV4P/B+QueO26RBi5ejeMeCvK+FUwDPbdKwc7fEBA6fy5DBN+7VewrguU0a32Vb9X0KgG/tMlDwolNfCT0seNmPtf/VaDbzXszMmdDktyyb2Yns0uR+/1E9GGGWUs3uchf684UbgCffIUF3aXCvxzC+QuVrUVsQgPfXqMAj1PtrVKGeKHTuQrgAfCgTgCcJ4LlNAJ7ZBOBJAnhukxGAN93Ykl59TQe3zqTsvnpCvjxdimpnMplevR24FsfxdUPaGpOyw4cPfgzjeD8B/N00wSPU300z1Dd3QeeuZCvABzIBeJIAntsE4JlNAJ4kgOc2aQn+6df5/M2X6Hr++nN48LT1Sa9l2XYLpqGKUv9ghEYmNLUD//A2iq4vH958Ef9CgyfekeBzI0a7WyTuAhWl/lEojUyIah/q7y/vL6JvH27En3Mh164kqdtqwh5ATjKQsr595aKBXPkUjf7+Mnr6dGPe+55bTiHUNzChqW2LvxbBPm7xYcHThM5dCBdf8E+/XspGz3KNJwrgQ7j4gr+WV/ULpl49TQAfwsU/1Bfka+EUwHObADyzCcCT1Kwg9cuyffXqh2cyJvD1N2L0NYEzQBOAJ5qQBfAUIdRzm4wK/BBNAJ4kgOc2AXhmE4AnCeC5TQCe2QTg3ap+smXlPj4uznseth2890+TEFy6Bl/9LNvqfTxc3M9R2XLw6aIBfPEQgKe4dA0eob6ByRhCfV7o3HGbADyzCcCTBPDcJgDPbALwJAE8t8nwwAf+1ZAh9upHcV99aPB6ABasIEMcx4/lK1QA7xTAewqhfqLgldC54zYBeGYTgCcJ4LlNAJ7ZBOBJAnhuE4APYRL2936rBqaOQwC+F5PAq/70OwsAvh8TgJ8oeIT6qYIP64LOHUkAH8IF4EOZADxJAM9tAvDMJgBPUrOCdPDbsvRDAJ6iRgXp4NekAb5aXwsq2cSg6KsEz27Sgbqqr9Kt29jiEerbu2wn+CGaADxJAM9tAvDMJgBPEsBzmwA8swnA+2oE99W3fH5mO/Db+tuyI/gmTdsn5rYCv7W/Jg3wEwWPUD/RUF8oiK/QuQvhAvChTACeJIDnNgF4ZhOAJwnguU1GAN50Y4mDRqKJEcAPB7wduFJcAk8WOAXwFAE8t8n2g0eob2QyAvCNXQC+ZCvABzIBeJIAntsE4JlNAJ4kgOc2AXiXCkucEfHWfR+XognVAuADm9T9HHUQl2KSAE8SwNME8A4h1E8UfAATdO5IAnhuE4BnNgF4kgCe2wTgmU0AniS+7zq0NGHo1edFtwD4OrX9jgvHOD6ABcDXCeATTQo8Qn2iaYHv3gSdO5IAntsE4JlNAJ4kgOc2AXhmk9GC//bhJoqu568/A7y3yxh+W/Zh/uNN9PDmi/gH8J4uY/g16ae/PX26ie4vdMOP5kLVwQHSkuD7zoOnHPlU4C/V/7R8zy2nRt3iRxHqo0yLB/gQLtsDHtf4oC7bAx69+qAu2wG+IF8LpwCe2wTgmU0AniSA5zYBeGYTgCcJ4LlNAJ7ZBOBJAnhuE4BnNgF4kgCe2wTgmU0AniSA5zYBeGYTgCcJ4LlNAJ7ZBOBJAnhuE4BnNgF4kgCe2wTgmU0AniSA5zYBeGYTgCcJ4LlNAJ7ZBOBJAnhuE4BnNgF4kgCe2wTgmU0AniSA5zYBeGYTgCcJ4LlNhg7e/zGPAB/CZSjgCQ92BfgQLgAfygTgSUKo5zYZOnh/AXwIF4APZQLwJAE8twnAM5sAPEkAz20C8MwmAE8SwHObhAf/taCSTQzqxGU8JhUuaPGhTKbT4n0tnAJ4bhOAZzYBeJIAntsE4JlNAJ4kgOc2AXhmE4AnCeC5TQCe2WQ64Ivq5hfJOnEZj4mfC8CPzgTgJ2oC8BM16QI8tLUC+IkK4CcqgJ+oAH6iagM+/buzTLqeSw9WI/Vr2dqBz8eYsBbm6df5/M0X36K0AJ/5pWkePf31M7fRw/xH+6PZfD7KhLswD28F70vforQAn/lteR59+8uf529ZjZ7+Jn80Wzuw+WgT/sIIJpe+RWkD/lL/0jijHkRDEWVhNVLglQOjj0y3g8KIRu9blGG3eKmHC16jDlp8FFPgLcz128i7KMO+xj9cyMDCa6QaI/M1XpswF+bp18so8i7K8Hv1F8xGqjFy9+qNCWthRPLSgL9XD22zAH6iAviJCuAnKoCfqHoAv35xZV43R7PZ7pXadnASf+Ktx1N50EK8rGbyZa9gYnZZH8in6u06Ek/ylN+s/2Xe+2Tt7Nz+uTyszv/s0ORO5FJUxrPbOC8rld/1y1sPs0bqE/zmSJR3qUq7PpBYGoGXFbv8/WGuhrPg1Rvt5MpTdVbdu5VkLQbvYKdT27w6j9bPz2VOl/LUXSni4iP5rvqsaas+was/ZLnlu//ai6PA7pWsBVEZ0erZ7SpuqrplrF/+0TaQf/qdovry9vHsvXw51wdH69/8bvdnnZTZJR1jxA6b43+f7ZxHye4mHsg8meSVrfz85xcf1auEYf8yx5ldI2V4lcmeyMzjqTrRZEAybTopikpAvMhMqEC1ONkca9yLnffm7FL5PSY1BYL6BG+qxmwTNaQKfihPdXGmL0VdLQ9lyVVD0C1DkDk4lOeD3E8FeFk3m+//8YcruedCfSavGyYpvUvS4nXqmyNxQu3md49SyRvbONSvFXj7lz1O50RfqLLZE+AXqrHKhMxHSVFsAnH8EHvIyGAbgd681GcEE4Ver/GyDeycm22b72VNyLKLGhLV8NO7PVF9ui6s5CcagaxFHcfFTuvvRAWJF1PLhV30Nf6Zrll5ohypNpndPYqjkMlGvCEPvuy4XPYez/5ZB2kb6aVrXJRUAkqPp/oUSoNfH6iaYYv1/YKX/39uC7s8VNf9mTwZNscfj0WQleHvwJwb4uyXoTJBYJvD8mR5Eq0ORRXJjfI8yO8i/1qZ2K5SlxUsTpXM7lEavLEtB192XC57j6e//V4R1yFhobtqB8lpbhOQ2hwd6jMk2+LVuzGBN8V7eau74TEacaXWrUlt/eG7x7N3OjDqypPNNNUk4+YsLpI/CUTf/XRe3eJlVS5340umzIG7xRtbQovPZk+krpEpnkcnqTSjfIvXIxp7jU/1IGXNjAl8tNjT0U316lNnuera68ufuAjoq7z6W19IZZU+j6s7voBHm38Vlft49pur9LWzeI1/PN2zO2yO9uJ+QrJ7GryxTViLXC53c9f4DPhs9mTnTnXsLU/xUVKUzDVec1cVonsAKj7ZwD+ma7yKfGoEKyNvEvxMo9R9XdWnl33gRdwVXs5kJ91WshgE6y67JqrOJtvdzu9ia/jQpL559W/pXr3a1fbq7UmjbMXmn2Xc3r0S5r89vtLvczY2oqSzJ6mbC8CJ/ShVlCQB9eFMdvqz4/ilrplR9er7V7bHyKt2czCjGsf3ry7Bt2I3rpk7aAgC+IkK4CcqgJ+oAH6iAviJCuAnKoCfqP4fa2i9S0YgEX0AAAAASUVORK5CYII=" alt="plot of chunk unnamed-chunk-1"/> </p>
<p>I remember my dad telling me that when he was at Northwestern in the mid-70s, the team was essentially winless. As a small consolation, he remembered that the football team had actually been full of good students.</p>
<p>Some time ago, I stumbled across <a href="http://nflcombineresults.com/nflcombinedata.php?year=&pos=&college=">nflcombineresults.com</a>. Among other measures, the site reports Wonderlic scores for a bunch of players entering the NFL. </p>
<p>By adding data on a school's institutional strength (from US News and World Report), I can look for an association between a quarterback's Wonderlic (a measure of cognitive ability), and the (academic) strength of a quarterback's alma mater.</p>
<p><strong>Results</strong>
While this is fairly rough, it looks like there <em>is</em> a relationship here – quarterbacks who attend better schools have higher Wonderlic scores; this relationship seems to only hold for top-50 schools.</p>
<p>There are a bunch of causal relationships that could give rise to this pattern, and we really don't have the data to separate these stories. It could be that better students choose better schools, or that attending <em>some</em> schools will increase a Wonderlic score. Alternatively, it may be that students at top schools are more likely to attend college for 4 years. </p>
<p>While the assessment is really preliminary, it looks like there might be something here.</p>
<p>I've attached my code below. At some point, I'll add the csv file with institutional strength (this information is freely available at US News as well).</p>
<pre><code class="r">
# loading libraries:
library(plyr)
library(ggplot2)
# gathering wonderlic data:
library(XML)
url = "http://nflcombineresults.com/nflcombinedata.php?year=&pos=&college="
test = readHTMLTable(url)
dat = test[[1]]
## cleaning data:
names(dat) = tolower(names(dat))
# replacing spaces in variable names:
names(dat) = gsub(x = names(dat), pattern = "\\s", ".")
# adjusting vert.leap.(in)
names(dat)[10] = "vert.leap.in"
# cleaning individual columns:
dat$year = as.numeric(as.character(dat$year))
dat$name = as.character(dat$name)
dat$wonderlic = as.numeric(as.character(dat$wonderlic))
dat$bench.press = as.numeric(as.character(dat$bench.press))
dat$vert.leap.in = as.numeric(as.character(dat$vert.leap.in))
# separating out the individuals with wonderlic scores:
dat.sub = dat[!is.na(dat$wonderlic), ]
# reordering
dat.sub = dat.sub[order(dat.sub$wonderlic, decreasing = TRUE), ]
# examining the scores by position:
pos.dat = ddply(dat.sub, .(pos), summarise, mean.wonderlic = mean(wonderlic,
na.rm = TRUE), count = length(wonderlic))
# note: not really enough to compare by position.
qb.dat = dat.sub[dat.sub$pos == "QB", ]
# reading in qbschools:
qb.schools = read.csv("qbschools.csv")
# merging:
qb.dat$college = as.character(qb.dat$college)
qb.schools$school = as.character(qb.schools$school)
merged = merge(qb.dat, qb.schools, by.x = "college", by.y = "school")
names(merged) = tolower(names(merged))
merged$usnewsrank = as.numeric(as.character(merged$usnewsrank))
# to generate plot: p = ggplot(merged, aes(x = usnewsrank, y = wonderlic)) +
# geom_point() + geom_smooth() p + opts(title = 'QB Wonderlic score \n by
# (academic) strength of undergraduate institution') + xlab('US News and
# World Report institution rank (as of 2013)') + ylab ('Wonderlic score')
# ggsave('wonderlic.jpg')
</code></pre>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-56317109574777756372013-10-31T06:19:00.000-07:002013-10-31T06:19:59.131-07:00Simulation of an Oxford (Undergrad) Interview Question<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title></title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<!-- Styles for R syntax highlighter -->
<style type="text/css">
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: rgb(88, 72, 246)
}
pre .number {
color: rgb(0, 0, 205);
}
pre .comment {
color: rgb(76, 136, 107);
}
pre .keyword {
color: rgb(0, 0, 255);
}
pre .identifier {
color: rgb(0, 0, 0);
}
pre .string {
color: rgb(3, 106, 7);
}
</style>
<!-- R syntax highlighter -->
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"</",c:[hljs.CLCM,hljs.CBLCLM,hljs.QSM,{cN:"string",b:"'\\\\?.",e:"'",i:"."},{cN:"number",b:"\\b(\\d+(\\.\\d*)?|\\.\\d+)(u|U|l|L|ul|UL|f|F)"},hljs.CNM,{cN:"preprocessor",b:"#",e:"$"},{cN:"stl_container",b:"\\b(deque|list|queue|stack|vector|map|set|bitset|multiset|multimap|unordered_map|unordered_set|unordered_multiset|unordered_multimap|array)\\s*<",e:">",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
</head>
<body>
<p><img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfgAAAH4CAMAAACR9g9NAAAA3lBMVEUAAAAAADoAAGYAOmYAOpAAZrY6AAA6ADo6AGY6OmY6OpA6ZrY6kNtmAABmADpmAGZmOgBmZmZmkJBmtv9/f39/f5V/f6t/lcF/q9aQOgCQOjqQkGaQtpCQ2/+Vf3+Vf6uVlcGVq9aVweurf5Wrf6urlZWrlcGrq6ur1v+2ZgC2Zjq225C2/7a2/9u2///BlX/BlZXBlavBq8HBwdbB6//Wq3/Wq5XW///bkDrb29vb/7bb///l5eXrwZXr1qvr///y8vL/tmb/1qv/25D/68H//7b//9b//9v//+v///+fjrsFAAAACXBIWXMAAAsSAAALEgHS3X78AAAPXElEQVR4nO3dAUPbxhmHcSULNOsgaYtToC1d66Wt2VLSdBB7Ix5gB7C+/xfaSbKNsaTzCXTGd//nXUaxg/Vg/3ySKOAmKSM5yVN/AszTDPCiA7zoAC86wIsO8KIDvOgALzrAiw7wogO86AAvOsCLDvCiA7zoAC86wIsO8I+c/zz1J/DAiQH++tWReTt+eXL/itq5PdyZfdiKj7x6flLcIMnmxXntpsKbKOFXzIKWK/xeWo0M/FPOHP761Q/bybOj/Irr3eTZD+aq3b3CL1u10yV7f8Ufb5l3B1v5bTPn8bZZ3OZG2Qa+X4RPr8ztp5uZbvZ3c2lrvulBksUDmbjgdw3U8Ytzc0WGm12cwefYg0L+PnzxrNjLbzuY3uC4uEF+XboIf3toniZZYWGzs01nG5ruIgKYyOALjinn3NFcyC9PAe/DZx8xu23+Jr27wb1d/fXuTnE8MR+1CD/bdDjo2cQKny/uhasG+elZkosvndyZff1gq9hI/jdmH2/2+bMNFDfIb7xV7O2zj1qEn206+6i9p3sUGo4M/MIp+RL8+OXvZkXP4A37zvwGc/hi918Df7fp43AO8jHAF3viq7sj72xXf7W0q59//D3428NvjO9sV5/TVu7qM/naXf3Spjd/YoDPjuVpcTI2h8/OwrJzs+yft4fZPw3oVbEel7+OH2Q78dmJYcY43n4230Bxg+IYn22qOLm722x2cldsOr/lF6z4dc5xcfhegM+Py18/P5l9WZdfnu6HZ4fsGfx4O1/sXxdflJltPX9nNmQ+avnLuYUv3WabHUyvyzZ9nAR0kI8DvmYcz7Nn+2/fn85GDfBmV5/t+YGPaJzgx9v5WTnw9fOpNBVXrZrL5jeh0kYFeNEK8KIV4EUrwItWgBetAC9aAV60ArxoBXjRiiv85H2n003Ts86bU+BjqLjCDw+Mem+0f2H+AB9BxRk+W+5d8/bmbd9c7Jip+1AmwLHu6n/rDnvp5EN/epXbM2vFBLVKYqq4wmczvFvxwIdecYUfHaQ3/zjlGB9NxXnFn3U6Pc7q46k02dUvjVtgxQT1YMVUWQt8kty/VD/3/rru/Yqb2f6y5Zut70bttZ4I/n7aw11lVg3wovM08Ozqn/oTrDBZC/ynpPbCZekDk4ULle/Hftq1ngrwNRN7ZS3wyT1sO3xSfQH4litrhi8dcZbhl04Dq96PnWQ9lfXCl0817t+NZMk6qXo/dpL1VICvmdgr64X/VPraorSrr7mw8H7sJOuprBve091YMVTK1wGvWQFetLIO+OTTXBz4TamsAX7xX7T7uhsrhkr5Ou/w9m8XBPVgxVQBXrTiH55d/UZW1gBfPqNr/26sGCrl64DXrAAvWgFetOIZPkns7mE9WDFV/MIXP/u3hruxYqiUrwNes+IXnl39xlY8w38CfkMrwItWvMNzjN/MCvCiFc/wySfgN7MCvGjlEfCXpSlfleT/a3cqwh4m9gorXrTiH97qHtaDFVMFeNGKd3j7nj6sByumCitetOIbvuYVWNq+GyuGSvk64DUrvuHZ1W9oxTv8eu4GlaYVv/Ar3cN6sGKqAC9aAV604g8+WXli1+LdoNK04g2+/lVUfdwNKk0rwItWvMGzq9/sij94hzO79u4GlaYV4EUrwItWgBetAC9aAV60ArxoBXjRij94J/ewHqyYKsCLVoAXrQAvWgFetAK8aAV40QrwohXgRSvAi1aAF604w5913pzO3gIffsUVfrR/Mf8DfAQVV/jPP12MDtJhN7152zcXO2Zqdw4rNsVs4NRrjTpmqQ976eRDf3rNimcWK36jK67ww4P089/78xUPfOgVZ/icnGN8NBVX+Mn7TqfHWX08FVf4ilkRAH6jK97g3dzDerBiqgAvWgFetAK8aAV40QrwohXgRSu+4B1eBaXNu0GlacUTvMvrHrV5N6g0rQAvWvEEz65+0yve4Nd7N6g0rQAvWgFetAK8aMUPvOupXWAPVkwVL/DOX8wF9mDFVAFetOIFnl395lf8wDuf24X1YMVUAV60ArxoBXjRCvCiFeBFK8CLVoAXrQAvWgFetAK8aAV40YofeGf3sB6smCrAi1YeAX9ZmvlVSfnv2puKMJXGFVa8aAV40QrwohXgRSvAi1aAF60AL1oBXrQCvGgFeNEK8KIV4EUrwItWgBetAC9aAV60ArxoBXjRCvCiFeBFK8CLVoAXrQAvWgFetOIF3t09rAcrpgrwohXgRSvAi1aAF60AL1oBXrQCvGgFeNEK8KIV4EUrrvDDjpluetZ5c7oS3v0/TBLYgxVTpcGKn/x5MdrP/qyAb/CfIgrswYqp0gD+42k67KY3b/vm/Wz9131cBm/ZDLOJYxEbHZgdfi+dfOhPr6h9ZrGrD6DiDm8W/N2Kt8JzchdAxRn+5ldzbHc6xgMfQsUZ/vPP2Vuns3rgA6i47+pLUx8AfvMrwItWgBetAC9aAV60ArxoBXjRig/4Bu5hPVgxVYAXrdjhBy/OB0myB3x8FSv89asj82f88gT46Cp2+NcnZs0DH2PFCp8OkmdHV+zqY6zY4a1TGwA+gArwohU7/O1h8uK/r46Aj69ihb893Bl/eX714hz46CpWeHNWb+DNW+Cjqzis+AErPsKKFT47xidJjTvwQVfs8NapDQAfQMUKXxzdOcbHWLHAX+8mxXCMj7DisOLrpjYAfAAVK7x9agPAB1Cxr/hiZ/+cY3x8FYcVP9hhxcdXcYDnrD7GigP8Fbv6CCtW+Okxnh/EiLDisOLrpi7QxD2sByumCvCiFTv8g75JA3wIFSv87WH2lVzTb8sCH0LFCm//Js1laYqrkvJftDoVYSqNK6x40YoVnmN8vBU7vHXqAsCHULHCP+zbssCHULHC335X8yP1NvgmL2ga2IMVU8W+4h/wbdlGr10d2IMVU8UKb5/qAPBhVFqHZ1cfRsUBvun344EPodL+igc+iArwohU7/IN+TRr4ECpW+If9mjTwIVSs8A/7NWngQ6g4rHi+OxdjxQrPd+firdjhrVMXAD6ECvCiFSv8w353DvgQKg4rvunvzgEfQsUBni/nYqw4wDf93TngQ6hY4R/2u3PAh1BxWPF1UxcAPoQK8KIVK/z8da8qj/J1AeBDqNhX/GBr9sYdvpF7WA9WTBX7in/ICxwCH0TFCj/93TlWfIQVK3zx3bkad+CDrtjhrVMTAD6ICvCiFQv89et3D/nuHPBBVFjxohXgRSt2+Ku7Xf2w0zlI07POm1PgY6hY4a9359+YG2XqvdH+hfkDfAQVO/zdv7Mb/pat+GE3vXnbNxc7ZmqOAE2OFsymzJLa8fynrs7Mih/2hr108qE/varmmcWKD6JihV/4YUtDno668xUPfOgV+4q/m1G+4jnGR1Nxhc/O6ruc1cdTcd3VV0xNAPggKg4rvuHP1QMfRMUBnh/EiLHiAN/w5+qBD6JihX/Qz9UDH0TFYcXXTU0A+CAqwItWgBetAC9aAV60ArxoBXjRCvCiFeBFK8CLVoAXrQAvWgFetAK8aAV40QrwohXgRSutwzdzD+vBiqkCvGgFeNEK8KIV4EUrwItWgBetAC9aAV60ArxoBXjRyiPgL0uTXZWUr255KsJUGldY8aIV4EUrwItWgBetAC9aaRs+AT6MSsvw2UvmPMXdoNK0ArxopWV4dvWhVNqG5+QukArwohXgRSvAi1aAF60AL1oBXrQCvGgFeNEK8KIV4EUrwItWgBetAC9aAV60ArxoBXjRCvCiFeBFK8CLVoAXrQAvWgFetAK8aAV40QrwohXgRSvO8GedzptT89a8AT6Ciiv85I8MfLR/Yf4AH0HFFf7mlx87B+mwm9687ZuLHTOVH9fkWMFsztS6jb7tp8PesJdOPvSnV1U+sxou+LBWSUwVV/gcvztf8cCHXnFe8d3UrPjVx3jgA6k4r3hzVt9NV5/VAx9IpcmufmkqA8AHUgFetAK8aAV40QrwohXgRSvAi1aAF60AL1oBXrQCvGgFeNEK8KIV4EUrwItWgBetAC9aAV60ArxoBXjRCvCiFeBFK8CLVoAXrQAvWgFetAK8aAV40QrwohXgRSvAi1aAF60AL1p5BPxlacxVSfnatqciTKVxhRUvWgFetAK8aAV40QrwohXgRSvAi1aAF60AL1oBXrQCvGgFeNEK8KIV4EUrwItWgBetAC9aAV60ArxoBXjRCvCiFeBFK8CLVlqGb+oe1oMVUwV40QrwohXgRSvAi1aAF60AL1oBXrQCvGgFeNEK8KIV4EUrwItWGsAPu2l61nlzCnwMFXf4UaebjvYvzB/gI6g4w3/+6X/dbNHfvO2bSx0zVR/V5FDBbNDUwhnvkYHvpZMP/elVVc8sVnwoFVf4YbbGu/MVD3zoFVd4MyOO8RFVmsFzVh9NpQH88lQFgA+lArxoBXjRCvCiFeBFK8CLVoAXrQAvWmkXPgE+lEqr8EnSWD6oByumCvCilVbh2dWHU2kXvvnnFNSDFVMFeNEK8KIV4EUrwItWgBetAC9aAV60ArxoBXjRCvCiFeBFK8CLVoAXrQAvWgFetAK8aAV40coj4MtT+eoorQ8VDxXgRSvAi1aAF63wcmWiA7zoAC86wIsO8KLzGPjF18BreybvO539C5PIGkXIR668fQ+V4kVCPd+X/AVIy3ekNvQI+Huvetn2jA7MJ92b/HE6D/nIlbfv6U5N/rzwe19GnW/7FXekPvQI+Huvc+tjhr2bX37sHExDPnLl7Xu6Ux9PK1otbn/y7+y1pst3pD70GPjFV7b2MGbRj8zTeNgrQj5y5e37uVPZ/svzfcnhS3ekPrS5K/7soPjnqOt1LS5t30/l42lVq9XEGle812P85H0vLV5Gedjzd/Qtb9/Lnbr59cL7fcng13SM93pWfzY/E+56PN+u2L6Pyuefq1ttTr5HX89ZPRPyAC86wIsO8KIDvOhow98e7j31p/BUA7zoCMNf7yZ/+Wpv/Nevnp+Mt5NkL3sWjL84Sq9enF8lyfOTp/78/I4w/PFOepXsjbf30utXR+n45clgJx0ke+lg5/r1STrYeurPz+/owme6ZpEb8NnF8Zfn//x+6/a7o+yJEPvowufix1P442zffv363et/vXxnnhBm1/8scntd+IUVf727lz8Pjr/52+133xf7eHOkf+JP0O/ows+O8QY8/392WpcUR/kMHfho5/YwP6vPdvWDJHu3wM9O5485q2ciHeBFB3jRAV50gBcd4EUHeNEBXnSAFx3gRef/KvK9PLG6hrQAAAAASUVORK5CYII=" alt="plot of chunk unnamed-chunk-1"/> </p>
<p>A friend of mine, who's an economics teacher in London, is responsible for preparing some of his students for interviews at Oxford and Cambridge. He told me that, at these schools, students have to go through a nerve-wracking experience where faculty pepper them with quiz questions meant to assess the ability to think quickly under pressure. </p>
<p>It turns out that these schools publish example questions, which are much more fun to think about in your own home, <strong>without</strong> an accomplished faculty waiting in silence. </p>
<p>I've found the list of <a href="http://www.cs.ox.ac.uk/ugadmissions/how_to_apply/sample_interview_problems.html">practice questions in Computer Science</a> interesting – these questions don't require computer-assistance to solve, but they offer a nice chance to practice programming skills nonetheless.</p>
<p>As an example, here's the 'lilypad problem':</p>
<p><strong>Eleven lily pads are numbered from 0 to 10. A frog starts on pad 0 and wants to get to pad 10. At each jump, the frog can move forward by one or two pads, so there are many ways it can get to pad 10. For example, it can make 10 jumps of one pad, 1111111111, or five jumps of two pads, 22222, or go 221212 or 221122, and so on. We'll call each of these ways different, even if the frog takes the same jumps in a different order. How many different ways are there of getting from 0 to 10?</strong></p>
<p>While this question really should be answered using permutation, I thought it'd be fun to try writing a short script to use <strong>simulation</strong> to solve.</p>
<p>My approach was to first write a function (<strong>solution.gen</strong>) which generates a random route. Next, I draw from this function (a large number of times, using <strong>draw.func</strong>), and examine the number of unique solutions. </p>
<p>To test the approach, I plot the number of unique solutions for 200 trials ranging from 50 to 10000 draws.</p>
<p><strong>Result</strong>: It looks like this process uncovers 89 distinct routes. With a few hiccups, it looks like about 1000 draws recover these, and more than 2000 draws recover them reliably. </p>
<p>While simulation isn't always the best approach, it's a nice tool to have if the analytic solution is too opaque (or in this case, just tiresome) to figure out.</p>
<p>Here's the code:</p>
<pre><code class="r">solution.gen = function() {
draw = rbinom(10, 1, 0.5) + 1
sub = which(cumsum(draw) <= 9)
draw = draw[sub]
if (sum(draw) == 9) {
draw = c(draw, 1)
} else {
draw = c(draw, 2)
}
return(draw)
}
draw.func = function(x) {
length(unique(replicate(x, solution.gen())))
}
draws = seq(from = 50, to = 10000, by = 50)
unique.routes = sapply(draws, draw.func)
df = data.frame(draws, unique.routes)
# (not run) code to create plot: library(ggplot2) p = ggplot(df, aes(x =
# draws, y = unique.routes)) + geom_point() + geom_line() p + labs(title =
# 'Unique Lilypad Routes')
</code></pre>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-14745828115714820152013-04-17T19:33:00.001-07:002013-04-17T19:33:50.395-07:00CrossFit weights: gender matters less than you'd think<!DOCTYPE html>
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title>Exploring Gaussian Mixture Models</title>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
white-space: pre-wrap;
}
pre code {
display: block; padding: 0.5em;
}
code.r, code.cpp {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<h2>Exploring Gaussian Mixture Models</h2>
<p>This week in the Empirical Research Methods course, we've been talking a lot about measurement error. The idea of having some latent variable of interest, coupled with 'flawed' measures reminded me of a section of Cosma's course I really enjoyed, but haven't gotten a change to go back to – mixture models.</p>
<p>The rough idea here is that we have some observed measure for a population, but suspect the population is composed of a number of distict types, which themselves have different distributions on our measure. Cosma has a nice example using precipitation data – here, we might suspect that snowy days result in a different amount of precipitation than rainy days. Total precipitation, then, is just the sum of a few distributions – one for each type.</p>
<p>Interested in trying to fit my own mixture models, I thought modeling weight by gender might be a reasonable domain for application.</p>
<p>Looking around a bit for open datasets, I found <a href="http://xfit2011.blogspot.com/2012/02/crossfit-open-2011-dataset.html">CrossFit data</a>, which needed only a bit of cleaning before I had a data frame composed of weights and genders for about 14000 CrossFit athletes.</p>
<p>Before fitting my model, I wanted to make sure the data were fairly bimodal, so I decided to create a density plot of weight:</p>
<p><img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfgAAAH4CAMAAACR9g9NAAAAw1BMVEUAAAAAADoAAGYAOmYAOpAAZpAAZrY6AAA6ADo6AGY6OpA6ZrY6kNtmAABmADpmAGZmOjpmZgBmkJBmtv9/f39/f5V/f6t/lcF/q9aQOgCQOjqQkGaQ2/+Vf3+VlcGVweurf6urlZWrlcGr1v+2ZgC2Zjq225C2/7a2///BlX/BlZXBlavBwdbB6//Wq3/W///bkDrb/7bb///l5eXrwZXr1qvr///y8vL/tmb/1qv/25D/68H//7b//9b//9v//+v///8VOY+HAAAACXBIWXMAAAsSAAALEgHS3X78AAAPdklEQVR4nO3djV8U1xXG8WtqTU01tpVETWhJY9qYJgp1KQJF2P//r8rMLvvGvJ1zX86dmft7PgSHAXzu2S93d1DWuCUpMi73AkieuNwLIHnici+A5InLvQCSJy73AkieuNwLIHnici+A5InLvQCSJy73AkieuNwLIHnici+A5InLvQCSJy73AkieOP9PXTTTdq4/n9Sf4VFCS/0K+DJbgC+0BfhCW4AvtAX4QluAL7QF+EJbgC+0BfhCW4AvtAX4QluAL7QF+EJbgC+0BfhCW4AvtAX4QluAL7QF+EJbgG+LM2lpBnjtGJFbXEMe+BLgXXPPA18AvFsAXyK823udrqU1wGvHiNkCfJnw7uCXVC3tAV47RrwW9+DXNC0dAV47RrQW1zhI0dIV4LVjRGsBHviELV0BXjtGrBbXchS/pTPAa8eI1QI88AlbOgO8doxILa7jGHjgI7R0B3jtGJFagC8T3nW+ATzwwS09AV47RpwW4IF/+BbwM4Z3PW8CD3xgS1+A144RpQV44BtvAg98YEtfgNeOEaUFeOAbbwJfDvzB28ADH9jSF+C1Y0RpAR74xtvAzxf+oTvwwAMPfGhLb4DXjhGjBXjgmyeAny18wx34UuH3TwEPfFBLf4DXjhGhBXjgW04BD3xQS3+A144R3tLiDjzwwHfnUzNt56InUonrP2cyimVLPPgoX8DseKsW4O8DPPBt54AHPqRlIMBrxwhvAR74tnPAAx/SMhDgtWOEtwBfJnybO/DAAw98SMtQgNeOEdzSCr93FnjgA1qGArx2jOAW4MuEb3cHHnjggfdvGQzw2jFCW4AHvv008LOE73AHvlT43TuAB967ZTjAa8cIbAEe+I53AA+8d8twgNeOEdgCPPAd7wAeeO+W4QCvHSOwBfgy4Tvdd+8CHnjfFkGA144R1gI88A8DPPChLYIArx0jqKXHHXjgQ1skAV47RlAL8MA3AjzwgS2SAK8dI6Slzx144ANbRAFeO0ZIC/DANwM88GEtogCvHSOkBfgy4XvdgQc+rEUW4LVjBLT0w2/eDTzwXi2yAK8dI6AFeODbAjzwIS2yAK8dI6AFeODbAjzwIS2yAK8dI6BlAP7+/cDPDX7IHXjgA1qEAV47hn8L8MC3B3jg/VuEAV47hn8L8GXCD7rffwjwwHu0SAO8dgzvFuCB7wrwwPu2SAO8dgzvFuDLhBe4Aw+8b4s4Y4T/cPTN+/2Dzz+c7p0Evq9FnBHCX726qF52B1dH357uTgLf2yLOCOHPT9Z7/P7g7re7X093J4+qdN5XjDku0sdMPq7rHedvlzX17mAFvzlZJ8oXMDveqkUMf7jj7+E3J4HvbRFnhPAPH+NX8DzGy1rEGSH8+gL+848Xmyv51X08V/WiFnHGCD+YKHOMEX71QcADr28RB3jtGJ4tInfggfdrkQd47RieLcAD3xPggfdqkQd47Rh+LTJ34EuFrz8OeOC1LYoArx3DrwX4MuGl7sAD79GiCfDaMbxagAe+P8ADr2/RBHjtGF4tYvjqI4EHXtmiCfDaMbxagAd+IMDPCV7uDjzw6hZVgNeO4dMCPPBDAX5G8Ar36oOBB17Vogvw2jE8WoAHfjjAzwZe5Q488LoWZYDXjqFvAR54QYAHXtWiDPDaMfQtwAMvCPDAq1qUAV47hrpF5w488KoWbYDXjqFuAR54UbQfvwDeZAxti9oReODlLeoArx1D2wI88LIAPwt4PSPwwItb9J8CvHYMZQvwZcJ7KC71nwO8wRi6AA+8uAV44IUB3mAMXYAHXtwCPPDCAG8whi4+8PpPAt5gDFWcVwvwwMsCvMEYqgC/KBLe+bUAD7wswBuMoQnwq1fFwTvPFuCBlwV4gzEUAX4Y/ub5E+A3LSXBL5eXzj06nhW8820pC77K7RvnXgJfGPz1l/WOv3nxbi7wzr9FKz9h+Jvnjz8K9vo6n5ppOxc9ghJXZXPo3+L7qboWg6xa+uFXO12038e841db1bntITt+0bvjb567dWTbPsocKW4st/cr8AvRXb1sr48bfuflnAtpKQhelShzJIUPbCkG/ubFz+s7+y8m/RgPfGvL7Hd8G5dni1IeeIMxugN8e0s//Nnjj2eyP7cDPlaLMmmu6r8+rl6un074Mb4VC/jhb+eqPQ/8ojT45Zl7dHw56bt64Dta+uE1iTJH7Bur3cq3RScPvMEYXQG+q6Uf/nLqf4ADfFdLL/zNc9nD+2jhO6SAn/tf0gDf2dILv/zpGfB7LcXA30z9L2mA72zp3/GaRJkDeG2AbwvwnS398Ldv3OP/fi37wfoocwCvTRL42zfPrr/6eDndn7kDvrOlF776dq6Cn+5P2XZBAS/a8WeT3fGx4XXyE4ZfPX1K+NPVwEdrUYWr+pYA390CvKalDPjtM2mm+id3nUzAi/6s/kz2ryNEmQN4bRL+7dxkv52LD6+SnzB89e3ccvUz1sCXBb/+dk747+BEmQN4bbiqb6QbCXjglS3AAz8Y4A3GaEsKeI088AZjtAX4vhbgdS3AAz8U4A3GaAvwfS3A61qAB34owBuM0Rbg+1qA17UAP3r4HiLggde2AA/8UIA3GKMlwPe2AK9skcsDbzBGS4DvbZktfB8Q8MCrW4AHfiDAG4zRDPD9LcBrW8TywBuM0Qzw/S3Aa1uAHzV8Lw/wwOtbgAe+P8AbjNEI8AMtwKtbpPLAG4zRCPADLcCrW4AHvjfAG4zRCPADLcCrW4AvFF4qD7zBGI0AP9ACvL6lLPgPR9+83zvYvN6cBT52izCp4a9eXVQv24P1y91/3u8+IsocsW6sfhvg5fDnJ8vPP5xuD9avP//z70ev6/ceVem8r8gRN9nfPFNc1zvO3y7vfj3dHqxfX317Wr+9TpQvYHa8Nll2fP2OqxPgk7QIk+UxvkZnxydqEcbmqv7zjxcPr+o3Gx74yC3C8H38QYAfapkn/AAN8MB7tcjkgTcY40GAH2wB3qMFeOATtsgC/H6AH2wB3qdFJA+8wRiHGXIBHni/FuCBT9giCvB7AX64BXivFok88AZjHAb44RbgvVqAHyP8oArwwHu2AA98whZJgN8FeEEL8F4twAOfsEUS4HcBXtACvF+LQB54gzEOArygBXi/FuCBT9giCPC7WMAL5IE3GOMgwAtagPdsAR74hC3DAX4bIxKTL6/hAL8N8JIW4D1bgAc+ZctggN8GeEkL8J4twAOfsmUwwG9i9afowBcKb/DT+5IAvwnwohbg/VoERcAbjLEXu78pH2gC3mCMvQAvagHer0XQBLzBGHsBXtQyO3iBezSS/i7gDcbYBXhZC/BeLZIu4A3G2MUSPun/5EwY4O8DvKwFeK8WSRnwBmPsArysBXivFkkZ8AZj7AK8rAV4rxZJGfAGY+xiCt/bBrzBGLsAL2sB3qtF0ga8wRi72ML31QFvMMY2EnfgF8D7tYj6gDcYYxtr+J5C4A3G2AZ4YQvwPi2iQuANxtgGeGEL8D4tokLgDcbYROQO/AJ4r5aDdFaWAv+pmbZz0XNQ4kxaDhKz0uQGW7fEg4/yBcyO1yb/jo8yxwThOzuBNxhjE+ClLfOCl7nHJuloBd5gjPsAL24BXt/yMMCHzwG8NsDXAV7cAry+5WGAD59jkvAdtcAbjHEf4MUtwOtbmmntBd5gjPsAL24BXt/SDPChc0wUvrUYeIMx1hG6A78A3qOlLcAHzjFV+LZq4A3GWAd4eQvw6pbWAB82B/DaAC93B34BvL6lPcCHzTFZ+JZy4A3GWAV4RQvw2pauNNqBNxijjtgd+AXw6pbOAB8yx4ThG/3AG4xRB3hNy3zg5e7AL4DXtvQE+IA5pgz/cAXAG4yxAF7ZAryupTeHSwDeYIwF8MoW4HUt/TlYA/AGYyyAV7bMBl7hnpBkfxXAG4wBvLYFeFXLYPaWAbzBGKOB31sH8AZjqNyBXwCvaxEEeK85pg+/WwnwBmOo3BOTbNYCvMEYwGtbgFe0CONMWhZhLcArWqRxJi2LoBbgFS3SAG82xrjg18sBPv0YOncDEmfSsghoAV7eookD3mAMpTvwC+AVLbo4B3zqMbTuRntRvSzglWOMFV4vD7xqjJHCf9IvDHjVGKOFV2964DVj6O9R7S67gE84xpjhlXseeEU8Lp5Nv9HSrA94eZzhjeXVotnzwMszenjNngdengnAyzc98OI4yxvLv0VID7w0zvTGCmlxq6RuEWX68KsbciLw6wzYAy/K+kacFPzifuuvf0nY0peJw29ut6nB17lXdw/v/8cI/+Hom/d7B/uvzeEPbq4pwh/Gbb8ARgh/9eqietke7L/0wDt99m6Ivo95MIbNjZW4xePWar9Jmr9lcxYp/PnJ8vMPp9uD/dfVyaMqbZ8VMgqJmC5WAfzb5d2vp9uD/dc9O177NT/WvTjDljg7HvjJtUjh/R7jzcagRdsihV9fwH/+8WIMV/WBJbQspvx9vH8JLQvgi20BvtAW4AttAb7QFuALbQG+0BbgC20BvtAW4AttAb7QFuALbQG+0BbgC20BvtCWePAtaf05vNgxKZl9i0veED1zJ7FpcckbomfuJDYtLnlD9MydxKbFmTST0cXlXgDJE5d7ASRPXO4FkDxxuRdA8sRF/L32n22RIqtnbzWf2BE1d78cHb26SN1S/c5Hb5PPslw9/62jxcUrOXh+VYJcHX17umx5KlfcktfVbfTWoKX6Kk7dUt9kJ123mIvXcvCMyvi5+61+om7Lkzej5/ytQUv7E1Hj5v//+N9J1y3m4tUcPIc6RVbwzadrx061HdO3fKj2YuqW+k6ls8XF60m84+/hk+/FD6+XNvcrJ6lbzut/vqKrxcXrSf0Yv4JP/bh490t10ZX+SuKkhk//GF8XpX+MT35Vv7qvSnwl/GG1Syyu6l9bXNVfWVzVkynF5V4AyROXewEkT1zuBZA8cbkXQPLE5V7A2HL99N3h0e7ErOJyL2C8AX7OuX3zcnn9p+Pl5eOPN8/dF+9WztXRH/5yfP30r869XJ+dX1zuBeTO2bPlmXtZ//JTdfRkBV8dXT46vv7yWf31wI6fZ66/+viv75/cfnd88/Xx8ubFu8q5er2sTtTi1X/AzzM3L35+8e+nP794V92nO/eo5q6+FoCff376259vv/v+Sb3b6zfZ8aXk0q0f5VeP7OuH9PvHeOBnndU1fX3hXt3XPzreXNX/cbvjb99wVV9QZrrPd3G5FzDC3L6pr/JyryJxXO4FkDxxuRdA8sTlXgDJE5d7ASRPXO4FkDxxuRdA8sTlXgDJE5d7ASRPfgeDZ8961OY6OQAAAABJRU5ErkJggg==" alt="plot of chunk unnamed-chunk-2"/> </p>
<p>This turned out to be not nearly as bimodal as I'd hoped. Inspecting the gender distribution shows that men make up vast majority of CrossFit athletes:</p>
<pre><code>##
## F M
## 0.27 0.73
</code></pre>
<p>Regardless of the comparatively smaller share of women, I was optimistic that a mixture model would still reasonably place gaussian distributions near the mean weights for each gender. Before I fit the model, I again plotted the weight densities, this time including gender information:</p>
<p><img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfgAAAH4CAMAAACR9g9NAAAA81BMVEUAAAAAADoAAGYAOjoAOmYAOpAAZmYAZpAAZrYAv8Q6AAA6ADo6AGY6Ojo6OmY6OpA6ZrY6kNtmAABmADpmAGZmOjpmOmZmZgBmkJBmkLZmkNtmtv9/f39/f5V/f6t/lcF/q9aQOgCQOjqQOmaQZpCQkGaQ27aQ29uQ2/+VlcGVweurf6urlcGr1v+2ZgC2Zjq2kDq2tma225C2/7a2/9u2///BlX/BlavBwdbB6//Wq3/W///bkDrbkGbb25Db/7bb/9vb///l5eXrwZXr1qvr///y8vL4dm3/tmb/1qv/25D/68H//7b//9b//9v//+v///9Ozai3AAAACXBIWXMAAAsSAAALEgHS3X78AAAS10lEQVR4nO3dC3sb13GAYUipWkF2a0tWm7SkI8tJGka25Lau7Fhim4iqCIqlKPz/X5O94DJnb3OAPTPay/c9bgyQ4Hg1LxcAL1AXa5pli099APRpAn6mAT/TgJ9pwM804Gca8DMN+JkG/EwDfqYBP9OAn2nAz7Sj4C/Cqtfbuoy8Xep5gz/AYl5qWSXgjxgIvOEeEs4b/AECbzNv8AcIvM28wR8g8DbzBn+AwNvMG/wBAm8zb/AHCLzNvMEfIPA28wZ/gMDbzBv8AQJvM2/wBwi8zbzBHyDwNvMGf4DA28wb/AECbzNv8AcIvM28wR8g8DbzBn+AwNvMG/wBAm8zb/AHCHz3vNWR84BvaETwK+ATNib46ikPfI+AP2Ig8IZ7qAd8yoA/YiDwhnuoB3zKxgO/uqh+QQd8j4A/YiDwhnuoBXzSxgUfygPfI+CPGAi84R5qrcT/HjYP+IaAP2Ig8IZ7qAV80oA/YiDwhnuoBXzSgD9iIPCGe6i1Cv51yDzgGwJeBLySxx5qAZ+00cCvKv8+YB7wDR0FfxlWvd63xnmryr97D+yRybzUskqc8SLOeCWPPVRb1S4A3yfgRcAreeyhGvBpA14EvJLHHqoBnzbgRcAreeyh0qrhEvA9Al4EvJLHHioBnzjgRcAreeyhEvCJA14EvJLHHioBnzjgRcAreeyh0qrhIvA9Al4EvJLHHioBnzjgRcAreeyhEvCJA14EvJLHHioBnzjgRcAreeyhEvCJA14EvJLHHioBnzjgRcAreeyhEvCJA14EvJLHHioBnzjgRcAreewhbNV0BfgeAS8CXsljD2HApw54EfBKHnsIAz51wIuAV/LYQxjwqQNeBLySxx7CgE/dGOG314DvEfAi4JU89hAGfOqAFwGv5LGHMOBTB7wIeCWPPYQBnzrgRcAreewhDPjUAS8CXsljD2HApw54EfBKHnsIAz51wIuAV/LYQxjwqQNeBLySxx7CgE8d8CLglTz2ELRqvAp8j4AXAa/ksYcg4JMHvAh4JY89BAGfvFHCb64D3yPgRcAreewhCPjkAS8CXsljD0HAJw94EfBKHnsIAj55wIuAV/LYg6zqDnz/gBcBr+SxBxnw6WuHf3X69S/7Cx9/Oj198m7zLo89yIBPXyv8+yfv3hfS5YX332SfAc837/PYgwz49LXCv322/vDda3lh/TaHP83yOrhtK/UNdGjt8M/XH39+LS/kJ32Zxwkg44xPX/wZ/2rnDrzFPA9tUexj/Mefnu/f57EHGfDpU57Vf/j+XXnhVf7Y/mzzLo89yIBPH1/Hi4BX8tiDDPj0AS8CXsljD7IafPkW4Hs0Vfhl58CWgFfy2IMM+PRNFH4JvNJU4YNTHvh6wIuAV/LYgwz49AEvAl7JYw8y4NM3UvjiTe3zlhfBF3TA1wNeBLySxx5EDe7A9w14EfBKHnsQHQy/3P1P88C2gFfy2IMIeIOAFwGv5LEHEfAGTQ4+/7kc8HpTgy++ZQe83vTgtz+LB76zKcLvLrUNbAt4JY89iA6Er10CviHgRcAreexBBLxBE4NfNl4Evt5Y4fM3dsOLy8DXA14EvJLHHkTAGwS8CHgljz2IgDcIeBHwSh57EAFvEPAi4JU89iAC3iDgRcAreexBFA+/bL4CfD3gRcAreexBBLxBI4BvdAe+Z8CLgFfy2MM+4C2aFPyy5Rrw9UYLn70Z+B5NGH5/Ffh6wIuAV/LYwz7gLQJeBLySxx72AW8R8CLglTz2sA94i8YLf7ECvkfAi4BX8tjDvqPhd9eBrzd8+Db3OnzVHfiOgBcBr+Sxh13AmwS8CHgljz3sAt4k4EXAK3nsYRfwJgEvAl7JYw+7gDcJeBHwSh572AW8SROCr7kD39GI4avvAf6QJg2/fRPw9YAXAa/ksYddsfAN7sC3B7wIeKXLsOr1voXzVq23C9+zbLjFsmlg/0zmpZZV4owXccYreexhF/AmAS8CXsljD7uANwl4EfBKHnvYBbxJk4Fvct++dYTwN48WD+X1H87ElRcP103dPDrZXzm/320IvGhA8NefSen11R1x9ears3VTAfz15y87DYEXDQf+9unizr8/WizuvVmfLxb386tnL7IL66vF4u5/3n2ZveFXD07W2Zserq/u/npReUt2ox/lZ4EKf/NIuYMA3mxeuOLyjM9O9Oyf7EzOL9x9efPo4dXiZH1+78353ZfXD07O8zfl7ysv79+S3ej2afPjQTP8Ov9kudN8RzJK+PLNI4V/kVtkyuVnQH7hxf38Pv9FBn/vTQac3SA7wYvPiJPgLdmNbp92n8MNd/XZncai+27CYw+7Zguf+ZZnfHby7s/4zHR/xpefFCW8fEsBf9gZf/0gP+NvHnc+M/DYw652+Ev5runBZ3e++aP2ef5If/1g+xif6VwVj+h//+gkP0Wzz44CvnyM37wlJzzwMb74jNHy2MO2dvdJw3dVPqtXZA97Vl+e6cr5Dvyn/gbOi4fZHfOi+xw95Ov4m0eLMu2099jDtkj4FvfRwq9qdZMcXNMZr+axh21zha/8roYxfGQee9gGfOF+aQl/8/jH8s7+7vge46cNn13jjBfzIuCLd4wdPr8CvJg3E/jisjF8/k0h7ft2Q4RvdZ8AfHnR+Fn9V2fZP9rX/sBbzGuD31xa5d9UXSy6vw97PPzjl9k5D7ySJ/z2Qgb/ZSr0Ovz6fHHn7Go0d/X7d04X/tIHPi6PPWybOfxKwGd39doX2vEBHwyMyw9+5XXGX43qGziTh5fP7U3htR/ibvLYw7Y5wwdfzNvCj+uHNBOHD797l4y8rHJX3/Ib25U89rCtL3z+rnHCV75dn0p8U/WufnKP8aOFr/58zhQ+Mo89bJsrfO0H8im0RcAHA+NygF/WSqEtqsDfPl3c+0vLK3T2eexh21zhKye8Mfzt04fXX7y5Gsvv3O3e2eE+DfjlpS189uVcBj+a37KNgs/eOXr47JrHGX8+qTN+AvD5FfvHeP23q4G3mBeueFlxt4aPy2MP24AvLy7X1w/y765d6a9pjQr4YGBcrvCbSxn8P/1L9tzrh39MD797Jc3ovnM3YfjthQz+i/84WV//67cmZ3zxvXrtZVfAW8xrht9/AmTw//vl+n/+aAI/shdNTh9e3ONn8H/93X///q8m8OWr6Yf05VyH+wzg5VO8df6F9m/+7dbmrr74ck79e3A89rBp1vDB13Q5/PVnZ0bwcXnsYdOc4cNv4qTQFgEfDIzLB77yXdsU2iLgg4FxucBXf0yTQls0bvjNuzvds3ePEb72c9kU2iLgw4FRTfAXMeLy2MOmucKbB3w4MCrgDfewCXijgA8HRgW84R42xcAr7sA3BXw4MCrgDfewCXijgA8HRgW84R7KutyB7xPw4cCogDfcQ1kSeP0G+4FRAW+4hzLgrQI+HBgV8IZ7KAPeKuDDgVEBb7iHsjTw60h54JU89lAWA6+rAl8P+HBgVMAb7qEMeKuADwdGBbzhHsqAt2rk8PkNgD8m4CsDY5o2/KvTr3+RFz5893r7Lo89lAFvVSv8+yfvsn/2F96f/hb47bzI240S/u2zzTleXvj4548/F/CnWW5Ht1b/BtfsBjGvMUn9OpTx1w7/fF1Sby9s4PM8ToCyRGd85HfrOePXtTMe+GBeXKOErz7GDxY+whT4esqz+g/fv9s+qwdezItrnPAdeeyhDHirxg5/sQL+qGYCHycPvJLHHsqAt2r08DGiwNebC3zU7YBX8thDGfBWAV8dGDkvJuCP2EMZ8FYNG77THfg+AV8dGDkvJuCP2EMR8GYBXx0YOS8m4I/YQ1FC+JgbAq/ksYci4M0aO/yy+yZiIPBBwFcGxgS84R6KgDdrPvAR8sAreeyhKAJeuY0YCLwM+MrAmIA33EMR8GaNHH6p3UYMBF4GfDgwKuAN91AEvFnAhwOjAt5wD0Ux8BHywNcDPhwYFfCGeyhKCq/LA6/ksYci4M0CPhwYFfCGeygC3izgw4FRAW+4hyLgzQI+HBgV8IZ7KALerEHDK6SXJSXwxwR8MDAu4A33kJcYXpUHXsljD3nA2zUBeF0e+HrABwPjAt5wD3nA2zUreE0eeCWPPeQBbxfwwcC4gDfcQx7wdgEfDIwLeMM9XOigwPcI+GBgXMAb7uEiAn4ZdzvgGwJeDtQGVecpAX/EHi7i4dUbAl9vXvCKPPBKHnu4AN404OVAZU59XnfAH7GHC+BNGzP8ThH4w5sEvHZL4OsBLwJeyWMPFybw3fLAK3ns4QJ404AXAa/ksYcLG/hOeeCVPPZwAbxpwIuAV/LYwwXwpg0YXv0xO/A9Al4EvJLHHiLu6cW87tsGB9ghD7ySxx6Atw14EfBKHnsA3raj4C/Dqtf7tpm3Um62FJe7bxsc4LLtVvGZ/IFTyyrN74zvOOU545U89gC8bcCLgFfy2IMdfLs88EoeezgMvvvGwNcDXgS8ksceVPjlBfA9Al4EvJLHHg6E77x19QDb5IFX8thDxK9hAN8j4EXAK3nsAXjbRgu/vAC+T8CLgFfy2IMlfJs88EoeewDetsHCR/xSPfA9mid8izzwSh57OBS+6wOArwe8CHgljz0Ab9tY4Ze1eQfBN8sDr+SxB+BtA14EvJLHHozhG+WBV/LYA/C2zRa+SR54JYc9xPw1KMD3aDrwHR8CfL2Rwi8b5h0I3yAPvJLDHoA3DngR8EoOezgCvv1jgK83UPgY977wdXnglez3ALx1c4avyQOvZL8H4K0DXgS8kv0ejoJv/Sjg6wEvAl7Jfg/d8MuWeYfCV+WBV7LfA/DWTQq+7cPaDzCUB17Jfg/AWzdM+Kjndk3zmj8Q+HpjhN9qAd+jicE3fyTw9WYOH8oDr2S+h074nRXwPZo7fCAPvJL5Ho6Hb/zQzgMU8sArme8BePMGCR/n3jKv4YO7D1AbeOi8fcAfuAdfeOUupCHgjfbgDN/59WFTwBvtAXj7hggf6d42r/7h6gG2f/O/MeBt9gC8Q2ODj/h+C/AxTRC+/vH6Abb8El9LwNvsoS98bUDEATa+GK8t4E32sOr9MxXg9YAXk4FXst1Df/iqfBT8cgm8lu0eOuCjf2EmlI87wOUSeCXTPaxS/KbUMfA5fdztgLfYQwf8Aa9/COSjDzBSHniDPaw65h3ywpeVoI8/wDh54A320AF/2Gscj4Lv+P84LALeYA/t8If+zSV7+YMOMOKkBz79Hlbt8w7+K2t29/aHHSDwrRnuoR3+iL+dbCt/4AGq8sCn30MbfNMdsD5vI3/oAS7zOm4HfOo9lFAN8xodoj6RiiJuWBvYoQ982j2s2k7QFoDYvV6si8G7z4DWT4b6wPy/vNwk5kX+d8cJ/+r061/Ehd3VdQ1+ta3XHvYfHszvuNeNh99+UsnjbDjepoG7/7zQnzT8+yfvsn92F3ZX8+q7aar29ssL8Umyavmo+pjee22Fqn62XgbvaPxcjjjgyh+v8b/ccA/i472rFf7ts/WH717vLuyunmZVbhpLSEGmrmrt8M/XH39+vbuwu5oXeUJV+1Q//Br8AY7ijM/z2EPCeYM/wEHBH/AYP/S9Dv4ABwVfPo3/8P07/Vn90Pc6+AMcFnxHHntIOG/wBwi8zbzBHyDwNvMGf4DA28wb/AECbzNv8AcIvM28wR8g8DbzBn+AwNvMG/wBAm8zb/AHCLzNvMEfIPA28wZ/gMDbzBv8AQJvM2/wBwi8zbzBHyDwNvMGf4Cjga9U/R28oc0b/AEm/wNHBPwE58UE/ATnxQT8BOfFlAKeRhjwMw34mQb8TAN+pvWHl6+06FnxIq366zeO7eNPp6dPml4QcmyvTk+fpzzA4vVpSefF1xs+eG1Vz1Gnv33d9Iqto+d9k+3zebqB2bzsUzPhAWZ/4mcp/8CH1Bs+eDVlrz7+OX89bsNrNPv09nnSgbtJSeb9/5/+71nqP3Bs/eHl66d7VsDXX5Xdo+wkTTnwVXaGppuX332knHdQAzrjN/ApT4BX3zS+zLtHb5+lm/c2/0smEs47qCE9xhfwCR/yPv6UPRVLODA7PTP4pI/J78f7GJ/y6WhxZ5f0SXh2RqUd+E3aZ+Hvx/usnsYZ8DMN+JkG/EwDfqZNGv7685fhpf0bZt+k4fcBX21K8LdPT9bXn52tr+69uXm0uPuycM4u/d0/n11//uvF4qR8K+VNCX59/nB9vjjJ//Uiu3S/gM8uXd05u37wMP984IzfNSn46y/e/PCH+7ffnt18dba+efwyc87+d529IRfP/g/4XZOCv3n84+P/+vzHxy+z+/TF4k7OnX0uAN/UpODXL37z5e23f7ifn+35Vc749qYFf7UoH+WLR/byIX3zGA98pWnBF8/p8yfu2X39nbPts/p/2J3xt095Vr9pWvDNcZ43NHX426f5s7xPfRQDbOrw1BLwMw34mQb8TAN+pgE/04CfacDPtL8BjHk3RjvTB3oAAAAASUVORK5CYII=" alt="plot of chunk unnamed-chunk-4"/> </p>
<p>This result looked especially good – finally, before running the model, I used the <strong>plyr</strong> package to find the mean and standard deviation for each gender:</p>
<pre><code>## gender mean.weight sd.weight
## 1 F 135.7 17.64
## 2 M 184.4 24.26
</code></pre>
<p>Next, I used the <strong>mixtools</strong> package to fit two gaussians to the data. Note that application of the EM algorithm will not yield the same estimates each time – the results reported come from an interative fitting process.. Here are a few draws from the two-component model:</p>
<pre><code>## number of iterations= 210
</code></pre>
<pre><code>## summary of normalmixEM object:
## comp 1 comp 2
## lambda 0.983262 0.0167383
## mu 170.985903 197.1966562
## sigma 30.039475 68.0108365
## loglik at estimate: -69972
</code></pre>
<pre><code>## number of iterations= 256
</code></pre>
<pre><code>## summary of normalmixEM object:
## comp 1 comp 2
## lambda 0.0167324 0.983268
## mu 197.1994160 170.986019
## sigma 68.0174677 30.039619
## loglik at estimate: -69972
</code></pre>
<pre><code>## number of iterations= 109
</code></pre>
<pre><code>## summary of normalmixEM object:
## comp 1 comp 2
## lambda 0.913756 0.0862443
## mu 170.277126 183.5844313
## sigma 32.336239 8.7492152
## loglik at estimate: -70004
</code></pre>
<p>Here, I was quite surprised. From the plot of the weight densities by gender, I was fairly sure the mixture model would work.. surprisingly, the first few times I ran the model, though, the EM algorithm resulted in a mixture of two gaussians, one around 170, and another around 195 (For a bit of background on this algorithm, and mixture models in general, I highly recommend <a href="http://www.stat.cmu.edu/%7Ecshalizi/uADA/12/lectures/ch20.pdf">Cosma's Notes</a>). </p>
<p>When I instead moved to a model with three types, I started reliably picking up a gaussian that's fairly close to the mean female weight:</p>
<pre><code>## number of iterations= 403
</code></pre>
<pre><code>## summary of normalmixEM object:
## comp 1 comp 2 comp 3
## lambda 0.0669021 0.144703 0.788395
## mu 191.9951136 128.521504 177.554028
## sigma 50.1322568 10.480415 24.355337
## loglik at estimate: -69672
</code></pre>
<pre><code>## number of iterations= 294
</code></pre>
<pre><code>## summary of normalmixEM object:
## comp 1 comp 2 comp 3
## lambda 0.14469 0.788445 0.0668648
## mu 128.52087 177.553775 191.9985466
## sigma 10.48005 24.356546 50.1385850
## loglik at estimate: -69672
</code></pre>
<pre><code>## number of iterations= 125
</code></pre>
<pre><code>## summary of normalmixEM object:
## comp 1 comp 2 comp 3
## lambda 0.911272 0.0719816 0.0167468
## mu 170.170622 181.9974482 194.2480171
## sigma 32.328408 7.6615752 8.4118644
## loglik at estimate: -70003
</code></pre>
<p>It's a bit difficult to tell what to make of all this. </p>
<p>The most obvious question in my mind is the failure of the model to pick up the types (i.e. gender) that I expected. </p>
<p>Taking a look at my assumptions, though, I've got to admit that I know essentially nothing about CrossFit. From a bit of browsing around, I found a few blog posts that suggested certain body types would be better for different exercises. </p>
<p>Returning to the model fits, it might make sense to think of different subpopulations based on weight – burly athletes who have a competitive advantage for strength-related competition, thinner athletes who have an advantage with speed-related events.</p>
<p>I should also mention that I originally chose two components based on my expectations about weight distribution following from gender. Cosma has a nice section in the notes on using cross-validation to choose the number of components, and that might be a reasonable way to extend the analysis.</p>
<p>As a final note, I've got this voice ringing in my head that dissuades the reification of the result – the 'types' our model uncovers are only useful insofar as they relate to other variables of interest.. it'd be ambitious to conclude here that there actually <em>are</em> these two subpopulations.. instead, we might do better to use this information to think heuristically about CrossFit athletes.</p>
<p>Here's the code I used for this post:</p>
<div style="overflow:auto;"><div class="geshifilter"><pre class="r geshifilter-R" style="font-family:monospace;"><a href="http://inside-r.org/r-doc/base/rm"><span style="color: #003399; font-weight: bold;">rm</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/list"><span style="color: #003399; font-weight: bold;">list</span></a> = <a href="http://inside-r.org/r-doc/base/ls"><span style="color: #003399; font-weight: bold;">ls</span></a><span style="color: #009900;">(</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/mixtools"><span style="">mixtools</span></a><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/library"><span style="color: #003399; font-weight: bold;">library</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/packages/cran/ggplot2"><span style="">ggplot2</span></a><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Note: data source: </span>
<span style="color: #666666; font-style: italic;"># http://dl.dropbox.com/u/23802677/xfit2011.csv</span>
dat = <a href="http://inside-r.org/r-doc/utils/read.csv"><span style="color: #003399; font-weight: bold;">read.csv</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"http://dl.dropbox.com/u/23802677/xfit2011.csv"</span><span style="color: #339933;">,</span>
stringsAsFactors = <span style="color: #000000; font-weight: bold;">FALSE</span><span style="color: #339933;">,</span>
header = <span style="color: #000000; font-weight: bold;">FALSE</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Creating gender variable:</span>
dat<span style="">$</span>gender = <a href="http://inside-r.org/r-doc/base/substr"><span style="color: #003399; font-weight: bold;">substr</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>V5<span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span>
correct.gender.codes = <span style="color: #009900;">(</span>dat<span style="">$</span>gender <span style="">%in%</span> <a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"F"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"M"</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span>
dat<span style="">$</span>gender<span style="color: #009900;">[</span><a href="http://inside-r.org/r-doc/base/which"><span style="color: #003399; font-weight: bold;">which</span></a><span style="color: #009900;">(</span>correct.gender.codes <span style="">==</span> <span style="color: #000000; font-weight: bold;">FALSE</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span> = <span style="color: #000000; font-weight: bold;">NA</span>
<span style="color: #666666; font-style: italic;">#table(dat$gender)</span>
<span style="color: #666666; font-style: italic;">#Eyeballing the data, it looks like weights are reported in either kilograms or pounds.</span>
kilog.vec = <a href="http://inside-r.org/r-doc/base/grep"><span style="color: #003399; font-weight: bold;">grep</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>V7<span style="color: #339933;">,</span> pattern = <span style="color: #0000ff;">"Kilograms"</span><span style="color: #009900;">)</span>
Lb.vec = <a href="http://inside-r.org/r-doc/base/grep"><span style="color: #003399; font-weight: bold;">grep</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>V7<span style="color: #339933;">,</span> pattern = <span style="color: #0000ff;">"Pounds"</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Defining function to excise number from string:</span>
excise.number = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>string<span style="color: #009900;">)</span><span style="color: #009900;">{</span>
<span style="color: #666666; font-style: italic;"># Note: assumes number starts the string</span>
<a href="http://inside-r.org/r-doc/base/as.numeric"><span style="color: #003399; font-weight: bold;">as.numeric</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/unlist"><span style="color: #003399; font-weight: bold;">unlist</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/strsplit"><span style="color: #003399; font-weight: bold;">strsplit</span></a><span style="color: #009900;">(</span>string<span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/split"><span style="color: #003399; font-weight: bold;">split</span></a> = <span style="color: #0000ff;">" "</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #009900;">}</span>
dat<span style="">$</span>weight = <a href="http://inside-r.org/r-doc/base/rep"><span style="color: #003399; font-weight: bold;">rep</span></a><span style="color: #009900;">(</span><span style="color: #000000; font-weight: bold;">NA</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/nrow"><span style="color: #003399; font-weight: bold;">nrow</span></a><span style="color: #009900;">(</span>dat<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
dat<span style="">$</span>weight<span style="color: #009900;">[</span>Lb.vec<span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>V7<span style="color: #009900;">[</span>Lb.vec<span style="color: #009900;">]</span><span style="color: #339933;">,</span>excise.number<span style="color: #009900;">)</span>
dat<span style="">$</span>weight<span style="color: #009900;">[</span>kilog.vec<span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>V7<span style="color: #009900;">[</span>kilog.vec<span style="color: #009900;">]</span><span style="color: #339933;">,</span>excise.number<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;">#note: weight in kilograms still needs to be converted:</span>
kilog.to.lb = <a href="http://inside-r.org/r-doc/base/function"><span style="color: #003399; font-weight: bold;">function</span></a><span style="color: #009900;">(</span>x<span style="color: #009900;">)</span><span style="color: #009900;">{</span><a href="http://inside-r.org/r-doc/base/round"><span style="color: #003399; font-weight: bold;">round</span></a><span style="color: #009900;">(</span>x<span style="">*</span><span style="color: #cc66cc;">2.20462</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">)</span><span style="color: #009900;">}</span>
dat<span style="">$</span>weight<span style="color: #009900;">[</span>kilog.vec<span style="color: #009900;">]</span> = <a href="http://inside-r.org/r-doc/base/sapply"><span style="color: #003399; font-weight: bold;">sapply</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>weight<span style="color: #009900;">[</span>kilog.vec<span style="color: #009900;">]</span><span style="color: #339933;">,</span>kilog.to.lb<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Final (Clean) data:</span>
dat = <a href="http://inside-r.org/r-doc/stats/na.omit"><span style="color: #003399; font-weight: bold;">na.omit</span></a><span style="color: #009900;">(</span>dat<span style="color: #009900;">[</span><span style="color: #339933;">,</span><a href="http://inside-r.org/r-doc/base/c"><span style="color: #003399; font-weight: bold;">c</span></a><span style="color: #009900;">(</span><span style="color: #0000ff;">"weight"</span><span style="color: #339933;">,</span><span style="color: #0000ff;">"gender"</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Weight density:</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>dat<span style="color: #339933;">,</span>aes<span style="color: #009900;">(</span>x = weight<span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_density<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
p
<span style="color: #666666; font-style: italic;"># Gender Distribution:</span>
<a href="http://inside-r.org/r-doc/base/round"><span style="color: #003399; font-weight: bold;">round</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/prop.table"><span style="color: #003399; font-weight: bold;">prop.table</span></a><span style="color: #009900;">(</span><a href="http://inside-r.org/r-doc/base/table"><span style="color: #003399; font-weight: bold;">table</span></a><span style="color: #009900;">(</span>dat<span style="">$</span>gender<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Weight density by gender:</span>
p = <a href="http://inside-r.org/packages/cran/ggplot"><span style="">ggplot</span></a><span style="color: #009900;">(</span>dat<span style="color: #339933;">,</span>aes<span style="color: #009900;">(</span>x = weight<span style="color: #339933;">,</span> color = <a href="http://inside-r.org/r-doc/base/factor"><span style="color: #003399; font-weight: bold;">factor</span></a><span style="color: #009900;">(</span>gender<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span> <span style="">+</span> geom_density<span style="color: #009900;">(</span><span style="color: #009900;">)</span>
p
<span style="color: #666666; font-style: italic;"># mean and sd for weight by gender:</span>
ddply<span style="color: #009900;">(</span>dat<span style="color: #339933;">,</span>.<span style="color: #009900;">(</span>gender<span style="color: #009900;">)</span><span style="color: #339933;">,</span>summarise<span style="color: #339933;">,</span> mean.weight = <a href="http://inside-r.org/r-doc/base/mean"><span style="color: #003399; font-weight: bold;">mean</span></a><span style="color: #009900;">(</span>weight<span style="color: #009900;">)</span><span style="color: #339933;">,</span> sd.weight = <a href="http://inside-r.org/r-doc/stats/sd"><span style="color: #003399; font-weight: bold;">sd</span></a><span style="color: #009900;">(</span>weight<span style="color: #009900;">)</span><span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Mixture model with 2 components:</span>
weight.k2 = normalmixEM<span style="color: #009900;">(</span>dat<span style="">$</span>weight<span style="color: #339933;">,</span>k=<span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span>maxit = <span style="color: #cc66cc;">1000</span><span style="color: #339933;">,</span> epsilon = <span style="color: #cc66cc;">0.001</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/summary"><span style="color: #003399; font-weight: bold;">summary</span></a><span style="color: #009900;">(</span>weight.k2<span style="color: #009900;">)</span>
<span style="color: #666666; font-style: italic;"># Mixture model with 3 components:</span>
weight.k3 = normalmixEM<span style="color: #009900;">(</span>dat<span style="">$</span>weight<span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span>=<span style="color: #cc66cc;">2</span><span style="color: #339933;">,</span>maxit = <span style="color: #cc66cc;">1000</span><span style="color: #339933;">,</span> epsilon = <span style="color: #cc66cc;">0.001</span><span style="color: #009900;">)</span>
<a href="http://inside-r.org/r-doc/base/summary"><span style="color: #003399; font-weight: bold;">summary</span></a><span style="color: #009900;">(</span>weight.k2<span style="color: #009900;">)</span></pre></div></div><p><a href="http://www.inside-r.org/pretty-r" title="Created by Pretty R at inside-R.org">Created by Pretty R at inside-R.org</a></p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-45008011723244718452013-03-19T18:04:00.000-07:002013-03-19T18:04:25.848-07:00Behavioral Economics and Beer... highly correlated<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title></title>
<base target="_blank"/>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: Consolas, 'Lucida Console', 'DejaVu Sans Mono', 'Droid Sans Mono', Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
}
.r {
background-color: #F8F8F8;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
/*
* highlight.styles.css
*
* RStudio style for highlight.js in HTML preview. Initial template based
* on highlight.js VS style by JasonDiamond, tweaked to look more like
* the default RStudio TextMate theme.
*
* Copyright (C) 2009-12 by RStudio, Inc.
* Copyright (C) Jason Diamond <jason@diamond.name>
*
* This program is licensed to you under the terms of version 3 of the
* GNU Affero General Public License. This program is distributed WITHOUT
* ANY EXPRESS OR IMPLIED WARRANTY, INCLUDING THOSE OF NON-INFRINGEMENT,
* MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Please refer to the
* AGPL (http://www.gnu.org/licenses/agpl-3.0.txt) for more details.
*
*/
pre code {
display: block; padding: 0.5em;
}
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: rgb(88, 72, 246)
}
pre .number {
color: rgb(0, 0, 205);
}
pre .comment,
pre .annotation,
pre .template_comment,
pre .diff .header,
pre .chunk,
pre .apache .cbracket {
color: rgb(76, 136, 107);
}
pre .keyword,
pre .id,
pre .title,
pre .built_in,
pre .aggregate,
pre .smalltalk .class,
pre .winutils,
pre .bash .variable,
pre .tex .command {
color: rgb(0, 0, 255);
}
pre .string,
pre .title,
pre .parent,
pre .tag .value,
pre .rules .value,
pre .rules .value .number,
pre .ruby .symbol,
pre .ruby .symbol .string,
pre .ruby .symbol .keyword,
pre .ruby .symbol .keymethods,
pre .instancevar,
pre .aggregate,
pre .template_tag,
pre .django .variable,
pre .addition,
pre .flow,
pre .stream,
pre .apache .tag,
pre .date,
pre .tex .formula {
color: rgb(3, 106, 7);
}
pre .ruby .string,
pre .decorator,
pre .filter .argument,
pre .localvars,
pre .array,
pre .attr_selector,
pre .pseudo,
pre .pi,
pre .doctype,
pre .deletion,
pre .envvar,
pre .shebang,
pre .preprocessor,
pre .userType,
pre .apache .sqbracket,
pre .nginx .built_in,
pre .tex .special,
pre .input_number {
color: rgb(43, 145, 175);
}
pre .phpdoc,
pre .javadoc,
pre .xmlDocTag {
color: rgb(128, 159, 191);
}
pre .vhdl .type { font-weight: bold; }
pre .vhdl .string { color: #666666; }
pre .vhdl .literal { color: rgb(163, 21, 21); }
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]||p=="no-highlight"){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v=="no-highlight"){return}if(v){y=d(v,x)}else{y=g(x);v=y.language}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script type="text/javascript">MathJax.Hub.Config({tex2jax: {processEscapes: true, processEnvironments: false, inlineMath: [ ['$','$'] ], displayMath: [ ['$$','$$'] ] }, asciimath2jax: {delimiters: [ ['$','$'] ] }, "HTML-CSS": {minScaleAdjust: 125 } });</script>
</head>
<body>
<p><strong>Short</strong>: <br/>
I plot the frequency of wikipedia searches of “Behavioral Economics”, and “Beer” – who knew the correlation would be 0.7!</p>
<p>Data reference:<br/>
Data on any wikipedia searches (back to 2007) are available at <a href="http://glimmer.rstudio.com/pssguy/wikiSearchRates/">http://glimmer.rstudio.com/pssguy/wikiSearchRates/</a>. The website allows you to download frequency hits per day as a csv, which is what I've done here.</p>
<pre><code class="r"># Behavioral Economics and Beer:
# Author: Mark T Patterson Date: March 18, 2013
# Clear Workbench:
rm(list = ls())
# libraries:
library(lubridate)
library(ggplot2)
</code></pre>
<pre><code class="no-highlight">## Find out what's changed in ggplot2 with
## news(Version == "0.9.1", package = "ggplot2")
</code></pre>
<pre><code class="r">
# data:
curr.wd = getwd()
setwd("C:/Users/Mark/Desktop/Blog/Data")
ts = read.csv("BehavEconBeer.csv", header = TRUE)
setwd(curr.wd)
# cleaning the dataset: str(ts)
ts$date = as.character(ts$date)
ts$date = mdy(ts$date)
</code></pre>
<pre><code class="no-highlight">## Using date format %m/%d/%Y.
</code></pre>
<pre><code class="r">ts = ts[, -1]
</code></pre>
<p>Note: the <em>mdy</em> function is in the lubridate package, which cleanly handles time/date data. I've eliminated the first column of data, which just gives row names inherited from excel.</p>
<pre><code class="r">p = ggplot(ts, aes(x = date, y = count)) + geom_line(aes(color = factor(name)),
size = 2)
p
</code></pre>
<p><img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfgAAAH4CAMAAACR9g9NAAABIFBMVEUAAAAAADoAAGYAOjoAOmYAOpAAZpAAZrYAv8Q6AAA6ADo6AGY6OgA6Ojo6OpA6ZmY6ZrY6kNtmAABmADpmAGZmOgBmZgBmZjpmZmZmtv9/f39/f5V/f6t/lcF/q9aQOgCQOjqQOmaQZpCQ27aQ2/+Vf3+Vf5WVf6uVlZWVlauVlcGVq6uVq9aVweurf3+rf6urlZWrlaurq5Wrq6urwcGrwdarweur1v+2ZgC2Zjq2tma2/7a2///BlX/BlZXBlavBq8HBwdbB6+vB6//Wq3/Wq5XWwZXW/+vW///bkDrbtmbb29vb/7bb///l5eXrwZXrwavr1qvr68Hr/+vr///y8vL4dm3/tmb/1qv/25D/68H//7b//9b//9v//+v////XfTJvAAAACXBIWXMAAAsSAAALEgHS3X78AAAbX0lEQVR4nO2dC3sbWVKGlcAQZYdZcHZYsHaScMnAMKzIclsHCBOW8YINjARsJMtO4v7//4K+SX1uVX0uddQt9/c9fuTu01Wn2t+r6m5ZLXtWQJPUbOgdgIYRwE9UAD9RAfxEBfATFcBPVAA/UQH8RAXwExXAT1QAP1EB/EQF8BNVLPiVJmP1oA0xTsWzSVGVxHaNinePizLKIoBPjAd4CXfZpKhKAE8I4BPjAV7CXTYpqhLAEwL4xHiAl3CXTYqqBPCEAD4xHuAl3GWToioBPCGAT4wHeAl32aSoSgBPiAa/XixeFMXl4mff64+N8rjLJkVVAnhCJPhdRf1i9/y98dVuzuMumxRVCeAJkeDXf1d1/HpZfHx9pT6WmxaljriLUA6R4C/Ljl9frC+K+3dX6mO7OU9bsUlRldDxhOiOvygP90tnx1fK4y6bFFUJ4Amx5/g1zvG98Q8OfHVVv8RVfW/8wwPPK4+7bFJUpehd2263XvEAH+NuUFJUpdhd224N8gDfSMTdsKSoSgBPCODd8QBPSMTdsKSoSkngVfIA30jE3bCkqEoATwjg3fEAT0jE3bCkqEpmwoEmwMeJdbfXrcHAdzi9wCvkAb4R526/WwA/uADeGb8FeEKcu/1uRYGvOQC8kE4HfAOiUFZ9KumlFJxGwmGLDr6rAfCNGHd7XZEA73vR7Qm+Aw3wvBh3PXBEg98KgTffe0sGb8wnyiiLpgnezE4Gb04oyiiLTg38Vl07LMqB3xrgD1EA34hx13Klh4Znkk6iW24WvMAfpjA5dZzjwFtPJFFGWXSi4JXlZikI/NYEpTwdAJ4X7a7lSg8NzyQdxbjAW9wB3oOGZ1IP+K0zaRUOfmtErnouIgCecqWPhmdSH3iKfAB4BaAa1XwD+Ea0u6YrfTQ8kyjwTpBEKQq8jtisBvCqaHf7GQ4FXudugl8Fg9/uTwEA78dQDvw2A3jlGdB8o8B34wDvx3C04M3WV8A7XjYA/PHBK8vZwW8V8FoiwJ8UeFeCL3g90RjXq45epwXeIhICfkuB79Y8wasXgBsXd4AnxsPBqyhSwLsSDJauUAO8ctSZGviNJmO1V6Hxm01jsvatXlTR9JQyc7fGBmXJCm22a6vduKu6KKQcOvGOb/vP0XN2KaVVtYRtt6J1M9fx3Sljah3vB3KU4I3l/Rx9ZQ5bdfBbgPdlyMTTSTx4hnyhzUCC7y9jbgV4BmQm8Ab3APD2ytY1DvCk/ED2gtcca530BK9dY1veH1Z7wOvcDfD6Sd24AgD4VTx4G4YFXgXU82pez2rXs4N3XmGIMsqi8YBXLdYJioDXp94a8waBN041AJ8LvPv0qqS4+TnAW9MGgN8CfCUR8CpnvblYEiYXgl88eDdhgK8kDX6/6AS/JcB3EziwAjyrocB3lI1XVS4S9sHdC/xWCjz7fFBqAXwveMVJBbzGUkNog9cBupOUXTPoCIHXSwG8J3jFM/MpYF/xWZ5HgDcBu1//29UAvlMO8Jr1fuBXdr7BDuDdGga8m6e+0Q+8ARLgfTUC8CaAjeKlyf3I4PVrS4AvpMFr5m2UVRqMY0InOzf4/Xoc+O5lA8C7FQReSUoBr0283zUrJwS8XbZbAXin8oF3TaguHqIKc7uW0PfEAnhTxs9JAO75qPveNN27jem3K9U1oxMpwLs1HvBqkmarM9U1oxIA8H16IODtk0MieGNagN/L+DndgE8f/EqdFeALWfCGdUngt2ZqIPiVe36A38v4Od2Aez8G5+IuAF4diANP7rK2aiwqEmWURYOBb46iBPjQhtQTbPB2UjPgmCsOvBkvyiiLBgSvGZoH/FYF74iPBe94FaFLlFEWjRq8w9SCBW91ZSR49/QAXxwTvFGJA+9iA/BuDQDeakuAH0BDgScM3SclgDdGQsEXrtmta4tuZQPwDvWCd708kwWvXeKbO+Eq4P5JJgT+clH9u/gM/z9eADwJRt2kjIiDXx2WHxz4++8qyLvn742vdrOXW8OC72A5jhKp4Lu9fWjgP/7i68WLYr0sPr6+Uh/LTYtSCSVrp7QVZV2Lobe4J7U2tSOOHLoAs9PuWN8pxiUS/O6rq2J9sb4o7t9dqY/tZq82Sex4V5OFdPyh1R05ch3vDD8SvQSxF3e7pbPjK3m5JQHedHTltJ8AT/MlK0we/G5ZlB0/6DneDd4tIhzgKbFX9ctC4qreefw1/TeSRMHzW/wKTAg8Lz8cG8cVtb4O8AMpN3jbMCd408pg8Bt3ODkPwEfmheEYDDzJF+Aj87zcigcfymVfyRqnuAN8ZJ6XWw8IPPGuHcAzbiWBtxwlKgF8oCYC3m/Xen8UV2FntCijLDoSePNNMzPCslIKvOsNGm3XfAsAfKMAtzrLtoe3yswQy8o48NQukLtmjwM8rwC3DpY1CyMBv3EPA3yPvNzquA8HnriVKvgPawN8Iy+3DNI0eNtKOfBUAjEO8Ly83HKBr1WYMZaVweCJt2vFwFOnBoB36KjgCYmBDxoXZZRFIwdvOxrkf0QCwPPycWsL8ONVdvArgB+ljgh+6wRvX5QB/DE0PHi3lQCfWUcArz0BAH4kygh+mwKe+l06wAspN3hlAeDHJIBPjAd4SwA/Zo0bvMPRIP8jEgCel4dbNHjSXcVKgM8rgE+MB3hTW4AfszKDV5cAfkyKBb/RZKzWqsmpSx14ZzyZ3luJVWhCcAFimrFLqOMd3am0rNLr3h1P3FGDjheSDHjXcTkVvFsALySAT4yfNPitP/jmCA7wgysb+C0BnnWXtZjNBPhASYDfkuA3WgDAj0gAnxg/YfAmVWUU4EcrwXM8Dd56bgD84JL7BQ7Aq7OMXgCfGA/wOvitBt78KBzAD66s4FcAP1odF3yfu6zFbCbABwrgE+MBngW/AviRKRP4LcCPXFK3Xhnk21WAH62OBX4F8OOSLPg92y3Aj12i4Lca+BXAj1ji4NU76TVX1FMBwA+uLOC3LvA+7rIWs5kAHyh58OqfOwL40QrgE+MB3lSMu6zFbCbAB0rss3Mu7gA/XgF8YjzAA7w6y+iVC3ycu2wSmwnwgcoDPtZdNonNBPhASYPXf0MH8KMVB34d8m/EjV6Pc5dNYjMBPlAM+N1iWeyevze+2o2WWwCvzTJ60eDvvv3fZdX0H19fqY/llkUpK7wFn3FXIUmR4EvGuxL8RXH/7kp9bDdbbYKO12YZvUjw66qvl86Or2S7BfDqLKMXd3G3CznHA7w2y+jVA97/qt75d3AAfrSS+wOHAK/OMnoBfGI8wAO8OsvoJfi3bG3uAD9e5f3/8QA/WgF8YjzAS7jLJkVVAnhCAJ8YD/AS7rJJUZUAnhDAJ8YDvIS7bFJUJYAnBPCJ8QAv4S6bFFUJ4AkBfGI8wEu4yyZFVQJ4QgCfGA/wEu6ySVGVAJ4QwCfGA7yEu2xSVCWAJwTwifEAL+EumxRVCeAJAXxiPMBLuMsmRVUCeEIAnxgP8BLusklRlQCeEMAnxgO8hLtsUlQlgCcE8InxAC/hLpsUVQngCQF8YvzUwG80Gau9Co1PyDzerunTjF3o+MT4qXV8HnfZpKhKAE8I4BPjAV7CXTYpqhLAEwL4xHiAl3CXTYqqBPCEAD4xHuAl3GWToioBPCGAT4wHeAl32aSoSgBPCOAT4wFewl02KaoSwBMC+MR4gJdwl02KqgTwhAA+MR7gJdxlk6IqATwhgE+M9wL/4dnsTF3/x1fKypuzgtb1k0hAfQL4xHgv8Lc/UkkXN4+U1Q8/eVXQuv38bSShHgF8YrwP+E8vZ4/+7Nls9tkPxfVs9qRaffWmXChuZrPH//T47c3jn84ev/1QR1w/+t3Z77ysQsuIs/JYcZ6K2C2AT4wP6Piy0cuvEmW1UIE+u5mdF9ef/VCu3z49byKuy22z83L1uoo4//SSOxEkCOAT473Blw1cYi07ueFbtfST6pj/pgRfM24jygNA/Two18qW//Qy00ke4BPjfcG/qRu7/Cp7uOv4Evz1HnwTcQDfPEeO1PEfvnx7eOSVx102KarSaMCXp/PfKg/n19V5/Pbp/hxfgr+pzvEV+CbiAL68EihDj3KOr64uatXPNV553GWToiqNAzynUVzV+/R6ozzusklRlcYPHq/j+5KiKp0A+GFkgL+pD/WPcY73jvcDP75PWBmHeu9LiTzusklRlQCeEM7xifEPAzx/oaEqj7tsUlQlgCdkHupxjg+MfxjgVV0uFhch/0Zcwl02KaoSwBMiwe9eFB9fX+2evze+2s153GWToiqNBPxWUwv+9mn12/g8XHvFHepL8Oul+ViOL0oNsKunLDf4H/9QfPpj7vd2GeXq+OvmWXi5WBbri+L+3ZX62MbkaSs2KarSqDu+Av+n7RvvzeP1Fx7XVyJygT+8qFsvnR1fKY+7bFJUpVGDf1ofXW+elL123jxm+w2tJRf4m+pZt1tW4HGO74tP7Pjizfl1/cZ7+3i0U77zHF//+q68qn+Bq/re+ISr+hb8TQ27eRwKvL/yuMsmRVUaNfjqUP+kPrvPzptHgKf0kMAPKgN8dduHz30YAM+Pm3aNHXxzh9c17sDxj38Y4HHP3UTBo+OnegcOzvHT7Hh/5XGXTYqqBPCEAD4x/oGArz/M9cjjDaM87rJJUZUAnpBxVV/f3O9zE38ed9mkqEoAT8i8qq8/s4mrev94P/BzTRV47S6M29834u1X1M1InTXrvRXa4wW5caiv36XxeUs4j7tsUlSl8YJX78KwwFMs6/d1RISLu8T4JPD7uzBuf+9Z+6cQnlR/JuX2x/9dYq7Xqjsz6gUdfP3Zy3r89ounh9RqmrM31Ucty+Aq5D/powPAJ8YnHer3d2GUV1Vl95d9Xz6WI9fnJblqy5vqzoxmWDnUP35b8b/+og4oF5ukavnz5nP2VXodMntV/BfA+yUcs+MPd2FUh/qy06+rv4zw4ct//vJtSe76vHqTvnqfth1WOr5+874JaFIPy5/+5G0Dvg4pnyfU+7wAnxgfe1Wv3YXR0jurz/rXf3RWdB1/VjTDGvjq282s6fI69Wa/fABfhzyhLwoAPjE+Grx6F0ZDrxx69Ifn9d9N+XA4x58VzbB6VX9WHQTac397sNgvH8DXIT/FOd47YZqv4/2Vx102KaoSwBMC+MR4gJdwl02KqgTwhAA+Mf6B3IjhrTzusklRlQCeEMAnxgO8hLtsUlQlgCcE8InxAC/hLpsUVQngCQF8YjzAS7jLJkVVAnhCseCN30eYv6DoUWh8Qubxdk2fZuxCxyfGT63j87jLJkVVAnhCAJ8YD/AS7rJJUZUAnhDAJ8YDvIS7bFJUJYAnBPCJ8QAv4S6bFFUJ4AkBfGI8wEu4yyZFVQJ4QgCfGA/wEu6ySVGVAJ4QwCfGA7yEu2xSVCWAJwTwifEAL+EumxRVCeAJAXxiPMBLuMsmRVUCeEIAnxgP8BLusklRlQCeEMAnxgO8hLtsUlQlgCcE8InxAC/hLpsUVQngCQF8YjzAS7jLJkVVAnhCAJ8YD/AS7rJJUZUAnhDAJ8YDvIS7bFJUJYAnBPCJ8QAv4S6bFFUJ4AmR4O//ZbF4/h7/P74v/sGB370oSV/snr83vtrNedxlk6IqATwh9lC/vlgvi4+vr9THcnhR6ki7B+USB75s+vVFcf/uSn1st+VpKzYpqhI6nhAD/rI82Ds7vlIed9mkqEoAT4i5uLsoH3GO74t/cOAvqzP5Elf1ffEPDnyP8rjLJkVVAnhCAJ8YD/AS7rJJUZUAnhDAJ8YDvIS7bFJUJYAnBPCJ8QAv4S6bFFUJ4AkBfGI8wEu4yyZFVQJ4QgCfGA/wEu6ySVGVAJ4QwCfGA7yEu2xSVCWAJwTwifEAL+EumxRVCeAJAXxiPMBLuMsmRVUCeEIAnxgP8BLusklRlQCeEMAnxgO8hLtsUlQlgCcE8InxAC/hLpsUVQngCQ0Ifj6fByVFVQJ4QsOBn88J8gB/DAF8YjzAh7oL8IMK4BPjAT7UXYAfVACfGD818BtNxmqvqvga/DwwMbhS1K4JSBRSDg3W8Q13V8uj448hgE+MB/hAdwF+WAF8YjzAB7oL8MMK4BPjAT7QXYAfVgCfGA/wge7OSfIAfwwBPBFv7hrAN0p2d+TgrX0D+Eap7s5PAby6cwDfKNVdgB9YAO+Ot/YO4Bulunsi4Oe98QAf5i7ADyyAX+kv3Qzwc1d8f2FRRlk0NvDlSD7wRAcr+7FfmLPg+48EooyyaHDwOnniIOCo5PkyW901reDGGNbLNwvq3rifKADvVih48vBvVep5tdVtLbSxLm1jDq/UpwAD3usUIMooi0YFnhhzVZp3aKxK2iyFOdpuscBrxdsFxxFCf54cxo3dFmWURWMCP3cMms+Dwo6dG5X0SVwJCjCbuwVee2YpYc0jwPuDP7jmYOwe5MGrHK2nj3mVbiTY3FXwSnkiHuD9wZs9YzDWUZvPBDf4uSf4brMOUp8O4BnJg9977Aa/H3U38Nx9yjbBqxE6MAp8tzfUEQLgpcBr5C3A1u9XDPD75MMkfuBXyjcN/GEiLd4Cb3AHeH/wHW5v8A6OFHh1TuuFu3l96QA/B/hW8eAVija/OskCo8hxn4QNXp2vsOLt1+sceOOQAvDC4BXYFng16TjgtWEXYIU8wIuBJ8AoIeS4PaML/Mo5fzB4sxjA94Nfqf4ZzlHgrXOzhdF1+k4Hrx5S1ERi908f/MfXV0WGfyPO9OchSWusYPBaOAt+bowz4F2AiSfo6rTB7xZfXRW75++Nr3arDjMZvErABG+kBh4JrOnVdZPXfoMnePrIcRx4KSLB3//q/t1VsV5Wja8+lpsWpaILtn2orGgL+rgxrI/PrWEjYT6n5tkPWOPtiGt3uP10TDR2MYf6GvxF9U19bDfqXRzQ8Y1FXMfQHc8eCRwdOafmcR/Suw2u+V2dTc5/FHZJ6gPv6vhKOk0SvGWsA7zp3IY+FHPgg8CEgScP6dR+nj74pHP8fG4bq4PXACjgVf89wIeDIeYPBU8+gY7CLkk94BOu6ucH2cNO8PsxA7w9pz94EgwxP8D3SrfLAb7jLgieMnq/btWTBE/PA/CmVTb5uQu8HpUKnuDoOHRQ4L3rAvzKFAt+5QJ/CMkE3gtwFHhHAVFGWZQVvPrKTBvPAp4Gkw88taOijLJICLzTj7n6ykwzyg88CcDewIPhAK9WPhsAfi+HW46BjqG+wXXP8yFXB+8qFAoG4G0JgreP6CpDNdAFvsvtBR8MxgGY4s6BJ59x9rAooywSBW8e0Ztf2arb2iXq4y21NsQ4Z3QveIqvOU4VpsAT8aKMskgW/Fxb3/+u/mjgqQ0x4F3DAE+C1/naN50fAbwzweJLjhdMOMAb4OeqOpuMm9GVBRu8MtuGGGd8pl+2OZ8M2j4ZIt5v4uefNHj1XG6AP6zvv4uD7wFDkXSMAzwvl4lWY2cE7w3G+hWDsUumqHeYqfk3zmFRRlkkCf4A/OCEN3jNStVnyn9fLuytAq7xSPDmLKOXKHjrVZXz8y2rAPA2F2I8HDwxHgfemmX0Og548+WW9fdJLCSx4O0Ea0Sp4lAw+JWz6uiVA3xnkAu84S4BnrrqLtzD+cFviAqT/lMoOnjFH1/wDivjwDsQOMFEgG+L+M0jyiiLZMGbF9ep4B2OuoeJa6yVLHj/cVFGWSQBfu4JvtuQAN55SqWusVYAT0kMvLLYrTr+QInlrsWrBzzhP8CHaXjwlJWBHAnuAE9IGPxKDrxbNBeAD1Im8JYr+hMiB3ihBIDnpf6UPuD1l2cAP7gAPjEe4F0rJvg+d1mL2UyAD9TRwGsXXwA/uATAz/3A+7jLJrGZAB8oKfDaarS7bBKbCfCBigW/6VSD3wSp6A+RygxNiN81fZqxS77jU9qKTWIz0fGBknp3TsZdNonNBPhADfZPhWkB/DEE8InxAC/hLpsUVQngCQF8YjzAS7jLJkVVAnhCAJ8YD/AS7rJJUZUAnhDAJ8YDvIS7bFJUJYAnBPCJ8QAv4S6bFFUJ4AkBfGI8wEu4yyZFVQJ4QgCfGA/wEu6ySVGVAJ4QwCfGA7yEu2xSVCWAJwTwifEAL+EumxRVCeAJAXxiPMBLuMsmRVUCeEIAnxgP8BLusklRlQCeEMAnxgO8hLtsUlQlgCcE8InxAC/hLpsUVQngCQF8YjzAS7jLJkVVAnhCAJ8YPwXwYf8/PsZdNimqEsATCgC/e/6+/GpX8rjLJkVVAnhCAeDXy+Lj66tyYVEq2w5Bx1EI+Ivi/t1Vu5KnrdikqEroeEIxHV8pj7tsUlQlgCeEc3xi/ATA46refzwDKWHhdXxiPMBLuMsmRVUCeEIAnxgP8BLusklRlQCeEMAnxgO8hLtsUlQlgCcE8InxAC/hLpsUVQngCQF8YjzAS7jLJkVVAnhCAJ8YD/AS7rJJUZUAnhDAJ8YDvIS7bFJUJYAnBPCJ8VMDr4u6BY/6L1zsLXvsv+4KrSS2a1T8+P/RmFt5wUvFJ2Qeb9dOSwCfWuBEBfCpBU5UAJ9a4EQlAx46OQH8RAXwExXAT1TR4H/9vTnSft6i+5hVT3wTeblQPqZh6e7rxWKxbJd/7lHJnUCXuvu6Ct4Z4/yP8iAkB779hNVu8ZUn+Dry/juaeqGx65a5Ss4EptTdX/5NOcmvv9E29fwoD0Ip4MsOelHcffP1ov5EXfOZyvtfdZ+o5eObyI+/+LocJNWyu6y6+O6v/roJ5So5E5hSd9/++0Vx9w/ftXu3/qZi3fOjPAglgC89Lpvo7tv3TSvtP0VNgjfim8hdaXSZSak6cn91tXtRTX/3F1fF5UVPJWcCU+ru2//7efE///Zdu3fr5pnF/ygPQikdv67OmVWL1Yfx/aeo6Y7X47vI3ZKs0jTwuvpTDMu7X7ahXCVnAlPq7tvfvPuPf/3Nd+3erevtPT/Kg1Ac+MuLqjuWdQfvQe4/Re1yyxXftuGyYDu+5tjgKo8VTQNzlZwJTKkyZv33vzzsXQOeK/BQFAf+rjpblg8/+9uLDmR7KUwcgO34JvLycBHuLHQ4ZS8uqquDF0VfJWcCXaoEf/fn31dPyHrvGvBcgYcivI6fqAB+ogL4iQrgJyqAn6hOHfynl+ft0u3nb4fckVMTwE9UJw3+w7PZb//BeXH7dDY7L5cfv60fht6r09BJg39zVtyUyH/yqmr3quPLgesnQ+/VaeiUwX/48m3RHurLxRJ89RSoBqF+nTL4+qz+pgT/ZlYe4Svwz2az2aNXQ+/XSeiUwbcd/+HZeXuoR7f765TBt+f4qvFvf/SqPcfffPbD0Lt1Ejpp8J9e1lf117Pq26eX9VU9jvR+OmnwULwAfqIC+IkK4CcqgJ+oAH6iAviJCuAnKoCfqAB+ovp/es96ibG1B5cAAAAASUVORK5CYII=" alt="plot of chunk unnamed-chunk-2"/> </p>
<p>It turns out the pattern we observe isn't at all unique – many variables follow (predictable) patterns of variation through the week. This doesn't necessarily mean, though, that the correlation between beer and behavioral economics is entirely spurious!</p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-2425864657472261802013-03-13T12:11:00.002-07:002013-03-13T12:40:08.372-07:00Using maps and ggplot2 to visualize college hockey championships<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title></title>
<base target="_blank"/>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: Consolas, 'Lucida Console', 'DejaVu Sans Mono', 'Droid Sans Mono', Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
}
.r {
background-color: #F8F8F8;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
/*
* highlight.styles.css
*
* RStudio style for highlight.js in HTML preview. Initial template based
* on highlight.js VS style by JasonDiamond, tweaked to look more like
* the default RStudio TextMate theme.
*
* Copyright (C) 2009-12 by RStudio, Inc.
* Copyright (C) Jason Diamond <jason@diamond.name>
*
* This program is licensed to you under the terms of version 3 of the
* GNU Affero General Public License. This program is distributed WITHOUT
* ANY EXPRESS OR IMPLIED WARRANTY, INCLUDING THOSE OF NON-INFRINGEMENT,
* MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Please refer to the
* AGPL (http://www.gnu.org/licenses/agpl-3.0.txt) for more details.
*
*/
pre code {
display: block; padding: 0.5em;
}
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: rgb(88, 72, 246)
}
pre .number {
color: rgb(0, 0, 205);
}
pre .comment,
pre .annotation,
pre .template_comment,
pre .diff .header,
pre .chunk,
pre .apache .cbracket {
color: rgb(76, 136, 107);
}
pre .keyword,
pre .id,
pre .title,
pre .built_in,
pre .aggregate,
pre .smalltalk .class,
pre .winutils,
pre .bash .variable,
pre .tex .command {
color: rgb(0, 0, 255);
}
pre .string,
pre .title,
pre .parent,
pre .tag .value,
pre .rules .value,
pre .rules .value .number,
pre .ruby .symbol,
pre .ruby .symbol .string,
pre .ruby .symbol .keyword,
pre .ruby .symbol .keymethods,
pre .instancevar,
pre .aggregate,
pre .template_tag,
pre .django .variable,
pre .addition,
pre .flow,
pre .stream,
pre .apache .tag,
pre .date,
pre .tex .formula {
color: rgb(3, 106, 7);
}
pre .ruby .string,
pre .decorator,
pre .filter .argument,
pre .localvars,
pre .array,
pre .attr_selector,
pre .pseudo,
pre .pi,
pre .doctype,
pre .deletion,
pre .envvar,
pre .shebang,
pre .preprocessor,
pre .userType,
pre .apache .sqbracket,
pre .nginx .built_in,
pre .tex .special,
pre .input_number {
color: rgb(43, 145, 175);
}
pre .phpdoc,
pre .javadoc,
pre .xmlDocTag {
color: rgb(128, 159, 191);
}
pre .vhdl .type { font-weight: bold; }
pre .vhdl .string { color: #666666; }
pre .vhdl .literal { color: rgb(163, 21, 21); }
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]||p=="no-highlight"){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v=="no-highlight"){return}if(v){y=d(v,x)}else{y=g(x);v=y.language}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script type="text/javascript">MathJax.Hub.Config({tex2jax: {processEscapes: true, processEnvironments: false, inlineMath: [ ['$','$'] ], displayMath: [ ['$$','$$'] ] }, asciimath2jax: {delimiters: [ ['$','$'] ] }, "HTML-CSS": {minScaleAdjust: 125 } });</script>
</head>
<body>
<p><strong>Short</strong>: <br/>
I plot the frequency of college hockey championships by state using the <strong>maps</strong> package, and <strong>ggplot2</strong></p>
<p>Note: this example is based heavily on the example provided at<br/>
<a href="http://www.dataincolour.com/2011/07/maps-with-ggplot2/">http://www.dataincolour.com/2011/07/maps-with-ggplot2/</a></p>
<p>data reference:<br/>
<a href="http://en.wikipedia.org/wiki/NCAA_Men%27s_Ice_Hockey_Championship">http://en.wikipedia.org/wiki/NCAA_Men%27s_Ice_Hockey_Championship</a></p>
<p><strong>Question of interest</strong><br/>
As a good Minnesotan, I've believed for quite some time that the colder, Northern states enjoy a competitive advantage when it comes to college hockey. Does this advantage exist? How strong is it? </p>
<p>I first downloaded data from wikipedia on past winners of hockey championships, and saved the short list in an excel csv file. </p>
<p>After saving the file, here's how the data look in R:</p>
<pre><code class="r"># Visualizing College Hockey Champions by State
# Author: Mark T Patterson Date: March 13, 2013
# Libraries:
library(ggplot2)
library(maps)
# Changing library:
rm(list = ls()) # Clearing the work bench
setwd("C:/Users/Mark/Desktop/Blog/Data")
# Loading Data:
# Loading state championships data:
dat.state = read.csv("HockeyChampsByState.csv", header = TRUE)
dat.state$state = tolower(dat.state$state)
head(dat.state)
</code></pre>
<pre><code class="no-highlight">## state titles
## 1 michigan 19
## 2 massachusetts 11
## 3 colorado 9
## 4 north dakota 7
## 5 minnesota 6
## 6 wisconsin 6
</code></pre>
<p>Now that we've loaded the information about hockey championships by state, we just need to load the mapping data. map_data(state') is a dataframe in the <strong>maps</strong> package. Here, we'll use the <em>region</em> column, which lists state names, to match our state championship data.</p>
<pre><code class="r"># Creating mapping dataframe:
us.state = map_data("state")
head(us.state)
</code></pre>
<pre><code class="no-highlight">## long lat group order region subregion
## 1 -87.46 30.39 1 1 alabama <NA>
## 2 -87.48 30.37 1 2 alabama <NA>
## 3 -87.53 30.37 1 3 alabama <NA>
## 4 -87.53 30.33 1 4 alabama <NA>
## 5 -87.57 30.33 1 5 alabama <NA>
## 6 -87.59 30.33 1 6 alabama <NA>
</code></pre>
<pre><code class="r">
# Merging the two datasets:
dat.champs = merge(us.state, dat.state, by.x = "region", by.y = "state",
all = TRUE)
dat.champs <- dat.champs[order(dat.champs$order), ]
# mapping requires the same order of observations that appear in us.state
head(dat.champs)
</code></pre>
<pre><code class="no-highlight">## region long lat group order subregion titles
## 1 alabama -87.46 30.39 1 1 <NA> NA
## 2 alabama -87.48 30.37 1 2 <NA> NA
## 3 alabama -87.53 30.37 1 3 <NA> NA
## 4 alabama -87.53 30.33 1 4 <NA> NA
## 5 alabama -87.57 30.33 1 5 <NA> NA
## 6 alabama -87.59 30.33 1 6 <NA> NA
</code></pre>
<p>With the dat.champs frame created, we're ready to plot</p>
<pre><code class="r"># Plotting
(qplot(long, lat, data = dat.champs, geom = "polygon", group = group,
fill = titles) + theme_bw() + labs(x = "", y = "", fill = "") + scale_fill_gradient(low = "#EEEEEE",
high = "darkgreen") + opts(title = "College Hockey Championships By State",
legend.position = "bottom", legend.direction = "horizontal"))
</code></pre>
<p><img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfgAAAH4CAMAAACR9g9NAAAArlBMVEUAAAAAADoAAGYAOpAAZAAAZrY6AAA6ADo6AGY6OpA6ZrY6kNtHgTpmAABmADpmAGZmOjpmOpBmZmZmkJBmtrZmtv93oGt/f3+DqHePr4SQOgCQOjqQOmaQZpCQkGaQkLaQtpCQ2/+mv52yxqq2ZgC2Zjq2/7a2/9u2//++zrjK1sXMzMzbkDrbtmbb29vb2//b///h5uDl5eXt7e36+vr/tmb/25D//7b//9v///8r0YhxAAAACXBIWXMAAAsSAAALEgHS3X78AAAXq0lEQVR4nO3di2LbOHoFYMaxnZmqmaS7VdjOTLd1o911Pdo2vsg23v/FSoA3kARAgABxPWdnfZHIn9D/CSQlS0pFkCJThR4AEiaALzSALzSALzSALzSALzSALzSALzSALzSALzSALzSALzSALzSALzSALzSALzSALzSAD5D/lfzsM1HBv32uqurA/f7LXfOfcg129etP94tKx9WtqJaZlBcXOSqWYJuormajOjWX3RDy/ut4E/mf/SYm+NfbI+3ZzXDBLvDDVjTgFRtWrtxee/r4g7/wxNBvAD9P14TLh7vmx6pqutbBd7/RefThL8006n8nPHx/4estW7Pp/OvtTX8p7Tk530y38vb5z7dVszG6SlU1y3/6SzMjL3RWvv3yX7d0wnJbZxfRpc8V/davzIbYLtxdw4bF4C9X99yG26Ferv7+K91Cu8139jN3e/wlIvjXT/3kpvOCzpcWnlGdm98+H5p+Xd33v9MlB/hhlc9Hen3zjc7sftELu7scp1t5+9xc2q7SrHp1T+8oF2ry6Y5ddb66Z1vvC7dL01LN/4eVfxkW7q5pS7MZfyTchlmh9ocD6bdJf+Zvj79EBH8Zjols1930poVnlzfdu9DedA3uu9keTJv51q8ydP5fm7sJ6RdlO4D2eDBuhTW///VC4Y/sMrrhz8d+AMNYuqV52gu7a/QLj5X7YbXA/YGIXkhvwrB7v7Tw/O3xlyjhL11/Wvhz1XaR7TGbhfrf6ZLDjO9X6fcEn/vdcrtos8ttd7hCeErSwH9ie+4OvB/AMJZuabpfPk7hu4W7a4bS5NwMYdxwN6z+GN9uk/7M3x5/iQi+2wm//98cvtsLDvDcXlEOz3YQw6KvP/29m1LjVga7zxWbdnrwhJ6df7gTwHfXdJvvvo4bHm5ku6tvt8ngvR/faSKC73aBzR5XsKsn3X7gPOxsWQb4xa7+OOxF29p/7na441aGnTetO4OX7+q7GoJdPVd9hB833K7bD6zb5mSQXhMRfINxaB9ozU/umi41Z+HtTKHNan+nq4hO7uiugX5jq/SLnqub+Vb4o/br7Ye7Cfxw6sYXHpb+NJnx3Glftz/pdvXsnloN53TdHbfHpttsT+7G2+MvMcGPT+AsH87RvrCHc83F/e9E8nCOPg5jZ80ffwyLskfv060Mc/jUHG3/1jwM4OH/1A9gGAu3dHWcnty1C3fXDJton8DhNkyv7+8L3TbZz9zt8Zeo4DVy2XhAXD7Ho4rRczvqhc027C8JwbfP7Gw8+T0brecQ3mzD/pIQPHvcs62Nr7dmOwpn8KYb9hcT+BpJORbwi0te1ld6Wl9Eo4qjMolVcToYwKdTBfDJkQEe8BZVAJ8cGeABb1ElDPyJPY944p5MBLznKkHg339jrxT6+GN8khTwnqsEgX/7elvd0Ccc377QJ5vpHx3qJyTd6MLTP0Sdj+cjef+9/ysDZrznKsFO7i6HYcYLVnU8MB9lEqsSBP5CX/p5xDE+ZJVgZ/UHgrP6kFXwOD45MsAD3qIK4JMjAzzgLaoA3l+zv3+PZyyAd1YG8EMAPw3gBSkDfl0e8IDfdSwJwYd+BXGT/zbM90VW+wB4wMsCeO8xhV/Kr/YB8ICXpTT4COjt4b+TFXvAzwZG354V2t0F/DDvJXcAwE8Hxq4O7e4M/jt7aCekB3zW8PIjPuDnA6vDuzuHF9ADvik5f9t1aPdd4Gf2gKclJ/J1+Dm/A/x80gOezfjZnM8PftzdfzfstTIZwNMMS4REp9kHviXvvgG+3dUPYQsEE++yEzwl738E/AyeHeBDy+8FP7kTrD6tWxh8ePYN8FvkAb/YtQeX9wKv2WtlUoef7etLgMcxfg6f5ON4U3ic1RMBfIJn9YbyJr1WJnn4kTqKv8vuC2/Wa2UA7za7whv2WpmU4F+EmfT9ReLhLXvCdzf4aXLzr8VdWc3T+iLaZR41FtxvxpPhS8jsCd/dan6SXTcRtMrzjH/UqLLjrr5O8ilbffhJr9tI3MPAP0r8d3shxtB0Et7dP7yi18q4hn+kUVTZFz6geJf94Ke9ppHNd9/wj0MUVQC/EX7aa5b44IXy+8PHkL3gZ72mkbv7hX8MBh+TvTf4a4W7V/jHYPARsW+C15Bvb+fkmbsWPoJdfTj40NaT7Af/nYe/vu5mfPDH8Y+Y8Sy7wNMbOU78p+tZZL1WJgf4mOR3mvD8Hn/uvoC/vvYH/wj4NrvAD3cASr9w5+T7CwDvPRvgjf4gL2Af5a8BHyx7w6vorwEfLqHg6UO72a/qbrqBfwwKHxqbzxZ4M3kZvPQAIE4O8DHJlwU/d/c94+N4azzL/vD69spu5gAfCzqNF3hde1U3Ae82vuAV9iQgvEge8I7h1Q/sVH+1pckBPjS2aXhzdsFG+AX+2JiVx3IkD/jk5OexgO/lpxPc1xM4OMbbxcmMJ/7htSY84KWxmfDjlOefqntZeSxHAB9B7NxbeTJxXzuxo3EAv3THMd4kjuD5+HkCRwQvkAe8JO7dx1dmyWMPL3QXyANelt3g930cr+kOeHmcw7+sP3lnDS+Z8JjxJrGVn/VF5/l6S/hHwLuIJTyZftDx8ilcQazg61rGDnijWMNz7ouX4ohjCS+Xl1QxgD8fCDlV1Ye7fmvSgYV2s44VPPu4U74tYeEX9Kbwl+pA3n+7Gy/I5o80y9jCTyY80XG3gqdD3g3+9ed/HMjb19vqhi3XpH6SJDSbfazcmw58H5tx/TR9p42saTp5GLMYsgJejKQJ//bl/nIgl6t7cj729zPZPTKAlOPYzXduxl9fO5zxDxvhhUq68Gc6xw/0p8uhuwjwEnjuTdTXy0gavA7/sBV+Jm98ckdnfIOOGR8G/mEzvFDJDJ6e1fcTHvBSd35P7wr+YSv84sPP8NIreWzgv8cGL1YCvCiW8END3ME/bIMXfdYh4OWxgOcnvDv4ubs2vCCAlyd5+O5DDoW1AC+PI3iRuyd42eeZEsCrEhv8wl1vxosDeHl2hJd1UwW/dF+BV7qV8XFn2+LG3eDpGxW8gB3wO8X/hAd8FNltwndvoRR0UwGPXb237AXfXyropuIYbzrjV9wAL8/e8AJ5V/DrboCXZ0d42aHeDXzIf34M8H38wwf9d+cA30cELz25B3wM2Qv+2hn8Qh7wLrIb/LU5vPDRnBheVWahNL0Q8PqxgV/Ky+HFmY8G8N6y5u4GXuIeIXwIgyBZhTfa149iLezkFz34aRlpAG+bPeCl0GvwNeC9BfB8yQAAgbK+q1fIL3pnD8+XUQXwlokMvubLqAJ4y+wCry2/GM60jCKAtwzg+ZIBAEIlKvhv36ZlFMGMt826vDf4b4D3mR3gdd0fvs0zKaMK4O2zFV7wb5FZw3/jy6iCJ3Dss/UYL+hmK2br7gf+RZTQFuHDNUPiLurbE/2i7S6A58roBLt65xn7YTDh2VTVd5cd4nGMD5ixHx7h+TIrAfw+4fph4E7FDNwjhC9enuuHgTt5qk3cw57ciUuGbnzwDO0wmfCG8Et3PJwLnqEdJvDNeg7gv+E1dyHTt8MAflqA0g73A/7Ch/mFXYYZ/w3wAdN1Q9/9WavsGjyb8tjVB4wh/POzO3gc40PGyP35WRe+jnzGQ74WdKFVb4xF7tbw/VH+G+CDRtCE5zE7wRM250PCQ16QZ15+sO8v0ashhyd1Y8729YCPLM+TbIOv1fAsgI8sU/ZnDt0FPKMIfVYPeFGeldGtIoFv20539UFP7gAviBv4xdN5CwOc1UcWR/A1By9EAXxkcQVfD/BiFMBHFmfw7V9wJOyAjy7u4Kl83/oH9r/y4P/dNP6GNo9D+Lr/uIwH9rlIUwLA5wzPvTxvTgD4rOGl7oDPGF4x30v5e3yh8J25gB3wacEbVSLsrE7MDvi84RUp4500gF8EMz5beLU84POFV8oDHvCyAN5nUoY/Hwg5VR/uVuDxBI4oDt19w1+qA7l8/NH8twLvTx7wHuBff/7HgU76ty/3dLkm9ZMkezROmELhZX03ii58431p4I/k/ff77iLMeIN4OsS7n/FnOscPw4xXwnuTB/z+8E0uusd4wC/jDl6tuhO81lk94AVJGX4eObwv+TzgDSupUQCfGfzYUjUK4JOB11iZ7+cKCuBTgddZmXD2Kyhh4ffr3zTpw2utyxpK+p/VAXwS8FqrEu5r5PDY1S+yfTffd5N0v6kD+FzgazLOdcB3KQK+lxd2fBHARwZfi+z11uNm/DoK4OOHr8139esogI8Pvt4y3Uf4muuvPIHhPckDfpHQ8H7kE4Of0uuuwz1LD/guCcLXz+3/TWb8MOVTgPcinxx8F8BbJlX42sB9PMYnAu9DPln4GvBWSRdePySxkzviQz57eP4PsunA7y+fPzz7Mjx3lwj8/n3JHr7u/iLX/ZII/GQ3tUvyh6+7Nrc/JwLPxrprVwqA7/4E335PBJ4QwDtI20jA88kfvpvq3aE+GXgCeBcZmp0O/M5n9mXAj81OCX5X+hLg+7N6cX8XATzgZQF8IiHD66pTgt9Vvgj4ro3i/i7iBf5lPfv2pAB42sGmjRqtbvOkuyDO6uMOYQf5WtbfReLZ1e+Z/OEJ92AO8ENKgK+V/V0kFng8c2eZiTzguxQAT6Pq7zzRwONxvGVIqvDe3kiXa0iy8JC3jrq/fACfVdT95QP4rKLuLx/AZxV1f/nEBB+6axlE2d9JAJ9TEnzKli2HWAbwpUbZ30kAn0mGz7iT93cSwGcUkig85C0y+ezi1OAhvz3du6fies2dpKRgSWRrJi/BSQ4e8tuj1d8xgM8piT6OZ8siFiF1cu+dG5ZFLBLbGyokJYXLIhZJGB7ydlntb5/o4CFvE43+dgF8LuH/8akk4SG/JQTwhYYQ7oX1gC8puv1tA/hsotvfNhHCg35TDPpL4x7+VFVH9vXDXe8oKSlN6B6mFxIe/nJD3r7cv/92N14E+N1DJm+dC7Wrb+Dfvt5WN2y5JvWTYUK3McGwtpn2WQ9DG/5UHcjl6p6cj90FmPF+ot1fml1O7s4H+vVy6B0lJaUJ3cE0o99fGvfH+AOFZ18x4/1k8spqnf7S7HJWf8O+9hMe8PvHrL80eByfQwz7SxMlPOTNYtxfEi087A0iI1MmTnjSDix0R9OIlGy1vysBfNyRkq32dyXh4EGvETmZRn+VAXzUkZNp9FcZwEcdOZlGf5UJCQ/6tSjItPqrSFh4AntlhM0DfPYRNy8TeMiLI29eNvCgF0XevFzgMedFUTQvH3jIL6JqXkbwkJ9F2TzA5xtl83KCh/w0yuZlBQ95PurmAT7bqJsH+Gyjbl5e8JDnom4e4HPNSvMyg4d8n7XmAT7PrDYP8HlmtXmAzzLrzQN8jtHoeG7wkKfR6Hhu8LOb/lLkHUFDLDv4RZXQCAGi05ho4F825ElnodAK3qPbGEf9ZT22gJfcl5TRvEeGlvAcvcZEM+MlJZXRHVhoCq/RbEwR8CXR6zamEPiC6DUbUwx8KfK6jSkHvgx67caUBF+CvHZjioIvgF67MYXB5y6v35jS4NfkDRaNMfqNKQ5+xVN/yRhj0JgC4ZWgustFGv3GFAmvINVbKtYYNKZQeKmp3lKxxqAxpcLLUHWWiTcGjSkXXsyqs0y0MWlMyfA6ZUJbmsSoMYBfKRNa0yBGjQE84Lc1hq+SI3yS8oB3AJ+OvFFjAA/4rY0ZqwA+aIwaA/j1MqFBdWPUGMBrlAktqhmjxgBep0wwSrOlTRoDeK0yO8GuUpotbXKLAB8xvOF2jW4R4PXK7ES7Jtl+a1E1Fje4RYCPCZ57d3/ryHlqrG1yiwAfFfzin3nnftZZXf8WAT6mz1doxsL7EcMXhpjcIsBHBL82Fo31tW9RGPhTVR3p1w93/SWA7+DZaZxsGNxJQJLwlxvy9uX+8vFH8193EeAHvHq+g+fCGycI36SBPx/oV7pck/op8viCZ1taHYZ4SJ5aIRiVNvypOpDzkbz/ft9dgBnfRznjJ6f9y1UNblGwk7vzYZjxglUdD8xBGZGPM2gZoCD89dIVY4W/HCh8Wsf4eZfJludx+1UIt7Zi4opS1y/cz4nB07P6m8TO6onYZgv8ZF1BEdVA6slNSg5+mVThjeRfJjtqMpTSd5/dJMBL4w6eiG2M4GvxQ/TN8LJ9BeBdwhOhjRG89KkZwM9LKhMDvJG8fAP9Ek+G/5iMeAOAdwpPxICu4Y3+3TDAS+IUngj99N0Vg2Fla/ZJ82vy0yqAF8ctPJm0l0wu1JjvqsHU7dP0T2RVHvAB4KfPnHEXWcOT9s9yxvDCx5mA3wV+ca0TeG4sJrt6wIvjHH7xqhjiHN7k5G6+9bHKWgBvX0aPfbebBHhBvMBryesNBvC5wWsOxha+1q0CeD/wuoPBjM8Fnv9LLGb8MmnDK+QJ9xc5wC+TLXxNAK9KvvDc437AL5Mr/OSG7XOTBPcywPuC1/s0bMAvA/jNYxFsEfBh4ec3C/DLpA6vfhOj/mAAD3jtsQBeEG/wOvJe4Gu9KoBPHX7xJiy9KoB3VWb5ruVNVczHUi+fwgF8GfCY8ct4hJe+k82syoaxLLYL+KDwYc7qAa9bxRn86tudAb9M/bIhT1tW2qtMU2UG4Gssc3j/jcGMx4ynAfymKsZjWZ5bAD4g/OYq5mMB/DJe4dc+2gLwywDeYiyAXyYg/OYqxmNZHOQBD3iDMtsHA/gw8Iun6wHvGX7l06sAvwzg7cYCePMqgLceDODVB3nAL5MJvHrKA34ZwNuNBfDmVbKAJ4A3ruIUXvIZ16ZVzMcCeOMqbuE5AqsqpmMBvHEV1/ByecAvkxM8Abx+soIXf8r1zjcJ8KZVdoCnAbxW8oMXyAN+mQzhvVcZTyoBXxQ8wYw3qhIDmZsqmPFGVWIgc1MF8EZVYiBzU2V4FAn4suAJ4E2qREHmpgrgTapEQeamCuBNqkRB5qgK4A2qxEHmpEpNAK9fJQoyN1X6vwzFCv/+a1V9/EFOVfXhrrsI8E6q1NpVgsBfbgg5Hd9/uxsvArznKsF29efj29fb6oYt16R+QtKNAXwz6S9X9w1/9ztmvOcqgWb86ab9fjl0FwDec5VAJ3d0olN0zPhQVYLAn+hR/UC/9RMe8L6r4HF8cmSAB7xFFcAnRwZ4wFtUAXxyZIAHvEUVwCdHBnjAW1QBfHJkgAe8RRXAJ0eWGfyWVJvW2qlMflUMymyH3xRHG3BTJr8qm8sAPu0qgC+0CuALrRIvPBJnAF9oAF9oAF9odoR/+3I/vhHrw936CvIy3VeLMtz6NoOht+dgV4LGfiA05/71sFvK7Ad/qa7uuzdiXT7+aP6zKNMV216GW99mMORsfXvYaOwH0uX9P7aW2Q3+/T/ff79nP52P50M3b7eWab9uL8OvbzEYQkuQ08GqRJPXn380M8K2Cs1f77aW2XFX38HTm3jsf9lchsFZlBnXtxtMs6v/08GqBM2FHv+sq7TvbN1YZh/4E32nZTsc+kasrfdtrozFjB+q2M74U/v+0bPdjG+qNAeM15/sdj3tYJoJH+mM796IZXM0G+FtyozrWw2mmWJvv9zZHp1bK/tj/Nu//Njc3J3hTzYnnmOZ7qtFGW59q5Pp5gYdrc/H6QHDvkpzrvBP7ZDiOqtHog7gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gCw3gC03u8P8zzx/DVf82zx+zdf95nvkCKQfwgM8ygJcE8IDPMoCXBPCAzzIi+FPFPidKDM996K0Qvrve9gOrgqdA+PffWjMhPP+huSJ4dn1fIeUUCP/29ZZ9RKUIfvKhuQL49vq+QsopEP5yxT7WVrKr5z40V7irZx+R2VVIOQXC01wOSnjpjCfDBwbTCimnQHhKpp7ximN8e31XIeUUCE/Pyel0lcOrzuq766vEJ3yR8F3E8FzwOD7hAF4SwAM+ywBektzh/1hE5yrNBVJO7vCIJIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNIAvNP8PKtTvj6Mu57QAAAAASUVORK5CYII=" alt="plot of chunk unnamed-chunk-3"/> </p>
<p>Having plotted the data, it's easy to see the effect of the 'great lakes' region on hockey championships. With the exception of Colorado, only Northern, colder states have won titles. </p>
<p><strong>Ways to improve this analysis</strong><br/>
While we observe that college title champions are clustered in the Northern Midwest and Northern East, it's possible that several variables could explain the distribution. We might consider examining 1) state temperature (we might expect that colder temperatures lead to better performance, since teams in colder states get to practice more), 2) distance from great lakes (this might be a proxy for the availability of ice), 3) distance from Canadian hockey cities (it's possible that hockey culture follows from Canadian or other European immigration).</p>
<p>Beyond examining these possible factors, it'd be interesting to try color presentations – I've adopted the same color scheme presented at <a href="http://www.dataincolour.com/2011/07/maps-with-ggplot2/">http://www.dataincolour.com/2011/07/maps-with-ggplot2/</a> , but it would be good to have some familiarity with other schemes.</p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0tag:blogger.com,1999:blog-8973439534644845561.post-19346723522841290322013-03-07T15:34:00.002-08:002013-03-13T12:40:25.355-07:00ddply in action<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- saved from url=(0014)about:internet -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Top Batting Averages Over Time</title>
<base target="_blank"/>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 12px;
margin: 8px;
}
tt, code, pre {
font-family: Consolas, 'Lucida Console', 'DejaVu Sans Mono', 'Droid Sans Mono', Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre {
margin-top: 0;
max-width: 95%;
border: 1px solid #ccc;
}
.r {
background-color: #F8F8F8;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
/*
* highlight.styles.css
*
* RStudio style for highlight.js in HTML preview. Initial template based
* on highlight.js VS style by JasonDiamond, tweaked to look more like
* the default RStudio TextMate theme.
*
* Copyright (C) 2009-12 by RStudio, Inc.
* Copyright (C) Jason Diamond <jason@diamond.name>
*
* This program is licensed to you under the terms of version 3 of the
* GNU Affero General Public License. This program is distributed WITHOUT
* ANY EXPRESS OR IMPLIED WARRANTY, INCLUDING THOSE OF NON-INFRINGEMENT,
* MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Please refer to the
* AGPL (http://www.gnu.org/licenses/agpl-3.0.txt) for more details.
*
*/
pre code {
display: block; padding: 0.5em;
}
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: rgb(88, 72, 246)
}
pre .number {
color: rgb(0, 0, 205);
}
pre .comment,
pre .annotation,
pre .template_comment,
pre .diff .header,
pre .chunk,
pre .apache .cbracket {
color: rgb(76, 136, 107);
}
pre .keyword,
pre .id,
pre .title,
pre .built_in,
pre .aggregate,
pre .smalltalk .class,
pre .winutils,
pre .bash .variable,
pre .tex .command {
color: rgb(0, 0, 255);
}
pre .string,
pre .title,
pre .parent,
pre .tag .value,
pre .rules .value,
pre .rules .value .number,
pre .ruby .symbol,
pre .ruby .symbol .string,
pre .ruby .symbol .keyword,
pre .ruby .symbol .keymethods,
pre .instancevar,
pre .aggregate,
pre .template_tag,
pre .django .variable,
pre .addition,
pre .flow,
pre .stream,
pre .apache .tag,
pre .date,
pre .tex .formula {
color: rgb(3, 106, 7);
}
pre .ruby .string,
pre .decorator,
pre .filter .argument,
pre .localvars,
pre .array,
pre .attr_selector,
pre .pseudo,
pre .pi,
pre .doctype,
pre .deletion,
pre .envvar,
pre .shebang,
pre .preprocessor,
pre .userType,
pre .apache .sqbracket,
pre .nginx .built_in,
pre .tex .special,
pre .input_number {
color: rgb(43, 145, 175);
}
pre .phpdoc,
pre .javadoc,
pre .xmlDocTag {
color: rgb(128, 159, 191);
}
pre .vhdl .type { font-weight: bold; }
pre .vhdl .string { color: #666666; }
pre .vhdl .literal { color: rgb(163, 21, 21); }
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]||p=="no-highlight"){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v=="no-highlight"){return}if(v){y=d(v,x)}else{y=g(x);v=y.language}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script type="text/javascript">MathJax.Hub.Config({tex2jax: {processEscapes: true, processEnvironments: false, inlineMath: [ ['$','$'] ], displayMath: [ ['$$','$$'] ] }, asciimath2jax: {delimiters: [ ['$','$'] ] }, "HTML-CSS": {minScaleAdjust: 125 } });</script>
</head>
<body>
<h2>Top Batting Averages Over Time</h2>
<p>reference:<br/>
<a href="http://www.baseball-databank.org/">http://www.baseball-databank.org/</a></p>
<p><strong>Short</strong><br/>
I'm going to use plyr and ggplot2 to look at how <em>top</em> batting averages have changed over time</p>
<p>First load the data:</p>
<pre><code class="r">options(width = 100)
library(ggplot2)
</code></pre>
<pre><code class="no-highlight">## Warning message: package 'ggplot2' was built under R version 2.14.2
</code></pre>
<pre><code class="r">library(plyr)
data(baseball)
head(baseball)
</code></pre>
<pre><code class="no-highlight">## id year stint team lg g ab r h X2b X3b hr rbi sb cs bb so ibb hbp sh sf gidp
## 4 ansonca01 1871 1 RC1 25 120 29 39 11 3 0 16 6 2 2 1 NA NA NA NA NA
## 44 forceda01 1871 1 WS3 32 162 45 45 9 4 0 29 8 0 4 0 NA NA NA NA NA
## 68 mathebo01 1871 1 FW1 19 89 15 24 3 1 0 10 2 1 2 0 NA NA NA NA NA
## 99 startjo01 1871 1 NY2 33 161 35 58 5 1 1 34 4 2 3 0 NA NA NA NA NA
## 102 suttoez01 1871 1 CL1 29 128 35 45 3 7 3 23 3 1 1 0 NA NA NA NA NA
## 106 whitede01 1871 1 CL1 29 146 40 47 6 5 1 21 2 2 4 1 NA NA NA NA NA
</code></pre>
<p>It looks like we've loaded the data successfully.</p>
<p>Next, We'll add something that is <em>close to</em> batting average: total hits divided by total at-bats:</p>
<pre><code class="r">baseball$ba = baseball$h/baseball$ab
head(baseball)
</code></pre>
<pre><code class="no-highlight">## id year stint team lg g ab r h X2b X3b hr rbi sb cs bb so ibb hbp sh sf gidp ba
## 4 ansonca01 1871 1 RC1 25 120 29 39 11 3 0 16 6 2 2 1 NA NA NA NA NA 0.3250
## 44 forceda01 1871 1 WS3 32 162 45 45 9 4 0 29 8 0 4 0 NA NA NA NA NA 0.2778
## 68 mathebo01 1871 1 FW1 19 89 15 24 3 1 0 10 2 1 2 0 NA NA NA NA NA 0.2697
## 99 startjo01 1871 1 NY2 33 161 35 58 5 1 1 34 4 2 3 0 NA NA NA NA NA 0.3602
## 102 suttoez01 1871 1 CL1 29 128 35 45 3 7 3 23 3 1 1 0 NA NA NA NA NA 0.3516
## 106 whitede01 1871 1 CL1 29 146 40 47 6 5 1 21 2 2 4 1 NA NA NA NA NA 0.3219
</code></pre>
<p>Finally, we can use the <strong>plyr</strong> package to look at how batting averages have changed over time. We'll only consider players who have at least 100 at-bats in a season.</p>
<p>Note: ddply essentially splits the dataset into groups based on the year variable, and then performs the same function on each of the subsets (here, we're executing the <strong>topBA</strong> function). With the calculation performed on each of the subsets, ddply then collects all of the output into a new data frame.</p>
<pre><code class="r">
BA.dat = ddply(baseball, .(year), summarise, topBA = max(ba[ab > 100], na.rm = TRUE))
head(BA.dat, 10)
</code></pre>
<pre><code class="no-highlight">## year topBA
## 1 1871 0.3602
## 2 1872 0.4147
## 3 1873 0.3976
## 4 1874 0.3359
## 5 1875 0.3666
## 6 1876 0.3560
## 7 1877 0.3872
## 8 1878 0.3580
## 9 1879 0.3570
## 10 1880 0.3602
</code></pre>
<p>Now, we're ready to use <strong><em>ggplot2</em></strong> to visually examine the data:</p>
<pre><code class="r">p = ggplot(BA.dat, aes(x = year, y = topBA)) + geom_point()
p
</code></pre>
<p><img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfgAAAH4CAMAAACR9g9NAAAAxlBMVEUAAAAAADoAAGYAOmYAOpAAZrY6AAA6ADo6AGY6OmY6OpA6ZrY6kNtmAABmADpmAGZmOgBmOpBmZmZmtv9/f39/f5V/f6t/lcF/q9aQOgCQOjqQOmaQ27aQ2/+Vf3+Vf6uVlcGVweurf5Wrf6urlZWrlcGrq6ur1v+2ZgC2Zjq2///BlX/BlZXBlavBwdbB6//Wq3/W///bkDrb///l5eXrwZXr1qvr///y8vL/tmb/1qv/25D/68H//7b//9b//9v//+v///+zr+xzAAAACXBIWXMAAAsSAAALEgHS3X78AAATgklEQVR4nO2dC3vTyBWGte3utt1Cd7eIJrSlRaWsKZTGAYyzDoTo//+paiQ7vklz0TmS5tN878MT7EQZfZ7XczS6OVlJkiSbOgCZBopPFIpPFIpPFIpPFIpPFIpPFIpPFIpPFIpPFIpPFIpPFIpPlF7iPzq4cS3gi3NNvjDRR4oXgZuI4kXgJqJ4EbiJKF4EbiKKF4GbiOJF4CaieBG4iSheBG4iiheBm4jiReAmongRuIkoXgRuIooXgZuI4kXgJqJ4EbiJKF4EbiJv8cv86dX24arYf6V4HaIVv7lYV/+ah3nx8HVA8Vl2HrZXQy1QvLf4anzfvbw2jz6/+FTsvpZlXtFZJSRkFYM0TNroFr8o798Z8ZX+TbH9usX15ur1bjbiz96lfRpqgyM+fMSvzBgvmq9Dimep78eA2/iyGesDj/jWsFoNMVHgrP7u1ZriD8FN5C3egmsd7GY3FG8Nq9UQE1G8DNxEFC8CNxHFi8BNRPEicBNRvAjcRBQvAjcRxYvATUTxInATUbwI3EQULwI3EcWLwE1E8SJwE1G8CNxEFC8CNxHFi8BNRPEicBNRvAjcRBQvAjcRxYvATUTxInATUbwI3EQULwI3EcWLwE1E8SJwE1G8CNxEFC8CNxHFi8BNRPEicBNRvAjcRBrib8ZivDX5ApyII14CbiKNEe9aB7vZDcVbw2o1xEQULwM3EcWLwE1E8SJwE1G8CNxEFC8CNxHFi8BNRPEicBOBiW/5ZOM+UDyY+LbPMu8DxVO8DIq3wVLvhuKtYbUaYiKKl4GbiOJF4CaieBG4iSheBG4iiheBmwhA/H4XTqWbTXMUDyD+4KCNRjfXzVE8xcugeBss9W4o3hpWqyEmongZuIkoXgRuIooXgZuI4kXgJqJ4EbiJKF4EbiKKF4GbiOJF4CbyFr/Mn15tH66K8v5tnl+sKR43ka/4zcV6szW9yYtyc1m9FRaRiXddibn/OcV7i69G+d3La/Po84tPRfMtIz6v6KwS42JOv0h+njTd4hfl/TsjvtK/Kcx3zKBvcL25Rno3uy66P/g5R3z4iF+ZMV6ZXz54j0U8S71hwG18uTGTu8X+Z651dL6o0NsjcLvZSbTim1n93at1I365Hfci8cE3ROF2s5N4xVtwrYPi3SQlnqV+T1riQ8HtZicUbw2r1RATUbwM3EQULwI3EcWLwE1E8SJwE1G8CNxEFC8CNxHFi8BNRPEicBMNJ54nQUOYj3he9hAExVvDajXERCz1MnATDSd+D7vZDcVbw2o1xEQULwM3EcWLwE1E8SJwE1G8CNxEFC8CNxHFi8BNRPEicBNRvAjcRBQvAjcRxYvATUTxInATUbwI3ESxi384uWse4HazE0jxN8NhLufYPxhyTf0AThTziM+yh+t46ge448sJ5Ih3raPvizrwzlLvy1zEH4f1+BWvlik+avFnGp1r8v1wFYqPW/xZWNcCFJ+oeJb6VMV7wkSJiD8tBNMnOoXirWF7/t7Zpn/yRGdQvDVsz9+j+JblUhDPUt+yXBLiT2Eiindj20WkeBvY4q0HhSjeBsW7oXhrWK2GWOopXgZuIooXgZuI4kXgJqJ4EbiJKF4EbiKKF4GbiOJF4CZKTHyzTx7616+6oXgb8YhvjsIF/727bijehvtFeYqgeDdQ4n1NsNS7oXgb8Wx8dqQpfrRSv4PiIxHvG1arISaieBm4iSheBG4iiheBm8hb/DJ/erV9uCqOnlK8AtGK31ysq3/Nw7w4fErxGkQrvhrldy+vzaPPLz4V+6d5RWeV6CDrtUEhQ9ItflHevzOmK9+bYv/U4Hpznbyb+x8xxR1fTuIf8Sszxot9AaB4FaIVf7hR3wi38b0PleN2s5NoxTfT+LtX60Y8Z/UNuIm8xVtwrSPOblY5RUfxrWz7NkrxOiflKb6NXd9SvBuKt4bVaoilPp1Sf+Q5ikRHzEv8lhi6+biyx5DoGIq3hu3/qxR/ulwi4lnqT5dLRfwRTETxMnATUbwI3EQULwI3EcWLwE1E8SJwE1G8CNxEFC9i4kQt5xso3sZMErWdYaR4GzNJRPGhzCURS30gTETxMnATUbwI3EQULwI3EcWLwE1E8SJwE1G8CNxEFC8CNxHFi8BNND/xPjdKUPz8xHvdGhX5jduSm3wo3kbcH9Uguq1vTPE3Y+G1piwLbtf0dPAv1Qzw2rOsf5ybgERzG/FeRFzqG+/9G0q21HsRcSLp7dsUbyPmRMLbtynexpCJ+olLcXfOv6fGEi9J1LNU34Sttxsc8QE9NZJ4USKBeJVPZ4lNfPdLmpl4Qamfo3jba5pXqe/JXEt9nJ8xJc8EMd1sX24c8RN8xpRtjTfbJcSp5iz+ffabN3LxKoR0s9UqxbvE/5pl3zzXGPEqKItnqe8Qf/t9Zf31E5f3OMV7lHqPBV3MUvyXH02JhxVvY/SToE5iEl9+eZxlT6IQv1VD8W60JnevY9jG79zgXu/iJDrxZfn12eSz+mHFS8BN5BbvxrWOiUt9y4imeNes/rtqf8414OOY3HUW7LZtOMXbxH999qR8X1n/9dsP8YvvnqJRfPty3eLN7tztDx+2u3Ww4lnq25ebifiwuTnFz0d8EExkFf+4udQX8ySNobMKUDzI7twubNji3ZeoU3wC4tvMU7xD/NdnVcd9hyq+e8hTvF3812ePqq/vXeZd69B6UX0OqrPUdy1nE9/M52OZ1etct1dD8Y5SXw/2WEb8BOKdK5ypeM8dOtc6piz17Xgmcr/VZireE9c6Yi6sVrMUX7HMn15tH+SLg6fY4h1qUy31ze5cfXJuc7Gu/pkHl+Xdy+uHp/MW72Sm4re7c8b8qjC+m29XDz6/WFfvgIq8orNKAJD12rjNj87dudWivH/XiF/mRTXw84cBP+6IV5nhRVyDpAw54qsnq8vy8z93T13r0HxROvt0MxVvuma4bXxRiz98H7jWQfFuNBLVfTPMrP7u1drM6i/L+7f13L63+H72WOqt6Ikf7JBtz3EbVTfXxJVIqdQ/HLfTv9iS4k+Ja3LnHOt9xQtKvQo3Wod/ZyreE9c6IhxfWid8KN7GMC/q0FyoRYrHFX+oLlgjS32y4qUMeeO2iJmLr/pdVOqlSfDv38UUL91EUzzF94SlfhLx0rlZhDuYWg3NXLwQJqJ4GbiJKL4dz20Jxdtof1E9NtPjdbPv7JHibbS+qD4Tc4p3Q/E2WOpZ6mXgJppO/Bknh2Hbwvo15IbiIxLf/TEGwS/KCcVTvAzcRPGIZ6lXAVC8R1ithpiI4mXEkmhfKlHEB+3UxdLNeyJJdDA5AhEfdhgnkm4+IJJEFO/J3MSz1HsyO/F7UMQHgdvNTnokah8z4OJlL8oJovjTLunYSmKLF74oJ4Diz7rE1kceG1AN8TfqmBfV8m3/NbX++gAM8No7OO+S9tdYti7btmCMI15a6l07C4AjPuQKAZ99JY0RL39RniQt3pMxS71rHRN0s+OVz128z3LzFO+AiWYp3l3oBrgYbKp7e07Xm7B4j6mN/uWfU93Nd7ZeireBJd7Wcot4zwvD5yd+bqXe/p46K/Web8E5incDlSjwDGY84vX+sESa4kPPYMZS6hX/lMyMxJ/2iVcin46MZ3JH8S2cdYpPIq+ejEc8S30LSYifvpvPmD5RAqU+hm4+RZLoqPejSHQExdsQJDqutzEkOobibVA8xQfDUj8v8ZN/gr4SFG/jPJHG38zou99qfo/irWG1GhpEfN8jVfXvUbw1rFZDg5T6I/EBzSmJD/x0XYoX0VXqg0a/SqkP/Txtij8idNx3JfIRf7QExVvDajXUlSh4M92ZyMf74TIs9dawWg0NL96NtvgtFG9j8FLvg26p3+F7MRjFi4g4kaN6UbyIiBNRfAtJJGKpP2esRP5TBskF371OG1G8CHuigJ0EwS0e/U4UDyS+e8YquQKP4t1LqYtf5k+vtg/yRVmu8vyyW3z3Pqromls08XMo9ZuLdfXPPLgs715eV1/L5YLicRP5il8VxnfzuHqw+mU74vOKlsWzzoa6f0ImpFv8orx/14hf5kW5vKy/1eB6c3F8uYEY8dUTI31TULxWovFvOvEVv9/GF0b8hiO+RinRBLeZ+YpvZvV3r9ZmVn9Zz+p3A57ixcQs3oJrHdtulr+22YqPuNTLxSu8q+crPt7JHcW3gptoPPEs9R8tfTBn8R/F7icW35I+MFF31XMk8u+4KMVLq/2U4rOsLf1I4gM6juJthIrPdpz+YKRSjyr+ITVoqe/0PloizFKvdnhiWvFtP8GdbiKKlxeQ8FLfsUaKt3GjdVyqWdPD+6j/G2r0jwx2Mk/xwteyD1t/pXgbMxY/Qal/4LzUK1UzrT7yn/0Dipejl0hr/qKUyD8OxYugeBsDi+/R84qFlaW+m2HF9xlzo48vJ5zcWcO2fVNbfOin1+hA8dawrd/VLfXBn16jA8Vbw2o1pCX+LNH4O5inULwNpVJ/lmj8Q0pnULyNoRJR/BHpiGepPyIh8b2heGtYrYaAEg31OVwUL2LwRIN98h7Fi6B4G0N2c7/ZFI54lvo67Nl3eu4/AYkPheJtRCR+m5/ibSJRSn2P8zyniXofEBhT/I0ipiM6f6i6JhVaE1lfg9+yIU14JGpdMK4Rb63dIFvUPif4bk6/2/fobyqlvh/xlPod56W+p3lY8dawWg15JfLpd99EzrbOE1H8QVithnwSeXW8ZyJ3Wy2JEiv11rBaDfmLd/T+kOL7QfE2vEu9y1i/Ut/SJsVbw2o15J1IS7yzUYq3htVqyD+RTqk/bZPiw0DZnXPBUh/IXMS3QPHWsFoNMRHFy6gTadxWQfHWsFoNqSayz/w93xUUbw2r1dB44n0PvFK8NaxWQ+OV+qHEd7ZK8TZGTDRMqe9+P1G8DfhEFN+PwT6OqTcs9dawWg0N9QFs/Ul3cqd52YMTio9GvOZlD27wS30nFG9j9hsfj+UiEQ9Y6k3iuBIZdolc/RmNeB9i6ea6U+saFUuiPQ8bH9eVBBQfzO5qPIp3ME/x4aXepoKl3hpWqyGNUl8Tksg6CNOd3HmF1WpokkQq4p1zYN8bSylexNil3n2xv++t5BQvYuxEtcSgG0u7bsKjeBGjJ3Le5XGSqPPmS4oXMU2igFLf+SaheBHxJ+o8b+8rfpk/vdo+yBfm/1VB8cCJfMVvLtbVP/Pgsrx7eV39n1M8ciJf8dX4rn0bzIPPLz7V4vOKzipBYOgWvyjv3zXil3n9JtgUu5+53lwcX24gRnz1ZGVG+s68ax3sZjfRit9v44vtvI4j/qNqIt0/vOxezlN8M6u/e7U2s/rLkuIb9BJp/UUrnqSxEWEiireF1WooxkSxlnqKbwM3EcWLwE1E8SJwE1G8CNxEFC8CN9HQ4s1cld3sJiCRzkcuDiy+3juF7mY7EyRy7PBTvI0pE7Vrm5t4lvpTOrwNU+pt1+YNLN4w+qc9OIEWb+cwka04RChe/vkuTpBLvQOKt5HKxoel/oRUxNuWi1B8d1ithqJNJD9FR/E2Yk2kcFKe4m2MkijIIcVbw2o1NEaiMIks9dawWg3FIn6/BCd31rBaDUVS6g/eGxRvDavVUCSJKN43rFZDsSRiqfcMq9UQE1G8DNxEFC8CNxHFi8BNRPEicBNRvAjcRBQvAjcRxYvATaQh/mYsxluTLwGJsmy4GAd4J+KIl+CfyHHOBnLEu9ZB8R8pXgSyeMfJOoq3htVqiIkoXgZuIooXgZuI4kXgJqJ4EbiJKF4EbiKKF4GbiOJF4CaieBG4iSheBG4iihfhlcjnviiKt4bVamjURF430VG8NaxWQxRP8TJY6m2kLt4HireG1WqIiSheBm4iiheBm4jiReAmongRuIkoXgRuIooXgZuI4kXgJqJ4EbiJKF4EbiKKF4GbiOJF4CaieBG4iSheBG4ib/HL/OnV9kG+KO/f5vnFmuJxE/mK31ysN7XpzWV59/K6+louFxSPm8hX/KowvpvH2wcrIz6v6KwSBIZu8VV5f9eIX+aF+c8Mej9u3Iv4ofYWY6JTvEa8eVIuvb3rEV9tmU0ij218YcTfv110LTkgs+nmAdEW38zq716tzaz+0nzJm4o/JrPp5gFRFx8Ds+nmAZmleDIYFJ8oFJ8oFJ8oMYo3xw/MuYFid8Lg4bRBFIlMmIkTbc+cHPZOaKIIxW/yv1yXq/rcQHMw4eGQQgSJ7v9runfqRJuz3glOFJ/4+/+ZQ8X1scKiOXx4dBBx4kR3//5HfllOnciwWhz2TnCi+MSX9TkCU8x+KZoTBgenDSZPtDFjfzF9IjPoD3snOFGs4g2rSEb8YSLzdVNMn8icOZnliDdXAfzrKpJt/GGiwhTZqRM1Z05mto3fjq/6up9YZvVHifZ7G5OxPXMys1k9GQOKTxSKTxSKTxSKTxSKTxSKTxSKL8vXT8rbHz58eZz95k1Z3n6fZdXzP/xknswYii/LXx+Zf68fle+/K7/86Xl5+/s3t98/mTrVwFB8WX75+cN/nhvjX36sR3n1X+V+6lQDQ/Fl+fWvf//ZVPos++Z5VfizquRTfBK8//Ojh9H++Eld6ik+BW5/Z0Z6taX/9oMxXj2l+CT4+rc3ZqzXlf59lv32pycUnwS3f5w6wfhQfDXIzUhPDYpPFIpPFIpPFIpPFIpPFIpPFIpPFIpPlP8D/fo5hRL0h4cAAAAASUVORK5CYII=" alt="plot of chunk unnamed-chunk-4"/> </p>
<p>While it's only a heuristic judgment at this point, it's pretty clear that we have a downward trend over time.</p>
</body>
</html>
Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com1tag:blogger.com,1999:blog-8973439534644845561.post-57159777885549735162012-11-14T10:10:00.003-08:002012-11-14T10:10:48.367-08:00WelcomeI hope this blog can be a useful resource for those who are relative newcomers to both Decision Science and R -- I plan to post brief commentary on academic papers I find interesting, and short tutorials introducing methods of data analysis in R.Mark T Pattersonhttp://www.blogger.com/profile/13617500268567880565noreply@blogger.com0