Posts Tagged ‘ Rstats ’

How Predictable is the English Premier League?

May 19, 2015
By
How Predictable is the English Premier League?

The reason why football is so exciting is uncertainty. The outcome of any match or league is unknown, and you get to watch the action unfold without knowing what’s going to happen. Watching matches where you know the score is never exciting. This weekend the English Premier League season will conclude with little fanfare. Bar […]

Read more »

R style default plot for Pandas DataFrame

March 28, 2015
By
R style default plot for Pandas DataFrame

The default plot method for dataframes in R is to show each numeric variable in a pair-wise scatter plot. I find this to be a really useful first look at a dataset, both to see correlations and joint distributions between variables, but also to quickly diagnose potential strangeness like bands of repeating values or outliers. […]

Read more »

Calling R from Scala sbt projects

January 24, 2015
By
Calling R from Scala sbt projects

Overview In previous posts I’ve shown how the jvmr CRAN R package can be used to call Scala sbt projects from R and inline Scala Breeze code in R. In this post I will show how to call to R from a Scala sbt project. This requires that R and the jvmr CRAN R package … Continue reading Calling R from Scala sbt projects

Read more »

Inlining Scala Breeze code in R using jvmr and sbt

January 3, 2015
By
Inlining Scala Breeze code in R using jvmr and sbt

Introduction In the previous post I showed how to call Scala code from R using sbt and jvmr. The approach described in that post is the one I would recommend for any non-trivial piece of Scala code – mixing up code from different languages in the same source code file is not a good strategy … Continue reading Inlining Scala Breeze code in R using jvmr and sbt

Read more »

Calling Scala code from R using jvmr

January 2, 2015
By
Calling Scala code from R using jvmr

Introduction In previous posts I have explained why I think that Scala is a good language to use for statistical computing and data science. Despite this, R is very convenient for simple exploratory data analysis and visualisation – currently more convenient than Scala. I explained in my recent talk at the RSS what (relatively straightforward) … Continue reading Calling Scala code from R using jvmr

Read more »

One-way ANOVA with fixed and random effects from a Bayesian perspective

December 22, 2014
By
One-way ANOVA with fixed and random effects from a Bayesian perspective

This blog post is derived from a computer practical session that I ran as part of my new course on Statistics for Big Data, previously discussed. This course covered a lot of material very quickly. In particular, I deferred introducing notions of hierarchical modelling until the Bayesian part of the course, where I feel it … Continue reading One-way ANOVA with fixed and random effects from a Bayesian perspective

Read more »

R resources

December 3, 2014
By

-+*This is the third in my weekly series of posts pointing out resources on this site. This week’s topic is R. R language for programmers Default arguments and lazy evaluation in R Distributions in R Moving data between R and Excel via the clipboard Sweave: First steps toward reproducible analyses Troubleshooting Sweave Regular expressions in […]

Read more »

Statistical computing languages at the RSS

November 22, 2014
By
Statistical computing languages at the RSS

On Friday the Royal Statistical Society hosted a meeting on Statistical computing languages, organised by my colleague Colin Gillespie. Four languages were presented at the meeting: Python, Scala, Matlab and Julia. I presented the talk on Scala. The slides I presented are available, in addition to the code examples and instructions on how to run … Continue reading Statistical computing languages at the RSS

Read more »

Statistics for Big Data

November 22, 2014
By
Statistics for Big Data

Doctoral programme in cloud computing for big data I’ve spent much of this year working to establish our new EPSRC Centre for Doctoral Training in Cloud Computing for Big Data, which partly explains the lack of posts on this blog in recent months. The CDT is now established, with 11 students in the first cohort, … Continue reading Statistics for Big Data

Read more »

One datavis for you, ten for me

September 14, 2014
By
One datavis for you, ten for me

Over the years of my graduate studies I made a lot of plots. I mean tonnes. To get an extremely conservative estimate I grep’ed for every instance of “plot\(” in all of the many R scripts I wrote over the past five years. The actual number is very likely orders of magnitude larger as 1) many […]

Read more »


Subscribe

Email:

  Subscribe