SAS, SPSS, Stata Users: Learn R from Home April 21

March 11, 2014
Has learning R been driving you a bit crazy? If so, it may be that you’re “lost in translation.” On April 21 and 23, I’ll be teaching a webinar, R for SAS, SPSS and Stata Users. With each R concept, … Continue reading →

Machine Learning Lesson of the Day – Introduction to Linear Basis Function Models

Given a supervised learning problem of using inputs () to predict a continuous target , the simplest model to use would be linear regression.  However, what if we know that the relationship between the inputs and the target is non-linear, but we are unsure of exactly what form this relationship has? One way to overcome […]

VB News – Statwing picks up funding from data science luminary Hammerbacher

March 10, 2014
From: http://venturebeat.com/2014/01/30/statwing-picks-up-funding-from-data-science-luminary-hammerbacher/Above: A correlation as shown in Statwing's software.Image Credit: StatwingJanuary 30, 2014 3:01 PM Jordan NovetBig data projects are tr...

Man at work(-ish)

March 10, 2014
Perhaps one could argue that the obvious, manly activity to do at the weekend when you're home alone is to put and organise stuff in the garage. Well, I was home alone last weekend and my very own version of this was to arxiv the first p...

Using old versions of R packages

March 10, 2014
I received this email yesterday: I have been using your ‘forecast’ package for more than a year now. I was on R version 2.15 until last week, but I am having issues with lubridate package, hence decided to update R version to R 3.0.1. In our organization even getting an open source application require us to go through a whole lot of approval processes. I asked for R 3.0.1, before…

Testing for Multivariate Normality

March 9, 2014
In a recent post I commented on the connection between the multivariate normal distribution and marginal distributions that are normal. Specifically, the latter do not necessarily imply the former.So, let's think about this in terms of testing for norm...

Loss‐Efficient Factor Selection

March 9, 2014
Alexi Onatski has an interesting recent paper, "Asymptotic Analysis of the Squared Estimation Error in Misspecified Factor Models." There's also an Appendix.Four interesting cases have emerged in the literature, corresponding to two types o...

Money(proper foot)ball?

March 9, 2014
This is an interesting (although a bit overused, of late) topic. In some quarters, we statisticians are all akin to "moneyballs" (by the way: I should say I haven't read the book or watched the movie \$-\$ but that's by design, as I suspect I wouldn't re...

Andrew Gelman, the Early Years

March 9, 2014
Andrew Gelman reminisced recently some early research (see here, here, and here). One of those earlier links mentioned a conference Gelman went early in his career which included Jaynes. I have the proceedings to that conference and was able to grab th...

Can a classifier that never says “yes” be useful?

March 8, 2014
Many data science projects and presentations are needlessly derailed by not having set shared business relevant quantitative expectations early on (for some advice see Setting expectations in data science projects). One of the most common issues is the common layman expectation of “perfect prediction” from classification projects. It is important to set expectations correctly so […] Related posts: Setting expectations in data science projects More on ROC/AUC On Being a…