Blog Archives

One place not to use the Sharpe ratio

March 23, 2015
By
One place not to use the Sharpe ratio

Having worked in finance I am a public fan of the Sharpe ratio. I have written about this here and here. One thing I have often forgotten (driving some bad analyses) is: the Sharpe ratio isn’t appropriate for models of repeated events that already have linked mean and variance (such as Poisson or Binomial models) … Continue reading One place not to use the Sharpe ratio → Related posts: A…

Read more »

The Win-Vector R data science value pack

March 11, 2015
By
The Win-Vector R data science value pack

Win-Vector LLC is proud to announce the R data science value pack. 50% off our video course Introduction to Data Science (available at Udemy) and 30% off Practical Data Science with R (from Manning). Pick any combination of video, e-book, and/or print-book you want. Instructions below. Please share and Tweet! For 50% off the video … Continue reading The Win-Vector R data science value pack → Related posts: How does…

Read more »

Announcing: Introduction to Data Science video course

February 25, 2015
By
Announcing: Introduction to Data Science video course

Win-Vector LLC’s Nina Zumel and John Mount are proud to announce their new data science video course Introduction to Data Science is now available on Udemy. We designed the course as an introduction to an advanced topic. The course description is: Use the R Programming Language to execute data science projects and become a data … Continue reading Announcing: Introduction to Data Science video course → Related posts: A bit…

Read more »

Check your return types when modeling in R

January 27, 2015
By
Check your return types when modeling in R

Just a warning: double check your return types in R, especially when using different modeling packages. We consider ourselves pretty familiar with R. We have years of experience, many other programming languages to compare R to, and we have taken Hadley Wickham’s Master R Developer Workshop (highly recommended). We already knew R’s predict function is … Continue reading Check your return types when modeling in R → Related posts: R…

Read more »

R bracket is a bit irregular

January 17, 2015
By
R bracket is a bit irregular

While skimming Professor Hadley Wickham’s Advanced R I got to thinking about nature of the square-bracket or extract operator in R. It turns out “[,]” is a bit more irregular than I remembered. The subsetting section of Advanced R has a very good discussion on the subsetting and selection operators found in R. In particular … Continue reading R bracket is a bit irregular → Related posts: R annoyances Selection…

Read more »

Is there a Kindle edition of Practical Data Science with R?

December 21, 2014
By
Is there a Kindle edition of Practical Data Science with R?

We have often been asked “why is there no Kindle edition of Practical Data Science with R on Amazon.com?” The short answer is: there is an edition you can read on your Kindle: but it is from the publisher Manning (not Amazon.com). The long answer is: when Amazon.com supplies a Kindle edition readers have to … Continue reading Is there a Kindle edition of Practical Data Science with R? →…

Read more »

A comment on preparing data for classifiers

December 4, 2014
By
A comment on preparing data for classifiers

I have been working through (with some honest appreciation) a recent article comparing many classifiers on many data sets: “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, Dinani Amorim; 15(Oct):3133−3181, 2014 (which we will call “the DWN paper” in this note). This paper applies 179 … Continue reading A comment on preparing data for classifiers → Related posts: The Geometry…

Read more »

Can we try to make an adjustment?

November 14, 2014
By
Can we try to make an adjustment?

In most of our data science teaching (including our book Practical Data Science with R) we emphasize the deliberately easy problem of “exchangeable prediction.” We define exchangeable prediction as: given a series of observations with two distinguished classes of variables/observations denoted “x”s (denoting control variables, independent variables, experimental variables, or predictor variables) and “y” (denoting … Continue reading Can we try to make an adjustment? → Related posts: Don’t use…

Read more »

Bias/variance tradeoff as gamesmanship

October 30, 2014
By
Bias/variance tradeoff as gamesmanship

Continuing our series of reading out loud from a single page of a statistics book we look at page 224 of the 1972 Dover edition of Leonard J. Savage’s “The Foundations of Statistics.” On this page we are treated to an example attributed to Leo A. Goodman in 1953 that illustrates how for normally distributed … Continue reading Bias/variance tradeoff as gamesmanship → Related posts: Automatic bias correction doesn’t fix…

Read more »

Factors are not first-class citizens in R

September 23, 2014
By
Factors are not first-class citizens in R

The primary user-facing data types in the R statistical computing environment behave as vectors. That is: one dimensional arrays of scalar values that have a nice operational algebra. There are additional types (lists, data frames, matrices, environments, and so-on) but the most common data types are vectors. In fact vectors are so common in R … Continue reading Factors are not first-class citizens in R → Related posts: R has…

Read more »


Subscribe

Email:

  Subscribe