Blog Archives

My favorite R bug

May 23, 2015
By
My favorite R bug

In this note am going to recount “my favorite R bug.” It isn’t a bug in R. It is a bug in some code I wrote in R. I call it my favorite bug, as it is easy to commit and (thanks to R’s overly helpful nature) takes longer than it should to find. The … Continue reading My favorite R bug → Related posts: My Favorite Graphs Random Test/Train…

Read more »

What is new in the vtreat library?

May 7, 2015
By
What is new in the vtreat library?

The Win-Vector LLC vtreat library is a library we supply (under a GPL license) for automating the simple domain independent part of variable cleaning an preparation. The idea is you supply (in R) an example general data.frame to vtreat’s designTreatmentsC method (for single-class categorical targets) or designTreatmentsN method (for numeric targets) and vtreat returns a … Continue reading What is new in the vtreat library? → Related posts: Vtreat: designing…

Read more »

I still think you can manufacture an unfair coin

April 13, 2015
By
I still think you can manufacture an unfair coin

In Gelman and Nolan’s paper “You Can Load a Die, But You Can’t Bias a Coin” The American Statistician, November 2002, Vol. 56, No. 4 it is argued you can’t easily produce a coin that is biased when flipped (and caught). A number of variations that can be easily biased (such as spinning) are also … Continue reading I still think you can manufacture an unfair coin → Related posts:…

Read more »

What can be in an R data.frame column?

April 9, 2015
By
What can be in an R data.frame column?

As an R programmer have you every wondered what can be in a data.frame column? The documentation is a bit vague, help(data.frame) returns some comforting text including: Value A data frame, a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on). If you ask an R programmer … Continue reading What can be in an R data.frame column? → Related posts: R…

Read more »

New video course: Campaign Response Testing

April 8, 2015
By
New video course: Campaign Response Testing

I am proud to announce a new Win-Vector LLC statistics video course: Campaign Response Testing John Mount, Win-Vector LLC This course works through the very specific statistics problem of trying to estimate the unknown true response rates one or more populations in responding to one or more sales/marketing campaigns or price-points. This is an old … Continue reading New video course: Campaign Response Testing → Related posts: Bad Bayes: an…

Read more »

How and why to return functions in R

April 3, 2015
By
How and why to return functions in R

One of the advantages of functional languages (such as R) is the ability to create and return functions “on the fly.” We will discuss one good use of this capability and what to look out for when creating functions in R. Why wrap/return functions? One of my favorite uses of “on the fly functions” is … Continue reading How and why to return functions in R → Related posts: R…

Read more »

One place not to use the Sharpe ratio

March 23, 2015
By
One place not to use the Sharpe ratio

Having worked in finance I am a public fan of the Sharpe ratio. I have written about this here and here. One thing I have often forgotten (driving some bad analyses) is: the Sharpe ratio isn’t appropriate for models of repeated events that already have linked mean and variance (such as Poisson or Binomial models) … Continue reading One place not to use the Sharpe ratio → Related posts: A…

Read more »

The Win-Vector R data science value pack

March 11, 2015
By
The Win-Vector R data science value pack

Win-Vector LLC is proud to announce the R data science value pack. 50% off our video course Introduction to Data Science (available at Udemy) and 30% off Practical Data Science with R (from Manning). Pick any combination of video, e-book, and/or print-book you want. Instructions below. Please share and Tweet! For 50% off the video … Continue reading The Win-Vector R data science value pack → Related posts: How does…

Read more »

Announcing: Introduction to Data Science video course

February 25, 2015
By
Announcing: Introduction to Data Science video course

Win-Vector LLC’s Nina Zumel and John Mount are proud to announce their new data science video course Introduction to Data Science is now available on Udemy. We designed the course as an introduction to an advanced topic. The course description is: Use the R Programming Language to execute data science projects and become a data … Continue reading Announcing: Introduction to Data Science video course → Related posts: A bit…

Read more »

Check your return types when modeling in R

January 27, 2015
By
Check your return types when modeling in R

Just a warning: double check your return types in R, especially when using different modeling packages. We consider ourselves pretty familiar with R. We have years of experience, many other programming languages to compare R to, and we have taken Hadley Wickham’s Master R Developer Workshop (highly recommended). We already knew R’s predict function is … Continue reading Check your return types when modeling in R → Related posts: R…

Read more »


Subscribe

Email:

  Subscribe