Blog Archives

Neglected optimization topic: set diversity

February 8, 2016
By
Neglected optimization topic: set diversity

The mathematical concept of set diversity is a somewhat neglected topic in current applied decision sciences and optimization. We take this opportunity to discuss the issue. The problem Consider the following problem: for a number of items U = {x_1, … x_n} pick a small set of them X = {x_i1, x_i2, ..., x_ik} such … Continue reading Neglected optimization topic: set diversity

Read more »

Free video course: applied Bayesian A/B testing in R

February 4, 2016
By
Free  video course: applied Bayesian A/B testing in R

As a “thank you” to our blog, mailing list, and Twitter followers (@WinVectorLLC) we at Win-Vector LLC have decided to re-release our formerly fee-based A/B testing video course as a free (advertisement supported) video course here on Youtube. The course emphasizes how to design A/B tests using prior “guestimates” of effect sizes (often you have … Continue reading Free video course: applied Bayesian A/B testing in R

Read more »

Win-Vector data science mailing list (and a give-away!)

January 20, 2016
By

Win-Vector LLC is starting a data science mailing list that we would like you to sign up for. It is going to be a (deliberately infrequent) set of updates including Win-Vector LLC notices, upcoming speaking events, and data science products. To kick this off we will be awarding 5 free permanent subscriptions to our video … Continue reading Win-Vector data science mailing list (and a give-away!)

Read more »

Prepping Data for Analysis using R

January 20, 2016
By
Prepping Data for Analysis using R

Nina and I are proud to share our lecture: “Prepping Data for Analysis using R” from ODSC West 2015. Nina Zumel and John Mount ODSC WEST 2015 It is about 90 minutes, and covers a lot of the theory behind the vtreat data preparation library. We also have a Github repository including all the lecture … Continue reading Prepping Data for Analysis using R

Read more »

Nina Zumel and John Mount part of R Day at Strata + Hadoop World in San Jose 2016

January 17, 2016
By

Nina Zumel and I are honored to have been invited to be part of Strata + Hadoop World in San Jose 2016 R Day organized by RStudio and O’Reilly. We have written a lot on the topic of model validation in R and we are very excited to distill it down to an exciting tutorial. … Continue reading Nina Zumel and John Mount part of R Day at Strata +…

Read more »

Using Excel versus using R

January 15, 2016
By

Here is a video I made showing how R should not be considered “scarier” than Excel to analysts. One of the takeaway points: it is easier to email R procedures than Excel procedures. Win-Vector’s John Mount shows a simple analysis both in Excel and in R. A save of the “email” linking to all code … Continue reading Using Excel versus using R

Read more »

Practical Data Science with R examples

December 11, 2015
By

One of the big points of Practical Data Science with R is to supply a large number of fully worked examples. Our intent has always been for readers to read the book, and if they wanted to follow up on a data set or technique to find the matching worked examples in the project directory … Continue reading Practical Data Science with R examples

Read more »

Sequential Analysis

December 11, 2015
By
Sequential Analysis

We here at Win-Vector LLC been working through an ad-hoc series about A/B testing combining elements of both operations research and statistical points of view. A dynamic programming solution to A/B test design Why does designing a simple A/B test seem so complicated? A clear picture of power and significance in A/B tests Bandit Formulations … Continue reading Sequential Analysis

Read more »

What was data science before it was called data science?

December 2, 2015
By
What was data science before it was called data science?

“Data Science” is obviously a trendy term making it way through the hype cycle. Either nobody is good enough to be a data scientist (unicorns) or everybody is too good to be a data scientist (or the truth is somewhere in the middle). Gartner hype cycle (Wikipedia). And there is a quarter that grumbles that … Continue reading What was data science before it was called data science?

Read more »

Free gradient boosting lecture

November 21, 2015
By

We have always regretted that we didn’t get to cover gradient boosting in Practical Data Science with R (Manning 2014). To try make up for that we are sharing (for free) our GBM lecture from our (paid) video course Introduction to Data Science. (link, all support material here). Please help us get the word out … Continue reading Free gradient boosting lecture

Read more »


Subscribe

Email:

  Subscribe