Posts Tagged ‘ statistics ’

Estimer une Proportion sur une Question Sensible, par Enquête

April 26, 2016
By
Estimer une Proportion sur une Question Sensible, par Enquête

Comme le rapportait Slate hier (dans un article qui, malheureusement ne mérite pas d’être lu), un joli sujet a été proposé au baccalauréat au lycée français de Pondichéry, en mathématiques. Ce sujet est d’autant plus intéressant qu’il revient sur une méthode assez classique pour questionner les gens sur des questions sensibles (ici sur le téléchargement pirate, mais on peut poser la même question sur la fraude, ou des pratiques sexuelles). Comme le…

Read more »

On Nested Models

April 26, 2016
By
On Nested Models

We have been recently working on and presenting on nested modeling issues. These are situations where the output of one trained machine learning model is part of the input of a later model or procedure. I am now of the opinion that correct treatment of nested models is one of the biggest opportunities for improvement … Continue reading On Nested Models

Read more »

Talk on Causality with Non-Gaussian Time Series, at Paris 7 Diderot

April 26, 2016
By
Talk on Causality with Non-Gaussian Time Series, at Paris 7 Diderot

This Monday, I will be giving a talk at Paris 7, room 1016 of the Sophie Germain building, on causality with non-Gaussian time series. Slides are now online.

Read more »

Sharp-R Time Series

April 22, 2016
By

This is the second part in a series of posts on the flexibility of our Excel Add In Sharp-R, that allows functions defined in R code to be run on data in any Excel worksheet. Part one looked as exploratory data analysis. This post deals with time serie...

Read more »

Principal curves example (Elements of Statistical Learning)

April 21, 2016
By
Principal curves example (Elements of Statistical Learning)

The bit of R code below illustrates the principal curves methods as described in The Elements of Statistical Learning, by Hastie, Tibshirani, and Friedman (Ch. 14; the book is freely available from the authors' website). Specifically, the code generates some bivariate data that have a nonlinear association, initializes the principal curve using the first (linear) principal … Continue reading Principal curves example (Elements of Statistical Learning) →

Read more »

Improved vtreat documentation

April 17, 2016
By
Improved vtreat documentation

Nina Zumel has donated some time to greatly improve the vtreat R package documentation (now available as pre-rendered HTML here). vtreat is an R data.frame processor/conditioner package that helps prepare real-world data for predictive modeling in a statistically sound manner. Even with modern machine learning techniques (random forests, support vector machines, neural nets, gradient boosted … Continue reading Improved vtreat documentation

Read more »

Sharp-R Data Analysis

April 14, 2016
By

As an illustration of how Sharp-R can be used for Excel statistics this post is part 1 in a series that show how various analysis methods defined in R code can be easily applied to Excel data. We have now posted the second article which deals with time...

Read more »

Sharp-R Update

April 11, 2016
By

Sharp-R, our Excel R interface has just been updated to version 1.1 to address some of the minor issues that didn’t get dealt with before the first release. Improved the quality of the R plots being returned to Excel. The previous version was cop...

Read more »

Half off Win-Vector data science books and video training!

April 8, 2016
By

We are pleased to announce our book Practical Data Science with R (Nina Zumel, John Mount, Manning 2014) is part of Manning’s “Deal of the Day” of April 9th 2016. This one day only offer gets you half off for physical book (with free e-copy) or paid e-copy (e-copy simultaneous pdf + ePub + kindle, … Continue reading Half off Win-Vector data science books and video training!

Read more »

I am a data scientist

April 8, 2016
By
I am a data scientist

Three years ago this week, I wrote a blog post, “Data science is statistics”. I was fiercely against the term at that time, as I felt that we already had a data science, and it was called Statistics. It was a short post, so I might as well quote the whole thing: When physicists do […]

Read more »


Subscribe

Email:

  Subscribe