Category: machine learning

vtreat Cross Validation

Nina Zumel finished new documentation on how vtreat‘s cross validation works, which I want to share here. vtreat is a system that makes data preparation for machine learning a “one-liner” (available in R or available in Python). We have a set of starting off points here. These documents describe what vtreat does for you, you … Continue reading vtreat Cross Validation

What is vtreat?

vtreat is a DataFrame processor/conditioner that prepares real-world data for supervised machine learning or predictive modeling in a statistically sound manner. vtreat takes an input DataFrame that has a specified column called “the outcome variable” (or “y”) that is the quantity to be predicted (and must not have missing values). Other input columns are possible … Continue reading What is vtreat?

Speaking at BARUG

We will be speaking at the Tuesday, September 3, 2019 BARUG. If you are in the Bay Area, please come see us. Nina Zumel & John Mount Practical Data Science with R Practical Data Science with R (Zumel and Mount) was one of the first, and most widely-read books on the practice of doing Data … Continue reading Speaking at BARUG

conditional noise contrastive estimation

At ICML last year, Ciwan Ceylan and Michael Gutmann presented a new version of noise constrative estimation to deal with intractable constants. While noise contrastive estimation relies upon a second independent sample to contrast with the observed sample, this approach uses instead a perturbed or noisy version of the original sample, for instance a Normal […]

visualising bias and unbiasedness

A question on X validated led me to wonder at the point made by Christopher Bishop in his Pattern Recognition and Machine Learning book about the MLE of the Normal variance being biased. As it is illustrated by the above graph that opposes the true and green distribution of the data (made of two points) […]

tenure track position in Clermont, Auvergne

My friend Arnaud Guillin pointed out this opening of a tenure-track professor position at his University of Clermont Auvergne, in Central France. With specialty in statistics and machine-learning, especially deep learning. The deadline for applications is 12 May 2019. (Tenure-track positions are quite rare in French universities and this offer includes a limited teaching load […]

Stein’s method in machine learning [workshop]

There will be an ICML workshop on Stein’s method in machine learning & statistics, next July 14 or 15, located in Long Beach, CA. Organised by François-Xavier Briol (formerly Warwick), Lester Mckey, Chris Oates (formerly Warwick), Qiang Liu, and Larry Golstein. To quote from the webpage of the workshop Stein’s method is a technique from […]

Stein’s method in machine learning [workshop]

There will be an ICML workshop on Stein’s method in machine learning & statistics, next July 14 or 15, located in Long Beach, CA. Organised by François-Xavier Briol (formerly Warwick), Lester Mckey, Chris Oates (formerly Warwick), Qiang Liu, and Larry Golstein. To quote from the webpage of the workshop Stein’s method is a technique from […]

position in statistics and/or machine learning at ENSAE ParisTech‐CREST

ENSAE ParisTech and CREST are currently inviting applications for a position of Assistant or Associate Professor in Statistics or Machine Learning. The appointment starts in September, 2019, at the earliest. At the level of Assistant Professor, the position is for an initial three-year term renewable for another three years before the tenure evaluation. Salary is […]

statlearn 2019, Grenoble

In case you are near the French Alps next week, STATLEARN 2019 will take place in Grenoble, the week after next, 04 and 05 April. The program is quite exciting, registration is free of charge!, and still open, plus the mountains are next door!

call for sessions and labs at Bay2sC0mp²⁰

A call to all potential participants to the incoming BayesComp 2020 conference at the University of Florida in Gainesville, Florida, 7-10 January 2020, to submit proposals [to me] for contributed sessions on everything computational or training labs [to David Rossell] on a specific language or software. The deadline is April 1 and the sessions will […]

prepaid ABC

Merijn Mestdagha, Stijn Verdoncka, Kristof Meersa, Tim Loossensa, and Francis Tuerlinckx from the KU Leuven, some of whom I met during a visit to its Wallon counterpart Louvain-La-Neuve, proposed and arXived a new likelihood-free approach based on saving simulations on a large scale for future users. Future users interested in the same model. The very […]

Nature Outlook on AI

The 29 November 2018 issue of Nature had a series of papers on AIs (in its Outlook section). At the general public (awareness) level than in-depth machine-learning article. Including one on the forecasted consequences of ever-growing automation on jobs, quoting from a 2013 paper by Carl Frey and Michael Osborne [of probabilistic numerics fame!] that […]