Statistics

Statistics Blogs

Some Neat New R Notations

August 22, 2017
By
Some Neat New R Notations

The R package seplyr supplies a few neat new coding notations. An Abacus, which gives us the term “calculus.” The first notation is an operator called the “named map builder”. This is a cute notation that essentially does the job of stats::setNames(). It allows for code such as the following: library("seplyr") names <- c('a', 'b') … Continue reading Some Neat New R Notations

Read more »

Bayesian Random Projection (More on Terabytes of Economic Data)

August 20, 2017
By

Some additional thoughts related to Serena Ng's World Congress piece (earlier post here, with a link to her paper):The key newish dimensionality-reduction strategies that Serena emphasizes are random projection and leverage score sampling.  I...

Read more »

Is dplyr Easily Comprehensible?

August 20, 2017
By
Is dplyr Easily Comprehensible?

dplyr is one of the most popular R packages. It is powerful and important. But is it in fact easily comprehensible? dplyr makes sense to those of us who use it a lot. And we can teach part time R users a lot of the common good use patterns. But, is it an easy task … Continue reading Is dplyr Easily Comprehensible?

Read more »

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

I often create character variables (i.e. variables with strings of text as their values) in SAS, and they sometimes don’t render as expected.  Here is an example involving the built-in data set SASHELP.CLASS. Here is the code: data c1;      set sashelp.class;      * define a new character variable to classify someone as tall or […]

Read more »

Thank You For The Very Nice Comment

August 16, 2017
By
Thank You For The Very Nice Comment

Somebody nice reached out and gave us this wonderful feedback on our new Supervised Learning in R: Regression (paid) video course. Thanks for a wonderful course on DataCamp on XGBoost and Random forest. I was struggling with Xgboost earlier and Vtreat has made my life easy now :). Supervised Learning in R: Regression covers a … Continue reading Thank You For The Very Nice Comment

Read more »

Gelman digested read

August 16, 2017
By

It's hard to keep up with Andrew Gelman, so let me point to some interesting recent posts from his blog. Readings on philosophy of statistics (link): Andrew has a bunch of links of (mostly his own) writings about deep statistical issues. Science is about understanding how the world works, which involves questions of cause and effect, and randomness and unexplained variability. Data that can be observed are almost never sufficient…

Read more »

Performance or Probativeness? E.S. Pearson’s Statistical Philosophy

August 16, 2017
By
Performance or Probativeness?  E.S. Pearson’s Statistical Philosophy

This is a belated birthday post for E.S. Pearson (11 August 1895-12 June, 1980). It’s basically a post from 2012 which concerns an issue of interpretation (long-run performance vs probativeness) that’s badly confused these days. I’ll blog some E. Pearson items this week, including, my latest reflection on a historical anecdote regarding Egon and the woman he wanted […]

Read more »

Update on inference with Wasserstein distances

August 15, 2017
By
Update on inference with Wasserstein distances

Hi again, As described in an earlier post, Espen Bernton, Mathieu Gerber and Christian P. Robert and I are exploring Wasserstein distances for parameter inference in generative models. Generally, ABC and indirect inference are fun to play with, as they make the user think about useful distances between data sets (i.i.d. or not), which is sort of implicit in classical […]

Read more »

Did web scraping just receive a legal boost?

August 15, 2017
By

Kaiser Fung, founder of Principal Analytics Prep and author of Numbersense, discusses a recent legal ruling against LinkedIn's technologies that restricts web scraping.

Read more »

A Stan case study, sort of: The probability my son will be stung by a bumblebee

August 14, 2017
By
A Stan case study, sort of: The probability my son will be stung by a bumblebee

The Stan project for statistical computation has a great collection of curated case studies which anybody can contribute to, maybe even me, I was thinking. But I don’t have time to worry about that right now because I’m on vacation, being on the ...

Read more »


Subscribe

Email:

  Subscribe