## The joy of joining data.tables

June 10, 2014
By

The example I present here is a little silly, yet it illustrates how to join tables with data.table in R. Mapping old data to new dataCategories in general are never fixed, they always change at some point. And then the trouble starts with the data. Fo...

## Mathematical and Applied Statistics Lesson of the Day – The Central Limit Theorem Can Apply to the Sum

$Mathematical and Applied Statistics Lesson of the Day – The Central Limit Theorem Can Apply to the Sum$

The central limit theorem (CLT) is often stated in terms of the sample mean of independent and identically distributed random variables.  An often unnoticed or forgotten aspect of the CLT is its applicability to the sample sum of those variables, too.  Since , the sample size, is just a constant, it can be multiplied to to obtain […]

## Prévision de séries chronologiques

June 9, 2014
By

Dans la seconde partie du cours de modèles de prévision, on quittera (un peu) les données individuelles pour parler de données chronologiques. Les slides de cette semaine (et probablement la semaine prochaine) sont en ligne. J’ai mis en ligne, en parallèle, quelques notes de cours sur les séries temporelles, qui pourront peut être servir de complément. Comme le disait Doug Martin “Time series is the worst subject to teach. First,…

## “The medical press must become irrelevant to publication of clinical trials.”

June 9, 2014
By

“The medical press must become irrelevant to publication of clinical trials.” So said Stephen Senn at a recent meeting of the Medical Journalists’ Association with the title: “Is the current system of publishing clinical trials fit for purpose?” Senn has thrown a few stones in the direction of medical journals in guest posts on this […]

## At the Copa

June 9, 2014
By

This is the first post of a fairly regular series (at least I'll try to keep it this way!), dedicated to the impending FIFA World Cup (you may think I've gone all Barry Manilow, like Peter & co \$-\$ but I can reassure you I haven't).Marta, Virgilio ...

## I hate polynomials

June 9, 2014
By

A recent discussion with Mark Palko [scroll down to the comments at this link] reminds me that I think that polynomials are way way overrated, and I think a lot of damage has arisen from the old-time approach of introducing polynomial functions as a canonical example of linear regressions (for example). There are very few […] The post I hate polynomials appeared first on Statistical Modeling, Causal Inference, and Social…

## On deck this week

June 9, 2014
By

Mon: I hate polynomials Tues: Spring forward, fall back, drop dead? Wed: Bayes in the research conversation Thurs: The health policy innovation center: how best to move from pilot studies to large-scale practice? Fri: Stroopy names Sat: He’s not so great in math but wants to do statistics and machine learning Sun: Comparing the full […] The post On deck this week appeared first on Statistical Modeling, Causal Inference, and…

## Piketty’s Empirics Are as Bad as His Theory

June 9, 2014
By

In my earlier Piketty post, I wrote, "If much of its "reasoning" is little more than neo-Marxist drivel, much of its underlying measurement is nevertheless marvelous." The next day, recognizing the general possibility of a Reinhart-Rogoff error, b...

## A reader submits a Type DV analysis

June 9, 2014
By

Darin Myers at PGi was kind enough to send over an analysis of a chart using the Trifecta Checkup framework. I'm reproducing the critique in full, with a comment at the end. *** At first glance this looks like a...

## The Ways Probability Distributions Are Wrong.

June 9, 2014
By

Suppose there’s some aspect of our universe we’d like to know; perhaps it’s a physical measurement taken next week, or an unknown population average that exists today. Whatever it is, we use information to create a distribution which ...

## How to generate a grid of points in SAS

June 9, 2014
By

In many areas of statistics, it is convenient to be able to easily construct a uniform grid of points. You can use a grid of parameter values to visualize functions and to get a rough feel for how an objective function in an optimization problem depends on the parameters. And […]

## Tuning particle MCMC algorithms

June 8, 2014
By

Several papers have appeared recently discussing the issue of how to tune the number of particles used in the particle filter within a particle MCMC algorithm such as particle marginal Metropolis Hastings (PMMH). Three such papers are: Doucet, Arnaud, Michael Pitt, and Robert Kohn. Efficient implementation of Markov chain Monte Carlo when using an unbiased […]

## Tuning particle MCMC algorithms

June 8, 2014
By

Several papers have appeared recently discussing the issue of how to tune the number of particles used in the particle filter within a particle MCMC algorithm such as particle marginal Metropolis Hastings (PMMH). Three such papers are: Doucet, Arnaud, Michael Pitt, and Robert Kohn. Efficient implementation of Markov chain Monte Carlo when using an unbiased … Continue reading Tuning particle MCMC algorithms

## Enjoy the silence

June 8, 2014
By

I've been quite silent on the blog in the past few weeks \$-\$ a combination of exam-marking, conference-organisation and other few (some more, some less interesting) things...As for Bayes Pharma, we're nearly there \$-\$ the conference is this week Wednes...

## Regression and causality and variable ordering

June 8, 2014
By

Bill Harris wrote in with a question: David Hogg points out in one of his general articles on data modeling that regression assumptions require one to put the variable with the highest variance in the ‘y’ position and the variable you know best (lowest variance) in the ‘x’ position. As he points out, others speak […] The post Regression and causality and variable ordering appeared first on Statistical Modeling, Causal…

## New Award for David Hendry

June 7, 2014
By

It's difficult to imagine what our modern econometrics world would be like if it weren't for the numerous, seminal, contributions that Sir David Hendry has made over the course of his distinguished career.So, it was wonderful to see this announcement t...

## “Does researching casual marijuana use cause brain abnormalities?”

June 7, 2014
By

David Austin points me to a wonderfully-titled post by Lior Pachter criticizing a recent paper on the purported effects of cannabis use. Not the paper criticized here. Someone should send this all to David Brooks. I’ve heard he’s intereste...

## stone flakes

June 6, 2014
By

I browsed through UC Irvine Machine Learning Repository! the other day and noticed a nice data set regarding stone flakes produced by our ancestors, the prehistoric men. To quote the dataset owners:'The data set concerns the earliest history ...

## Frequentist vs. Bayesian Analysis

June 6, 2014
By

"Statisticians should readily use both Bayesian and frequentist ideas."So begins a 2004 paper by Bayarri and Berger, "The Interplay of Bayesian and Frequentist Analysis", Statistical Science, 19(1), 58-80.Let's re-phrase that opening sentence: "Econome...

## Hurricanes vs. Himmicanes

June 6, 2014
By

The story’s on the sister blog and I quote liberally from Jeremy Freese, who wrote: The authors have issued a statement that argues against some criticisms of their study that others have offered. These are irrelevant to the above observations, as I [Freese] am taking everything about the measurement and model specification at their word–my […] The post Hurricanes vs. Himmicanes appeared first on Statistical Modeling, Causal Inference, and Social…

## R style tip: prefer functions that return data frames

June 6, 2014
By

While following up on Nina Zumel’s excellent Trimming the Fat from glm() Models in R I got to thinking about code style in R. And I realized: you can make your code much prettier by designing more of your functions to return data.frames. That may seem needlessly heavy-weight, but it has a lot of down-stream […] Related posts: Prefer = for assignment in R Your Data is Never the Right…

## Statistically savvy journalism

June 6, 2014
By

Roy Mendelssohn points me to this excellent bit of statistics reporting by Matt Novak. I have no comment, I just think it’s good to see this sort of high-quality Felix Salmon-style statistically savvy journalism. The post Statistically savvy jou...

## The Real Reason Reproducible Research is Important

June 6, 2014
By

Reproducible research has been on my mind a bit these days, partly because it has been in the news with the Piketty stuff, and also perhaps because I just published a book on it and I'm teaching a class on … Continue reading →