## Regression and causality and variable ordering

June 8, 2014
By

Bill Harris wrote in with a question: David Hogg points out in one of his general articles on data modeling that regression assumptions require one to put the variable with the highest variance in the ‘y’ position and the variable you know best (lowest variance) in the ‘x’ position. As he points out, others speak […] The post Regression and causality and variable ordering appeared first on Statistical Modeling, Causal…

## New Award for David Hendry

June 7, 2014
By

It's difficult to imagine what our modern econometrics world would be like if it weren't for the numerous, seminal, contributions that Sir David Hendry has made over the course of his distinguished career.So, it was wonderful to see this announcement t...

## “Does researching casual marijuana use cause brain abnormalities?”

June 7, 2014
By

David Austin points me to a wonderfully-titled post by Lior Pachter criticizing a recent paper on the purported effects of cannabis use. Not the paper criticized here. Someone should send this all to David Brooks. I’ve heard he’s intereste...

## stone flakes

June 6, 2014
By

I browsed through UC Irvine Machine Learning Repository! the other day and noticed a nice data set regarding stone flakes produced by our ancestors, the prehistoric men. To quote the dataset owners:'The data set concerns the earliest history ...

## Frequentist vs. Bayesian Analysis

June 6, 2014
By

"Statisticians should readily use both Bayesian and frequentist ideas."So begins a 2004 paper by Bayarri and Berger, "The Interplay of Bayesian and Frequentist Analysis", Statistical Science, 19(1), 58-80.Let's re-phrase that opening sentence: "Econome...

## Hurricanes vs. Himmicanes

June 6, 2014
By

The story’s on the sister blog and I quote liberally from Jeremy Freese, who wrote: The authors have issued a statement that argues against some criticisms of their study that others have offered. These are irrelevant to the above observations, as I [Freese] am taking everything about the measurement and model specification at their word–my […] The post Hurricanes vs. Himmicanes appeared first on Statistical Modeling, Causal Inference, and Social…

## R style tip: prefer functions that return data frames

June 6, 2014
By

While following up on Nina Zumel’s excellent Trimming the Fat from glm() Models in R I got to thinking about code style in R. And I realized: you can make your code much prettier by designing more of your functions to return data.frames. That may seem needlessly heavy-weight, but it has a lot of down-stream […] Related posts: Prefer = for assignment in R Your Data is Never the Right…

## Statistically savvy journalism

June 6, 2014
By

Roy Mendelssohn points me to this excellent bit of statistics reporting by Matt Novak. I have no comment, I just think it’s good to see this sort of high-quality Felix Salmon-style statistically savvy journalism. The post Statistically savvy jou...

## The Real Reason Reproducible Research is Important

June 6, 2014
By

Reproducible research has been on my mind a bit these days, partly because it has been in the news with the Piketty stuff, and also perhaps because I just published a book on it and I'm teaching a class on … Continue reading →

## Applied Statistics Lesson of the Day – What “Linear” in Linear Regression Really Means

$Applied Statistics Lesson of the Day – What “Linear” in Linear Regression Really Means$

Linear regression is one of the most commonly used tools in statistics, yet one of its fundamental features is commonly misunderstood by many non-statisticians.  I have witnessed this misunderstanding on numerous occasions in my work experience in statistical consulting and statistical education, and it is important for all statisticians to be aware of this common […]

## The Deviance Information Criterion

June 5, 2014
By

A few years ago - twelve, to be specific - an interesting paper appeared in the Journal of the Royal Statistical Society. That paper, "Bayesian measures of model complexity and fit", by Spiegelhalter et al., stirred up a good deal of controve...

## Modèles de prévision, fourre-tout

June 5, 2014
By

Dans le dernier cours de modèle de prévision, la semaine passée, nous avions passé un peu de temps sur l’étude des points aberrants et des points influents. Tout est expliqué dans les slides (avec les codes) donc je ne reviendrais pas dessus. Je pourrais juste évoquer quelques lignes de codes utilisées pour voir l’impact d’une observation sur la régression, en enlevant l’observation de la base, et en regardant ce que…

## Stephen Senn: Blood Simple? The complicated and controversial world of bioequivalence (guest post)

June 5, 2014
By

Blood Simple? The complicated and controversial world of bioequivalence by Stephen Senn* Those not familiar with drug development might suppose that showing that a new pharmaceutical formulation (say a generic drug) is equivalent to a formulation that has a licence (say a brand name drug) ought to be simple. However, it can often turn out to […]

## Identifying pathways for managing multiple disturbances to limit plant invasions

June 5, 2014
By

Andrew Tanentzap, William Lee, Adrian Monks, Kate Ladley, Peter Johnson, Geoffrey Rogers, Joy Comrie, Dean Clarke, and Ella Hayman write: We tested a multivariate hypothesis about the causal mechanisms underlying plant invasions in an ephemeral wetland in South Island, New Zealand to inform management of this biodiverse but globally imperilled habitat. . . . We […] The post Identifying pathways for managing multiple disturbances to limit plant invasions appeared first…

## Know your data 15: the false promise of data correction

June 5, 2014
By

It's a good thing that FTC is making some noise about regulating the snooping done by online services. (link) It's not a good thing that the measures described in the article ("tools to view, suppress and fix the information") do not solve the fundamental problem, and are likely counter-productive. What's the fundamental problem? Imagine a world in which you walk into your supermarket. When you check out, you are required…

## Approches Statistiques du Risque

June 5, 2014
By

Il semble que l’ouvrage approches statistiques du risque, édité par Jean Jacques Droesbeke, Myriam Maumy-Bertrand, Gilbert Saporta et Christine Thomas-Agnan soit finalement paru, selon Gilbert Saporta. Ou devraient paraître – au pire – dans les jours à venir (je le vois pas encore sur le site de Technip). Les 400 pages reprennent les mini-cours que nous avions fait aux Journées d’Etudes Statistique de 2010, au CIRM, avec Patrice Bertail, Anne-Laure Fougères,…

## Box plot, Fisher’s style

June 5, 2014
By

In a recent issue of Significance, I discovered an interesting – and amuzing – figure, about some box & beard plot, in Dr Fisher’s casebook: Beard the statistician in his den. In French, the box plot (introduced by John Tukey, not George Box, as discussed in a previous post) is popular under the name boîte à moustaches (box with a mustache, for a simple translation). > set.seed(2) > x=rnorm(500) > boxplot(x,horizontal=TRUE,axes=FALSE)…

## The Unavoidable Instability of Brand Image

June 5, 2014
By

"It may be that most consumers forget the attribute-based reasons why they chose or rejected the many brands they have considered and instead retain just a summary attitude sufficient to guide choice the next time."This is how Dolnicar and Rossiter con...

## Another Benefit of Publicly Version-Controlled Research

June 4, 2014
By

I've been thinking quite a bit lately about why and how political scientists should publicly version control their research projects. By research projects, I mean data, manuscript, and code. And by publicly version control, I mean use Git to version-control and post a public GitHub repository, from the beginning of the project, so that other […]

## Data!

June 4, 2014
By

The animated gif below (>>link) counts data transferred every second over the internet (sources?). Another (static) infographic by Cisco estimates that …Continue reading →

## Determining chemical concentration with standard addition: An application of linear regression in JMP – A Guest Blog Post for the JMP Blog

$Determining chemical concentration with standard addition: An application of linear regression in JMP – A Guest Blog Post for the JMP Blog$

I am very excited to announce that I have been invited by JMP to be a guest blogger for its official blog!  My thanks to Arati Mejdal, Global Social Media Manager for the JMP Division of SAS, for welcoming me into the JMP blogging community with so much support and encouragement, and I am pleased to […]

## All the Assumptions That Are My Life

June 4, 2014
By

Statisticians take tours in other people’s data. All methods of statistical inference rest on statistical models. Experiments typically have problems with compliance, measurement error, generalizability to the real world, and representativeness of the sample. Surveys typically have problems of undercoverage, nonresponse, and measurement error. Real surveys are done to learn about the general population. But […] The post All the Assumptions That Are My Life appeared first on Statistical Modeling,…

## Yet another power-law tail, explained

June 4, 2014
By

At the next Boston Python user group meeting, participants will present their solutions to a series of puzzles, posted here.  One of the puzzles lends itself to a solution that uses Python iterators, which is something I was planning to get more f...