## Midterm Exam (Introduction to Statistical Computing)

October 11, 2013
By

Midterm Exam: eight questions about thirteen lines of code. Introduction to Statistical Computing

## Self-syndication

October 11, 2013
By

This is a piece I've written for The SWITCH project website. SWITCH is a research project addressing issues related to the social market economy in Europe. The topics addressed in the project range from macro financial sustainability conditions to the ...

## Why do we still teach a semester of trigonometry? How about engineering instead?

October 11, 2013
By

Arthur Benjamini says we should teach statistics before calculus.  He points out that most of what we do in high school math is preparing us for calculus. He makes the point that while physicists, engineers and economists need calculus, in the … Continue reading →

## Gladwell and Chabris, David and Goliath, and science writing as stone soup

October 11, 2013
By

The only thing is, I’m not sure who’s David here and who is Goliath. From the standpoint of book sales, Gladwell is Goliath for sure. On the other hand, Gladwell’s credibility has been weakened over the years by fights with bigshots such as Steven Pinker. Maybe the best analogy is a boxing match where Gladwell […]The post Gladwell and Chabris, David and Goliath, and science writing as stone soup appeared…

## Two applications of the "runs test"

October 11, 2013
By

In my last blog post I described how to implement a "runs test" in the SAS/IML language. The runs test determines whether a sequence of two values (for example, heads and tails) is likely to have been generated by random chance. This article describes two applications of the runs test. [...]

## Random Sequence of Heads and Tails: For R Users

October 10, 2013
By

Rick Wicklin on the SAS blog made a post today on how to tell if a sequence of coin flips were random.  I figured it was only fair to port the SAS IML code over to R.  Just like Rick Wicklin did in his example this is the Wald-Wolfowitz test for randomness.  I tried to […]

## Calculating AUC the hard way

October 10, 2013
By

The Area Under the Receiver Operator Curve is a commonly used metric of model performance in machine learning and many other binary classification/prediction problems. The idea is to generate a threshold independent measure of how well a model is able to distinguish between two possible outcomes. Threshold independent here just means that for any model […]

## De Novo Transcriptome Assembly with Trinity: Protocol and Videos

October 10, 2013
By

One of the clearest advantages RNA-seq has over array-based technology for studying gene expression is not needing a reference genome or a pre-existing oligo array. De novo transcriptome assembly allows you to study non-model organisms, cancer cells, o...

## Cancelled NIH study sections: a subtle, yet disastrous, effect of the government shutdown

October 10, 2013
By

Editor's note: This post is contributed by Debashis Ghosh. Debashis is the chair of the Biostatistical Methods and Research Design (BMRD) study sections at the National Institutes of Health (NIH).  BMRD's focus is statistical methodology. I write today to discuss effects of … Continue reading →

## Chris Chabris is irritated by Malcolm Gladwell

October 10, 2013
By

Christopher Chabris reviewed the new book by Malcolm Gladwell: One thing “David and Goliath” shows is that Mr. Gladwell has not changed his own strategy, despite serious criticism of his prior work. What he presents are mostly just intriguing possibilities and musings about human behavior, but what his publisher sells them as, and what his […]The post Chris Chabris is irritated by Malcolm Gladwell appeared first on Statistical Modeling, Causal…

## That’s Smooth

October 10, 2013
By

I had someone ask me the other day how to take a scatterplot and draw something other than a straight line through the graph using Excel.  Yes, it can be done in Excel and it’s really quite simple, but there are some limitations when using the stock Excel dialog screens. So it is probably in […]

## The Seven Year Itch

October 10, 2013
By

Eagereyes.org turned seven years old last week, on October 1st. Seven years is a long time on the web. In dog years, the site is almost fifty years old! Has it lost its edge? Have I gone soft? Where is the bite? Where is the fight? The Establishment This site has become a part of […]

## Bad statistics: crime or free speech (II)? Harkonen update: Phil Stat / Law /Stock

October 10, 2013
By

There’s an update (with overview) on the infamous Harkonen case in Nature with the dubious title “Uncertainty on Trial“, first discussed in my (11/13/12) post “Bad statistics: Crime or Free speech”, and continued here. The new Nature article quotes from Steven Goodman: “You don’t want to have on the books a conviction for a practice that many […]

## Diagrams for hierarchical models – we need your opinion

October 9, 2013
By

When trying to understand a hierarchical model, I find it helpful to make a diagram of the dependencies between variables. But I have found the traditional directed acyclic graphs (DAGs) to be incomplete at best and downright confusing at worst. Theref...

## Mister P: What’s its secret sauce?

October 9, 2013
By

This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, […]The post Mister P: What’s its secret sauce? appeared first on Statistical Modeling, Causal Inference,…

## The Care and Feeding of Your Scientist Collaborator

October 9, 2013
By

Editor’s Note: This post written by Roger Peng is part of a two-part series on Scientist-Statistician interactions. The first post was written by Elizabeth C. Matsui, an Associate Professor in the Division of Allergy and Immunology at the Johns Hopkins … Continue reading →

## Robert G Brown (1923-2013)

October 9, 2013
By

Robert Goodell Brown was the father of exponential smoothing. He died last week at the age of 90. While I never met him, I was indebted to him for exponential smoothing and his practical and insightful books. Today I received this email from King Harrison III advising of his death. Twenty years ago I attended the ISF 93 conference in Pittsburgh, which honored Bob Brown on his 70th birthday, and…

## How to tell whether a sequence of heads and tails is random

October 9, 2013
By

While walking in the woods, a statistician named Goldilocks wanders into a cottage and discovers three bears. The bears, being hungry, threaten to eat the young lady, but Goldilocks begs them to give her a chance to win her freedom. The bears agree. While Mama Bear and Papa Bear block [...]

## The (third) runway bride

October 9, 2013
By

I think I should disclaim the conflict of interest in this one (since Marta is one of the authors of the paper), but it was really, really cool to see her study on the impact on health of noise pollution close to airports in the newspapers today (for e...

## Introduction aux GLM

October 9, 2013
By

Cette semaine, on finit la régression de Poisson (temporairement) avant de présenter la théorie des GLM. Les transparents sont en ligne. On en aura besoin pour aller plus loin sur les modèles avec surdispersion, pour modéliser la fréquence de sin...

## Some heuristics about spline smoothing

October 9, 2013
By
$\mathbb{E}(Y\vert X=x)=h(x)$

Let us continue our discussion on smoothing techniques in regression. Assume that . where is some unkown function, but assumed to be sufficently smooth. For instance, assume that  is continuous, that exists, and is continuous, that  exists and is also continuous, etc. If  is smooth enough, Taylor’s expansion can be used. Hence, for which can also be writen as for some ‘s. The first part is simply a polynomial. The second…

## Some heuristics about local regression and kernel smoothing

October 9, 2013
By
$\mathbb{E}(Y\vert X=x)=\beta_0+\beta_1 x$

In a standard linear model, we assume that . Alternatives can be considered, when the linear assumption is too strong. Polynomial regression A natural extension might be to assume some polynomial function, Again, in the standard linear model approach (with a conditional normal distribution using the GLM terminology), parameters can be obtained using least squares, where a regression of  on  is considered. Even if this polynomial model is not the…

## Happy birthday

October 8, 2013
By

Sylvia Richardson (who's now the head of the MRC Biostatistics Unit in Cambridge, and part of our RDD project) asks me to advertise the MRC Biostatistic Unit's Centenary Conference, which will be held in Queens' College Cambridge on March 26t...