Midterm Exam: eight questions about thirteen lines of code. Introduction to Statistical Computing

Midterm Exam: eight questions about thirteen lines of code. Introduction to Statistical Computing

Arthur Benjamini says we should teach statistics before calculus. He points out that most of what we do in high school math is preparing us for calculus. He makes the point that while physicists, engineers and economists need calculus, in the … Continue reading →

The only thing is, I’m not sure who’s David here and who is Goliath. From the standpoint of book sales, Gladwell is Goliath for sure. On the other hand, Gladwell’s credibility has been weakened over the years by fights with bigshots such as Steven Pinker. Maybe the best analogy is a boxing match where Gladwell […]The post Gladwell and Chabris, David and Goliath, and science writing as stone soup appeared…

The Area Under the Receiver Operator Curve is a commonly used metric of model performance in machine learning and many other binary classification/prediction problems. The idea is to generate a threshold independent measure of how well a model is able to distinguish between two possible outcomes. Threshold independent here just means that for any model […]

Editor's note: This post is contributed by Debashis Ghosh. Debashis is the chair of the Biostatistical Methods and Research Design (BMRD) study sections at the National Institutes of Health (NIH). BMRD's focus is statistical methodology. I write today to discuss effects of … Continue reading →

Christopher Chabris reviewed the new book by Malcolm Gladwell: One thing “David and Goliath” shows is that Mr. Gladwell has not changed his own strategy, despite serious criticism of his prior work. What he presents are mostly just intriguing possibilities and musings about human behavior, but what his publisher sells them as, and what his […]The post Chris Chabris is irritated by Malcolm Gladwell appeared first on Statistical Modeling, Causal…

There’s an update (with overview) on the infamous Harkonen case in Nature with the dubious title “Uncertainty on Trial“, first discussed in my (11/13/12) post “Bad statistics: Crime or Free speech”, and continued here. The new Nature article quotes from Steven Goodman: “You don’t want to have on the books a conviction for a practice that many […]

This is a long and technical post on an important topic: the use of multilevel regression and poststratification (MRP) to estimate state-level public opinion. MRP as a research method, and state-level opinion (or, more generally, attitudes in demographic and geographic subpopulation) as a subject, have both become increasingly important in political science—and soon, I expect, […]The post Mister P: What’s its secret sauce? appeared first on Statistical Modeling, Causal Inference,…

Editor’s Note: This post written by Roger Peng is part of a two-part series on Scientist-Statistician interactions. The first post was written by Elizabeth C. Matsui, an Associate Professor in the Division of Allergy and Immunology at the Johns Hopkins … Continue reading →

Robert Goodell Brown was the father of exponential smoothing. He died last week at the age of 90. While I never met him, I was indebted to him for exponential smoothing and his practical and insightful books. Today I received this email from King Harrison III advising of his death. Twenty years ago I attended the ISF 93 conference in Pittsburgh, which honored Bob Brown on his 70th birthday, and…

Cette semaine, on finit la régression de Poisson (temporairement) avant de présenter la théorie des GLM. Les transparents sont en ligne. On en aura besoin pour aller plus loin sur les modèles avec surdispersion, pour modéliser la fréquence de sin...

Let us continue our discussion on smoothing techniques in regression. Assume that . where is some unkown function, but assumed to be sufficently smooth. For instance, assume that is continuous, that exists, and is continuous, that exists and is also continuous, etc. If is smooth enough, Taylor’s expansion can be used. Hence, for which can also be writen as for some ‘s. The first part is simply a polynomial. The second…

In a standard linear model, we assume that . Alternatives can be considered, when the linear assumption is too strong. Polynomial regression A natural extension might be to assume some polynomial function, Again, in the standard linear model approach (with a conditional normal distribution using the GLM terminology), parameters can be obtained using least squares, where a regression of on is considered. Even if this polynomial model is not the…