## Rasmus Bååth’s Bayesian first aid

January 23, 2014
By

Besides having coded a pretty cool MCMC app in Javascript, this guy Rasmus Bååth has started the Bayesian first aid project. The idea is that if there’s an R function called blabla.test performing test “blabla”, there should be a function bayes.blabla.test performing a similar test in a Bayesian framework, and showing the output in a […]

## Sampling with replacement: Now easier than ever in the SAS/IML language

January 23, 2014
By

With each release of SAS/IML software, the language provides simple ways to carry out tasks that previously required more effort. In 2010 I blogged about a SAS/IML module that appeared in my book Statistical Programming with SAS/IML Software, which was written by using the SAS/IML 9.2. The blog post showed [...]

## Peer Review, Part 4: Good Reasons for Bad Papers

January 23, 2014
By

As a reviewer, you might sometimes ask yourself why people write so many bad papers. And why they bother submitting them. I certainly do. But where do they come from? Who submits bad papers? And why? It may come as a surprise, but there are good reasons to submit bad papers for review. To Get […]

## Slides from my online forecasting course

January 23, 2014
By

Last year I taught an online course on forecasting using R. The slides and exercise sheets are now available at www.otexts.org/fpp/resources/

## What’s Warren Buffett’s \$1 Billion Basketball Bet Worth?

January 23, 2014
By

A friend of mine just alerted me to a story on NPR describing a prize on offer from Warren Buffett and Quicken Loans. The prize is a billion dollars (1B USD) for correctly predicting all 63 games in the men’s Division I college basketball tournament this March. The facebook page announcing the contest puts the odds at 1:9,223,372,036,854,775,808, […]

## Likelihood Based Methods, for Extremes

January 23, 2014
By
$\boldsymbol{\theta}$

This week, in the MAT8595 course, we will start the section on inference for extreme values. To start with something simple, we will use maximum likelihood techniques on a Generalized Pareto Distribution (we’ve seen Monday Pickands-Balkema-de Hann theorem). Maximum Likelihood Estimation In the context of parametric models, the standard technique is to consider the maximum of the likelihood (or the log-likelihod).Let denote the parameter (with ). Given some – stnardard…

## The performance of dplyr blows plyr out of the water

January 22, 2014
By

Together with many other packages written by Hadley Wickham, plyr is a package that I use a lot for data processing. The syntax is clean, and it works great for breaking down larger data.frame‘s into smaller summaries. The greatest disadvantage… See more ›

## Coursera Specializations: Data Science, Systems Biology, Python Programming

January 22, 2014
By

I first mentioned Coursera about a year ago, when I hired a new analyst in my core. This new hire came in as a very competent Python programmer with a molecular biology and microbial ecology background, but with very little experience in statistics. I ...

## Example 2014.2: Block randomization

January 22, 2014
By

This week I had to block-randomize some units. This is ordinarily the sort of thing I would do in SAS, just because it would be faster for me. But I had already started work on the project R, using knitr/LaTeX to make a PDF, so it made sense to conti...

## Spell-checking example demonstrates key aspects of Bayesian data analysis

January 22, 2014
By

One of the new examples for the third edition of Bayesian Data Analysis is a spell-checking story. Here it is (just start at 2/3 down on the first page, with “Spelling correction”). I like this example—it demonstrates the Bayesian algebra, also gives a sense of the way that probability models (both “likelihood” and “prior”) are […]The post Spell-checking example demonstrates key aspects of Bayesian data analysis appeared first on Statistical…

## Numbersense on MailChimp’s Gmail study 1/2

January 22, 2014
By

MailChimp, a major vendor that companies use to send marketing emails to customers, published an analysis of the effect of Gmail marketing tabs (link). How should you read such a study? I'd begin by clarifying what problem the analyst is solving. In May, Google rolled out to all Gmail users a tabbed interface, in which the inbox is split into three parts: the regular inbox, a "promotional" email box, and…

## Peer Review, Part 3: A Taxonomy of Bad Papers

January 22, 2014
By

Reviewing is great when you get a good paper where you can make some suggestions to make it even better, and everybody’s happy. Bad papers are much less fun, but they are also much more common. Here are some examples I’ve seen and that I keep seeing. The completely insane. I once got a paper […]

## Phil6334: “Philosophy of Statistical Inference and Modeling” New Course: Spring 2014: Mayo and Spanos: (Virginia Tech) UPDATE: JAN 21

January 22, 2014
By

FURTHER UPDATED: New course for Spring 2014: Thurs 3:30-6:15 (Randolph 209) first installment 6334 syllabus_SYLLABUS (first) Phil 6334: Philosophy of Statistical Inference and Modeling D. Mayo and A. Spanos Contact: error@vt.edu This new course, to be jointly taught by Professors D. Mayo (Philosophy) and A. Spanos (Economics) will provide an introductory, in-depth introduction to graduate […]

## Looking for a new post-doc

January 22, 2014
By

We are looking for a new post-doctoral research fellow to work on the project “Macroeconomic Forecasting in a Big Data World”.  Details are given at the link below jobs.monash.edu.au/jobDetails.asp?sJobIDs=519824 This is a two year positio...

## Six Word Peer Review

January 21, 2014
By

A "Six Word Peer Review" competition has been running on Twitter (#sixwordpeereview).Here are a few gems that might be a little too close to home for comfort:You didn't cite my paper: Reject!Taking my time. Love, your competitor.Bayes would turn in his...

## Statistical modeling and computation [book review]

January 21, 2014
By

Dirk Kroese (from UQ, Brisbane) and Joshua Chen (from ANU, Canberra) just published a book entitled Statistical Modeling and Computation, distributed by Springer-Verlag (I cannot tell which series it is part of from the cover or frontpages…) The book is intended mostly for an undergrad audience (or for graduate students with no probability or statistics […]

## Everything I need to know about Bayesian statistics, I learned in eight schools.

January 21, 2014
By

This post is by Phil. I’m aware that there are some people who use a Bayesian approach largely because it allows them to provide a highly informative prior distribution based subjective judgment, but that is not the appeal of Bayesian methods for a lot of us practitioners. It’s disappointing and surprising, twenty years after my initial experiences, […]The post Everything I need to know about Bayesian statistics, I learned in eight schools.…

## Causal Autoregressive Time Series

January 21, 2014
By
$AR(1)$

In the MAT8181 graduate course on Time Series, we will discuss (almost) only causal models. For instance, with , with some white noise , those models are obtained when . In that case, we’ve seen that was actually the innovation process, and we can write which is actually a mean-square convergent series (using simple Analysis arguments on series). From that expression, we can easily see that is stationary, since (which does…

## Intuitively, it is clear that it is obvious that any idiot can see

January 21, 2014
By

In technical writing, three terms/phrases not to be used: Intuitively, ... It is clear that ... It is obvious that ... Just as well you could write Any idiot can plainly see ... These phrases may be true for you, the writer. However, the reader won't h...

## The Johns Hopkins Data Science Specialization on Coursera

January 21, 2014
By

We are very proud to announce the the Johns Hopkins Data Science Specialization on Coursera. You can see the official announcement from the Coursera folks here. This is the main reason Simply Statistics has been a little quiet lately. The … Continue reading →

## Visualizing Autoregressive Time Series

January 21, 2014
By
$AR(1)$

In the MAT8181 graduate course on Time Series, we started discussing autoregressive models. Just to illustrate, here is some code to plot  – causal – process, > graphar1=function(phi){ + nf <- layout(matrix(c(1,1,1,1,2,3,4,5), 2, 4, byrow=TRUE), respect=TRUE) + e=rnorm(n) + X=rep(0,n) + for(t in 2:n) X[t]=phi*X[t-1]+e[t] + plot(X[1:6000],type="l",ylab="") + abline(h=mean(X),lwd=2,col="red") + abline(h=mean(X)+2*sd(X),lty=2,col="red") + abline(h=mean(X)-2*sd(X),lty=2,col="red") + u=seq(-1,1,by=.001) + plot(0:1,0:1,col="white",xlab="",ylab="",axes=FALSE,ylim=c(-2,2),xlim=c(-2.5,2.5)) + polygon(c(u,rev(u)),c(sqrt(1-u^2),rev(-sqrt(1-u^2))),col="light yellow") + abline(v=0,col="grey") + abline(h=0,col="grey") + points(1/phi,0,pch=19,col="red",cex=1.3) + plot(0:1,0:1,col="white",xlab="",ylab="",axes=FALSE,ylim=c(-.2,.2),xlim=c(-1,1)) + axis(1) +…

## The Commissar for Traffic presents the latest Five-Year Plan

January 21, 2014
By

What do Paul Samuelson and the U.S. Department of Transportation have in common? Phil Price points us to this news article by Clark Williams-Derry: As the State Smart Transportation Initiative at the University of Wisconsin points out, the US Department of Transportation has been making the virtually identical vehicle travel forecasts for well over a […]The post The Commissar for Traffic presents the latest Five-Year Plan appeared first on Statistical…

## Where are the millionaires? Where’s the news?

January 21, 2014
By

The financial media, ranging from Wall Street Journal to Zero Hedge, blogged about the geographical distribution of U.S. millionaires. The stories came with a map, and in the case of the latter, two data tables ranked by ascending and descending...