November 17, 2013
In response to some big new push for testing schoolchildren, Mark Palko writes: The announcement of a new curriculum is invariably followed by a round of hearty round of self congratulations and talk of "keeping standards high" as if adding a slide to a PowerPoint automatically made students better informed. It doesn't work that way. […]

## Dutch Rainwater Composition 1992-2005.

November 17, 2013
After reading Blog About Stats' Open Data Index Blog Post I decided to browse a bit in the Open Data Index. Choosing Netherlands and following Emission of Pollutants I ended on a page from National Institute for Public Health.

## What should statistics do about massive open online courses?

November 17, 2013
Marie Davidian, the President of the American Statistical Association, writes about the JHU Biostatistics effort to deliver massive open online courses. She interviewed Jeff, Brian Caffo, and me and summarized our thoughts. All acknowledge that the future is unknown.

## Stein’s Method

November 16, 2013
$Stein’s Method$

I have mentioned Stein's method in passing, a few times on this blog. Today I want to talk about Stein's method in a bit of detail. 1. What Is Stein's Method? Stein's method, due to Charles Stein, is actually quite old, going back to 1972. But there has been a great deal of interest in […]

## S. Stanley Young: More Trouble with ‘Trouble in the Lab’ (Guest post)

November 16, 2013
Stanley Young's guest post arose in connection with Kepler's Nov. 13, and my November 9 post,and associated comments. S. Stanley Young, PhD Assistant Director for Bioinformatics National Institute of Statistical Sciences Research Triangle Park, NC Much is made by some of the experimental biologists that their art is oh so sophisticated that mere mortals do not have […]

## Objects of the class “Objects of the class”

November 16, 2013
Objects of the class "Foghorn Leghorn": parodies that are more famous than the original. ("It would be as if everybody were familiar with Duchamp's Mona-Lisa-with-a-moustache while never having heard of Leonardo's version.") Objects of the class "Whoopi Goldberg": actors who are undeniably talented but are almost always in bad movies, or at least movies that […]

## BCEs0 version 1.1 on CRAN

November 16, 2013
As I was responding to the points raised by two referees and the editor on my paper on cost-effectiveness with structural zeros (the preliminary version was here, while I have presenting it in a few talks and discussed it here, here, here and here...

## A Shiny App for Experimenting with Dynamic Programming

November 16, 2013
This post demonstrates the dynamics involved in a susceptible, infected, and recovering (SIR) model previous post for the model.  The shiny ui and server code can be found on GitHub.of dynamic programming. As a dynamic infection model, I find it p...

## Le Monde puzzle [#839]

November 15, 2013
A number theory Le Monde mathematical puzzle whose R coding is not really worth it (and which rings a bell of a similar puzzle in the past, puzzle I cannot trace…): The set Ξ is made of pairs of integers (x,y) such that (i) both x and y are written as a sum of two squared integers (i.e., are […]

## Laplace didn’t have a time machine

November 15, 2013
Dr. Mayo responded to criticism of the Severity Principle here. The main points are (A) if SEV differs from Bayes it doesn't mean SEV's bad (B) you shouldn't compare SEV and Bayes because they do different things (C) A prior can alway...

## What’s the future of inference?

November 15, 2013
Rob Gould reports on what appears to have been interesting panel discussion on the future of statistics hosted by the UCLA Statistics Department. The panelists were Songchun Zhu (UCLA Statistics), Susan Paddock (RAND Corp.), and Jan de Leeuw (UCLA Statistics).

## “Are all significant p-values created equal?”

November 15, 2013
The answer is no, as explained in this classic article by Warren Browner and Thomas Newman from 1987. If I were to rewrite this article today, I would frame things slightly differently—referring to Type S and Type M errors rather than speaking of "the probability that the research hypothesis is true"—but overall they make good […]

## Evaluating Quandl Data Quality

November 15, 2013
Quandl has indexed millions of time-series datasets from over 400 sources. All of Quandl's datasets are open and free. This is great news but before performing any backtest using Quandl data, I want to compare it with a trusted source: Bloomberg for the purpose of this post. I will focus only on daily Futures data here […]

## Python: Venn Diagram

November 15, 2013
Venn Diagram is very useful for visualizing operations between events/sets. So in this post, we will learn how to visualize one in Python. First, we need to install the module matplotlib-venn. Open the terminal or command prompt, and run the followin...

## BDA class 4 G+ hangout on air is on air

November 15, 2013
Here. And here's the backstory. P.S. The damn mike was muted most of the time. Something always goes wrong!

## Un peu plus près des étoiles (***)

November 15, 2013
$p$

Il y a eu un gros buzz, récement autour du papier de Valen Johnson paru dans PNAS. L'article a été repris un peu partout (http://nature.com/news/, http://blogs.scientificamerican.com/absolutely-maybe/, http://arstechnica.com/science/ ou encore http://passeurdesciences.blog.lemonde.fr/ qui a repris l'information, en français). Et plusieurs personnes m'ont fait suivre des liens, en me demandant mon avis, par courriel ou via twitter. Je ne vais pas revenir sur l'étude (pour l'instant) ni sur les mauvaises lectures de l'étude, mais plutôt sur le buzz…

## How Countries Fare, 2010

November 15, 2013
Originally posted on CoolStatsBlog:The Current Account Balance is a measure of a country's "profitability". It is the sum of profits (losses) made from trading with other countries, profits (losses) made from investments in other countries, and cash transfers, such as remittances from expatriates. World: Current Account Balance, 2010 As the infographic shows, there isn't…

## Daily/monthly/yearly tallies for your data

November 15, 2013
Say you have a dataset, where each row has a date or time, and something is recorded for that date and time. If each row is a unique date – great! If not, you may have rows with the same date, and you have to combine records for the same date to get a daily tally. […]

## BDA class G+ hangout another try

November 14, 2013
Tomorrow (Thurs) 8h30 (Paris time) I will be teaching my Bayesian Data Analysis class (class4a.pdf and class4b.pdf, you can follow the slides here). We had problems earlier with the regular G+ hangout, so this time we're trying the G+ On-Air Hangout which I think should work better. I'll post a blog entry tomorrow with a […]

## The Leek group guide to sharing data with a data analyst to speed collaboration

November 14, 2013
My group collaborates with many different scientists and the number one determinant of how fast we can turn around results is the status of the data we receive from our collaborators. If the data are well organized and all the …

## Calibration of p-value under variable selection: an example

November 14, 2013
Very often people report p-values for linear regression estimates after performing variable selection step. Here is a simple simulation that shows that such a procedure might lead to wrong calibration of such tests.Consider a simple data generating pro...

## Statistics is the least important part of data science

November 14, 2013
This came up already but I'm afraid the point got lost in the middle of our long discussion of Rachel and Cathy's book. So I'll say it again: There's so much that goes on with data that is about computing, not statistics. I do think it would be fair to consider statistics (which includes sampling, […]