## p-values are (possibly biased) estimates of the probability that the null hypothesis is true

April 1, 2013
Last week, I posted about statisticians’ constant battle against the belief that the p-value associated (for example) with a regression coefficient is equal to the probability that the null hypothesis is true, for a null hypothesis that beta is zero or negative. I argued that (despite our long pedagogical practice) there are, in fact, many […]

## How do Dew and Fog Form? Nature at Work with Temperature, Vapour Pressure, and Partial Pressure

In the early morning, especially here in Canada, I often see dew – water droplets formed by the condensation of water vapour on outside surfaces, like windows, car roofs, and leaves of trees.  I also sometimes see fog – water droplets or ice crystals that are suspended in air and often blocking visibility at great […]

## Context – if it isn’t fun…

March 31, 2013
The role of context in statistical analysis The wonderful advantage of teaching statistics is the real-life context within which any applicaton must exist. This can also be one of the difficulties. Statistics without context is merely the mathematics of statistics,

## Checking for Normality with Quantile Ranges and the Standard Deviation

Introduction I was reading Michael Trosset’s “An Introduction to Statistical Inference and Its Applications with R”, and I learned a basic but interesting fact about the normal distribution’s interquartile range and standard deviation that I had not learned before.  This turns out to be a good way to check for normality in a data set. […]

## Easter

March 31, 2013
This morning, there was an interesting post entitled “why does Easter move around so much?” online on http://economist.com/blogs/economist-explains/… In my time series classes, I keep saying that sometimes, series can exhibit seasonlity, but the seasonal effect can be quite irregular. It is the cas for river levels, where snowmelt can have a huge impact, and it is irregular. Similarly, chocolate sales (even monthly, or quarterly) depends on Easter. Because it can be…

## He’s getting ready to write a book

March 31, 2013
Eric Novik does some open-source planning: My co-author, Jacki Buros, and I [Novik] have just signed a contract with Apress to write a book tentatively entitled “Predictive Analytics with R”, which will cover programming best practices, data munging, data exploration, and single and multi-level models with case studies in social media, healthcare, politics, marketing, and [...]

## Introduction to Approximate Bayesian Computation (ABC)

March 31, 2013
Many of the posts in this blog have been concerned with using MCMC based methods for Bayesian inference. These methods are typically “exact” in the sense that they have the exact posterior distribution of interest as their target equilibrium distribution, but are obviously “approximate”, in that for any finite amount of computing time, we can […]

## George E P Box (1919–2013)

March 31, 2013
Last Thursday (28 March 2013), George Box passed away at the age of 93. He was one of the great statisticians of the last 100 years, and leaves an astonishingly diverse legacy. When I teach forecasting to my second year commerce students, we cover Box-Cox transformations, Box-Pierce and Ljung-Box tests, and Box-Jenkins modelling, and my students wonder if it is the same Box in all cases. It is. And we…

## R: Importing Data

March 31, 2013
There are number of ways in importing data into R, and several formats are available,From Excel to R From SPSS to RFrom Stata to R, and more hereIn this post, I'm going to talk about importing common data format that we often encounter, such as Excel, ...

## Topological Inference

March 31, 2013
We uploaded a paper called Statistical Inference For Persistent Homology on arXiv. (I posted about topological data analysis earlier here.) The paper is written with Siva Balakrishnan, Brittany Fasy, Fabrizio Lecci, Alessandro Rinaldo and Aarti Singh. The basic idea is this. We observe data where and is supported on a set . We want to […]

## “Statistical Modeling: A Fresh Approach”

March 30, 2013
Ben Hansen recommended to me this book and course by Daniel Kaplan. It looks pretty good. I’ve only looked at the website, not the book itself, and I’m sure I’d find lots of places to disagree with it on details, but the general flow seemed reasonable, also I liked that there’s lots of course materials [...]

## More ordinal data display

March 30, 2013
The past two weeks I made a post regarding analyzing ordinal data with R and JAGS. The calculations in the second part made me realize I could actually get top two box intervals out of R. This demonstrated here. For that I needed the inv...

## Presenting without slides

March 30, 2013
Tired of slides, I’ve been experimenting with different ways of presenting. At the recent Conference on Statistical Practice, I decided only to use slides for an outline and references. As it turns out, the most critical feedback I got had to do with...

## The Art of R Programming review – part 5

March 30, 2013
It's what you've all been waiting for! Let's continue on with our book review: In Chapter 8, the author discusses Math and Simulation functions in R. The topics in this chapter could fill a book given this is what R is primarily used for. However the a...

## R: Entering Data

March 30, 2013
To enter data into R, two common and easy to use R functions are utilized.The concatenate function, c; and,The data.frame function.The concatenate function, c, is use for combining data points into a single numeric R object. The usage of this function ...

March 30, 2013
There are two ways to install R in Ubuntu. One is through the terminal, and the other is through the Ubuntu Software Center.Through TerminalPress Ctrl+Alt+T to open Terminal Then execute sudo apt-get update After that, sudo apt-get install r-baseT...

March 30, 2013
The R statistical package is available at CRAN website. In this site, do the followingClick the Download R for Windows. Then go to install R for the first time in the base subdirectory. And finally download the latest version of R for windows...

## Householder matrices

March 29, 2013
Householder matrices are square matrices of the form $$P = I - \beta v v^T$$ where $\beta$ is a scalar and $v$ is a vector. It has the useful property that for suitable chosen $v$ and $\beta$ it makes the product $P x$ to zero out all of the coordinat...

## Open Data Exchange 2013, April 6. Montreal

March 29, 2013
UPDATE: The day was great! There are many people doing really amazing things with open data and it was amazing to meet them. Here are my slides from the panel talk. Next Saturday, I’ll be sitting on a panel discussing future avenues for open data at ODX13. From the odx13 site: Odx13 is a mini-conference […]

## Another Feller theory

March 29, 2013
My paper with Christian Robert, “Not Only Defended But Also Applied”: The Perceived Absurdity of Bayesian Inference, was recently published in The American Statistician, along with discussions by Steve Fienberg, Steve Stigler, Deborah Mayo, and Wesley Johnson, and our rejoinder, The Anti-Bayesian Moment and Its Passing. These articles revolved around the question of why the [...]

## The mirage of large numbers

March 29, 2013
The first thing one (should) learn about statistics is "all that data is not information." That's the very first thing I tell my class each semester. This message is doubly resonant in this age of "Big Data". *** I was reading a post on Dell at Felix Salmon's blog, a post written by Ryan McCarthy or Ben Walsh. It cited BusinessWeek's Roben Farzad: "When it comes to putting a price…

## latent Gaussian model workshop in Reykjavik

March 28, 2013
An announcement for an Icelandic meeting next September, meeting I would have loved to attend (darn!)… This meeting is sponsored by the BayesComp session, of course!!! We are pleased to announce that the University of Iceland will host the 3rd Workshop on Bayesian Inference for Latent Gaussian Models with Applications (LGM). The workshop will be [...]

## Generalized Pairs Plot: It’s about time!

March 28, 2013
JW Emerson, WA Green, B Schloerke, J Crowley, D Cook, H Hofmann, H Wickham (2013) The Generalized Pairs Plot. Journal of Computational and Graphical Statistics 22(1). Here's a free preprint version. Until this new paper and implementation by Emerson et al., there were no widely available pairs plots that accommodated both numerical and categorical fields. [...]