As regular readers of this blog are aware, a few months ago Val Johnson published an article, "Revised standards for statistical evidence," making a Bayesian argument that researchers and journals should use a p=0.005 publication threshold rather than the usual p=0.05. Christian Robert and I were unconvinced by Val's reasoning and wrote a response, "Revised […]

## Books & roses (& chatty taxi-drivers)

I knew that Wednesday was el dia de San Jordi but I hadn't realised the way in which Catalans celebrate it. My flight from Heathrow was slightly delayed, so by the time I arrived in Barcelona it was past 9pm. Still, on my way from the airport I ha...

## Shout out to "R Handles Big Data"

Searching for SignificanceI just wanted to give a shout out to check out this post by Bob Muenchen on his excellent blog r4stats.com for his exceptional post a little over a year ago entitle "R Handles Big Data".

## Bandit Formulations for A/B Tests: Some Intuition

Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior. – Kohavi, Henne, Sommerfeld, "Practical Guide to Controlled Experiments on the Web" (2007) A/B tests are one of the simplest ways of running controlled experiments to evaluate the efficacy of a proposed improvement (a new […]

## An open site for researchers to post and share papers

Alexander Grossman writes: We have launched a beta version of ScienceOpen in December at the occasion of the MRS Fall meeting in Boston. The participants of that conference, most of them were active researchers in physics, chemistry, and materials science, provided us with a very positive feedback. In particular they emphazised that it appears to […]

## When to use the start-at-zero rule

A response to a tweet forwarded to me. The person tweeting complained that FiveThirtyEight uses charts that don't start the vertical axis at zero. The example given was this: In this post, I want to clear some confusion around the...

## Publishing an R package in the Journal of Statistical Software

I've been an editor of JSS for the last few years, and as a result I tend to get email from people asking me about publishing papers describing R packages in JSS. So for all those wondering, here are some general comments. JSS prefers to publish papers about packages where the package is on CRAN and has been there long enough to have matured (i.e., obvious bugs ironed out and…

## Machine Learning Lesson of the Day – Estimating Coefficients in Linear Gaussian Basis Function Models

$Machine Learning Lesson of the Day – Estimating Coefficients in Linear Gaussian Basis Function Models$

Recently, I introduced linear Gaussian basis function models as a suitable modelling technique for supervised learning problems that involve non-linear relationships between the target and the predictors.  Recall that linear basis function models are generalizations of linear regression that regress the target on functions of the predictors, rather than the predictors themselves.  In linear regression, […]

## Phil 6334 Visitor: S. Stanley Young, “Statistics and Scientific Integrity”

We are pleased to announce our guest speaker at Thursday's seminar (April 24, 2014): "Statistics and Scientific Integrity": S. Stanley Young, PhD  Assistant Director for Bioinformatics National Institute of Statistical Sciences Research Triangle Park, NC Author of Resampling-Based Multiple Testing, Westfall and Young (1993) Wiley.       The main readings for the discussion are:  Young, S. & Karr, […]

## Distribution of a range

-+*Suppose you're drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you've already seen? The problem is more interesting when the interval is unknown. You may be trying to estimate the end points of the interval by taking the max and min of […]

## Thinking of doing a list experiment? Here’s a list of reasons why you should think again

Someone wrote in: We are about to conduct a voting list experiment. We came across your comment recommending that each item be removed from the list. Would greatly appreciate it if you take a few minutes to spell out your recommendation in a little more detail. In particular: (a) Why are you "uneasy" about list […]

## A Weekend With Julia: An R User’s Reflections

The Famous Julia First off, I am not going to talk much about Julia's speed. Everybody has seen the tables and graphs showing how in this benchmark or another, Julia is tens times or a hundred times faster than R.  Most blog posts talking about Ju...

## The inverse of the Hilbert matrix

Just one last short article about properties of the Hilbert matrix. I've already blogged about how to construct a Hilbert matrix in the SAS/IML language and how to compute a formula for the determinant. One reason that the Hilbert matrix is a famous (some would say infamous!) example in numerical […]

## A short questionnaire regarding the subjective assessment of evidence

E. J. Wagenmakers writes: Remember I briefly talked to you about the subjective assessment of evidence? Together with Richard Morey and myself, Annelies Bartlema created a short questionnaire that can be done online. There are five scenarios and it does not take more than 5 minutes to complete. So far we have collected responses from […]

## Applied Statistics Lesson of the Day – Notation for Fractional Factorial Designs

$Applied Statistics Lesson of the Day – Notation for Fractional Factorial Designs$

Fractional factorial designs use the notation; unfortunately, this notation is not clearly explained in most textbooks or web sites about experimental design.  I hope that my explanation below is useful. is the number of levels in each factor; note that the notation assumes that all factors have the same number of levels. If a factor has […]

## When less is more

Recently the Center for Medicaid and Medicare Services (CMS) released provider utilization and payment data. This is part of the government's ongoing push for transparency into medical services and costs. You may remember that last year they released h...

## What is meant by regression modeling?

What is meant by regression modeling? Linear Regression is one of the most common statistical modeling techniques. It is very powerful, important, and (at first glance) easy to teach. However, because it is such a broad topic it can be a minefield for teaching and discussion. It is common for angry experts to accuse writers […]

## A helpful structure for analysing graphs

Mathematicians teaching English "I became a maths teacher so I wouldn't have to mark essays" "I'm having trouble getting the students to write down their own ideas" "When I give them templates I feel as if it's spoon-feeding them" These … Continue reading →

## Drexel on Monday 4/28

Looks interesting if you're in the area.  I plan to be at the lunch.From: Drexel University's LeBow College of Business <announce@lebow.drexel.edu>Date: Thu, Apr 17, 2014 at 11:14 AMSubject: School of Economics Presents: 2 Presentations by D...

## Picking a (bio)statistics thesis topic for real world impact and transferable skills

One of the things that was hardest for me in graduate school was starting to think about my own research projects and not just the ideas my advisor fed me. I remember that it was stressful because I didn't quite … Continue reading →

## Russ Altman’s Translational Bioinformatics Year in Review

A few weeks ago the 2014 AMIA Translational Bioinformatics Meeting (TBI) was held in beautiful San Francisco.  This meeting is full of great science that spans the divide between molecular and clinical research, but a true highlight of this meetin...