Blog Archives

Picking a (bio)statistics thesis topic for real world impact and transferable skills

April 22, 2014
By

One of the things that was hardest for me in graduate school was starting to think about my own research projects and not just the ideas my advisor fed me. I remember that it was stressful because I didn't quite … Continue reading →

Read more »

The #rOpenSci hackathon #ropenhack

April 10, 2014
By

Editor's note: This is a guest post by Alyssa Frazee, a graduate student in the Biostatistics department at Johns Hopkins and a participant in the recent rOpenSci hackathon.  Last week, I took a break from my normal PhD student schedule … Continue reading →

Read more »

A non-comprehensive comparison of prominent data science programs on cost and frequency.

March 26, 2014
By
A non-comprehensive comparison of prominent data science programs on cost and frequency.

We did a really brief comparison of a few notable data science programs for a grant submission we were working on. I thought it was pretty fascinating, so I'm posting it here. A couple of notes about the table. 1. Our … Continue reading →

Read more »

The 80/20 rule of statistical methods development

March 20, 2014
By
The 80/20 rule of statistical methods development

Developing statistical methods is hard and often frustrating work. One of the under appreciated rules in statistical methods development is what I call the 80/20 rule (maybe could even by the 90/10 rule). The basic idea is that the first … Continue reading →

Read more »

The time traveler’s challenge.

March 19, 2014
By

Editor's note: This has nothing to do with statistics.  I do a lot of statistics for a living and would claim to know a relatively large amount about it. I also know a little bit about a bunch of other scientific … Continue reading →

Read more »

Oh no, the Leekasso….

March 12, 2014
By
Oh no, the Leekasso….

An astute reader (Niels Hansen, who is visiting our department today) caught a bug in my code on Github for the Leekasso. I had: lm1 = lm(y ~ leekX) predict.lm(lm1,as.data.frame(leekX2)) Unfortunately, this meant that I was getting predictions for the … Continue reading →

Read more »

PLoS One, I have an idea for what to do with all your profits: buy hard drives

March 5, 2014
By

I've been closely following the fallout from PLoS One's new policy for data sharing. The policy says, basically, that if you publish a paper, all data and code to go with that paper should be made publicly available at the … Continue reading →

Read more »

Repost: Ronald Fisher is one of the few scientists with a legit claim to most influential scientist ever

February 17, 2014
By
Repost: Ronald Fisher is one of the few scientists with a legit claim to most influential scientist ever

Editor's Note: Ronald  This is a repost of the post "R.A. Fisher is the most influential scientist ever" with a picture of my pilgrimage to his  gravesite in Adelaide, Australia.  You can now see profiles of famous scientists on Google Scholar citations. … Continue reading →

Read more »

On the scalability of statistical procedures: why the p-value bashers just don’t get it.

February 14, 2014
By

Executive Summary The problem is not p-values it is a fundamental shortage of data analytic skill. In general it makes sense to reduce researcher degrees of freedom for non-experts, but any choice of statistic, when used by many untrained people, … Continue reading →

Read more »

Monday data/statistics link roundup (2/10/14)

February 10, 2014
By

I'm going to try Monday's for the links. Let me know what you think. The Guardian is reading our blog. A week after Rafa posts that everyone should learn to code for career preparedness, the Guardian gets on the bandwagon. … Continue reading →

Read more »


Subscribe

Email:

  Subscribe