Blog Archives

Johns Hopkins Data Science Specialization Captsone 2 Top Performers

June 10, 2015
By
Johns Hopkins Data Science Specialization Captsone 2 Top Performers

The second capstone session of the Johns Hopkins Data Science Specialization concluded recently. This time, we had 1,040 learners sign up to participate in the session, which again featured a project developed in collaboration with the amazingly innovative folks at SwiftKey.  We've identified the learners listed below as the top performers in this capstone session.

Read more »

I’m a data scientist – mind if I do surgery on your heart?

June 8, 2015
By

There has been a lot of recent interest from scientific journals and from other folks in creating checklists for data science and data analysis. The idea is that the checklist will help prevent results that won't reproduce or replicate from the literature. One analogy that I'm frequently hearing is the analogy with checklists for surgeons that can

Read more »

Interview with Chris Wiggins, chief data scientist at the New York Times

June 1, 2015
By

Editor's note: We are trying something a little new here and doing an interview with Google Hangouts on Air. The interview will be live at 11:30am EST. I have some questions lined up for Chris, but if you have others you'd like to ask, you can tweet them @simplystats and I'll see if I can

Read more »

Science is a calling and a career, here is a career planning guide for students and postdocs

May 28, 2015
By

Editor’s note: This post was inspired by a really awesome career planning guide that Ben Langmead wrote up for his postdocs which you should go check out right now. You can also find the slightly adapted Leek group career planning guide here. The most common reason that people go into science is altruistic. They loved

Read more »

Residual expertise – or why scientists are amateurs at most of science

May 18, 2015
By

Editor's note: I have been unsuccessfully attempting to finish a book I started 3 years ago about how and why everyone should get pumped about reading and understanding scientific papers. I've adapted part of one of the chapters into this blogpost. It is pretty raw but hopefully gets the idea across.  An episode of The Daily Show with

Read more »

The tyranny of the idea in science

May 8, 2015
By

There are a lot of analogies between startups and academic science labs. One thing that is definitely very different is the relative value of ideas in the startup world and in the academic world. For example, Paul Graham has said: Actually, startup ideas are not million dollar ideas, and here's an experiment you can try

Read more »

Mendelian randomization inspires a randomized trial design for multiple drugs simultaneously

May 7, 2015
By
Mendelian randomization inspires a randomized trial design for multiple drugs simultaneously

Joe Pickrell has an interesting new paper out about Mendelian randomization. He discusses some of the interesting issues that come up with these studies and performs a mini-review of previously published studies using the technique. The basic idea behind Mendelian Randomization is the following. In a simple, randomly mating population Mendel's laws tell us that at any

Read more »

Rafa’s citations above replacement in statistics journals is crazy high.

May 1, 2015
By
Rafa’s citations above replacement in statistics journals is crazy high.

Editor's note:  I thought it would be fun to do some bibliometrics on a Friday. This is super hacky and the CAR/Y stat should not be taken seriously.  I downloaded data on the 400 most cited papers between 2000-2010 in some statistical journals from Web of Science. Here is a boxplot of the average number

Read more »

Data analysis subcultures

April 29, 2015
By

Roger and I responded to the controversy around the journal that banned p-values today in Nature. A piece like this requires a lot of information packed into very little space but I thought one idea that deserved to be talked about more was the idea of data analysis subcultures. From the paper: Data analysis is taught

Read more »

A blessing of dimensionality often observed in high-dimensional data sets

April 9, 2015
By

Tidy data sets have one observation per row and one variable per column.  Using this definition, big data sets can be either: Wide - a wide data set has a large number of measurements per observation, but fewer observations. This type of data set is typical in neuroimaging, genomics, and other biomedical applications. Tall - a

Read more »


Subscribe

Email:

  Subscribe