Frustration with published results that can’t be reproduced, and journals that don’t seem to care

December 4, 2016
By
Frustration with published results that can’t be reproduced, and journals that don’t seem to care

Thomas Heister writes: Your recent post about Per Pettersson-Lidbom frustrations in reproducing study results reminded me of our own recent experience that we had in replicating a paper in PLOSone. We found numerous substantial errors but eventually gave up as, frustratingly, the time and effort didn’t seem to change anything and the journal’s editors quite […] The post Frustration with published results that can’t be reproduced, and journals that don’t…

Read more »

December Reading List

December 3, 2016
By
December Reading List

Goodness me! November went by really quickly!Bagnato, L., L. De Capitani, & A. Punzo, 2016. Testing for serial independence: Beyond the portmanteau approach. American Statistician, in press.Aastveit, K.A., C. Foroni, & F. Ravazzolo, 2016. Densi...

Read more »

So little information to evaluate effects of dietary choices

December 3, 2016
By
So little information to evaluate effects of dietary choices

Paul Alper points to this excellent news article by Aaron Carroll, who tells us how little information is available in studies of diet and public health. Here’s Carroll: Just a few weeks ago, a study was published in the Journal of Nutrition that many reports in the news media said proved that honey was no […] The post So little…

Read more »

Be careful evaluating model predictions

December 3, 2016
By
Be careful evaluating model predictions

One thing I teach is: when evaluating the performance of regression models you should not use correlation as your score. This is because correlation tells you if a re-scaling of your result is useful, but you want to know if the result in your hand is in fact useful. For example: the Mars Climate Orbiter … Continue reading Be careful…

Read more »

Some U.S. demographic data at zipcode level conveniently in R

December 2, 2016
By

Ari Lamstein writes: I chuckled when I read your recent “R Sucks” post. Some of the comments were a bit … heated … so I thought to send you an email instead. I agree with your point that some of the datasets in R are not particularly relevant. The way that I’ve addressed that is […] The post Some U.S.…

Read more »

Good stuff around

December 2, 2016
By
Good stuff around

Lately, I've been publicising quite heavily our Summer school and new MSc, but of course, we're not the only one to plan for interesting things worth mentioning $-$ well, of course this is highly subjective... But then again, this blog is (mainly)...

Read more »

Survey weighting and that 2% swing

December 1, 2016
By
Survey weighting and that 2% swing

Nate Silver agrees with me that much of that shocking 2% swing can be explained by systematic differences between sample and population: survey respondents included too many Clinton supporters, even after corrections from existing survey adjustments. In Nate’s words, “Pollsters Probably Didn’t Talk To Enough White Voters Without College Degrees.” Last time we looked carefully […] The post Survey weighting…

Read more »

ratio-of-uniforms [#4]

December 1, 2016
By
ratio-of-uniforms [#4]

Possibly the last post on random number generation by Kinderman and Monahan’s (1977) ratio-of-uniform method. After fiddling with the Gamma(a,1) distribution when a<1 for a while, I indeed figured out a way to produce a bounded set with this method: considering an arbitrary cdf Φ with corresponding pdf φ, the uniform distribution on the set […]

Read more »

RStudio in the cloud with Amazon Lightsail and docker

December 1, 2016
By
RStudio in the cloud with Amazon Lightsail and docker

About two years ago we published a quick and easy guide to setting up your own RStudio server in the cloud using the Docker service and Digital Ocean. The process is incredibly easy-- about the only cumbersome part is retyping a random password. Toda...

Read more »

Efficiently Saving and Sharing Data in R

December 1, 2016
By
Efficiently Saving and Sharing Data in R

After spending a day the other week struggling to make sense of a federal data set shared in an archaic format (ASCII fixed format dat file). It is essential for the effective distribution and sharing of data that it use the minimum amount of disk spac...

Read more »

How can you evaluate a research paper?

December 1, 2016
By
How can you evaluate a research paper?

Shea Levy writes: You ended a post from last month [i.e., Feb.] with the injunction to not take the fact of a paper’s publication or citation status as meaning anything, and instead that we should “read each paper on its own.” Unfortunately, while I can usually follow e.g. the criticisms of a paper you might […] The post How can…

Read more »

Sorting out the data, and creating the head-shake manual

December 1, 2016
By
Sorting out the data, and creating the head-shake manual

Yesterday's post attracted a few good comments. Several readers don't like the data used in the NAEP score chart. The authors labeled the metric "gain in NAEP scale scores" which I interpreted to be "gain scores," a popular way of...

Read more »

asymptotically exact inference in likelihood-free models [a reply from the authors]

November 30, 2016
By
asymptotically exact inference in likelihood-free models [a reply from the authors]

[Following my post of lastTuesday, Matt Graham commented on the paper with force détails. Here are those comments. A nicer HTML version of the Markdown reply below is also available on Github.] Thanks for the comments on the paper! A few additional replies to augment what Amos wrote: This however sounds somewhat intense in that […]

Read more »

An exciting new entry in the “clueless graphs from clueless rich guys” competition

November 30, 2016
By
An exciting new entry in the “clueless graphs from clueless rich guys” competition

Jeff Lax points to this post from Matt Novak linking to a post by Matt Taibbi that shares the above graph from newspaper columnist / rich guy Thomas Friedman. I’m not one to spend precious blog space mocking bad graphs, so I’ll refer you to Novak and Taibbi for the details. One thing I do […] The post An exciting…

Read more »

Review: Jon Schwabish, Better Presentations

November 30, 2016
By
Review: Jon Schwabish, Better Presentations

Presentations can be dreadful. Badly thought-out slides, boring structure, poorly delivered. I once told a colleague after a practice talk to please shoot me before she’d ever make me sit through such a talk again (to be fair, she had called the talk boring herself before she even began). Instead of suffering through more bad presentations, Jon […]

Read more »

Reading a Picture

November 30, 2016
By
Reading a Picture

Visual storytelling Visualising data helps understanding facts. Sometimes it’s very easy to understand a graph; sometimes it’s necessary to read it and to study it to discover unknown territory. Such graphs are little masterpieces. Here’s one of these and I am sure the authors had more than one iteration and discussion while creating it. The … Continue reading Reading a…

Read more »

Interesting epi paper using Stan

November 30, 2016
By

Jon Zelner writes: Just thought I’d send along this paper by Justin Lessler et al. Thought it was both clever & useful and a nice ad for using Stan for epidemiological work. Basically, what this paper is about is estimating the true prevalence and case fatality ratio of MERS-CoV [Middle East Respiratory Syndrome Coronavirus Infection] […] The post Interesting epi…

Read more »

Involuntary head-shaking is probably not an intended consequence of data visualization

November 30, 2016
By
Involuntary head-shaking is probably not an intended consequence of data visualization

This chart is in the Sept/Oct edition of Harvard Magazine: Pretty standard fare. It even is Tufte-sque in the sparing use of axes, labels, and other non-data-ink. Does it bug you how much work you need to do to understand...

Read more »

BERT: a newcomer in the R Excel connection

November 30, 2016
By
BERT: a newcomer in the R Excel connection

A few months ago a reader point me out this new way of connecting R and Excel. I don’t know for how long this has been around, but I never came across it and I’ve never seen any blog post or article about it. So I decided to write a post as the tool is really […]

Read more »

Append data to add markers to SAS graphs

November 30, 2016
By
Append data to add markers to SAS graphs

Do you want to create customized SAS graphs by using PROC SGPLOT and the other ODS graphics procedures? An essential skill that you need to learn is how to merge, join, append, and concatenate SAS data sets that come from different sources. The SAS statistical graphics procedures (SG procedures) enable […] The post Append data to add markers to SAS…

Read more »

vtreat data cleaning and preparation article now available on arXiv

November 30, 2016
By

Nina Zumel and I are happy to announce a formal article discussing data preparation and cleaning using the vtreat methodology is now available from arXiv.org as citation arXiv:1611.09477 [stat.AP]. vtreat is an R data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. It prepares variables so that data has fewer … Continue reading vtreat data…

Read more »

Not So Standard Deviations Episode 27 – Special Guest Amelia McNamara

November 30, 2016
By

I had the pleasure of sitting down with Amelia McNamara, Visiting Assistant Professor of Statistical and Data Sciences at Smith College, to talk about data science, data journalism, visualization, the problems with R, and adult coloring books. If you ...

Read more »

“A bug in fMRI software could invalidate 15 years of brain research”

November 29, 2016
By
“A bug in fMRI software could invalidate 15 years of brain research”

About 50 people pointed me to this press release or the underlying PPNAS research article, “Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates,” by Anders Eklund, Thomas Nichols, and Hans Knutsson, who write: Functional MRI (fMRI) is 25 years old, yet surprisingly its most common statistical methods have not been validated […] The post “A bug…

Read more »


Subscribe

Email:

  Subscribe