Politics and chance

September 25, 2016
By

After the New Hampshire primary Nadia Hassan wrote: Some have noted how minor differences in how the candidates come out in these primaries can make a huge difference in the media coverage. For example, only a few thousand voters separate third and fifth and it really impacts how pundits talk about a candidate’s performance. Chance […] The post Politics and chance appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

Windows 10 anniversary updates includes a whole Linux layer – this is good news for data scientists

September 24, 2016
By
Windows 10 anniversary updates includes a whole Linux layer – this is good news for data scientists

If you are on Windows 10, no doubt you have heard that Microsoft included the bash shell in its 2016 Windows 10 anniversary update. What you may not know is that this is much, much more than just the bash shell. This is a whole Linux layer that enables...

Read more »

Cracks in the thin blue line

September 24, 2016
By
Cracks in the thin blue line

When people screw up or cheat in their research, what do their collaborators say? The simplest case is when coauthors admit their error, as Cexun Jeffrey Cai and I did when it turned out that we’d miscoded a key variable in an analysis, invalidating the empirical claims of our award-winning paper. On the other extreme, […] The post Cracks in…

Read more »

Making sense of the most detailed map of gay marriages

September 23, 2016
By
Making sense of the most detailed map of gay marriages

Last week, my Columbia students discussed this nice article in the New York Times called "The Most Detailed Map of Gay Marriages in America". (link) The center of the article is this map: I asked the students to identify the problem that this dataviz is supposed to address. Someone responded that it tells us where gay married couples are found…

Read more »

Trump +1 in Florida; or, a quick comment on that “5 groups analyze the same poll” exercise

September 23, 2016
By
Trump +1 in Florida; or, a quick comment on that “5 groups analyze the same poll” exercise

Nate Cohn at the New York Times arranged a comparative study on a recent Florida pre-election poll. He sent the raw data to four groups (Charles Franklin; Patrick Ruffini; Margie Omero, Robert Green, Adam Rosenblatt; and Sam Corbett-Davies, David Rothschild, and me) and asked each of us to analyze the data how we’d like to […] The post Trump +1…

Read more »

Andrew Gelman is not the plagiarism police because there is no such thing as the plagiarism police.

September 23, 2016
By
Andrew Gelman is not the plagiarism police because there is no such thing as the plagiarism police.

The title of this post is a line that Thomas Basbøll wrote a couple years ago. Before I go on, let me say that the fact that I have not investigated this case in detail is not meant to imply that it’s not important or that it’s not worth investigating. It’s just not something that […] The post Andrew Gelman…

Read more »

Multicollinearity causing risk and uncertainty

September 22, 2016
By

Alexia Gaudeul writes: Maybe you will find this interesting / amusing / frightening, but the Journal of Risk and Uncertainty recently published a paper with a rather obvious multicollinearity problem. The issue does not come up that often in the published literature, so I thought you might find it interesting for your blog. The paper […] The post Multicollinearity causing…

Read more »

an inverse permutation test

September 22, 2016
By
an inverse permutation test

A straightforward but probabilistic riddle this week in the Riddler, which is to find the expected order of integer i when the sequence {1,2,…,n} is partitioned at random into two sets, A and B, each of which is then sorted before both sets are merged. For instance, if {1,2,3,4} is divided in A={1,4} and B={2,3}, […]

Read more »

Why is the scientific replication crisis centered on psychology?

September 22, 2016
By

The replication crisis is a big deal. But it’s a problem in lots of scientific fields. Why is so much of the discussion about psychology research? Why not economics, which is more controversial and gets more space in the news media? Or medicine, which has higher stakes and a regular flow of well-publicized scandals? Here […] The post Why is…

Read more »

Talk: Pie Charts – Unloved, Unstudied, and Misunderstood

September 22, 2016
By
Talk: Pie Charts – Unloved, Unstudied, and Misunderstood

I gave a talk at Information+ earlier this year that has now been posted. It's about pie charts! And it was a fun talk, too. The video is focused a bit too much on me at the beginning, so you'll miss a few of the early jokes. But for the most part, you'll see the […]

Read more »

“Crimes Against Data”: My talk at Ohio State University this Thurs; “Solving Statistics Problems Using Stan”: My talk at the University of Michigan this Fri

September 21, 2016
By

Crimes Against Data Statistics has been described as the science of uncertainty. But, paradoxically, statistical methods are often used to create a sense of certainty where none should exist. The social sciences have been rocked in recent years by highly publicized claims, published in top journals, that were reported as “statistically significant” but are implausible […] The post “Crimes Against…

Read more »

What has happened down here is the winds have changed

September 21, 2016
By
What has happened down here is the winds have changed

Someone sent me this article by psychology professor Susan Fiske, scheduled to appear in the APS Observer, a magazine of the Association for Psychological Science. The article made me a little bit sad, and I was inclined to just keep my response short and sweet, but then it seemed worth the trouble to give some […] The post What has…

Read more »

Simulate data from a generalized Gaussian distribution

September 21, 2016
By
Simulate data from a generalized Gaussian distribution

Although statisticians often assume normally distributed errors, there are important processes for which the error distribution has a heavy tail. A well-known heavy-tailed distribution is the t distribution, but the t distribution is unsuitable for some applications because it does not have finite moments (means, variance,...) for small parameter values. […] The post Simulate data from a generalized Gaussian distribution…

Read more »

A Fun Gastronomical Dataset: What’s on the Menu?

September 20, 2016
By
A Fun Gastronomical Dataset: What’s on the Menu?

I just found a fun food themed dataset that I’d never heard about and that I thought I’d share. It’s from a project called What’s on the menu where the New York Public Library has crowdsourced a digitization of their collection of historical ...

Read more »

Uncertainty in a probability

September 20, 2016
By
Uncertainty in a probability

Suppose you did a pilot study with 10 subjects and found a treatment was effective in 7 out of the 10 subjects. With no more information than this, what would you estimate the probability to be that the treatment is effective in the next subject? Easy: 0.7. Now what would you estimate the probability to be […]

Read more »

On "Shorter Papers"

September 20, 2016
By

Journals should not corral shorter papers into sections like "Shorter Papers".  Doing so sends a subtle (actually unsubtle) message that shorter papers are basically second-class citizens, somehow less good, or less important, or less so...

Read more »

“Methodological terrorism”

September 20, 2016
By
“Methodological terrorism”

Methodological terrorism is when you publish a paper in a peer-reviewed journal, its claim is supported by a statistically significant t statistic of 5.03, and someone looks at your numbers, figures out that the correct value is 1.8, and then posts that correction on social media. Terrorism is when somebody blows shit up and tries […] The post “Methodological terrorism”…

Read more »

Acupuncture paradox update

September 20, 2016
By

The acupuncture paradox, as we discussed earlier, is: The scientific consensus appears to be that, to the extent that acupuncture makes people feel better, it is through relaxing the patient, also the acupuncturist might help in other ways, encouraging the patient to focus on his or her lifestyle. But whenever I discuss the topic with […] The post Acupuncture paradox…

Read more »

Short Course on R and Data Mining, University of Canberra, Fri 7 Oct 2016

September 20, 2016
By
Short Course on R and Data Mining, University of Canberra, Fri 7 Oct 2016

Short Course on R and Data Mining Information Technology and Engineering, University of Canberra Fees: There is no fees for the short course but seats are limited to 60 – so register early through http://www.meetup.com/CanberraDataSci/events/234168862/ Presenters: Dr Yanchang Zhao (Adjunct … Continue reading →

Read more »

Tapestry 2017: St. Augustine, FL on March 1st

September 20, 2016
By
Tapestry 2017: St. Augustine, FL on March 1st

We just announced next year's Tapestry Conference – the fifth episode (chapter? act?)! It will take place on March 1st, 2017, in St. Augustine, FL. We have three exciting keynotes, and we're looking for your talk proposals, posters, and demos! Tapestry is a conference about storytelling with data. The goal is to bring together people from different backgrounds and […]

Read more »

Relative error distributions, without the heavy tail theatrics

September 20, 2016
By
Relative error distributions, without the heavy tail theatrics

Nina Zumel prepared an excellent article on the consequences of working with relative error distributed quantities (such as wealth, income, sales, and many more) called “Living in A Lognormal World.” The article emphasizes that if you are dealing with such quantities you are already seeing effects of relative error distributions (so it isn’t an exotic … Continue reading Relative error…

Read more »

StanCon is coming! Sat, 1/21/2017

September 19, 2016
By
StanCon is coming! Sat, 1/21/2017

Save the date! The first Stan conference is going to be in NYC in January. Registration will open at the end of September.   When: Saturday, January 21, 2017 9 am – 5 pm   Where: Davis Auditorium, Columbia University 530 West 120th Street 4th floor (campus level), room 412 New York, NY 10027   […] The post StanCon is…

Read more »

Primers in computational biology

September 19, 2016
By
Primers in computational biology

I recently stumbled across this collection of computational biology primers in Nature Biotechnology. Many of these are old, but they're still great resources to get a fundamental understanding of the topic. Here they are in no particular order....How d...

Read more »


Subscribe

Email:

  Subscribe