A follow-up to Crowdsourcing Research

February 11, 2016
By
A follow-up to Crowdsourcing Research

Last month I published some thoughts on crowdsourcing research, inspired by Anthony Goldbloom’s talk at Statistical Programming DC on the Kaggle experience. Today, I found a rather similar discussion  on crowdsourcing research (on the online version of the magazine Good) as a potential way to increase the accuracy of scientific research and reducing bias. I think more consideration needs to […]

Read more »

Data handcuffs

February 10, 2016
By

A few years ago, if you asked me what the top skills I got asked about for students going into industry, I'd definitely have said things like data cleaning, data transformation, database pulls, and other non-traditional statistical tasks. But as companies have progressed from the point of storing data to actually wanting to do something with

Read more »

La guerre des étoiles : distinguer le signal du bruit

February 10, 2016
By
La guerre des étoiles : distinguer le signal du bruit

La grande difficulté dans la modélisation et la construction de modèles prédictifs est de réussir à distinguer le signal et le bruit (pour reprendre le titre du classique de Nate Silver). La réponse statistique est la notion de significativité, et la recherche des ‘étoiles’ dans les sorties de régression. Avec l’explosion du nombre de données, il est devenu crucial de…

Read more »

Scientific explanation of Panther defeat!

February 10, 2016
By
Scientific explanation of Panther defeat!

Roy’s comment on our recent post inspires me to reveal the true explanation underlying the Carolina team’s shocking Super Bowl loss. The Panthers were primed during the previous week with elderly-themed words such as “bingo” and “Manning.” As well-established research has demonstrated, this caused Cam and the gang to move more slowly, hence all the […] The post Scientific explanation…

Read more »

More on Policy Uncertainty

February 10, 2016
By

Speaking of policy uncertainty (earlier blog post here), read here about the exciting ongoing project at the Becker-Friedman Institute of the University of Chicago.

Read more »

Who do you think I am?

February 10, 2016
By
Who do you think I am?

Today I've received an email inviting me to submit a paper to a scientific journal (it doesn't really matter what journal it is, or whether or not my own work is actually relevant for them). I looove the way they've addressed to me, though. I think tha...

Read more »

Sample with replacement and unequal probability in SAS

February 10, 2016
By
Sample with replacement and unequal probability in SAS

How do you sample with replacement in SAS when the probability of choosing each observation varies? I was asked this question recently. The programmer thought he could use PROC SURVEYSELECT to generate the samples, but he wasn't sure which sampling technique he should use to sample with unequal probability. This […] The post Sample with replacement and unequal probability in…

Read more »

Books to Read While the Algae Grow in Your Fur, January 2016

February 10, 2016
By

Attention conservation notice: I have no taste. Mark Thompson, The White War: Life and Death on the Italian Front, 1915--1919 A well-told narrative history of the war, mostly from the Italian side. He covers all aspects, from the back-and-forth of ...

Read more »

"Robust Bayesian inference via coarsening" (Next Week at the Statistics Seminar)

February 10, 2016
By

Attention conservation notice: Only of interest if you (1) care allocating precise fractions of a whole belief over a set of mathematical models when you know none of them is actually believable, and (2) will be in Pittsburgh on Monday. As someone wh...

Read more »

Worst Practices Conference

February 10, 2016
By

This ad just arrived in the email.  What a title.  Presumably the conference is about improving worst-case outcomes in order to improve expected minimax loss.  But still, that title...2016 Foresight Practitioner Conference:Worst Practice...

Read more »

Finding the K in K-means by Parametric Bootstrap

February 9, 2016
By
Finding the K in K-means by Parametric Bootstrap

One of the trickier tasks in clustering is determining the appropriate number of clusters. Domain-specific knowledge is always best, when you have it, but there are a number of heuristics for getting at the likely number of clusters in your data. We cover a few of them in Chapter 8 (available as a free sample … Continue reading Finding the…

Read more »

Posterior Update of Bayes@Lund 2016

February 9, 2016
By
Posterior Update of Bayes@Lund 2016

For the third year round I and Ullrika Sahlin arranged Bayes@Lund, a mini-conference bringing together researchers interested in or working with Bayesian methods in and around Sweden. This year we were thrilled to have over 70 attendees, both from ne...

Read more »

Analysis: Clinton backed by Big Money: Sanders by Small

February 9, 2016
By
Analysis: Clinton backed by Big Money: Sanders by Small

This article examines FEC data in depth and finds what most people already know. Hillary Clinton's presidential bid is financed largely through a relatively small quantity of big donors while Bernie Sanders' presidential bid is funded by numerous small donors.In order to do our analysis, we look at four hundred thousand individualized contributions reported to the FEC at the end…

Read more »

Leek group guide to reading scientific papers

February 9, 2016
By

The other day on Twitter Amelia requested a guide for reading papers I love @jtleek’s github guides to reviewing papers, writing R packages, giving talks, etc. Would love one on reading papers, for students. — Amelia McNamara (@AmeliaMN) February 5, 2016   So I came up with a guide which you can find here: Leek

Read more »

The Replication Network

February 9, 2016
By
The Replication Network

This is a "shout out" for The Replication Network.The full name is, The Replication Network: Furthering the Practice of Replication in Economics. I was alerted to TRN some time ago by co-organiser, Bob Reed, and I'm pleased to be a member.What's T...

Read more »

Stan’s Super Bowl prediction: Broncos 24, Panthers 13

February 9, 2016
By
Stan’s Super Bowl prediction:  Broncos 24, Panthers 13

We ran the data through our model, not just the data from the past season but from the past 17 seasons (that’s what we could easily access) with a Gaussian process model to allow team abilities to vary over time. Because we’re modeling individual game outcomes, our model automatically controls for imbalances such as Carolina’s […] The post Stan’s Super…

Read more »

Phd positions in Probabilistic Machine Learning at #AaltoPML group Finland

February 9, 2016
By
Phd positions in Probabilistic Machine Learning at #AaltoPML group Finland

There are PhD positions in our Probabilistic Machine Learning group at Aalto, Finland, and altogether 15 positions in Helsinki ICT network. Apply here The most interesting topic in the call is supervised by Prof. Samuel Kaski at AaltoPML (and you may collaborate with me too :) We are looking for PhD candidates interested in probabilistic […] The post Phd positions…

Read more »

Primed to lose

February 9, 2016
By
Primed to lose

David Hogg points me to a recent paper, “A Social Priming Data Set With Troubling Oddities” by Hal Pashler, Doug Rohrer, Ian Abramson, Tanya Wolfson, and Christine Harris, which begins: Chatterjee, Rose, and Sinha (2013) presented results from three experiments investigating social priming—specifically, priming effects induced by incidental exposure to concepts relating to cash or […] The post Primed to…

Read more »

New Judea Pearl Causal Inference "Primer"

February 9, 2016
By
New Judea Pearl Causal Inference "Primer"

Should be a fun and informative read. Check out the contents and various chapters here. ("Causal Inference in Statistics - A Primer" by J. Pearl, M. Glymour and N. Jewell. Available now on Kindle; available in print Feb. 26, 2016.)

Read more »

Using SVG graphics in blog posts

February 9, 2016
By
Using SVG graphics in blog posts

My traditional work flow for embedding R graphics into a blog post has been via a PNG files that I upload online. However, when I created a 'simple' graphic with only basic curves and triangles for a recent post, I noticed that the PNG output didn't lo...

Read more »

The State of Information Visualization, 2016

February 8, 2016
By
The State of Information Visualization, 2016

Oh hello, new year! I almost didn’t see you there! Lots of interesting things happened last year: Dear Data, deceptive visualization, storytelling research, new tools and ideas, etc. And this year is already shaping up to be quite strong, too. Dear Data Perhaps the most exciting project of 2015 was Dear Data by Giorgia Lupi and Stefanie Posavec. They … Continue reading The State…

Read more »

Neglected optimization topic: set diversity

February 8, 2016
By
Neglected optimization topic: set diversity

The mathematical concept of set diversity is a somewhat neglected topic in current applied decision sciences and optimization. We take this opportunity to discuss the issue. The problem Consider the following problem: for a number of items U = {x_1, … x_n} pick a small set of them X = {x_i1, x_i2, ..., x_ik} such … Continue reading Neglected optimization…

Read more »

Forking paths vs. six quick regression tips

February 8, 2016
By
Forking paths vs. six quick regression tips

Bill Harris writes: I know you’re on a blog delay, but I’d like to vote to raise the odds that my question in a comment to http://andrewgelman.com/2015/09/15/even-though-its-published-in-a-top-psychology-journal-she-still-doesnt-believe-it/gets discussed, in case it’s not in your queue. It’s likely just my simple misunderstanding, but I’ve sensed two bits of contradictory advice in your writing: fit one complete model all at […] The post Forking paths…

Read more »


Subscribe

Email:

  Subscribe