A linguist has a question about sampling when the goal is causal inference from observational data

July 28, 2014
By

Nate Delaney-Busch writes: I’m a PhD student of cognitive neuroscience at Tufts, and a question came recently with my colleagues about the difficulty of random sampling in cases of highly controlled stimulus sets, and I thought I would drop a line to see if you had any reading suggestions for us. Let’s say I wanted […] The post A linguist has a question about sampling when the goal is causal…

Stan NYC Meetup – Thurs, July 31

July 28, 2014
By

The next Stan NYC meetup is happening on Thursday, July 31, at 7 pm. If you’re interested, registration is required and closes on Wednesday night: http://www.meetup.com/Stan-Users-NYC/events/193685802/   The third session will focus on using the Stan language. If you’re bringing a laptop, please come with RStan, PyStan, or CmdStan already installed.   We’re going to […] The post Stan NYC Meetup – Thurs, July 31 appeared first on Statistical Modeling,…

On deck this week

July 28, 2014
By

Mon: A linguist has a question about sampling when the goal is causal inference from observational data Tues: The Ben Geen case: Did a naive interpretation of a cluster of cases send an innocent nurse to prison until 2035? Wed: Statistics and data science, again Thurs: The health policy innovation center: how best to move […] The post On deck this week appeared first on Statistical Modeling, Causal Inference, and…

The unkind fate of data graphics in the media

July 28, 2014
By

Journalism suffers from an archiving challenge in the digital age, which I wrote about here. Even worse is the fate of data graphics. This has always been an issue, as digital archives of newspapers do not save any of the graphics. (Try going to the New York Times archive to see for yourself). The new wave of graphing technology is making this problem worse! The new technology embeds charting instructions…

A Second NBER Econometrics Group?

July 28, 2014
By

The NBER is a massive consumer of econometrics, so it needs at least a group or two devoted to producing econometrics. Hence I'm thrilled that the "Forecasting and Empirical Methods in Macroeconomics and Finance" group, now led by A...

Lexicographic combinations in SAS

July 28, 2014
By

In a previous blog post, I described how to generate combinations in SAS by using the ALLCOMB function in SAS/IML software. The ALLCOMB function in Base SAS is the equivalent function for DATA step programmers. Recall that a combination is a unique arrangement of k elements chosen from a set […]

Cigarette and life expectancy

July 28, 2014
By

Yesterday evening, I uploaded a graph, with the labor productivity as a function of coffee consumption. Of course, it was for fun ! With this kind of regression, base on aggregated data, we can say almost anything, since most of them are correlated because of some (hidden) common factor, such as the wealth of the country. For instance, with a similar approach, we can see that there is an increasing…

Coffee and Productivity

July 27, 2014
By

On Twitter, I was asked if there were serious research papers published on coffee consumption and labour productivity. There are some papers on coffee breaks and productivity, e.g. Productivity Through Coffee Breaks, but I could not find anything on coffee consumptions. Since I could not find any dataset with personal consumption (maybe I should start keeping tracks of my own consumption to run a study) I tried to find data for national…

Stan 2.4, New and Improved

July 27, 2014
By

We’re happy to announce that all three interfaces (CmdStan, PyStan, and RStan) are up and ready to go for Stan 2.4. As usual, you can find full instructions for installation on the Stan Home Page. Here are the release notes with a list of what’s new and improved: New Features ------------ * L-BFGS optimization (now […] The post Stan 2.4, New and Improved appeared first on Statistical Modeling, Causal Inference,…

Stan found using directed search

July 27, 2014
By

X and I did some “Sampling Through Adaptive Neighborhoods” ourselves the other day and checked out the nearby grave of Stanislaw Ulam, who is buried with his wife, Françoise Aron, and others of her family. The above image of Stanislaw and Françoise Ulam comes from this charming mini-biography from Roland Brasseur, which I found here. […] The post Stan found using directed search appeared first on Statistical Modeling, Causal Inference,…

NYC workshop 22 Aug on open source machine learning systems

July 26, 2014
By

The workshop is organized by John Langford (Microsoft Research NYC), along with Alekh Agarwal and Alina Beygelzimer, and it features Liblinear, Vowpal Wabbit, Torch, Theano, and . . . you guessed it . . . Stan! Here’s the current program: 8:55am: Introduction 9:00am: Liblinear by CJ Lin. 9:30am: Vowpal Wabbit and Learning to Search (John […] The post NYC workshop 22 Aug on open source machine learning systems appeared first…

Statistics, and the Goldilocks Principle

July 26, 2014
By
$\hat{f}_h(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)$

By the end of May, in Toronto, we had that great talk at the SSC by Jeff Rosenthal, on monte carlo techniques, and Jeff mention the name of “the Goldilocks principle” (it was in the contect of MCMC, and I did mention it in my talk in London on MCMC, when I discussed the value of the rejection rate of the Hastings Metropolis algorithm, which should be not to large,…

S. Senn: “Responder despondency: myths of personalized medicine” (Guest Post)

July 26, 2014
By

Stephen Senn Head, Methodology and Statistics Group Competence Center for Methodology and Statistics (CCMS) Luxembourg Responder despondency: myths of personalized medicine The road to drug development destruction is paved with good intentions. The 2013 FDA report, Paving the Way for Personalized Medicine  has an encouraging and enthusiastic foreword from Commissioner Hamburg and plenty of extremely […]

LOD Cloud Growing

July 26, 2014
By

Linked Open Data Cloud is growing. The new diagram as of April 2014 shows this development, compared to 2011 (diagram below). …Continue reading →

Guns are Cool – States

July 26, 2014
By

Last week I looked at time effects of the shootingtracker database. This week I will look at the states. Some (smaller) states never made it on the database. Other states, far too frequently. The worst of these California. After correcting for populati...

Student forecasting awards from the IIF

July 26, 2014
By

At the IIF annual board meeting last month in Rotterdam, I suggested that we provide awards to the top students studying forecasting at university level around the world, to the tune of \$100 plus IIF membership for a year. I’m delighted that the idea met with enthusiasm, and that the awards are now available. Even […]

A Few Notes on UseR! 2014

July 26, 2014
By

It has been a month since the UseR! 2014 conference, and I'm probably the last one who writes about it. UseR! is my favorite conference because it is technical and not too big. I have completely lost interest in big and broad conferences like JSM (...

library() vs require() in R

July 26, 2014
By

While I was sitting in a conference room at UseR! 2014, I started counting the number of times that require() was used in the presentations, and would rant about it after I counted to ten. With drums rolling, David won this little award (sorry, I did n...

Academic statisticians: there is no shame in developing statistical solutions that solve just one problem

July 25, 2014
By

I think that the main distinction between academic statisticians and those calling themselves data scientists is that the latter are very much willing to invest most of their time and energy into solving specific problems by analyzing specific data sets. … Continue reading →

“An Experience with a Registered Replication Project”

July 25, 2014
By

Anne Pier Salverda writes: I came across this blog entry, “An Experience with a Registered Replication Project,” and thought that you would find this interesting. It’s written by Simone Schnall, a social psychologist who is the first author of an oft-cited Psych Science(!) paper (“Cleanliness reduces the severity of moral judgments”) that a group of […] The post “An Experience with a Registered Replication Project” appeared first on Statistical Modeling,…

The top dog among jealous dogs

July 25, 2014
By

Is data visualization worth paying for? In some quarters, this may be a controversial question. If you are having doubts, just look at some examples of great visualization. This week, the NYT team brings us a wonderful example. The story...

Pat pat

July 25, 2014
By

This is probably akin to an exercise in self-pleasing, but I'll indulge in this anyway to celebrate the fact that our paper on the Bias in the Eurovision song contest voting (the last in a relatively long series of posts on this is here) has now over 4...

Interactive visualization of non-linear logistic regression decision boundaries with Shiny

July 24, 2014
By

(skip to the shiny app) Model building is very often an iterative process that involves multiple steps of choosing an algorithm and hyperparameters, evaluating that model / cross validation, and optimizing the hyperparameters. I find a great aid in this… Continue reading →