## Data science for executives and managers

October 22, 2016
By

Nina Zumel recently announced upcoming speaking appearances. I want to promote the upcoming sessions at ODSC West 2016 (11:15am-1:00pm on Friday November 4th, or 3:00pm-4:30pm on Saturday November 5th) and invite executives, managers, and other data science consumers to attend. We assume most of the Win-Vector blog audience is made of practitioners (who we hope … Continue reading Data science for executives and managers

## Posterior predictive distribution for multiple linear regression

October 22, 2016
By

Suppose you've done a (robust) Bayesian multiple linear regression, and now you want the posterior distribution on the predicted value of $$y$$ for some probe value of $$\langle x_1,x_2,x_3, ... \rangle$$. That is, not the posterior distribution on t...

## LOD MOOC

October 22, 2016
By

Massive Open Online Courses (MOOC) are available worldwide and offer tons of topics, also about Linked Open Data (LOD). An easy way to enter the semantic web. Two examples: HPI The Hasso Plattner Institute, Potsdam provides, for some years now, a course in Linked Data Engineering with a certificate. I did it some years ago and … Continue reading LOD MOOC

## Another failed replication of power pose

October 22, 2016
By

Someone sent me this recent article, “Embodying Power: A Preregistered Replication and Extension of the Power Pose Effect,” by Katie Garrison, David Tang, and Brandon Schmeichel. Unsurprisingly (given that the experiment was preregistered), the authors found no evidence for any effect of power pose. The Garrison et al. paper is reasonable enough, but for my […] The post Another failed replication of power pose appeared first on Statistical Modeling, Causal…

## Tourism forecasting competition data as an R package

October 22, 2016
By

The data used in the tourism forecasting competition, discussed in Athanasopoulos et al (2011), have been made available in the Tcomp package for R. The objects are of the same format as for Mcomp package containing data from the M1 and M3 competitions. Thanks to Peter Ellis for putting the package together. He has also […]

## Why is my cat orange?

October 21, 2016
By

One of the students in my Bayesian statistics class, Mafalda Borges, came up with an excellent new Bayes theorem problem.  Here's my paraphrase:About 3/4 of orange cats are male.  If my cat is orange, what is the probability that his mother w...

## Practical Bayesian model evaluation in Stan and rstanarm using leave-one-out cross-validation

October 21, 2016
By

Our (Aki, Andrew and Jonah) paper Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC was recently published in Statistics and Computing. In the paper we show why it’s better to use LOO instead of WAIC for model evaluation how to compute LOO quickly and reliably using the full posterior sample how Pareto smoothing importance […] The post Practical Bayesian model evaluation in Stan and rstanarm using leave-one-out cross-validation appeared…

## Authors of AJPS paper find that the signs on their coefficients were reversed. But they don’t care: in their words, “None of our papers actually give a damn about whether it’s plus or minus.” All right, then!

October 21, 2016
By

Avi Adler writes: I hit you up on twitter, and you probably saw this already, but you may enjoy this. I’m not actually on twitter but I do read email, so I followed the link and read this post by Steven Hayward: EPIC CORRECTION OF THE DECADE Hoo-wee, the New York Times will really have […] The post Authors of AJPS paper find that the signs on their coefficients were…

## Reader’s guide to the power pose controversy 2

October 21, 2016
By

Yesterday, I started a series of posts covering the "power pose" research controversy. The plan is as follows: Key Idea 1: Peer Review, Manuscripts, Pop Science and TED Talks Key Idea 2: P < 0.05, P-hacking, Replication Studies, Pre-registration Key Idea 3: Negative Studies, and the File Drawer (Today) Key Idea 4: Degrees of Freedom, and the Garden of Forking Paths Key Idea 5: Sample Size Here is a quick…

## Notes from the Kölner R meeting, 14 October 2016

October 21, 2016
By

Last Friday the Cologne R user group came together for two talks and a quiz at Eye/o, the company behind Adblock Plus, in Köln-Ehrenfeld. Eye/o were a great host, offering nibbles and drinks to warm up the event and pizza at the end.Cologne R user mee...

## Avoiding model selection in Bayesian social research

October 21, 2016
By

One of my favorites, from 1995. Don Rubin and I argue with Adrian Raftery. Here’s how we begin: Raftery’s paper addresses two important problems in the statistical analysis of social science data: (1) choosing an appropriate model when so much data are available that standard P-values reject all parsimonious models; and (2) making estimates and […] The post Avoiding model selection in Bayesian social research appeared first on Statistical Modeling,…

## Polls

October 20, 2016
By

A journalist sent me a bunch of questions regarding problems with polls. Here was my reply: In answer to your question, no, the polls in Brexit did not fail. They were pretty good. See here and here. The polls also successfully estimated Donald Trump’s success in the Republican primary election. I think that poll responses […] The post Polls appeared first on Statistical Modeling, Causal Inference, and Social Science.

## A Treemap Chart Pie

October 20, 2016
By

After his recent early chart pie attempts, Ben Shneiderman has now achieved the ultimate in chart pie baking: a treemap chart pie.  In case you're wondering about the significance of this momentous achievement: treemaps were Ben's idea. Just as before, I'll just let Ben explain this one. I can't top his puns, anyway. HI Robert, I […]

## We have a ways to go in communicating the replication crisis

October 20, 2016
By

I happened to come across this old post today with this amazing, amazing quote from a Harvard University public relations writer: The replication rate in psychology is quite high—indeed, it is statistically indistinguishable from 100%. This came up in the context of a paper by Daniel Gilbert et al. defending the reputation of social psychology, […] The post We have a ways to go in communicating the replication crisis appeared…

## Reader’s guide to the power pose controversy 1

October 20, 2016
By

I recently covered the power pose research controversy, ignited by an inflammatory letter by Susan Fiske (link). Dana Carney, one of the coauthors of the original power pose study, courageously came forward to disown the research, and explained the reasons why she no longer trusts the result. Here is her mea culpa. Her co-author, Amy Cuddy, then went to New York Magazine to publish her own corrective, claiming that the…

## Annotated Facets with ggplot2

October 20, 2016
By

I was recently asked to do a panel of grouped boxplots of a continuous variable, with each panel representing a categorical grouping variable. This seems easy enough with ggplot2 and the facet_wrap function, but then my collaborator wanted p-values on the graphs! This post is my approach to the problem. First of all, one caveat. I’m a […]

## Distributed Masochism as a Pedagogical Model

October 20, 2016
By

Editor’s note: This is a guest post by Sean Kross. Sean is a software developer in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health. Sean has contributed to several of our specializations including Data Science...

## a grim knight [cont’d]

October 19, 2016
By

As discussed in the previous entry, there are two interpretations to this question from The Riddler: “…how long is the longest path a knight can travel on a standard 8-by-8 board without letting the path intersect itself?” as to what constitutes a path. As a (terrible) chess player, I would opt for the version on […]

## Mathematica, now with Stan

October 19, 2016
By

Vincent Picaud developed a Mathematica interface to Stan: MathematicaStan You can find everything you need to get started by following the link above. If you have questions, comments, or suggestions, please let us know through the Stan user’s group or the GitHub issue tracker. MathematicaStan interfaces to Stan through a CmdStan process. Stan programs are […] The post Mathematica, now with Stan appeared first on Statistical Modeling, Causal Inference, and…

## Interim analysis, futility monitoring, and predictive probability

October 19, 2016
By

An interim analysis of a clinical trial is an unusual analysis. At the end of the trial you want to estimate how well some treatment X works. For example, you want to how likely is it that treatment X works better than the control treatment Y. But in the middle of the trial you want to know something more subtle. It’s […]

## The Psychological Science stereotype paradox

October 19, 2016
By

Lee Jussim, Jarret Crawford, and Rachel Rubinstein just published a paper in Psychological Science that begins, Are stereotypes accurate or inaccurate? We summarize evidence that stereotype accuracy is one of the largest and most replicable findings in social psychology. We address controversies in this literature, including the long-standing and continuing but unjustified emphasis on stereotype […] The post The Psychological Science stereotype paradox appeared first on Statistical Modeling, Causal Inference,…

## Loess regression in SAS/IML

October 19, 2016
By

A previous post discusses how the loess regression algorithm is implemented in SAS. The LOESS procedure in SAS/STAT software provides the data analyst with options to control the loess algorithm and fit nonparametric smoothing curves through points in a scatter plot. Although PROC LOESS satisfies 99.99% of SAS users who […] The post Loess regression in SAS/IML appeared first on The DO Loop.

## Deep learning in the cloud with MXNet

October 19, 2016
By

Last Friday together with Przemysław Szufel and Wit Jakuczun we were giving a live demo on introduction to deep learning at Digital Champions conference.The objective of the workshop was to show how to build a simple predictive model using MXNet libra...