Return to the Comedy Hour: P-values vs posterior probabilities (1)

November 29, 2015
By
Return to the Comedy Hour: P-values vs posterior probabilities (1)

Some recent criticisms of statistical tests of significance have breathed brand new life into some very old howlers, many of which have been discussed on this blog. One variant that returns to the scene every decade I think (for 50+ years?), takes a “disagreement on numbers” to show a problem with significance tests even from a “frequentist” perspective.  Since it’s […]

Read more »

We got mooks

November 28, 2015
By
We got mooks

Columbia University’s Data Science Institute is releasing some mooks, and I’m part of it. I’ll first give the official announcement and then share some of my thoughts. The official announcement: The Data Science Institute at Columbia University is excited to announce the launch of its first online-education series, Data Science and Analytics in Context, on […] The post We got mooks appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

You’ll never believe what this girl wrote in her diary (NSFW)

November 27, 2015
By

Arber Tasimi heard about our statistics diaries and decided to try it out in the psychology class he was teaching. The students liked his class but a couple of them pushed back against the diaries, describing the assignment as pointless or unhelpful in their learning. This made me think that it may be that a […] The post You’ll never believe what this girl wrote in her diary (NSFW) appeared…

Read more »

“iPredict to close after Govt refuses anti-money laundering law exemption”

November 26, 2015
By

Richard Barker points us to an update on ipredict, the New Zealand political prediction market. From the news article by Hamish Rutherford: The site, run by Victoria University of Wellington’s commercialisation arm, VicLink, issued a statement to its website and on Twitter on Thursday. According to the iPredict statement, Associate Justice Minister Simon Bridges refused […] The post “iPredict to close after Govt refuses anti-money laundering law exemption” appeared first…

Read more »

Boston Stan meetup 1 Dec

November 25, 2015
By

Here’s the announcement: Using Stan for variational inference, plus a couple lightning talks Dustin Tran will give a talk on using Stan for variational inference, then we’ll have a couple lightening (5 minute-ish) talks on projects. David Sparks will talk, I will talk about some of my work and we’re looking for 1-2 more volunteers. […] The post Boston Stan meetup 1 Dec appeared first on Statistical Modeling, Causal Inference,…

Read more »

A thanksgiving dplyr Rubik’s cube puzzle for you

November 25, 2015
By

Nick Carchedi is back visiting from DataCamp and for fun we came up with a dplyr Rubik's cube puzzle. Here is how it works. To solve the puzzle you have to make a 4 x 3 data frame that spells Thanksgiving like this: View the code on Gist. To solve the puzzle you need to pipe this

Read more »

Gary McClelland agrees with me that dichotomizing continuous variables is a bad idea. He also thinks my suggestion of dividing a variable into 3 parts is also a mistake.

November 25, 2015
By
Gary McClelland agrees with me that dichotomizing continuous variables is a bad idea.  He also thinks my suggestion of dividing a variable into 3 parts is also a mistake.

In response to some of the discussion that inspired yesterday’s post, Gary McClelland writes: I remain convinced that discretizing a continuous variable, especially for multiple regression, is the road to perdition. Here I explain my concerns. First, I don’t buy the motivation that discretized analyses are easier to explain to lay citizens and the press. […] The post Gary McClelland agrees with me that dichotomizing continuous variables is a bad…

Read more »

3 YEARS AGO (NOVEMBER 2012): MEMORY LANE

November 25, 2015
By
3 YEARS AGO (NOVEMBER 2012): MEMORY LANE

MONTHLY MEMORY LANE: 3 years ago: November 2012. I mark in red three posts that seem most apt for general background on key issues in this blog.[1]. Please check out others that didn’t make the “bright red cut”. If you’re interested in the Likelihood Principle, check “Blogging Birnbaum” and “Likelihood Links”. If you think P-values are hard to explain, see how […]

Read more »

Even the tiniest error messages can indicate an invalid statistical analysis

November 25, 2015
By
Even the tiniest error messages can indicate an invalid statistical analysis

The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one lin...

Read more »

Extracting elements from a matrix: rows, columns, submatrices, and indices

November 25, 2015
By
Extracting elements from a matrix: rows, columns, submatrices, and indices

A matrix is a convenient way to store an array of numbers. However, often you need to extract certain elements from a matrix. The SAS/IML language aupports two ways to extract elements: by using subscripts or by using indices. Use subscripts when you are extracting a rectangular portion of a […] The post Extracting elements from a matrix: rows, columns, submatrices, and indices appeared first on The DO Loop.

Read more »

a programming bug with weird consequences

November 24, 2015
By
a programming bug with weird consequences

One student of mine coded by mistake an independent Metropolis-Hastings algorithm with too small a variance in the proposal when compared with the target variance. Here is the R code of this implementation: It produces outputs of the following shape which is quite amazing because of the small variance. The reason for the lengthy freezes […]

Read more »

Internet use and religion, part four

November 24, 2015
By
Internet use and religion, part four

[If you are jumping into the middle of this series, you might want to start with this article, which explains the methodological approach I am taking.]In the previous article, I presented preliminary results from a study of relationships between I...

Read more »

Statistical Models That Support Design Thinking: Driver Analysis vs. Partial Correlation Networks

November 24, 2015
By
Statistical Models That Support Design Thinking: Driver Analysis vs. Partial Correlation Networks

We have been talking about design thinking in marketing since Tim Brown's Harvard Business Review article in 2008. It might be easy for the data scientist to dismiss the approach as merely a type of brainstorming for new products or services. Yet, desi...

Read more »

Fitting linear mixed models for QTL mapping

November 24, 2015
By
Fitting linear mixed models for QTL mapping

Linear mixed models (LMMs) have become widely used for dealing with population structure in human GWAS, and they’re becoming increasing important for QTL mapping in model organisms, particularly for the analysis of advanced intercross lines (AIL), which often exhibit variation in the relationships among individuals. In my efforts on R/qtl2, a reimplementation R/qtl to better […]

Read more »

20 years of Data Science: from Music to Genomics

November 24, 2015
By
20 years of Data Science: from Music to Genomics

I finally got around to reading David Donoho's 50 Years of Data Science paper.  I highly recommend it. The following quote seems to summarize the sentiment that motivated the paper, as well as why it has resonated among academic statisticians: The statistics profession is caught at a confusing moment: the activities which preoccupied it over centuries are now

Read more »

Beyond the median split: Splitting a predictor into 3 parts

November 24, 2015
By
Beyond the median split:  Splitting a predictor into 3 parts

Carol Nickerson pointed me to a series of papers in the journal Consumer Psychology, first one by Dawn Iacobucci et al. arguing in favor of the “median split” (replacing a continuous variable by a 0/1 variable split at the median) “to facilitate analytic ease and communication clarity,” then a response by Gary McClelland et al. […] The post Beyond the median split: Splitting a predictor into 3 parts appeared first…

Read more »

Estimating the exponent of discrete power law data

November 24, 2015
By
Estimating the exponent of discrete power law data

Suppose you have data from a discrete power law with exponent α. That is, the probability of an outcome n is proportional to n-α. How can you recover α? A naive approach would be to gloss over the fact that you have discrete data and use the MLE (maximum likelihood estimator) for continuous data. That […]

Read more »

Statbusters: please back up an extreme claim with numbers

November 23, 2015
By

In this week's Statbusters, my column with Andrew Gelman in the Daily Beast, we take note of Slate's recent rant about "wasteful" anti-smoking advertising, and demonstrate how to think about cost-benefit analysis. The key point is: if you are going to make an extreme claim, you better have some numbers to back it up. These numbers can be approximate, and based on (potentially dubious) Googled data. Not every analysis needs…

Read more »

I already know who will be president in 2016 but I’m not telling

November 23, 2015
By

Nadia Hassan writes: One debate in political science right now concerns how the economy influences voters. Larry Bartels argues that Q14 and Q15 impact election outcomes the most. Doug Hibbs argues that all 4 years matter, with later growth being more important. Chris Wlezien claims that the first two years don’t influence elections but the […] The post I already know who will be president in 2016 but I’m not…

Read more »

Efficiency in space usage leads to efficiency in comprehension

November 23, 2015
By
Efficiency in space usage leads to efficiency in comprehension

Consider the following two charts that illustrate the same data. (I deliberately took out the header text to make a point. The original chart came from the Wall Street Journal.) To me, the line chart gets to the point more...

Read more »

On Bayesian DSGE Modeling with Hard and Soft Restrictions

November 23, 2015
By

A theory is essentially a restriction on a reduced form. It can be imposed directly (hard restrictions) or used as as a prior mean in a more flexible Bayesian analysis (soft restrictions). The soft restriction approach -- "theory as a shrinkage directi...

Read more »

Determine whether a SAS product is licensed

November 23, 2015
By
Determine whether a SAS product is licensed

Sometimes you are writing a program that needs to find out whether a particular SAS product (like SAS/ETS, SAS/QC, or SAS/OR) is licensed. I was reminded of this fact when I wrote last week's blog post about how to create a map with PROC SGPLOT. Although the SGPLOT procedure is […] The post Determine whether a SAS product is licensed appeared first on The DO Loop.

Read more »

Paper: The Connected Scatterplot for Presenting Paired Time Series

November 23, 2015
By
Paper: The Connected Scatterplot for Presenting Paired Time Series

I’m very happy to finally be able to announce our paper on the connected scatterplot technique. It describes the technique, provides some historical perspective, and most of all looks into how easy to understand and engaging the technique actually is. The connected scatterplot isn’t really known in visualization, but has gotten some interest in journalism. … Continue reading Paper: The Connected Scatterplot for Presenting Paired Time Series

Read more »


Subscribe

Email:

  Subscribe