“Bayes factor”: where the term came from, and some references to why I generally hate it

July 21, 2017
By

Someone asked: Do you know when this term was coined or by whom? Kass and Raftery’s use of the tem as the title of their 1995 paper suggests that it was still novel then, but I have not noticed in the paper any information about where it started. I replied: According to Etz and Wagenmakers […] The post “Bayes factor”: where the term came from, and some references to why…

Read more »

Surprising result when exploring Rcpp gallery

July 21, 2017
By
Surprising result when exploring Rcpp gallery

I’m starting to incorporate more Rcpp in my R work, and so decided to spend some time exploring the Rcpp Gallery. One example by John Merrill caught my eye. He provides a C++ solution to transforming an list of lists into a data frame, and shows impressive speed savings compared to as.data.frame. This got me thinking about how […]

Read more »

Quirks about running Rcpp on Windows through RStudio

July 20, 2017
By
Quirks about running Rcpp on Windows through RStudio

Quirks about running Rcpp on Windows through RStudio This is a quick note about some tribulations I had running Rcpp (v. 0.12.12) code through RStudio (v. 1.0.143) on a Windows 7 box running R (v. 3.3.2). I also have RTools v. 3.4 installed. I fully admit that this may very well be specific to my […]

Read more »

How does a Nobel-prize-winning economist become a victim of bog-standard selection bias?

July 20, 2017
By

Someone who wishes to remain anonymous writes in with a story: Linking to a new paper by Jorge Luis García, James J. Heckman, and Anna L. Ziff, an economist Sue Dynarski makes this “joke” on facebook—or maybe it’s not a joke: How does one adjust standard errors to account for the fact that N of […] The post How does…

Read more »

Dragon Trainer rich mathematical task

July 20, 2017
By
Dragon Trainer rich mathematical task

I love rich mathematical tasks. Here is one for all levels of schooling. What do you think? Background to rich tasks A rich task is an open-ended task that students can engage with at multiple levels. I use the following … Continue reading →

Read more »

Make Your Plans for Stans (-s + Con)

July 19, 2017
By
Make Your Plans for Stans (-s + Con)

This post is by Mike A friendly reminder that registration is open for StanCon 2018, which will take place over three days, from Wednesday January 10, 2018 to Friday January 12, 2018, at the beautiful Asilomar Conference Grounds in Pacific Grove, California. Detailed information about registration and accommodation at Asilomar, including fees and instructions, can be found on […] The post Make Your…

Read more »

Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

July 19, 2017
By

Jonah “ShinyStan” Gabry, Mike “Riemannian NUTS” Betancourt, and I will be giving a three-day short course next month in New York, following the model of our successful courses in 2015 and 2016. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version […] The post Short course…

Read more »

His concern is that the authors don’t control for the position of games within a season.

July 19, 2017
By

Chris Glynn wrote last year: I read your blog post about middle brow literature and PPNAS the other day. Today, a friend forwarded me this article in The Atlantic that (in my opinion) is another example of what you’ve recently been talking about. The research in question is focused on Major League Baseball and the […] The post His concern…

Read more »

The imprecision of data, subway edition

July 19, 2017
By
The imprecision of data, subway edition

Kaiser Fung, founder of Principal Analytics Prep, draws practical lessons from a CUNY advertisement in the subway train

Read more »

A quantile definition for skewness

July 19, 2017
By
A quantile definition for skewness

Skewness is a measure of the asymmetry of a univariate distribution. I have previously shown how to compute the skewness for data distributions in SAS. The previous article computes Pearson's definition of skewness, which is based on the standardized third central moment of the data. Moment-based statistics are sensitive to [...] The post A quantile definition for skewness appeared first…

Read more »

seplyr update

July 19, 2017
By

The development version of my new R package seplyr is performing in practical applications with dplyr 0.7.* much better than even I (the seplyr package author) expected. I think I have hit a very good set of trade-offs, and I have now spent significant time creating documentation and examples. I wish there had been such … Continue reading seplyr update

Read more »

My unfunded HHMI teaching professors proposal

July 19, 2017
By

A little over a year ago I saw a request from the Howard Hughes Medical Institute for proposals focused on undergraduate teaching. I decided to apply for this grant since it combines all of the things I’m interested in: teaching, education resear...

Read more »

Animating a spinner using ggplot2 and ImageMagick

July 18, 2017
By
Animating a spinner using ggplot2 and ImageMagick

It’s Sunday, and I [Bob] am just sitting on the couch peacefully ggplotting to illustrate basic sample spaces using spinners (a trick I’m borrowing from Jim Albert’s book Curve Ball). There’s an underlying continuous outcome (i.e., where the spinner lands) and a quantization into a number of regions to produce a discrete outcome (e.g., “success” […] The post Animating a…

Read more »

“The ‘Will & Grace’ Conjecture That Won’t Die” and other stories from the blogroll

July 18, 2017
By

From sociologist Jay Livingston: The “Will & Grace” Conjecture That Won’t Die From sociologist David Weakliem: Why does Trump try to implement the unpopular ideas he’s proposed, and not the popular ideas? History professor who wrote award-winning book about 1970-era crime, is misinformed about the history of 1970s-era crime “West Virginia, which was a lock […] The post “The ‘Will…

Read more »

This one takes time to make, takes even more time to read

July 18, 2017
By
This one takes time to make, takes even more time to read

Kaiser Fung, creator of Junk Charts and Principal Analytics Prep, explains why this Wired chart about Netflix viewing behavior is so hard to read, and offers an alternative focusing on a particular insight about the data

Read more »

How to run a course (if you’re me)

July 17, 2017
By

Last summer, I and my trusty henchpeople from the Department of Politics ran an intensive six week summer course for incoming freshmen on data science (‘POL245’, for locals). This post sketches out how I think course infrastructure should work, and provides some practical details of how we arranged things.  Most of our structures worked pretty … Continue reading How to…

Read more »

How to design future studies of systemic exercise intolerance disease (chronic fatigue syndrome)?

July 17, 2017
By

Someone named Ramsey writes on behalf of a self-managed support community of 100+ systemic exercise intolerance disease (SEID) patients. He read my recent article on the topic and had a question regarding the following excerpt: For conditions like S.E.I.D., then, the better approach may be to gather data from people suffering “in the wild,” combining […] The post How to…

Read more »

Should we continue not to trust the Turk? Another reminder of the importance of measurement

July 17, 2017
By

From 2013: Don’t trust the Turk From 2017 (link from Kevin Lewis), from Jesse Chandler and Gabriele Paolacci: The Internet has enabled recruitment of large samples with specific characteristics. However, when researchers rely on participant self-report to determine eligibility, data quality depends on participant honesty. Across four studies on Amazon Mechanical Turk, we show that […] The post Should we…

Read more »

3 ways to visualize prediction regions for classification problems

July 17, 2017
By
3 ways to visualize prediction regions for classification problems

An important problem in machine learning is the "classification problem." In this supervised learning problem, you build a statistical model that predicts a set of categorical outcomes (responses) based on a set of input features (explanatory variables). You do this by training the model on data for which the outcomes [...] The post 3 ways to visualize prediction regions for…

Read more »

They want help designing a crowdsourcing data analysis project

July 16, 2017
By

Michael Feldman writes: My collaborators and myself are doing research where we try to understand the reasons for the variability in data analysis (“the garden of forking paths”). Our goal is to understand the reasons why scientists make different decisions regarding their analyses and in doing so reach different results. In a project called “Crowdsourcing […] The post They want…

Read more »

Graphs as comparisons: A case study

July 16, 2017
By
Graphs as comparisons:  A case study

Above is a pair of graphs from a 2015 paper by Alison Gopnik, Thomas Griffiths, and Christopher Lucas. It takes up half a page in the journal, Current Directions in Psychological Science. I think we can do better. First, what’s wrong with the above graphs? We could start with the details: As a reader, I […] The post Graphs as…

Read more »

dplyr 0.7 Made Simpler

July 15, 2017
By

I have been writing a lot (too much) on the R topics dplyr/rlang/tidyeval lately. The reason is: major changes were recently announced. If you are going to use dplyr well and correctly going forward you may need to understand some of the new issues (if you don’t use dplyr you can safely skip all of … Continue reading dplyr 0.7…

Read more »

Hey—here are some tools in R and Stan to designing more effective clinical trials! How cool is that?

July 15, 2017
By

In statistical work, design and data analysis are often considered separately. Sometimes we do all sorts of modeling and planning in the design stage, only to analyze data using simple comparisons. Other times, we design our studies casually, even thoughtlessly, and then try to salvage what we can using elaborate data analyses. It would be […] The post Hey—here are…

Read more »


Subscribe

Email:

  Subscribe