Adan Becerra writes to Tyler VanderWeele: I have a question about your paper “Mediation analysis for a survival outcome with time-varying exposures, mediators, and confounders” that I was hoping that you could help my colleague (Julia Ward) and me with. We are currently using Medicare claims data to evaluate the following general mediation among dialysis […]

## Replicating a Linear Model

For a few of my commercial projects I have been in the seemingly strange place being asked to port a linear model from one data science system to another. Now I try to emphasize that it is better going forward to port procedures and build new models with training data. But sometimes that is not … Continue reading Replicating a Linear Model

## O’Bayes 19/4

Last talks of the conference! With Rui Paulo (along with Gonzalo Garcia-Donato) considering the special case of factors when doing variable selection. Which is an interesting question that I had never considered, as at best I would remove all leves or keeping them all. Except that there may be misspecification in the factors as for […]

## Journalistic stunt with Emacs

Emacs has been called a text editor with ambitions of being an operating system, and some people semi-seriously refer to it as their operating system. Emacs does not want to be an operating system per se, but it is certainly ambitious. It can be a shell, a web browser, an email client, a calculator, a […]

## How to read (in quantitative social science). And by implication, how to write.

## O’Bayes 19/3.5

Among the posters at the second poster session yesterday night, one by Judith ter Schure visually standing out by following the #betterposter design suggested by Mike Morrison a few months ago. Design on which I have ambivalent feelings. On the one hand, reducing the material on a poster is generally a good idea as […]

## This is a great example for a statistics class, or a class on survey sampling, or a political science class

Under the heading, “Latino approval of Donald Trump,” Tyler Cowen writes: From a recent NPR/PBS poll: African-American approval: 11% White approval: 40% Latino approval: 50% He gets 136 comments, many of which reveal a stunning ignorance of polling. For example, several commenters seem to think that a poll sponsored by National Public Radio is a […]

## Miscreant’s Way

We went to Peter Luger then took the train back . . . Walking through Williamsburg, everyone looked like a Daniel Clowes character.

## O’Bayes 19/3

Nancy Reid gave the first talk of the [Canada] day, in an impressive comparison of all approaches in statistics that involve a distribution of sorts on the parameter, connected with the presentation she gave at BFF4 in Harvard two years ago, including safe Bayes options this time. This was related to several (most?) of the […]

## Notes on computing hash functions

A secure hash function maps a file to a string of bits in a way that is hard to reverse. Ideally such a function has three properties: pre-image resistance collision resistance second pre-image resistance Pre-image resistance means that starting from the hash value, it is very difficult to infer what led to that output; it […]

## On deck through the end of 2019

## O’Bayes 19/2

One talk on Day 2 of O’Bayes 2019 was by Ryan Martin on data dependent priors (or “priors”). Which I have already discussed in this blog. Including the notion of a Gibbs posterior about quantities that “are not always defined through a model” [which is debatable if one sees it like part of a semi-parametric […]

## What if that regression-discontinuity paper had only reported local linear model results, and with no graph?

We had an interesting discussion the other day regarding a regression discontinuity disaster. In my post I shone a light on this fitted model: Most of the commenters seemed to understand the concern with these graphs, that the upward slopes in the curves directly contribute to the estimated negative value at the discontinuity leading to […]

## O’Bayes 19/1 [snapshots]

Although the tutorials of O’Bayes 2019 of yesterday were poorly attended, albeit them being great entries into objective Bayesian model choice, recent advances in MCMC methodology, and the multiple layers of BART, for which I have to blame myself for sticking the beginning of O’Bayes too closely to the end of BNP as only the […]

## My Favorite data.table Feature

My favorite R data.table feature is the “by” grouping notation when combined with the := notation.

Let’s take a look at this powerful notation.

First, let’s build an example data.frame.

d <- wrapr::build_frame(

"gr…

## It’s a lot of pressure to write a book!

Regression and Other Stories is almost done, and I was spending a couple hours going through it starting from page 1, cleaning up imprecise phrasings and confusing points. . . . One thing that’s hard about writing a book is that there are so many places you can go wrong. A 500-page book contains something […]

## All I need is time, a moment that is mine, while I’m in between

You’re an ordinary boy and that’s the way I like it – Magic Dirt Look. I’ll say something now, so it’s off my chest. I hate order statisics. I loathe them. I detest them. I wish them nothing but ill and strife. They are just awful. And I’ve spent the last god only knows how long […]

## running after my plane

A bit of a hectic trip to Abidjan last Sunday, starting from Caen in the early morning where I was supporting my daughter, wife, mother, and mother-in-law for the annual Rochambelle women-only 5k race on the previous evening! With my daughter managing a fantastic 52nd position and ending up first of her category! As I […]

## Racism is a framework, not a theory

Awhile ago we had a discussion about racism, in the context of a review of a recent book by science reporter Nicholas Wade that attributed all sorts of social changes and differences between societies to genetics. There is no point in repeating all this, but I did want to bring up here an issue that […]