Gary Gorton has a fascinating new paper, "History and Economics of Safe Assets", which contains the quote of the week: "...almost all of human history can be written as the search for and the production of different forms of safe assets". No...

Paul Alper pointed me to this news article about an economist who got BUSTED for doing algebra on the plane. This dude was profiled by the lady sitting next to him who got suspicious of his incomprehensible formulas. I feel that way about a lot of econ research too, so I can see where she

Jona Sassenhagen writes: Here is a paper ***, in case you, errrrr, have run out of other things to blog about … I took a look and replied: Wow—what a horrible paper. Really ignorant. Probably best for me to just ignore it! The post Drive-...

Just a "heads-up." I've been editing a two-part three-part series Nina Zumel is writing on some of the pitfalls of improperly applied principal components analysis/regression and how to avoid them (we are using the plural spelling as used in following Everitt The Cambridge Dictionary of Statistics). The series is looking absolutely fantastic and I think

Someone sent me this question: As a social and political science expert, you analyze data related to everything from public health and clinical research to college football. Considering how adaptable analytics expertise is, what kinds of careers available to one with this skillset? In which industries are data scientists and analysts in particularly demand? What

There's nothing wrong with Meehl. He's great. The puzzle of Paul Meehl is that everything we're saying now, all this stuff about the problems with Psychological Science and PPNAS and Ted talks and all that, Paul Meehl was saying 50 years ago. And it was no secret. So how is it that all this was

A couple days ago I received an email: I'm a reporter for *** [newspaper], currently looking into a fun article about a recent study, and my old professor *** recommended I get in touch with you to see if you would give me a comment on the statistics in the study. It's a bit of

OK, here's the story. A couple days ago, regarding the now-notorious PPNAS article, "Physical and situational inequality on airplanes predicts air rage," I wrote: NPR will love this paper. It directly targets their demographic of people who are rich enough to fly a lot but not rich enough to fly first class, and who think

The riddle from The Riddler this week is about finding an undirected graph with N nodes and no isolated node such that the number of nodes with more connections than the average of their neighbours is maximal. A representation of a connected graph is through a matrix X of zeros and ones, on which one […]

The worker IDs Amazon's Mechanical Turk gives you may look pretty random and anonymous, but they can reveal personally-identifiable information. They need to be removed from datasets, especially when they are shared or published. Like many things, I learned this the hard way. Or I would have, had Steve Haroz not caught it in the data

vtreat cross frames John Mount, Nina Zumel 2016-05-05 As a follow on to "On Nested Models" we work R examples demonstrating "cross validated training frames" (or "cross frames") in vtreat. Consider the following data frame. The outcome only depends on the "good" variables, not on the (high degree of freedom) "bad" variables. Modeling such a

In an otherwise pointless comment thread the other day, Dan Lakeland contributed the following gem: A p-value is the probability of seeing data as extreme or more extreme than the result, under the assumption that the result was produced by a specific random number generator (called the null hypothesis). I could care less about p-values

Attention conservation notice:: An academic promoting his own talk. Even if you can get past that, only of interest if you (1) care about statistical methods for comparing network data sets, and (2) will be in Seattle on Friday. Since the coin came ...

Here's to the the NBER's ongoing Conference on Research in Income and Wealth (CRIW), unsung hero, home of down-and-dirty measurement mavens since 1935. Yes, since 1935! Check out Chuck Holten's fascinating CRIW description in the NBER ...

What DataUsa is doing could be – I guess – the next step in the evolution of Open Government Data websites. It's the step from offering file downloads to presenting data (and not files) interactively. And it's a kind of presentation many official statistical websites would surely be proud of. César A. Hidalgo from MIT discusses

Yesterday, in the context of a post about news media puffery of the latest three-headed monstrosity to come out of PPNAS, I promised you a solution. I wrote: OK, fine, you might say. But what's a reporter to do? They can't always call Andrew Gelman at Columbia University for a quote, and they typically won't

In a previous post I showed how to download, install, and use packages in SAS/IML 14.1. SAS/IML packages incorporate source files, documentation, data sets, and sample programs into a ZIP file. The PACKAGE statement enables you to install, uninstall, and manage packages. You can load functions and data into your

Attention conservation notice: I have no taste. Guido W. Imbens and Donald B. Rubin, Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction While I found less to disagree with about the over-all approach than I anticipated...