## LOST CAUSES IN STATISTICS I: Finite Additivity

June 30, 2013
By
LOST CAUSES IN STATISTICS I: Finite Additivity I decided that I'll write an occasional post about lost causes in statistics. (The title is motivated by Streater (2007).) Today's post is about finitely additive probability (FAP). Recall how we usually define probability. We start with a sample space and a -algebra of events . A real-valued …

## R sucks

June 29, 2013
By

I was trying to make some new graphs using 5-year-old R code and I got all these problems because I was reading in files with variable names such as "co.fipsid" and now R is automatically changing them to "co_fipsid". Or maybe the names had underbars all along, and the old R had changed them into

## Going negative

June 29, 2013
By

Troels Ring writes: I have measured total phosphorus, TP, on a number of dialysis patients, and also measured conventional phosphate, Pi. Now P is exchanged with the environment as Pi, so in principle a correlation between TP and Pi could perhaps be expected. I'm really most interested in the fraction of TP which is not

June 29, 2013
By

## Descending Text in Righthand Margin of R Graphics à la mtext

June 29, 2013
By

There was an R-help thread in January regarding text in the righthand margin of an R graphic, where the text should be rendered in reading order from top to bottom. The base R function mtext is used to plot text in the margin. But, mtext is only able to render text from left to right […]

## Econ coauthorship update

June 28, 2013
By

The other day I posted some remarks on Stan Liebowitz's analysis of coauthorship in economics. Liebowitz followed up with some more thoughts: I [Liebowitz] am not arguing for an increase or decrease in coauthorship, per se. I would prefer an efficient amount of coauthorship, whatever that is, and certainly it will vary by paper and

## Testing function arguments in GNU R

June 28, 2013
By

Recently I have read a nice post on ensuring that proper arguments are passed to a function using GNU R class system. However, I often need a more lightweight solution to repetitive function argument testing.The alternative idea is to test function arg...

## The weirdest thing about the AJPH story

June 27, 2013
By

Earlier today I posted a weird email that began with "You are receiving this notice because you have published a paper with the American Journal of Public Health within the last few years" and continued with a sleazy attempt to squeeze \$1000 out of me so that an article that I sent them for free

## What is the Best Way to Analyze Data?

June 27, 2013
By

One topic I've been thinking about recently is extent to which data analysis is an art versus a science. In my thinking about art and science, I rely on Don Knuth's distinction, from his 1974 lecture "Computer Programming as an … Continue reading →

## Huh?

June 27, 2013
By

I received the following bizarre email: Apr 26, 2013 Dear Andrew Gelman You are receiving this notice because you have published a paper with the American Journal of Public Health within the last few years. Currently, content on the Journal is closed access for the first 2 years after publication, and then freely accessible thereafter.

## When a chart does nothing for the story

June 27, 2013
By

There is some banter on Twitter about a chart that appeared in The Atlantic on "Pixar's Sad Decline--in One Chart". (@thewhyaxis, @jschwabish, @tealtan). Link to article *** It's a bit horrible but not the worst chart ever. The most offensive...

## R Package Versioning

June 27, 2013
By

This should be what it feels like to bump the major version of your software: For me, the main reason for package versioning is to indicate the (slight or significant) differences among different versions of the same package, otherwise we can keep o...

## Why I am not a “dualist” in the sense of Sander Greenland

June 27, 2013
By

This post picks up, and continues, an exchange that began with comments on my June 14 blogpost (between Sander Greenland, Nicole Jinn, and I). My new response is at the end. The concern is how to expose and ideally avoid some of the well known flaws and foibles in statistical inference, thanks to gaps between […]

## MCMSki IV, Jan. 6-8, 2014, Chamonix (news #6)

June 26, 2013
By

More news about MCMSki IV: First, the 9 invited and the 16 contributed sessions are about to be set into the program by the scientific committee. It should appear any time now: stay tuned. We are just debating where to put the roundtable (in Camelot, of course!) Second, after looking a wee bit around for […]

June 26, 2013
By

UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS:http://bit.ly/1m1whzU .  PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.Link to the code for the analysis...

## Don’t buy Bayesian Data Analysis . . .

June 26, 2013
By

. . . yet! BDA3 is coming soon, and it's bigger and badder than ever. It's got Gaussian processes and weakly informative priors and HMC and VB and EP and Stan. It's got all-new material on WAIC and several new chapters on nonparametrics. It's got the birthday data! I'll have some future posts with more

## Art from Data

June 26, 2013
By

There's a nice piece by Mark Hansen about data-driven aesthetics in the New York Times special section on big data. From a speedometer to a weather map to a stock chart, we routinely interpret and act on data displayed visually.

## Doing Bayesian Data Analysis at Jacob Bernoulli’s grave

June 26, 2013
By

The book visited the grave of Jacob Bernoulli (1655-1705) in Basel, Switzerland, as shown in these photos taken by Dr. Benjamin Scheibehenne. Jacob Bernoulli pre-dated Bayes (1701-1761), of course, but Bernoulli established foundational concepts and th...

## How to color clusters in a dendrogram

June 26, 2013
By

The CLUSTER procedure in SAS/STAT software creates a dendrogram automatically. The black-and-white dendrogram is nice, but plain. A SAS customer wanted to know whether it is possible to add color to the dendrogram to emphasize certain clusters. For example, the plot at the left emphasizes a four-cluster scenario for clustering [...]

## Future ISFs

June 26, 2013
By

The next few locations for the International Symposium on Forecasting have been announced: 2014: Rotterdam, The Netherlands 2015: Riverside, California, USA 2016: Santander, Spain 2017: Cairns, Australia The ISF is easily the best forecasting confere...

## Natural language processing tutorial

June 26, 2013
By

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. ...

## My talk at Boston Python

June 26, 2013
By

I just gave a talk at Boston Python about natural language processing in general, and edX ease and discern in specific. You can find the presentation source here, and the web version of it here. There is a video of it here. Nelle Varoquaux and Micha...