## The Gambling Machine Puzzle

March 9, 2013
This puzzle came up in the New York Times Number Play blog. It goes like this: An entrepreneur has devised a gambling machine that chooses two independent random variables x and y that are uniformly and independently distributed between 0 and 100. He plans to tell any customer the value of x and to ask him […]

## Plaig

March 9, 2013
This, from Jeremy Duns (previously encountered here), resonates with me: When I asked Thayer why he hadn’t cited Zeigler, he told me very forcefully that he had cited everything, and accused me of libelling him: this means, presumably, that he accused me of libel without checking his article and seeing the ‘citations’ weren’t there. And [...]

## Multiple (smoothed) regression and portfolio exposure

March 9, 2013
Wednesday, in class, we’ve seen how to visualize a multiple regression model (with two continuous explanatory variables). Here, the goal is to predict the average cost of an insurance claim, using some covariates, e.g. the age of the driver, and the age of the car (recall that losses here are liability losses). The prediction obtained from a (standard) generalized linear model, with a log-link > reg1=glm(cout~ageconducteur+agevehicule,data=base,family=Gamma(link="log")) The code to visualize…

## A bit more on sample size

March 8, 2013
In our article What is a large enough random sample? we pointed out that if you wanted to measure a proportion to an accuracy “a” with chance of being wrong of “d” then a idea was to guarantee you had a sample size of at least: This is the central question in designing opinion polls […] Related posts: What is a large enough random sample? Level fit summaries can be…

## PSMR (short course at UCL)

March 8, 2013
As every year, come April our group hold a short course on Practical Statistics in Medical Research. I think this has run for several years now and it's reasonably established. The course is aimed at health care professionals (ie non statisti...

## Fun day

March 8, 2013
Today I spent the morning reading a PhD thesis that I need to examine (scheduled for next month) and the afternoon marking in-course assessment (ICA) papers for my course (Social Statistics). Neither activity is the most amusing in the world (alth...

## Comparing quantiles for two samples

March 8, 2013
Recently, for a research paper, I some samples, and I wanted to compare them. Not to compare they means (by construction, all of them were centered) but there dispersion. And not they variance, but more their quantiles. Consider the following boxplot type function, where everything here is quantile related (which is not the case for standard boxplot, see http://freakonometrics.hypotheses.org/4138, in French) > boxplotqbased=function(x){ + q=quantile(x[is.na(x)==FALSE],c(.05,.25,.5,.75,.95)) + plot(1,1,col="white",axes=FALSE,xlab="",ylab="", + xlim=range(X),ylim=c(1-.6,1+.6)) +…

## Send me student/postdoc blogs in statistics and computational biology

March 8, 2013
I’ve been writing a blog for a few years now, but it started after I was already comfortably settled in a tenure track job. There have been some huge benefits of writing a scientific blog. It has certainly raised my … Continue reading →

## Cool GSS training video! And cumulative file 1972-2012!

March 8, 2013
Felipe Osorio made the above video to help people use the General Social Survey and R to answer research questions in social science. Go for it! Meanwhile, Tom Smith reports: The initial release of the General Social Survey (GSS), cumulative file for 1972-2012 is now on our website. Codebooks and copies of questionnaires will be [...]

## From OpenOffice noob to control freak: A love story with R, LaTeX and knitr

March 8, 2013
Lately I had to write a seminar paper for a class and I decided to overdo it.But let's start at the very beginning. Here is my evolution of how I used to write stuff and how I got from this:to that:School: OpenOffice - I guess everyone has some&nb...

## The Gap

March 8, 2013
No, not the store. I am referring to the gap between the invention of a method and the theoretical justification for that method. It is hard to quantify the gap because, for any method, there is always dispute about who invented (discovered?) it, and who nailed the theory. For example, maximum likelihood is usually credited […]

## NBC Announces that Obama bin Laden Spokesman is captured in Jordan???

March 7, 2013
Ooops.  I don’t think Obama bin Laden was their intention.   Thankfully, they corrected it just a few minutes later.

## Stephen Senn: Casting Stones

March 7, 2013
Casting Stones, by Stephen Senn* At the end of last year I received a strange email from the editor of the British Medical Journal(BMJ) appealing for  ‘evidence’ to persuade the UK parliament of the necessity of making sure that data for clinical trials conducted by the pharmaceutical industry are made readily available to all and […]

## Which software is responsible for this?

March 7, 2013
@guitarzan wants us to see this chart from north of the border, and read the comments. Please hold your nose first. Here's one insightful comment: "I think it's insane to debate the ages 18 or 19. Why not cap it...

## Placing Big Data in Official Statistics: A Big Challenge?

March 7, 2013
From: http://www.cros-portal.eu/content/ntts-2013-programmeSlides: http://www.cros-portal.eu/sites/default/files//15A02_214_0.pdfMonica Scannapieco1, Antonino Virgillito2, Diego Zardetto31Istat - Italian National Institute of Statistics, e-ma...

## Data Storytelling in Video

March 7, 2013
I’m not a fan of video. I don’t spend time randomly surfing YouTube, and when given the choice between reading an article and watching a video, I’ll read. The reason is that videos often don’t work well for me: they’re too fast or too slow, they take a long time to get to the point, they don’t let me skip around and browse easily. I’d rather be in control than…

## Stan 1.2.0 and RStan 1.2.0

March 7, 2013
$Stan 1.2.0 and RStan 1.2.0$

Stan 1.2.0 and RStan 1.2.0 are now available for download. See: http://mc-stan.org/ Here are the highlights. Full Mass Matrix Estimation during Warmup Yuanjun Gao, a first-year grad student here at Columbia (!), built a regularized mass-matrix estimator. This helps for posteriors with high correlation among parameters and varying scales. We’re still testing this ourselves, so [...]

## Let’s Do Some Hierarchical Bayes Choice Modeling in R!

March 7, 2013
It can be difficult to work your way through hierarchical Bayes choice modeling.  There is just too much new to learn.  If nothing else, one gets lost in all ways that choice data can be collected and analyzed.  Then there is all this ou...

## Janet Mertz’s response to “The Myth of American Meritocracy”

March 6, 2013
The following is source material regarding our recent discussion of Jewish admission to Ivy League colleges. I’m posting it for the same reason that I earlier posted a message from Ron Unz, out of a goal to allow the data and arguments to be made as clearly as possible. Janet Mertz writes: I became involved [...]

## Barcelona by Night – Seen through the Lens of Social Media

March 6, 2013
atNight - Designing the City at Night [atnight.ws] is a research project at Universitat Politècnica de Catalunya led by Mar Santamaria (researcher) in collaboration with Pablo Martínez (lighting designer) and Jordi Bari (engineer). It in...

## Comparing the Perception of Wealth Inequality in the U.S. to Reality

March 6, 2013
How is the roughly 54 trillion dollars of wealth available in the U.S. divided over its population? The narrated and infographic movie "Wealth Inequality in America" focuses on the difference between our perception of wealth distribution in the U.S., ...

## Interactive eQTL plot with d3.js

March 6, 2013
I just finished an interactive eQTL plot using D3, in preparation for my talk on interactive graphics at the ENAR meeting next week. The code (in CoffeeScript) is available at github. But beware: it’s pretty awful. The hardest part was setting up the data files. Well, that plus the fact that I just barely know […]