## Using Checkin Times to Infer Effective Opening Hours

July 3, 2013
By

It's an idea that everyone has had, and I'm sure I'm not reporting something new. Foursquare has a feature that allows you to see their best guess at opening hours based on when people checkin to a business.

## Disappointing title

July 3, 2013
By

I caught a glimpse of a book in a library this morning and thought the title was “Statistics for People Who Think.” Sounds like a great book! But the title was actually “Statistics for People Who (Think They) Hate Statistics” wh...

## Kuhn, 1/f noise, and the fractal nature of scientific revolutions

July 3, 2013
By

Bill Harris writes: I was re-reading your and Shalizi’s “Philosophy and the practice of Bayesian statistics” [see also the rejoinder] and noticed a statement near the end of section 6 about paradigm shifts coming in different magnitudes over different time spans. That reminded me of the almost-mystical ideas surrounding 1/f (f being frequency”) noise in [...]The post Kuhn, 1/f noise, and the fractal nature of scientific revolutions appeared first on…

## Antihubrisines

July 3, 2013
By

From John Tukey’s Sunset Salvo: Our suffering sinuses are now frequently relieved by antihistamines. Our suffering philosophy — whether implicit or explicit — of data analysis, or of statistics, or of science and technology needs to be far more frequently relieved by antihubrisines. To the Greeks hubris meant the kind of pride that would be […]

## Duplicate values in a stream of random numbers

July 3, 2013
By

As I wrote in my previous post, a SAS customer noticed that he was getting some duplicate values when he used the RAND function to generate a large number of random uniform values on the interval [0,1]. He wanted to know if this result indicates a bug in the RAND [...]

July 3, 2013
By

Get your fresh copy of the R-Journal from here.

## The Mechanics of Data Visualization2

July 3, 2013
By

I recently presented about the mechanics of data visualization at the CLaRI Literacy Conference to a group of researchers, teachers and school administrators. The presentation is based on the work of Few (2012); Few (2009). While the presentation itself is … Continue reading →

## The Mechanics of Data Visualization

July 2, 2013
By

I recently presented about the mechanics of data visualization at the CLaRI Literacy Conference to a group of researchers, teachers and school administrators. The presentation is based on the work of Few (2012; 2009). While the presentation itself is not about … Continue reading →

## Le Monde puzzle [#827]

July 2, 2013
By

Back to R (!) for the current Le Monde puzzle: Given an unknown permutation of the set {1,…,6}, written on the faces of a cube, there exist a sequence of summits such that increasing by one unit the three numbers of the faces sharing the successive summits in the sequence leads to identical values over […]

## Phototrails: the Visual Structure of Millions of User-Generated Photos

July 2, 2013
By

The Digital Humanities project Phototrails [phototrails.net], a collaboration by the Department of History of Art and Architecture (University of Pittsburgh), the Software Studies Initiative (California Institute for Telecommunication and Information)...

## They want me to send them free material and pay for the privilege

July 2, 2013
By

Since we’re on the topic of publishers asking me for money . . . The other day I received the following email: Mimi Liljeholm has sent you a message. Please click ‘Reply’ to send a direct response. Dear Prof Gelman, In collaboration with Frontiers in Psychology, we are organizing a Research Topic titled “Causal discovery [...]The post They want me to send them free material and pay for the privilege…

July 2, 2013
By

Like your .bashrc, .vimrc, or many other dotfiles you may have in your home directory, your .Rprofile is sourced every time you start an R session. On Mac and Linux, this file is usually located in ~/.Rprofile. On Windows it's buried somewhere in the R...

## Fundraiser glass half empty or half full

July 2, 2013
By

Reader Max sent this photo of a poster in the Shake Shack in Brooklyn, NY: Are they way below their goal? You wouldn't know unless you read all the text and numbers.

## There is definitely R in July

July 2, 2013
By

The useR!2013 conference in Albacete, Spain, will commence next Wednesday, 10 July, and on the day before Diego and I will give a googleVis tutorial. The following Monday, 15 July, the first R in Insurance event will take place at Cass Business School ...

## Some Common Approaches for Analyzing Likert Scales and Other Categorical Data

July 2, 2013
By
$Some Common Approaches for Analyzing Likert Scales and Other Categorical Data$

Analyzing Likert scale responses really comes down to what you want to accomplish (e.g. Are you trying to provide a formal report with probabilities or are you trying to simply understand the data better). Sometimes a couple of graphs are sufficient and a formalize statistical test isn’t even necessary. However, with how easy it is […]

## integral priors for binomial regression

July 1, 2013
By

Diego Salmerón and Juan Antonio Cano from Murcia, Spain (check the movie linked to the above photograph!), kindly included me in their recent integral prior paper, even though I mainly provided (constructive) criticism. The paper has just been arXived. A few years ago (2008 to be precise), we wrote together an integral prior paper, published […]

## Measuring the importance of data privacy: embarrassment and cost

July 1, 2013
By

We live in an era when it is inexpensive and easy to collect data about ourselves or about other people. These data can take the form of health information - like medical records, or they could be financial data - … Continue reading →

## Using R to Produce Scalable Vector Graphics for the Web

July 1, 2013
By

Statistical software is normally used during the analysis stage of a project and a cleaned up static graphic is created for the presentation.  If the presentation is in web format then there are some considerations that are needed. The trick is to find ways to implement those graphs in that web format so the graph […]

## Going meta on Niall Ferguson

July 1, 2013
By

Ashok Rao shreds the latest book from Niall Ferguson, who we’ve encountered most recently as the source of homophobic slurs but who used to be a serious scholar. Or maybe still is. Remember Linda, that character from the Kahneman and Tversky vignette who was deemed likely to be “a bank teller who is active in [...]The post Going meta on Niall Ferguson appeared first on Statistical Modeling, Causal Inference, and…

July 1, 2013
By

TechCrunch has a great piece on how Facebook tracks you even if you don't give them data. (link; be careful, opening this link drags my browser to a crawl.) Here's my take on the issue: I have always been disturbed by the complicity of invading other people's privacy, forced upon us when we use a service like Facebook (or Google or you name it). For those of you who allow…

## Duplicate values in random numbers: Tossing dice and sharing birthdays

July 1, 2013
By

Tossing dice is a simple and familiar process, yet it can illustrate deep and counterintuitive aspects of random numbers. For example, if you toss four identical six-sided dice, what is the probability that the faces are all distinct, as shown to the left? Many people would guess that the probability [...]

## Exploratory Data Analysis – Kernel Density Estimation and Rug Plots in R on Ozone Data in New York and Ozonopolis

Update on July 15, 2013: Thanks to Harlan Nelson for noting on AnalyticBridge that the ozone concentrations for both New York and Ozonopolis are non-negative quantities, so their kernel density plot should have non-negative support sets.  This has been corrected in this post by - defining new variables called max.ozone and max.ozone2 - using the […]

## Intractable likelihoods, unbiased estimators and sign problem

July 1, 2013
By

Hey all, We’re at the Big Data era blablabla, but the advanced computational methods usually don’t scale well enough to match the increasing sizes of datasets. For instance, even in a simple case of i.i.d. data and an associated likelihood function , the cost of evaluating the likelihood function at any parameter is typically growing […]