## Risks in Big Data Attract Big Law Firms

July 21, 2012
Risks in Big Data Attract Big Law Firms: Holland & Knight just announced that it was launching a new data privacy and security unit, lead by partners Christopher Cwalina and Steven Roosa, who left Reed Smith to take on the new task. Its unit will b...

## “Always the last place you look!”

July 21, 2012
This gets to a distinction I have tried to articulate, between explaining a known effect (like looking for a known object), and searching for an unknown effect (that may well not exist). In the latter, possible effects of “selection” or searching need to be taken account of. Of course, searching for the Higgs is akin [...]

## Optimizing software in C++

July 21, 2012
Matt3 pointed us to this helpful document by Agner Fog, “Optimizing software in C++ An optimization guide for Windows, Linux and Mac platforms.” More here. Enjoy!

## Emulating dynamic scoping in GNU R

July 21, 2012
By design GNU R uses lexical scoping. Fortunately it allows for at least two ways to simulate dynamic scoping.Let us start with the example code and next analyze it:x <- "global"f1 <- function() cat("f1:", x, "\n")f2 <- function() cat("f2:", e...

## The Amazing Mean Shift Algorithm

July 21, 2012
$The Amazing Mean Shift Algorithm$

The mean shift algorithm is a mode-based clustering method due to Fukunaga and Hostetler (1975) that is commonly used in computer vision but seems less well known in statistics. The steps are: (1) estimate the density, (2) find the modes of the density, (3) associate each data point to one mode. 1. The Algorithm We [...]

## Le Monde puzzle [#783]

July 20, 2012
$Le Monde puzzle [#783]$

In a political party, there are as many cells as there are members and each member belongs to at least one cell. Each cell has five members and an arbitrary pair of cells only shares one member. How many members are there in this political party? Back to the mathematical puzzles of Le Monde (science [...]

## Interview with Lauren Talbot – Quantitative analyst for the NYC Financial Crime Task Force

July 20, 2012
Lauren Talbot Lauren Talbot is a quantitative analyst for the New York City Financial Crime Task Force. Before working for NYC she was an analyst at Acumen LLC and got her degree in economics from Stanford University. She is a key player turning spa...

## Likelihood thresholds and decisions

July 20, 2012
David Hogg points me to this discussion: Martin Strasbourg and I [Hogg] discussed his project to detect new satellites of M31 in the PAndAS survey. He can construct a likelihood ratio (possibly even a marginalized likelihood ratio) at every position in the M31 imaging, between the best-fit satellite-plus-background model and the best nothing-plus-background model. He [...]

## Do to other as you would have them do to you

July 20, 2012
The New York Times wrote about how the "Big Data" industry is trying to transform education (link). This is amusing and creepy by turns. All of these may be well-intentioned, but what strikes me is how unscientific the arguments are given in favor of these data-driven methods. You'd expect the same data-driven approach to be used to justify their new solutions but you find almost none of that. *** For…

## Course at Monash (#2)

July 19, 2012
Here are the slides for the second day of my course at Monash University, Melbourne, in the Special Lectures in Econometrics, with a strong strong similarity with the slides of my course in Roma this Spring. (Ah, sunny Roma…) The first day lecture was very well attended and I hope this remains true for the [...]

## Health Care Costs – Part 3, "Why You Are Paying More"

July 19, 2012
Malpractice - A Booming Industry? Perhaps authors Frank Sloan, Randall Bovbjerg and Penny Githens capture it best from their book Insuring Medical Malpractice: "If aging Doctor Kildare were to return to medical practice today, having been...

## Alexa, Maricel, and Marty: Three cellular automata who got on my nerves

July 19, 2012
I received the following two emails within fifteen minutes of each other. First, from “Alexa Russell,” subject line “An idea for a blog post: The Role, Importance, and Power of Words”: Hi Andrew, I’m a researcher/writer for a resource covering the importance of English proficiency in today’s workplace. I came across your blog andrewgelman.com as [...]

## Help me find the good JSM talks

July 19, 2012
I’m about to head out for JSM in a couple of weeks. The sheer magnitude of the conference means it is pretty hard to figure out what talks I should attend. One approach I’ve used in the past is to identify people who I know give good talks ...

## Big data is worth nothing without big science

July 19, 2012
Big data is worth nothing without big science: As with gold or oil, data has no intrinsic value, writes Webtrends CEO Alex Yoder. Big science, which bridges the gap between knowledge and insight, is where the real value is. Read this blog post by Alex ...

## New Kvetch Posted 7/18/12

July 19, 2012
New Kvetch

## Course at Monash (#1)

July 18, 2012
Here are the slides for the first day of my course at Monash University, Melbourne, in the Special Lectures in Econometrics, with a strong similarity with the slides of my course in Wharton, two years ago. (Be sure to check slide 67! If the update on slideshare works from my flat in Melbourne…) Filed under: [...]

## Sampling Distributions of t When Stopping Intention is Threshold Duration

July 18, 2012
Consider two groups of data on a metric scale, for which we want to conduct a t test. To compute the p value of t, we need to determine its sampling distribution, which is the relative probability of all possible values of t that would be obtained from...

## Gamification Quantification

July 18, 2012
Surveys become engaging when they become games, or at least, take on some of the characteristics of games.  This is the argument made by those advocating the gamification of marketing research [http://researchaccess.com/2011/12/market-researc...

## Top Universities Test the Online Appeal of Free

July 18, 2012
Top Universities Test the Online Appeal of Free: Online courses have been around for years, but now big-name colleges and competing software platforms have entered the field, which is evolving with astonishing speed.

## A closer look at data suggests Johns Hopkins is still the #1 US hospital

July 18, 2012
The US News best hospital 2012-20132 rankings are out. The big news is that Johns Hopkins has lost its throne. For 21 consecutive years Hopkins was ranked #1, but this year Mass General Hospital (MGH) took the top spot displacing Hopkins to #2. Howeve...

## The R packages in a data scientist’s toolbox

July 18, 2012
The following from Revolutions: John Myles White, self-described “statistics hacker” and co-author of “Machine Learning for Hackers” was interviewed recently by The Setup. In the interview, he describes his some of his go-to R packages for data science: Most of my work involves programming, so programming languages and their libraries are the bulk of the [...]

## Statistical Simulation

July 18, 2012
The basics of statistical simulation A statistical simulation often consists of the following steps: Simulate a random sample of size N from a statistical model. Compute a statistic for the sample. Repeat 1 and 2 many times and accumulate the results. Examine the union of the statistics, which approximates the sampling distribution of the statistic [...]

## Johns Hopkins Coursera Statistics Courses

July 18, 2012
Computing for Data Analysis Data Analysis Mathematical Biostatistics Bootcamp