By design GNU R uses lexical scoping. Fortunately it allows for at least two ways to simulate dynamic scoping.Let us start with the example code and next analyze it:x <- "global"f1 <- function() cat("f1:", x, "\n")f2 <- function() cat("f2:", e...

By design GNU R uses lexical scoping. Fortunately it allows for at least two ways to simulate dynamic scoping.Let us start with the example code and next analyze it:x <- "global"f1 <- function() cat("f1:", x, "\n")f2 <- function() cat("f2:", e...

The mean shift algorithm is a mode-based clustering method due to Fukunaga and Hostetler (1975) that is commonly used in computer vision but seems less well known in statistics. The steps are: (1) estimate the density, (2) find the modes of the density, (3) associate each data point to one mode. 1. The Algorithm We [...]

In a political party, there are as many cells as there are members and each member belongs to at least one cell. Each cell has five members and an arbitrary pair of cells only shares one member. How many members are there in this political party? Back to the mathematical puzzles of Le Monde (science [...]

David Hogg points me to this discussion: Martin Strasbourg and I [Hogg] discussed his project to detect new satellites of M31 in the PAndAS survey. He can construct a likelihood ratio (possibly even a marginalized likelihood ratio) at every position in the M31 imaging, between the best-fit satellite-plus-background model and the best nothing-plus-background model. He [...]

The New York Times wrote about how the "Big Data" industry is trying to transform education (link). This is amusing and creepy by turns. All of these may be well-intentioned, but what strikes me is how unscientific the arguments are given in favor of these data-driven methods. You'd expect the same data-driven approach to be used to justify their new solutions but you find almost none of that. *** For…

Here are the slides for the second day of my course at Monash University, Melbourne, in the Special Lectures in Econometrics, with a strong strong similarity with the slides of my course in Roma this Spring. (Ah, sunny Roma…) The first day lecture was very well attended and I hope this remains true for the [...]

I received the following two emails within fifteen minutes of each other. First, from “Alexa Russell,” subject line “An idea for a blog post: The Role, Importance, and Power of Words”: Hi Andrew, I’m a researcher/writer for a resource covering the importance of English proficiency in today’s workplace. I came across your blog andrewgelman.com as [...]

I’m about to head out for JSM in a couple of weeks. The sheer magnitude of the conference means it is pretty hard to figure out what talks I should attend. One approach I’ve used in the past is to identify people who I know give good talks ...

Big data is worth nothing without big science: As with gold or oil, data has no intrinsic value, writes Webtrends CEO Alex Yoder. Big science, which bridges the gap between knowledge and insight, is where the real value is. Read this blog post by Alex ...

Here are the slides for the first day of my course at Monash University, Melbourne, in the Special Lectures in Econometrics, with a strong similarity with the slides of my course in Wharton, two years ago. (Be sure to check slide 67! If the update on slideshare works from my flat in Melbourne…) Filed under: [...]

Surveys become engaging when they become games, or at least, take on some of the characteristics of games. This is the argument made by those advocating the gamification of marketing research [http://researchaccess.com/2011/12/market-researc...

Top Universities Test the Online Appeal of Free: Online courses have been around for years, but now big-name colleges and competing software platforms have entered the field, which is evolving with astonishing speed.

The following from Revolutions: John Myles White, self-described “statistics hacker” and co-author of “Machine Learning for Hackers” was interviewed recently by The Setup. In the interview, he describes his some of his go-to R packages for data science: Most of my work involves programming, so programming languages and their libraries are the bulk of the [...]

The basics of statistical simulation A statistical simulation often consists of the following steps: Simulate a random sample of size N from a statistical model. Compute a statistic for the sample. Repeat 1 and 2 many times and accumulate the results. Examine the union of the statistics, which approximates the sampling distribution of the statistic [...]

Computing for Data Analysis Data Analysis Mathematical Biostatistics Bootcamp

Gur Huberman points to an article on the financial crisis by Bethany McLean, who writes: lthough our understanding of what instigated the 2008 global financial crisis remains at best incomplete, there are a few widely agreed upon contributing factors. One of them is a 2004 rule change by the U.S. Securities and Exchange Commission that [...]

I ran across a nice quote from Phil Schrodt on the virtue of explanation over prediction. It starts, "This is utterly, totally and completely self-serving bullshit...", and there is more. I encourage you to share this with others and contribute to the conversation at Explanation or Prediction? An Amazing Quote from Phil Schrodt, which first appeared at carlislerainey.com.For more of my thoughts and ideas, subscribe to my blog (via RSS…