Internet privacy as seen from 1975

Science fiction authors set stories in the future, but they don’t necessarily try to predict the future, and so it’s a little odd to talk about they “got right.” Getting something right implies they were making a prediction rather than imagining a setting of a story. However, sometimes SF authors do indeed try to predict […]

Hey, people are doing the multiverse!

Elio Campitelli writes: I’ve just saw this image in a paper discussing the weight of evidence for a “hiatus” in the global warming signal and immediately thought of the garden of forking paths. From the paper: Tree representation of choices to represent and test pause-periods. The ‘pause’ is defined as either no-trend or a slow-trend. […]

Data quality is a thing.

I just happened to come across this story, where a journalist took some garbled data and spun a false tale which then got spread without question. It’s a problem. First, it’s a problem that people will repeat unjustified claims, also a problem that when data are attached, you can get complete credulity, even for claims […]

Impossible to misunderstand

“The goal is not to be possible to understand, but impossible to misunderstand.” I saw this quote at the beginning of a math book when I was a student and it stuck with me. I would think of it when grading exams. Students often assume it is enough to be possible to understand, possible for […]

You are what you vote

I’ve tried my hand at writing for the wider public with an article for The Conversation based on my paper with Di Cook and Jeremy Forbes on “Spatial modelling of the two-party preferred vote in Australian federal elections: 2001-2016”. With the next Au…

“In 1997 Latanya Sweeney dramatically demonstrated that supposedly anonymized data was not anonymous,” but “Over 20 journals turned down her paper . . . and nobody wanted to fund privacy research that might reach uncomfortable conclusions.”

Tom Daula writes: I think this story from John Cook is a different perspective on replication and how scientists respond to errors. In particular the final paragraph: There’s a perennial debate over whether it is best to make security and privacy flaws public or to suppress them. The consensus, as much as there is a […]

robust Bayesian synthetic likelihood

David Frazier (Monash University) and Chris Drovandi (QUT) have recently come up with a robustness study of Bayesian synthetic likelihood that somehow mirrors our own work with David. In a sense, Bayesian synthetic likelihood is definitely misspecified from the start in assuming a Normal distribution on the summary statistics. When the data generating process is […]

Comparing Truncation to Differential Privacy

Traditional methods of data de-identification obscure data values. For example, you might truncate a date to just the year. Differential privacy obscures query values by injecting enough noise to keep from revealing information on an individual. Let’s compare two approaches for de-identifying a person’s age: truncation and differential privacy. Truncation First consider truncating birth date […]

Timing Working With a Row or a Column from a data.frame

In this note we share a quick study timing how long it takes to perform some simple data manipulation tasks with R data.frames. We are interested in the time needed to select a column, alter a column, or select a row. Knowing what is fast and what is slow is critical in planning code, so … Continue reading Timing Working With a Row or a Column from a data.frame

“MRP is the Carmelo Anthony of election forecasting methods”? So we’re doing trash talking now??

What’s the deal with Nate Silver calling MRP “the Carmelo Anthony of forecasting methods”? Someone sent this to me: and I was like, wtf? I don’t say wtf very often—at least, not on the blog—but this just seemed weird. For one thing, Nate and I did a project together once using MRP: this was our […]

an attempt at code golf

Having discovered codegolf on Stack Exchange a few weeks ago, I spotted a few interesting puzzles since then but only got the opportunity at a try over a quiet and rainy weekend (and Robin being on vacation)! The challenge was to write an R code for deciding whether or not a given integer n is […]