Category: Privacy

Why are dates of service on HIPAA’s Safe Harbor list?

The HIPAA Privacy Rule offers two ways to say that data has been de-identified: Safe Harbor and expert determination. This post is about the former. I help companies with the latter. Safe Harbor provision The Safe Harbor provision lists 18 categories of data that would cause a data set to not be considered de-identified unless […]

Monte Carlo fusion

Hongsheng Dai, Murray Pollock (University of Warwick), and Gareth Roberts (University of Warwick) just arXived a paper we discussed together while I was at Warwick. Where fusion means bringing different parts of the target distribution f(x)∝f¹(x)f²(x)… together, once simulation from each part has been done. In the same spirit as in Scott et al. (2016) […]

Can I have the last four digits of your social?

Imagine this conversation. “Could you tell me your social security number?” “Absolutely not! That’s private.” “OK, how about just the last four digits?” “Oh, OK. That’s fine.” When I was in college, professors would post grades by the last four digits of student social security numbers. Now that seems incredibly naive, but no one objected […]

Revealing information by trying to suppress it

FAS posted an article yesterday explaining how blurring military installations out of satellite photos points draws attention to them, showing exactly where they are and how big they are. The Russian mapping service Yandex Maps blurred out sensitive locations in Israel and Turkey. As the article says, this is an example of the Streisand effect, […]

Simulating identification by zip code, sex, and birthdate

As mentioned in the previous post, Latanya Sweeney estimated that 87% of Americans can be identified by the combination of zip code, sex, and birth date. We’ll do a quick-and-dirty estimate and a simulation to show that this result is plausible. There’s no point being too realistic with a simulation because the actual data that […]

No funding for uncomfortable results

In 1997 Latanya Sweeney dramatically demonstrated that supposedly anonymized data was not anonymous. The state of Massachusetts had released data on 135,000 state employees and their families with obvious identifiers removed. However, the data contained zip code, birth date, and sex for each individual. Sweeney was able to cross reference this data with publicly available […]

Visualizing data breaches

The image below is a static screen shot of an interactive visualization of the world’s biggest data breaches. The site lets you filter the data by industry and type of breach. See the site for credits and the raw data.

Poetic description of privacy-preserving analysis

Erlingsson et al give a poetic description of privacy-preserving analysis in their RAPPOR paper [1]. They say that the goal is to … allow the forest of client data to be studied, without permitting the possibility of looking at individual trees. Related posts What is differential privacy? Data privacy consulting [1] Úlfar Erlingsson, Vasyl Pihur, and […]