Posts Tagged ‘ data ’

Is data privacy a fundamental right?

July 4, 2015
By

This piece is part of the StatBusters column written jointly with Andrew Gelman. Hope they fix the labeling soon. In it, we talk about two recent studies on data privacy, which leads to contradictory conclusions. How should the media report such surveys? Is the brand name of the organization enough? In addition, we debunk the notion that consumers will definitely get something valuable out of sharing their data.

Read more »

Mathematical Statistics Lesson of the Day – Ancillary Statistics

Mathematical Statistics Lesson of the Day – Ancillary Statistics

The set-up for today’s post mirrors my earlier Statistics Lessons of the Day on sufficient statistics and complete statistics. Suppose that you collected data in order to estimate a parameter .  Let be the probability density function (PDF) or probability mass function (PMF) for . Let be a statistics based on . If the distribution of does NOT […]

Read more »

The Day After the Half Day in the Life of a Data Scientist

June 16, 2015
By

In the last installment, I embarked on a project--perhaps only a task--to assemble a membership list for an organization. It sounded simple: how hard could it be to merge two lists of people? Of course, I couldn’t just stitch one list on top of the other as there are members who subscribed to the newsletter as well as joined the Facebook group. These duplicate rows must be merged so that…

Read more »

A Half-day in the Life of a Data Scientist

June 11, 2015
By

An organization wanted to understand its base of members so the first order of business was constructing a database of all people who can be considered members. We decided to define membership broadly. Members included those who join the Facebook group, and those who subscribed to the newsletter. The organization kept two separate lists which I would merge to create a master list. For simplicity, I’ll call them the FB…

Read more »

What I said about data science at Princeton Reunions

June 3, 2015
By
What I said about data science at Princeton Reunions

Here was how I spent last weekend: At college reunions in beautiful Princeton on a glorious sunny day. I also spoke about data science at a Faculty-Alumni panel titled "Science Under Attack!". Here is what I said: In the past five to 10 years, there has been an explosion of interest in using data in business decision-making. What happens when business executives learn that the data do not support their…

Read more »

Some statistics about nutrition statistics

May 26, 2015
By

I only read nutrition studies in the service of this blog but otherwise, I don't trust them or care. Nevertheless, the health beat of most media outlets is obsessed with printing the latest research on coffee or eggs or fats or alcohol or what have you. Now, the estimable John Ioannidis has published an editorial in BMJ titled "Implausible Results in Human Nutrition Research". John previously told us about the…

Read more »

Should I tell students that the maximum score in the class is 137?

May 22, 2015
By

This op-ed by Richard Thaler caught my attention because I have a similar experience. In my statistics classes, I have noticed a pattern: if the mid-term exam is hard, with a lower average score (say 75-80%), the students look crestfallen and feel that they did not learn; eventually, when it comes to evaluating the instructor, I receive lower grades, with comments indicating that I have not taught them properly to…

Read more »

Story time, known unknowns and the endowment effect in an HBR article on customer data

May 6, 2015
By
Story time, known unknowns and the endowment effect in an HBR article on customer data

Harvard Business Review devotes a long article to customer data privacy in the May issue (link). The article raises important issues, such as the low degree of knowledge about what data are being collected and traded, the value people place on their data privacy, and so on. In a separate post, I will discuss why I don't think the recommendations issued by the authors will resolve the issues they raised.…

Read more »

Painting the full picture of the employment situation

May 5, 2015
By
Painting the full picture of the employment situation

It's very frustrating to read the mainstream articles about the recent unemployment report. For example, the New York Times said "U.S. Jobless Claims Hit 15-year Low." (link) At this point, everyone should be aware of how employment statistics, in particular,...

Read more »

Wakefield: Random Data Set (Part II)

April 30, 2015
By
Wakefield: Random Data Set (Part II)

This post is part II of a series detailing the GitHub package, wakefield, for generating random data sets. The First Post (part I) was a test run to gauge user interest. I received positive feedback and some ideas for improvements, … Continue reading →

Read more »


Subscribe

Email:

  Subscribe