Posts Tagged ‘ Bias ’

Reading Everything is Obvious by Duncan Watts

February 15, 2017
By
Reading Everything is Obvious by Duncan Watts

In his book, Everything is Obvious (Once You Know the Answer): Why Common Sense Fails, Duncan Watts, a professor of sociology at Columbia, imparts urgent lessons that are as relevant to his students as to self-proclaimed data scientists. It takes only nominal effort to generate narrative structures that retrace the past, Watts contends, but developing lasting theory that produces valid predictions requires much more effort than common sense. Watts’s is…

Read more »

Deep thinking about your data

February 3, 2017
By
Deep thinking about your data

In the on-going series of posts about the IMDB dataset, from Kaggle, I have so far looked at several of the scraped variables, including the number of faces on movie posters (1, 2), plot keywords (3), and movie rating by title year (4). In this post, I tackle the variables resulting from a data merge between IMDB and Facebook. These columns have names like "Director Facebook Likes", "Actor 1 Facebook…

Read more »

Pre-processing data is not just about correcting errors

January 30, 2017
By
Pre-processing data is not just about correcting errors

Exploration of IMDB rating data, by Kaiser Fung, founder of Principal Analytics Prep

Read more »

Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

January 25, 2017
By
Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

Today I continue to explore the movie dataset, found on Kaggle. To catch up with previous work, see the blog posts 1 and 2. One of the students came up with an interesting problem. Among the genre of action movies, are there particular plot elements that are correlated with box office? This problem is solvable because the dataset contains a variable called "plot keywords" lifted from IMDB. Plot keywords are…

Read more »

ASA President meets OCCAM data

December 27, 2016
By

Just leaving this quote from ASA President Jessica Utts here (Source: Amstat News Dec 2016): A few days ago, I was in Vietnam and took a four-hour bus ride from Ha Long Bay to Hanoi. When I arrived, my fitness tracker had given me credit for taking 9,124 steps and climbing 81 flights of stairs during those four hours, even though I only left my seat once during a short…

Read more »

This election forecasting business

November 15, 2016
By
This election forecasting business

If you live in the States, and particularly a blue state, in the last year or two, it has been drilled into your head that Hillary Clinton was the overwhelming favorite to win the Presidential election. On the day before the election, when all the major media outlets finalized their "election forecasting models," they unanimously pronounced Clinton the clear winner, with a probability of winning of 70% to 99%. One…

Read more »

The idol worship of objective data is damaging our discipline

October 28, 2016
By

In class last week, I discussed this New York Times article with the students. One of the claims in the article is that the U.S. News ranking of colleges is under threat by newcomers whose rankings are more relevant because they more directly measure outcomes such as earnings of graduates. This specific claim in the article makes me head hurt: "If nothing else, earnings are objective and, as the database…

Read more »

Reader’s guide to the power pose controversy 2

October 21, 2016
By

Yesterday, I started a series of posts covering the "power pose" research controversy. The plan is as follows: Key Idea 1: Peer Review, Manuscripts, Pop Science and TED Talks Key Idea 2: P < 0.05, P-hacking, Replication Studies, Pre-registration Key Idea 3: Negative Studies, and the File Drawer (Today) Key Idea 4: Degrees of Freedom, and the Garden of Forking Paths Key Idea 5: Sample Size Here is a quick…

Read more »

Reader’s guide to the power pose controversy 1

October 20, 2016
By

I recently covered the power pose research controversy, ignited by an inflammatory letter by Susan Fiske (link). Dana Carney, one of the coauthors of the original power pose study, courageously came forward to disown the research, and explained the reasons why she no longer trusts the result. Here is her mea culpa. Her co-author, Amy Cuddy, then went to New York Magazine to publish her own corrective, claiming that the…

Read more »

The plural of anecdote is not …

October 12, 2016
By
The plural of anecdote is not …

One of my favorite statistics-related wisecracks is: the plural of anecdote is not data. In today's world, the saying should really say: the plural of anecdote is not BIG DATA. In class this week, we discussed a recent Letter to the Editor of top journal, New England Journal of Medicine, featuring a short analysis of weight data coming from a digital scale that, you guessed it, makes users consent to…

Read more »


Subscribe

Email:

  Subscribe