Posts Tagged ‘ Bias ’

Why Tesla owners pay more for insurance

June 6, 2017
By

A nice article in Zerohedge about the recent rate hike for Tesla owners (link). Of course, Tesla is crying foul. The gist of the story is that Tesla owners tend to get into more crashes and the repair cost is higher than for other brands. The former is a matter of statistics - but also self-selection: in my limited encounters, the first things Tesla owners tell me about their cars…

Read more »

Book review: Everybody Lies by Seth Stephens-Davidowitz

May 15, 2017
By
Book review: Everybody Lies by Seth Stephens-Davidowitz

Kaiser Fung, founder of Principal Analytics Prep, discusses Seth Stephens-Davidowitz's new book, Everybody Lies

Read more »

The Times agrees on privacy and kind of on fake news business

May 11, 2017
By

The New York Times Magazine has been publishing some pieces that directly relate to a couple of my blog posts. In this article, Amanda Hess noticed that "privacy became a commodity for the rich and powerful." This echoes my blog post on "Data is the next frontier of equal rights." Hess discussed the asymmetry and hypocrisy of the situation whereby the same businesses and business executives that are wantonly stripping…

Read more »

Dispute over analysis of school quality and home prices shows social science is hard

April 24, 2017
By
Dispute over analysis of school quality and home prices shows social science is hard

Most of my friends with families fret over school quality when deciding where to buy their homes. It's well known that good school districts are also associated with expensive houses. A feedback cycle is at work here: home prices surge where there are good schools; only richer people can afford to buy such homes; wealth brings other advantages, and so the schools tend to have better students, which leads to…

Read more »

My pre-existing United boycott, and some musing on randomness and fairness

April 12, 2017
By

You probably already saw the video - if not, do yourself a favor, and search for "man forcibly removed from overbooked United flight." Other than the video evidence, which is damning, we don't have many facts, other than assertions made by various parties, repeated endlessly on social media and mainline media. Some facts, such as the United CEO claiming the passenger was "belligerent," is an assault on the meaning of…

Read more »

The get-rich-quick scheme of the English

March 31, 2017
By
The get-rich-quick scheme of the English

The World Economic Forum published this chart: The "EF EPI Score" is a measure of English proficiency. So the evidence is clear as day: "Better English and Income Go Hand in Hand," as their headline blares. Last time I was in the New York subway, the panhandler spoke good English. What's a blogger to do? I pulled out the EPI scores from the EPI report, and downloaded the Gross National…

Read more »

Reading Everything is Obvious by Duncan Watts

February 15, 2017
By
Reading Everything is Obvious by Duncan Watts

In his book, Everything is Obvious (Once You Know the Answer): Why Common Sense Fails, Duncan Watts, a professor of sociology at Columbia, imparts urgent lessons that are as relevant to his students as to self-proclaimed data scientists. It takes only nominal effort to generate narrative structures that retrace the past, Watts contends, but developing lasting theory that produces valid predictions requires much more effort than common sense. Watts’s is…

Read more »

Deep thinking about your data

February 3, 2017
By
Deep thinking about your data

In the on-going series of posts about the IMDB dataset, from Kaggle, I have so far looked at several of the scraped variables, including the number of faces on movie posters (1, 2), plot keywords (3), and movie rating by title year (4). In this post, I tackle the variables resulting from a data merge between IMDB and Facebook. These columns have names like "Director Facebook Likes", "Actor 1 Facebook…

Read more »

Pre-processing data is not just about correcting errors

January 30, 2017
By
Pre-processing data is not just about correcting errors

Exploration of IMDB rating data, by Kaiser Fung, founder of Principal Analytics Prep

Read more »

Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

January 25, 2017
By
Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

Today I continue to explore the movie dataset, found on Kaggle. To catch up with previous work, see the blog posts 1 and 2. One of the students came up with an interesting problem. Among the genre of action movies, are there particular plot elements that are correlated with box office? This problem is solvable because the dataset contains a variable called "plot keywords" lifted from IMDB. Plot keywords are…

Read more »


Subscribe

Email:

  Subscribe