Posts Tagged ‘ Models ’

A call for less automation, more transparency in digital advertising

August 8, 2017
By

Kaiser Fung, founder of Principal Analytics Prep and founding director of Applied Analytics at Columbia, calls for reform in the digital advertising industry to address concerns about measurability and accountability

Read more »

If you are using Facebook Ads split testing (A/B testing), stop fooling yourself

July 26, 2017
By

Kaiser Fung, founder of Principal Analytics Prep, and former director of Applied Analytics at Columbia University, explains why you can't run proper A/B tests on Facebook

Read more »

An email from LinkedIn

July 5, 2017
By
An email from LinkedIn

Last week, I got served a dose of predictive analytics. I got an email solicitation from LinkedIn, presenting a list of jobs that they think I might be interested in. This email is algorithmically generated, and LinkedIn tells me that I received it because I clicked on a job posting for a senior data analyst position at Ogilvy & Mather, a top advertising agency based in Manhattan. Yes, I did…

Read more »

The get-rich-quick scheme of the English

March 31, 2017
By
The get-rich-quick scheme of the English

The World Economic Forum published this chart: The "EF EPI Score" is a measure of English proficiency. So the evidence is clear as day: "Better English and Income Go Hand in Hand," as their headline blares. Last time I was in the New York subway, the panhandler spoke good English. What's a blogger to do? I pulled out the EPI scores from the EPI report, and downloaded the Gross National…

Read more »

Confused by machines, or spooked by the machine-makers

March 29, 2017
By

This New York Times article draws attention to real trends in the financial investments industry but gets completely lost in the smoke around those pushing "machines" and "data". The trend most concerning to the investments industry is the sustained, large-scale outflow of money from "actively-managed" funds, mutual funds being the biggest category of such. The industry makes loads of money from management fees by promoting the idea that investors are…

Read more »

Reading Everything is Obvious by Duncan Watts

February 15, 2017
By
Reading Everything is Obvious by Duncan Watts

In his book, Everything is Obvious (Once You Know the Answer): Why Common Sense Fails, Duncan Watts, a professor of sociology at Columbia, imparts urgent lessons that are as relevant to his students as to self-proclaimed data scientists. It takes only nominal effort to generate narrative structures that retrace the past, Watts contends, but developing lasting theory that produces valid predictions requires much more effort than common sense. Watts’s is…

Read more »

Deep thinking about your data

February 3, 2017
By
Deep thinking about your data

In the on-going series of posts about the IMDB dataset, from Kaggle, I have so far looked at several of the scraped variables, including the number of faces on movie posters (1, 2), plot keywords (3), and movie rating by title year (4). In this post, I tackle the variables resulting from a data merge between IMDB and Facebook. These columns have names like "Director Facebook Likes", "Actor 1 Facebook…

Read more »

Pre-processing data is not just about correcting errors

January 30, 2017
By
Pre-processing data is not just about correcting errors

Exploration of IMDB rating data, by Kaiser Fung, founder of Principal Analytics Prep

Read more »

Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

January 25, 2017
By
Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

Today I continue to explore the movie dataset, found on Kaggle. To catch up with previous work, see the blog posts 1 and 2. One of the students came up with an interesting problem. Among the genre of action movies, are there particular plot elements that are correlated with box office? This problem is solvable because the dataset contains a variable called "plot keywords" lifted from IMDB. Plot keywords are…

Read more »

Counting is hard, especially when you don’t have theories

January 19, 2017
By
Counting is hard, especially when you don’t have theories

Exploring the data about movies, uncovering data issues

Read more »


Subscribe

Email:

  Subscribe