Blog Archives

Reading Everything is Obvious by Duncan Watts

February 15, 2017
By
Reading Everything is Obvious by Duncan Watts

In his book, Everything is Obvious (Once You Know the Answer): Why Common Sense Fails, Duncan Watts, a professor of sociology at Columbia, imparts urgent lessons that are as relevant to his students as to self-proclaimed data scientists. It takes only nominal effort to generate narrative structures that retrace the past, Watts contends, but developing lasting theory that produces valid predictions requires much more effort than common sense. Watts’s is…

Read more »

Butcher: which part of the leg do you want? Me: All of it, in five pieces please

February 14, 2017
By
Butcher: which part of the leg do you want? Me: All of it, in five pieces please

This ABC News chart seemed to have taken over the top of my Twitter feed so I better comment on it. Someone at ABC News tried really hard to dress up the numbers. The viz is obviously rigged - Obama...

Read more »

Layered donuts have excess fats and oils

February 8, 2017
By
Layered donuts have excess fats and oils

Via Twitter, Nicholas S. sent this chart: It's a layered donut. There isn't much context here except that the chart comes from USDA. Judging from the design, I surmise that the key message is the change in proportion by food...

Read more »

Deep thinking about your data

February 3, 2017
By
Deep thinking about your data

In the on-going series of posts about the IMDB dataset, from Kaggle, I have so far looked at several of the scraped variables, including the number of faces on movie posters (1, 2), plot keywords (3), and movie rating by title year (4). In this post, I tackle the variables resulting from a data merge between IMDB and Facebook. These columns have names like "Director Facebook Likes", "Actor 1 Facebook…

Read more »

February talks, and exploratory data analysis using visuals

January 30, 2017
By
February talks, and exploratory data analysis using visuals

News: In February, I am bringing my dataviz lecture to various cities: Atlanta (Feb 7), Austin (Feb 15), and Copenhagen (Feb 28). Click on the links for free registration. I hope to meet some of you there. *** On the...

Read more »

Pre-processing data is not just about correcting errors

January 30, 2017
By
Pre-processing data is not just about correcting errors

Exploration of IMDB rating data, by Kaiser Fung, founder of Principal Analytics Prep

Read more »

Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

January 25, 2017
By
Apparently Hollywood does not recycle action-movie plots. The data said so, so it must be right

Today I continue to explore the movie dataset, found on Kaggle. To catch up with previous work, see the blog posts 1 and 2. One of the students came up with an interesting problem. Among the genre of action movies, are there particular plot elements that are correlated with box office? This problem is solvable because the dataset contains a variable called "plot keywords" lifted from IMDB. Plot keywords are…

Read more »

Numbersense and government accountability in the new political reality

January 24, 2017
By

You've heard me say often, numbersense is the most important quality for good data analysts; little did I know that numbersense would become the new requirement for healthy American democracy. From the first day in office, the new President is at war with numbers (over attendance figures at his inauguration). But I believe that getting to the bottom of data-driven claims is a bi-partisan issue: while it is obvious that…

Read more »

Lines that delight, lines that blight

January 23, 2017
By
Lines that delight, lines that blight

This WSJ graphic caught my eye. The accompanying article is here. The article (judging from the sub-header) makes two separate points, one about the total amount of money raised in IPOs in a year, and the change in market value...

Read more »

Counting is hard, especially when you don’t have theories

January 19, 2017
By
Counting is hard, especially when you don’t have theories

Exploring the data about movies, uncovering data issues

Read more »


Subscribe

Email:

  Subscribe