Posts Tagged ‘ data ’

Statistics as inverse probability

April 22, 2013
By

Statistics is sometimes described as inverse probability. In a typical probability problem, one starts by positing that a certain quantity has some given probability distribution, say the number of people entering a bank branch follows a Poisson distribution, and then goes on to compute probabilities such as the chance that more than 100 people (max capacity) require service at the same time. In a typical statistical problem, one observes the…

Read more »

FDA endorses masking placebos as proven drugs as a get-rich-quick scheme

April 19, 2013
By

That's not really what the FDA said but based on the shameful actions unearthed by ProPublica and reported by Scientific American here, one would think one can get away with this scam. Three years ago, the FDA busted a Houston lab (based on a whistleblower report) that fabricated loads of research studies that were used by the FDA to approve about 100 drugs. Eighty-percent of those drugs were generic drugs…

Read more »

Occupational hazards in data science

April 17, 2013
By

An interesting episode is developing in econometrics over the very high profile Reinhart-Rogoff paper that was heavily cited as a source to "prove" that high levels of national debt impede growth. It appears that that result was based on a combination of spreadsheet errors, and bad assumptions. 1. Andrew Gelman has a great discussion here. His main concern is ethics of data analysts. This is a very important point -…

Read more »

Doing legwork, doing justice

April 15, 2013
By
Doing legwork, doing justice

The New York Times brought attention to the Bronx courtrooms this weekend. (link) The following small-multiples chart effectively illustrates how the Bronx system is uniquely unproductive, compared to the other boroughs: The above chart shows the outcomes. The next chart...

Read more »

Generalized Pairs Plot: It’s about time!

March 28, 2013
By
Generalized Pairs Plot: It’s about time!

JW Emerson, WA Green, B Schloerke, J Crowley, D Cook, H Hofmann, H Wickham (2013) The Generalized Pairs Plot. Journal of Computational and Graphical Statistics 22(1). Here's a free preprint version. Until this new paper and implementation by Emerson et al., there were no widely available pairs plots that accommodated both numerical and categorical fields. [...]

Read more »

The state of charting software

March 28, 2013
By
The state of charting software

Andrew Wheeler took the time to write code (in SPSS) to create the "Scariest Chart ever" (link). I previously wrote about my own attempt to remake the famous chart in grayscale. I complained that this is a chart that is...

Read more »

Mix percent metaphors, add average confusion, and serve

March 27, 2013
By
Mix percent metaphors, add average confusion, and serve

Sometimes, a chart just strains your mind. Such is the case with the following, a tip from Augustine F. (@acfou) There are just so many percentages on the chart it's really hard to figure out which is which. Under the...

Read more »

Cat and dog food, for thought

March 21, 2013
By
Cat and dog food, for thought

My friend Rhonda (@RKDrake) sends me to this pair of charts (in BusinessWeek). They are fun to look at, and ponder at. Here's the first chart: Should the countries be colored according to the distance from the Equator? Is this...

Read more »

In search of the honest credit repair shop

March 19, 2013
By
In search of the honest credit repair shop

FTC made the headlines recently complaining about inaccurate consumer credit reports. The Wall Street Journal (link) has a typical report on this research. Here's their summary: In the FTC study, 262 of the 1,001 people who reviewed their credit reports spotted at least one potential "material" mistake, such as a credit-card account that wasn't theirs or a late payment that they didn't believe was late. That sounds like a worrisome…

Read more »

Simulated Power/Precision Analysis

February 22, 2013
By
Simulated Power/Precision Analysis

I cringe when I see research proposals that describe a sophisticated statistical approach, yet do not evaluate this approach in their power/precision/sample size planning. It's often the case that a simplified version of the proposed statistical approach is used instead. Presumably, this is due to the limited availability of power/precision/sample size planning software for sophisticated [...]

Read more »

Subscribe

Email:

Add to Google Reader or Homepage

  Subscribe