Occupational hazards in data science

April 17, 2013
By

An interesting episode is developing in econometrics over the very high profile Reinhart-Rogoff paper that was heavily cited as a source to "prove" that high levels of national debt impede growth. It appears that that result was based on a combination of spreadsheet errors, and bad assumptions. 1. Andrew Gelman has a great discussion here. His main concern is ethics of data analysts. This is a very important point -…

Read more »

Quantile regression: Better than connecting the sample quantiles of binned data

April 17, 2013
By
Quantile regression: Better than connecting the sample quantiles of binned data

I often see variations of the following question posted on statistical discussion forums: I want to bin the X variable into a small number of values. For each bin, I want to draw the quartiles of the Y variable for that bin. Then I want to connect the corresponding quartile [...]

Read more »

Interview with a forced convert from Matlab to R

April 17, 2013
By
Interview with a forced convert from Matlab to R

Here is an interview with Ron Hochreiter, Assistant Professor at WU Vienna University Economics and Business. In 25 words or less tell us what you do (using German words is cheating). I consider myself as a data scientist (teaching and research) with roots in Mathematical Programming, i.e. Optimization under Uncertainty (Stochastic Programming). You were an […]The post Interview with a forced convert from Matlab to R appeared first on Burns…

Read more »

Reinhart & Rogoff: Everyone makes coding mistakes, we need to make it easy to find them + Graphing uncertainty

April 17, 2013
By
Reinhart & Rogoff: Everyone makes coding mistakes, we need to make it easy to find them + Graphing uncertainty

You may have already seen a lot written on the replication of Reinhart & Rogoff’s (R &amp R) much cited 2010 paper done by Herndon, Ash, and Pollin. If you haven’t, here is a round up of some of some of what has been written: Konczal, Y...

Read more »

Memo to Reinhart and Rogoff: I think it’s best to admit your errors and go on from there

April 17, 2013
By
Memo to Reinhart and Rogoff:  I think it’s best to admit your errors and go on from there

Jeff Ratto points me to this news article by Dean Baker reporting the work of three economists, Thomas Herndon, Michael Ash, and Robert Pollin, who found errors in a much-cited article by Carmen Reinhart and Kenneth Rogoff analyzing historical statistics of economic growth and public debt. Mike Konczal provides a clear summary; that’s where I [...]The post Memo to Reinhart and Rogoff: I think it’s best to admit your errors…

Read more »

I wish economists made better plots

April 16, 2013
By
I wish economists made better plots

I'm seeing lots of traffic on a big-time economics article by that failed to reproduce and here are my quick thoughts. You can read a pretty good summary here by Mike Konczal. Quick background: Carmen Reinhart and Kenneth Rogoff wrote … Continue reading →

Read more »

My talk in Chicago this Thurs 6:30pm

April 16, 2013
By

Choices in Visualizing Data This time, it’s not at the university, it’s at a data science meetup. Here are the slides. I actually prefer the term “statistical graphics” or “visualizing quantitative information” rather than “visualizing data.” I spend a lot of time graphing inferences and fitted models, understanding my fits and doing exploratory model analysis. [...]The post My talk in Chicago this Thurs 6:30pm appeared first on Statistical Modeling, Causal…

Read more »

Flotsam 11: mostly on books

April 16, 2013
By
Flotsam 11: mostly on books

‘No estaba muerto, andaba the parranda’† as the song says. Although rather than partying it mostly has been reading, taking pictures and trying to learn how to record sounds. Here there are some things I’ve come across lately. I can’t remember if I’ve recommended Matloff’s The Art of R Programming before; if I haven’t, go […]

Read more »

Test Driven Analysis?

April 16, 2013
By
Test Driven Analysis?

At the last LondonR meeting Francine Bennett from Mastodon C shared some of her experience and findings from an analysis of a large prescriptions data set of the UK's national health service (NHS). However, it was her last slide, which I found the most...

Read more »

RStudio is reminding me of the older Macs

April 16, 2013
By
RStudio is reminding me of the older Macs

The only thing missing is the cryptic ID number.Well, the only bad thing is that I am trying to run a probabilistic graphical model on some real data, and having a crash like this will definitely slow things down.

Read more »

Four-day course in doing Bayesian data analysis, June 10-13

April 16, 2013
By
Four-day course in doing Bayesian data analysis, June 10-13

There will be a four-day introductory course in doing Bayesian data analysis, June 10-13 (2013), at the University of St. Gallen, Switzerland. The course is offered through the Summer School in Empirical Research Methods. Complete info is at this link:...

Read more »

MCMSki IV, Jan. 6-8, 2014, Chamonix (news #5)

April 15, 2013
By
MCMSki IV, Jan. 6-8, 2014, Chamonix (news #5)

More exciting news about MCMSki IV! First thing first, the 16 contributed sessions are now all-set, having gotten the stamp of approval from the scientific committee! Thanks to everyone who submitted a session proposal. (There were so many proposals that we alas had to reject some, as well as every single talk proposal… Sorry people: […]

Read more »

Isotonic Regression

April 15, 2013
By
Isotonic Regression

My latest contribution for scikit-learn is an implementation of the isotonic regression model that I coded with Nelle Varoquaux and Alexandre Gramfort. This model finds the best least squares fit to a set of points, given the constraint that the f...

Read more »

Data science only poses a threat to (bio)statistics if we don’t adapt

April 15, 2013
By

We have previously mentioned on this blog how statistics needs better marketing. Recently, Karl B. has suggested that “Data science is statistics” and Larry W. has wondered if “Data science is the end of statistics?” I think there are a … Continue reading →

Read more »

How effective are football coaches?

April 15, 2013
By

Dave Berri writes: A recent study published in the Social Science Quarterly suggests that these moves may not lead to the happiness the fans envision (HT: the Sports Economist). E. Scott Adler, Michael J. Berry, and David Doherty looked at coaching changes from 1997 to 2010. What they found should give pause to people who [...]The post How effective are football coaches? appeared first on Statistical Modeling, Causal Inference, and…

Read more »

Doing legwork, doing justice

April 15, 2013
By
Doing legwork, doing justice

The New York Times brought attention to the Bronx courtrooms this weekend. (link) The following small-multiples chart effectively illustrates how the Bronx system is uniquely unproductive, compared to the other boroughs: The above chart shows the outcomes. The next chart...

Read more »

The role of statistics in the top public health achievements of the 20th century

April 15, 2013
By
The role of statistics in the top public health achievements of the 20th century

In this International Year of Statistics, I'd like to describe the major role of statistics in public health advances. In our modern society, it is sometimes difficult to recall the huge advances in health and medicine in the 20th century. To name a few: penicillin was discovered in 1928, risk [...]

Read more »

Stock-picking opportunity and the ratio of variabilities

April 15, 2013
By
Stock-picking opportunity and the ratio of variabilities

How good is the current opportunity to pick stocks relative to the past? Idea The more stocks act differently from each other relative to how volatile they are, the more opportunity there is to benefit by selecting stocks.  This post looks at a particular way of investigating that idea. Data Daily (log) returns of 442 … Continue reading →

Read more »

Simulating the Gambler’s Ruin

April 15, 2013
By
Simulating the Gambler’s Ruin

The gambler’s ruin problem is one where a player has a probability p of winning  and probability q of losing. For example let’s take a skill game where the player x can beat player y with probability 0.6 by getting closer to a target. The game play begins with player x being allotted 5 points and player y [...]

Read more »

Checking the Goodness of Fit of the Poisson Distribution in R for Alpha Decay by Americium-241

Checking the Goodness of Fit of the Poisson Distribution in R for Alpha Decay by Americium-241

Introduction Today, I will discuss the alpha decay of americium-241 and use R to model the number of emissions from a real data set with the Poisson distribution.  I was especially intrigued in learning about the use of Am-241 in smoke detectors, and I will elaborate on this clever application.  I will then use the Pearson chi-squared […]

Read more »

Good, Bad and Wrong: Videos about Confidence Intervals

April 14, 2013
By
Good, Bad and Wrong: Videos about Confidence Intervals

Videos are useful teaching and learning resources There is much talk about “flipped classrooms” and the wonders of Khan Academy, YouTube abounds with videos about everything…really! Even television news reports show YouTube clips. Teachers and instructors use videos in their … Continue reading →

Read more »

BayesComp homepage

April 14, 2013
By
BayesComp homepage

Today, the BayesComp section of ISBA launched its website. It is organised as a wiki and members of the section are strongly incited to take part into the construction of the website. To quote from Peter Green’s introduction: This new Wikidot site aims to be a community-edited resource on all aspects of Bayesian computation, available […]

Read more »

Detecting predictability in complex ecosystems

April 14, 2013
By

A couple people pointed me to a recent article, “Detecting Causality in Complex Ecosystems,” by fisheries researchers George Sugihara, Robert May, Hao Ye, Chih-hao Hsieh, Ethan Deyle, Michael Fogarty, and Stephan Munch. I don’t know anything about ecology research but I could imagine this method being useful in that field. I can’t see the approach [...]The post Detecting predictability in complex ecosystems appeared first on Statistical Modeling, Causal Inference, and…

Read more »


Subscribe

Email:

  Subscribe