This chart, from Internet Retailer (March 2012, p. 26), is okay, at least they didn't use pie charts. But it could have been much more effective. To make it better, we have to break all the rules: Use lines instead...

I am working on a project that requires the generation of Bernoulli outcomes. Typically, I would go about this using the built in sample() function like so: This works great and is fast, even for large n. Problem is, I want to generate each sample with its own unique probability. Seems straight forward enough, I

False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [I]t is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis. The culprit is a construct we refer to as researcher degrees of freedom. In the course of collecting and analyzing data, researchers have many decisions to make: Should [...]

Note to readers: There will be few updates to the blog in the next few days until southern Manhattan gets its power back. Note to AT&T: If you listen to your customers, you won't need to "investigate". Cell phone service has been unavailable since Monday. *** Lots of numbers are thrown around in the media all the time. Do you know how they are derived? This is a really important…

Regime Detection comes handy when you are trying to decide which strategy to deploy. For example there are periods (regimes) when Trend Following strategies work better and there are periods when Mean Reversion strategies work better. Today I want to show you one way to detect market Regimes. To detect market Regimes, I will fit [...]

Bayesian inference, conditional on the model and data, conforms to the likelihood principle. But there is more to Bayesian methods than Bayesian inference. See chapters 6 and 7 of Bayesian Data Analysis for much discussion of this point. It saddens me ...

U-Phil: I would like to open up this post, together with Gandenberger’s (Oct. 30, 2012), to reader U-Phils, from December 6- 19 (< 1000 words) for posting on this blog (please see # at bottom of post). Where Gandenberger claims, “Birnbaum’s proof is valid and his premises are intuitively compelling,” I have shown that if Birnbaum’s [...]

Francesco and Andrea have asked me to join them in doing a short course before the conference of the Italian Health Economics Association (AIES). The course is about Bayesian statistics and health economics and will be in Rome on November 14th. I ...

I’m sorry I don’t have any new zombie papers in time for Halloween. Instead I’d like to be a little monster by reproducing a mini-rant from this article on experimental reasoning in social science: I will restrict my discussion to social science examples. Social scientists are often tempted to illustrate their ideas with examples from [...]

The Lance Armstrong story continues to regale. The biggest myth that is busted is that a series of negative tests prove anything. (This shouldn't be controversial to anyone who follows doping. Marion Jones, Tyler Hamilton, etc. etc. all passed hundreds of tests before getting caught.) Another myth in the doping circle is the "victimless crime". Of course there are losers. Apart from Armstrong, who now stands to lose both reputationally…

The determinant of a matrix arises in many statistical computations, such as in estimating parameters that fit a distribution to multivariate data. For example, if you are using a log-likelihood function to fit a multivariate normal distribution, the formula for the log-likelihood involves the expression log(det(Σ)), where Σ is the [...]

Updated: 21 November 2012 Make is a marvellous tool used by programmers to build software, but it can be used for much more than that. I use make whenever I have a large project involving R files and LaTeX files, which means I use it for almost all of the papers I write, and almost of the consulting reports I produce. If you are using a Mac or Linux, you…

The Future of Machine Learning (and the End of the World?) On Thursday (Oct 25) we had an event called the ML Futuristic Panel Discussion. The panelists were Ziv Bar-Joseph, Steve Fienberg, Tom Mitchell Aarti Singh and Alex Smola. Ziv is an expert on machine learning and systems biology. Steve is a colleague of mine [...]

Microsoft Seeks an Edge in Analyzing Big Data: Microsoft is incorporating advanced computing technologies into many of its products, allowing users to comb huge amounts of data and get suggestions based on their habits.

A short visit to ISU but and therefore a busy and proftable day! About ten appointments in Snedecor Hall after a nice morning run, a highly attended Zyskind Lecture, and many interesting discussions all over the day: e.g., I had a great time discussing using null recurrent Markov chains for integral approximations with Krishna [...]