## Open and Closed Intervals: A Problem for ML Inference But Not Bayes

February 21, 2014
By

Does maximum likelihood inference have a support problem?  Maximum likelihood (ML) has a problem with parameters that take values in open sets (Is that all of them? Almost!). Bayesian inference doesn't obviously have this problem.  Briefly, using max...

## Open and Closed Intervals: A Problem for ML Inference But Not Bayes

February 21, 2014
By

Does maximum likelihood inference have a support problem?  Maximum likelihood (ML) has a problem with parameters that take values in open sets (Is that all of them? Almost!). Bayesian inference doesn't obviously have this problem.  Briefly, using max...

## Here’s why the scientific publishing system can never be "fixed"

February 21, 2014
By

There's been much discussion recently about how the scientific publishing system is "broken". Just the latest one that I saw was a tweet from Princeton biophysicist Josh Shaevitz: Editor at a ‘fancy’ journal to my postdoc “This is amazing work … Continue reading →

## One day discount on Practical Data Science with R

February 21, 2014
By

Please forward and share this discount offer for our upcoming book. Manning Deal of the Day February 22: Half off Practical Data Science with R. Use code dotd022214au at www.manning.com/zumel/. Related posts: Data Science, Machine Learning, and Statis...

## Canvas Shows All the Lint Generated by a Clothes Dryer during One Year

February 21, 2014
By

The concept is original, yet simple. Assistant Professor of Arts Technology Rick Valentin and his partner created a life-size physical visualization of all the lint that they collected from their clothes dryer during the last year. The work thus cons...

## The world’s most popular languages that the Mac documentation hasn’t been translated into

February 21, 2014
By

I was updating my Mac and noticed the following: Lots of obscure European languages there. That got me wondering: what’s the least obscure language not on the above list? Igbo? Swahili? Or maybe Tagalog? I did a quick google and found this list of languages by number of native speakers. Once you see the list, […]The post The world’s most popular languages that the Mac documentation hasn’t been translated into…

## Pets may need shelter from this terrible chart

February 21, 2014
By

Josh tweeted quite a shocking attack ad to me last week. He told me it came from the DC Metro. The ad is taken out by a group called HumaneWatch.Org, which apparently is a watchdog checking up on charity organizations....

## Forecasting within limits

February 21, 2014
By

It is common to want forecasts to be positive, or to require them to be within some specified range . Both of these situations are relatively easy to handle using transformations. Positive forecasts To impose a positivity constraint, simply work on the log scale. With the forecast package in R, this can be handled by specifying the Box-Cox parameter . For example, consider the real price of a dozen eggs…

## STEPHEN SENN: Fisher’s alternative to the alternative

February 21, 2014
By

Reblogging 2 years ago: By: Stephen Senn This year [2012] marks the 50th anniversary of RA Fisher’s death. It is a good excuse, I think, to draw attention to an aspect of his philosophy of significance testing. In his extremely interesting essay on Fisher, Jimmie Savage drew attention to a problem in Fisher’s approach to testing. […]

## Applied Statistics Lesson of the Day – The Matched Pairs Experimental Design

$Applied Statistics Lesson of the Day – The Matched Pairs Experimental Design$

The matched pairs design is a special type of the randomized blocked design in experimental design.  It has only 2 treatment levels (i.e. there is 1 factor, and this factor is binary), and a blocking variable divides the experimental units into pairs.  Within each pair (i.e. each block), the experimental units are randomly assigned to the […]

## The gap between data mining and predictive models

February 21, 2014
By

The Facebook data science blog shared some fun data explorations this Valentine’s Day in Carlos Greg Diuk’s “The Formation of Love”. They are rightly receiving positive interest in and positive reviews of their work (for example Robinson Meyer’s Atlantic article). The finding is also a great opportunity to discuss the gap between cool data mining […] Related posts: A Demonstration of Data Mining Generalized linear models for predicting rates Data…

## How to fake a sophisticated knowledge of wine with Markov Chains

February 20, 2014
By

To the untrained (like me), wine criticism may seem like an exercise in pretentiousness. It may seem like anybody following a set of basic rules and knowing the proper descriptors can feign sophistication (at least when it comes to wine).… Continue reading →

## The Kernel of Truth in Frequentism: How Frequentists Screw it up

February 20, 2014
By

The last two posts (here and here) showed how Frequentists conflate the most likely frequency histogram with true probability distributions. The usual Bayesian complaint is that this limits statistics to repeated trials with stable frequency distributi...

## Correlation is evidence of causation

February 20, 2014
By

In class last week, I was talking about correlation and linear regression, and I made the outrageous claim that correlation is evidence of causation.  One of my esteemed colleagues, who is helping out with the class, was sitting in the back of the...

## Data Analysis for Genomics MOOC

February 20, 2014
By

Last month I told you about Coursera's specializations in data science, systems biology, and computing. Today I was reading Jeff Leek's blog post defending p-values and found a link to HarvardX's Data Analysis for Genomics course, taught by Rafael Iriz...

## Do differences between biology and statistics explain some of our diverging attitudes regarding criticism and replication of scientific claims?

February 20, 2014
By

Last month we discussed an opinion piece by Mina Bissell, a nationally-recognized leader in cancer biology. Bissell argued that there was too much of a push to replicate scientific findings. I disagreed, arguing that scientists should want others to be able to replicate their research, that it’s in everyone’s interest if replication can be done […]The post Do differences between biology and statistics explain some of our diverging attitudes regarding…

## R.A. Fisher: ‘Two New Properties of Mathematical Likelihood’

February 20, 2014
By

Exactly 1 year ago: I find this to be an intriguing discussion–before some of the conflicts with N and P erupted.  Fisher links his tests and sufficiency, to the Neyman and Pearson lemma in terms of power.  It’s as if we may see them as ending up in a similar place while starting from different […]

## More on Product Terms and Interaction in Logistic Regression Models

February 20, 2014
By

I noticed that Bill Berry, Justin Esarey, and Jackie DeMeritt's (BDE) long-time R&R'ed paper at AJPS is finally forthcoming. I really like seeing highly applied, but rigorous, work like this being published at top journals. You should definitely have a look at their paper if you use logit or probit models to argue for interaction. […]

## R in Insurance 2014 Conference Poster

February 20, 2014
By

Here is the poster for the 2nd R in Insurance conference on Monday 14 July 2014 at Cass Business School in London:R in Insurance 2014 conference poster. Download PDF versionImportant dead lines to keep in mind:Abstract submissions: 28 March 2014Early b...

## Backcasting in R

February 20, 2014
By

Sometimes it is useful to “backcast” a time series — that is, forecast in reverse time. Although there are no in-built R functions to do this, it is very easy to implement. Suppose x is our time series and we want to backcast for periods. Here is some code that should work for most univariate time series. The example is non-seasonal, but the code will also work with seasonal data.…

## Identification of ARMA processes

February 20, 2014
By

Last week (in the MAT8181 course) in order to identify the orders of an ARMA process, we’ve seen the eacf method, and I mentioned the scan method, introduced in Tsay and Tiao (1985). The code below – to produce the output of the scan proce...

## evaluating stochastic algorithms

February 19, 2014
By

Reinaldo sent me this email a long while ago Could you recommend me a nice reference about measures to evaluate stochastic algorithms (in particular focus in approximating posterior distributions). and I hope he is still reading the ‘Og, despite my lack of prompt reply! I procrastinated and procrastinated in answering this question as I did not […]

## Selfie City: a Visualization-Centric Analysis of Online Self-Portraits

February 19, 2014
By

Selfie City [selfiecity.net], developed by Lev Manovich, Moritz Stefaner, Mehrdad Yazdani, Dominikus Baur and Alise Tifentale, investigates the socio-popular phenomenon of self-portraits (or selfies) by using a mix of theoretic, artistic and quantitat...