## Bayes 2014 coming up nicely!

January 8, 2014
Today I had a very useful teleconference with the other members of the organising committee for the Bayes Pharma 2014 conference. The new website is already up and running and we've included some information.We've nearly finalised all the details and t...

## New year, new cost-effectiveness thresholds?

January 8, 2014
Karl Claxton and colleagues at the University of York have recently published a working paper on Methods for the Estimation of the NICE Cost Effectiveness Threshold. Since a guideline was issued in 2004, NICE has used standard values of £20-30,000 per...

## Bayesian variable selection in multiple regression: Sensitivity to prior

January 8, 2014
[Notice updates appended January 10, 2014.] In multiple regression, analysts are often concerned with variable selection: Of the many predictor variables, which ones should be selected for inclusion in the model? Bayesian model comparison seems to be a...

## Avinash’s magnificent zoo, and the unfulfilled promise of Big Data

January 8, 2014
Avinash (Web Analytics 2.0) is fond of animal metaphors. I think he's the one who coined HIPPOs (Highest Paid Person's Opinion). Now he's come up with Reporting Squirrels and Analysis Ninjas. See his recent post here. In short, he is shouting about "return on analytics", a really, really important thing. What is being lost in the hype of Big Data is that all the investment in analytics has to generate…

## Belief aggregation

January 8, 2014
Johannes Castner writes: Suppose there are k scientists, each with her own model (Bayesian Net) over m random variables. Then, because the space of Bayesian Nets over these m variables, with the square-root of the Jensen-Shannon Divergence as a distance metric is a closed and bounded space, there exists one unique Bayes Net that is […]The post Belief aggregation appeared first on Statistical Modeling, Causal Inference, and Social Science.

## The top 10 predictor takes on the debiased Lasso – still the champ!

January 8, 2014
After reposting on the comparison between the lasso and the always top 10 predictor (leekasso) I got some feedback that the problem could be I wasn't debiasing the Lasso (thanks Tim T. on Twitter!). The idea behind debiasing (as I … Continue reading →

## Elements of Statistical Learning: A Stunningly Good Job of LaTeX to pdf to Web

January 8, 2014
A very Happy New Year to all! Here's a little thing to start us off.I happened to be thinking about principal-component regression vs. ridge regression yesterday, so as usual I consulted the Hastie-Tibshirani-Friedman (HTF) classic, El...

## How to display multinominal logit results graphically?

January 8, 2014
Adriana Lins de Albuquerque writes: Do you have any suggestions for the best way to represent multinominal logit results graphically? I am using stata. My reply: I don’t know from Stata, but here are my suggestions: 1. If the categories are unordered, break them up into a series of binary choices in a tree structure […]The post How to display multinominal logit results graphically? appeared first on Statistical Modeling, Causal…

## Losing the big picture

January 8, 2014
One of the dangers of "Big Data" is the temptation to get lost in the details. You become so absorbed in the peeling of the onion that you don't realize your tear glands have dried up. Hans Rosling linked to...

## Applied Statistics Lesson of the Day – Choosing the Number of Levels for Factors in Experimental Design

The experimenter needs to decide the number of levels for each factor in an experiment. For a qualitative (categorical) factor, the number of levels may simply be the number of categories for that factor.  However, because of cost constraints, an experimenter may choose to drop a certain category.  Based on the experimenter’s prior knowledge or […]

## Machine Learning Lesson of the Day – Using Validation to Assess Predictive Accuracy in Supervised Learning

Supervised learning puts a lot of emphasis on building a model that has high predictive accuracy.  Validation is a good method for assessing a model’s predictive accuracy. Validation is the use of one part of your data set to build your model and another part of your data set to assess the model’s predictive accuracy. […]

## Connecting TOAD For MySQL, MySQL Workbench, and R to Amazon AWS EC2 Using SSH Tunneling

January 8, 2014
I often use Amazon EC2 to store and retrieve data when I need either additional storage or higher computing capacity.  In this tutorial I’ll share how to connect to a MySQL database so that one can retrieve the data and do the analysis.  I tend to use either TOAD for MySQL or MySQL Workbench to run […]

## “Philosophy of Statistical Inference and Modeling” New Course: Spring 2014: Mayo and Spanos: (Virginia Tech)

January 8, 2014
New course for Spring 2014: Thursday 3:30-6:15 Phil 6334: Philosophy of Statistical Inference and Modeling D. Mayo and A. Spanos Contact: error@vt.edu This new course, to be jointly taught by Professors D. Mayo (Philosophy) and A. Spanos (Economics) will provide an introductory, in-depth introduction to graduate level research in philosophy of inductive-statistical inference and probabilistic […]

## Significant news

January 7, 2014
Good news on the second day back to work after the Christmas break: I've been invited to join the Editorial Board of the Significance magazine \$-\$ of course I have happily agreed to the invitation!I have always been a big fan of the magazine (in fact I...

## 13 popular articles from 2013

January 7, 2014
In 2013 I published 110 blog posts. Some of these articles were more popular than others, often because they were linked to from a SAS newsletter such as the SAS Statistics and Operations Research News. In no particular order, here are some of my most popular posts from 2013, organized [...]

## Preparing for tenure track job interviews

January 7, 2014
Editor's note: This is a slightly modified version of a previous post. If you are in the job market you will soon be receiving (or already received) an invitation for an interview. So how should you prepare?  You have two goals. The … Continue reading →

## My recent debugging experience

January 7, 2014
OK, so this sort of thing happens sometimes. I was working on a new idea (still working on it; if it ultimately works out—or if it doesn’t—I’ll let you know) and as part of it I was fitting little models in Stan, in a loop. I thought it would make sense to start with linear […]The post My recent debugging experience appeared first on Statistical Modeling, Causal Inference, and Social…

## Text Mining: The Next Data Frontier – Scientific Computing

January 7, 2014
From: http://www.scientificcomputing.com/articles/2014/01/text-mining-next-data-frontier#.UswIHNLuLToMon, 01/06/2014 - 2:04pmMark AnawisBy some estimates, 80 percent of available information occurs as free-form textText Mining: The Next Data Front...

## MCMSki IV [day 1.5]

January 7, 2014
The afternoon sessions I attended were “Computational and Methodological Challenges in evidence synthesis and multi-step” organised by Nicky Best and Sylvia Richardson and “Approximate inference” put together by Dan Simpson. Since both Nicky and Sylvia were alas unable to attend MCMSki, I chaired their session, which I found most interesting as connected to a recurrent […]

## From spreadsheet thinking to R thinking

January 7, 2014
Towards the basic R mindset. Previously The post “A first step towards R from spreadsheets” provides an introduction to switching from spreadsheets to R.  It also includes a list of additional posts (like this one) on the transition. Add two columns Figure 1 shows some numbers in two columns and the start of adding those […] The post From spreadsheet thinking to R thinking appeared first on Burns Statistics.

## Whale charts – Visualising customer profitability

January 7, 2014
The Christmas and New Year's break is over, yet there is still time to return unwanted presents. Return to Santa was the title of an article in the Economist that highlighted the impact on online retailers, as return rates can be alarmingly high. ...

## Machine Learning Lesson of the Day: Clustering, Density Estimation and Dimensionality Reduction

$Machine Learning Lesson of the Day: Clustering, Density Estimation and Dimensionality Reduction$

I struggle to categorize unsupervised learning.  It is not an easily defined field, and it is also hard to find generalizations of techniques that are exhaustive and mutually exclusive. Nonetheless, here are some categories of unsupervised learning that cover many of its commonly used techniques.  I learned this categorization from Mathematical Monk, who posted a […]