The top 10 predictor takes on the debiased Lasso – still the champ!

January 8, 2014
By
The top 10 predictor takes on the debiased Lasso – still the champ!

After reposting on the comparison between the lasso and the always top 10 predictor (leekasso) I got some feedback that the problem could be I wasn't debiasing the Lasso (thanks Tim T. on Twitter!). The idea behind debiasing (as I … Continue reading →

Read more »

Elements of Statistical Learning: A Stunningly Good Job of LaTeX to pdf to Web

January 8, 2014
By
Elements of Statistical Learning: A Stunningly Good Job of LaTeX to pdf to Web

  A very Happy New Year to all! Here's a little thing to start us off.I happened to be thinking about principal-component regression vs. ridge regression yesterday, so as usual I consulted the Hastie-Tibshirani-Friedman (HTF) classic, El...

Read more »

How to display multinominal logit results graphically?

January 8, 2014
By

Adriana Lins de Albuquerque writes: Do you have any suggestions for the best way to represent multinominal logit results graphically? I am using stata. My reply: I don’t know from Stata, but here are my suggestions: 1. If the categories are unordered, break them up into a series of binary choices in a tree structure […]The post How to display multinominal logit results graphically? appeared first on Statistical Modeling, Causal…

Read more »

Losing the big picture

January 8, 2014
By
Losing the big picture

One of the dangers of "Big Data" is the temptation to get lost in the details. You become so absorbed in the peeling of the onion that you don't realize your tear glands have dried up. Hans Rosling linked to...

Read more »

Applied Statistics Lesson of the Day – Choosing the Number of Levels for Factors in Experimental Design

Applied Statistics Lesson of the Day – Choosing the Number of Levels for Factors in Experimental Design

The experimenter needs to decide the number of levels for each factor in an experiment. For a qualitative (categorical) factor, the number of levels may simply be the number of categories for that factor.  However, because of cost constraints, an experimenter may choose to drop a certain category.  Based on the experimenter’s prior knowledge or […]

Read more »

Machine Learning Lesson of the Day – Using Validation to Assess Predictive Accuracy in Supervised Learning

Machine Learning Lesson of the Day – Using Validation to Assess Predictive Accuracy in Supervised Learning

Supervised learning puts a lot of emphasis on building a model that has high predictive accuracy.  Validation is a good method for assessing a model’s predictive accuracy. Validation is the use of one part of your data set to build your model and another part of your data set to assess the model’s predictive accuracy. […]

Read more »

Connecting TOAD For MySQL, MySQL Workbench, and R to Amazon AWS EC2 Using SSH Tunneling

January 8, 2014
By
Connecting TOAD For MySQL, MySQL Workbench, and R to Amazon AWS EC2 Using SSH Tunneling

I often use Amazon EC2 to store and retrieve data when I need either additional storage or higher computing capacity.  In this tutorial I’ll share how to connect to a MySQL database so that one can retrieve the data and do the analysis.  I tend to use either TOAD for MySQL or MySQL Workbench to run […]

Read more »

“Philosophy of Statistical Inference and Modeling” New Course: Spring 2014: Mayo and Spanos: (Virginia Tech)

January 8, 2014
By
“Philosophy of Statistical Inference and Modeling” New Course: Spring 2014: Mayo and Spanos:  (Virginia Tech)

New course for Spring 2014: Thursday 3:30-6:15 Phil 6334: Philosophy of Statistical Inference and Modeling D. Mayo and A. Spanos Contact: error@vt.edu This new course, to be jointly taught by Professors D. Mayo (Philosophy) and A. Spanos (Economics) will provide an introductory, in-depth introduction to graduate level research in philosophy of inductive-statistical inference and probabilistic […]

Read more »

Significant news

January 7, 2014
By
Significant news

Good news on the second day back to work after the Christmas break: I've been invited to join the Editorial Board of the Significance magazine $-$ of course I have happily agreed to the invitation!I have always been a big fan of the magazine (in fact I...

Read more »

13 popular articles from 2013

January 7, 2014
By
13 popular articles from 2013

In 2013 I published 110 blog posts. Some of these articles were more popular than others, often because they were linked to from a SAS newsletter such as the SAS Statistics and Operations Research News. In no particular order, here are some of my most popular posts from 2013, organized [...]

Read more »

Preparing for tenure track job interviews

January 7, 2014
By

Editor's note: This is a slightly modified version of a previous post. If you are in the job market you will soon be receiving (or already received) an invitation for an interview. So how should you prepare?  You have two goals. The … Continue reading →

Read more »

My recent debugging experience

January 7, 2014
By
My recent debugging experience

OK, so this sort of thing happens sometimes. I was working on a new idea (still working on it; if it ultimately works out—or if it doesn’t—I’ll let you know) and as part of it I was fitting little models in Stan, in a loop. I thought it would make sense to start with linear […]The post My recent debugging experience appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

Text Mining: The Next Data Frontier – Scientific Computing

January 7, 2014
By
Text Mining: The Next Data Frontier – Scientific Computing

From: http://www.scientificcomputing.com/articles/2014/01/text-mining-next-data-frontier#.UswIHNLuLToMon, 01/06/2014 - 2:04pmMark AnawisBy some estimates, 80 percent of available information occurs as free-form textText Mining: The Next Data Front...

Read more »

MCMSki IV [day 1.5]

January 7, 2014
By
MCMSki IV [day 1.5]

The afternoon sessions I attended were “Computational and Methodological Challenges in evidence synthesis and multi-step” organised by Nicky Best and Sylvia Richardson and “Approximate inference” put together by Dan Simpson. Since both Nicky and Sylvia were alas unable to attend MCMSki, I chaired their session, which I found most interesting as connected to a recurrent […]

Read more »

From spreadsheet thinking to R thinking

January 7, 2014
By
From spreadsheet thinking to R thinking

Towards the basic R mindset. Previously The post “A first step towards R from spreadsheets” provides an introduction to switching from spreadsheets to R.  It also includes a list of additional posts (like this one) on the transition. Add two columns Figure 1 shows some numbers in two columns and the start of adding those […] The post From spreadsheet thinking to R thinking appeared first on Burns Statistics.

Read more »

Whale charts – Visualising customer profitability

January 7, 2014
By
Whale charts – Visualising customer profitability

The Christmas and New Year's break is over, yet there is still time to return unwanted presents. Return to Santa was the title of an article in the Economist that highlighted the impact on online retailers, as return rates can be alarmingly high. ...

Read more »

Machine Learning Lesson of the Day: Clustering, Density Estimation and Dimensionality Reduction

Machine Learning Lesson of the Day: Clustering, Density Estimation and Dimensionality Reduction

I struggle to categorize unsupervised learning.  It is not an easily defined field, and it is also hard to find generalizations of techniques that are exhaustive and mutually exclusive. Nonetheless, here are some categories of unsupervised learning that cover many of its commonly used techniques.  I learned this categorization from Mathematical Monk, who posted a […]

Read more »

Applied Statistics Lesson of the Day: Sample Size and Replication in Experimental Design

Applied Statistics Lesson of the Day: Sample Size and Replication in Experimental Design

The goal of an experiment is to determine whether or not there is a cause-and-effect relationship between the factor and the response the strength of the causal relationship, should such a relationship exist. To answer these questions, the response variable is measured in both the control group and the experimental group.  If there is a […]

Read more »

Reinforcement Learning in R: Markov Decision Process (MDP) and Value Iteration

January 7, 2014
By
Reinforcement Learning in R: Markov Decision Process (MDP) and Value Iteration

How can we find the best long-term plan? In the last post, we looked at the idea of dynamic programming,...

Read more »

An Introduction to Statistical Learning with Applications in R

January 7, 2014
By
An Introduction to Statistical Learning with Applications in R

Statistical learning theory offers an opportunity for those of us trained as social science methodologists to look at everything we have learned from a different perspective. For example, missing value imputation can be seen as matrix completion and re...

Read more »

You Are What You Write

January 7, 2014
By

To my wonderful students: These paragraphs are a revision of advice recently given to a student writer.  Writing is a craft we all must master. And we all will. You are young: enthusiasm and energy come through in your writing: keep that and add to it...

Read more »

Spam names

January 6, 2014
By
Spam names

There was this thing going around awhile ago, the “porn star name,” which you create by taking the name of your childhood pet, followed by the name of the street where you grew up (for example, Blitz Clifton). But recently I’ve been thinking about spam names. Just in the last two days, I’ve received emails […]The post Spam names appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »


Subscribe

Email:

  Subscribe