## Top papers in the International Journal of Forecasting

February 4, 2014
Every year or so, Elsevier asks me to nominate five International Journal of Forecasting papers from the last two years to highlight in their marketing materials as “Editor’s Choice”. I try to select papers across a broad range of subjects, and I take into account citations and downloads as well as my own impression of the paper. That tends to bias my selection a little towards older papers as they have…

## Bar des Sciences: Débat sur le Big Data

February 3, 2014
Le Cœur des Sciences, à Université du Québec à Montréal, organise le 13 février prochain, à 18h, un débat grand public sur le Big Data, dans le cadre d’un bar des sciences, auquel je devrais participer, avec Vincent Gautrais (a.k.a. @gautrais), Yves-Alexandre de Montjoye (a.k.a. @yvesalexandre) et Jean-Hughes Roy (a.k.a. @jeanhuguesroy). Les mauvaises langues diront que je n’aurais pas pu refuser d’intervenir dans un bar (et elles n’auront probablement pas tort). Cela dit,…

## Research Credibility, Bayes, and "Searching for Asterisks"

February 3, 2014
Is there really a "credibility crisis" in the sciences that use statistics, as some seem to fear these days? I think not; generally I'm on board with Demming's "In God we trust, all others bring data." Of course there are issues, but they're hardly new...

## The three tables for genomics collaborations

February 3, 2014
Collaborations between biologists and statisticians are very common in genomics. For the data analysis to be fruitful, the statistician needs to understand what samples are being analyzed. For the analysis report to make sense to the biologist, it needs to … Continue reading →

## NLSdata: an R package for National Longitudinal Surveys

February 3, 2014
Introduction Alongside interstate highways, national defense, and social security, your tax dollars are used to collect data. Sometimes it’s high profile and relevant, like the census or NSA’s controversial PRISM surveillance program. Other times it’s just high profile, like the three-billion dollar brain dataset that nobody has figured out how to use. Then there are the lower profile data sets, studied by researchers, available to the public, but with enough…

## One-way street fallacy again! in reporting of research on brothers and sisters

February 3, 2014
There’s something satisfying about seeing the same error being made by commentators on the left and the right. In this case, we’re talking about the one-way street fallacy, which is the implicit assumption of unidirectionality in a setting that actually has underlying symmetry. 1. A month or so ago we reported on an op-ed by […]The post One-way street fallacy again! in reporting of research on brothers and sisters appeared…

## Oldie but goodie

February 3, 2014
Back in 2007, the New York Times graphics team produced a fabulous chart explaining the rise in prices at the pump (link). Let's start with the tab labeled "Regional Price" which contains a well-executed map of the average gas prices...

## Sample without replacement in SAS

February 3, 2014
Last week I showed three ways to sample with replacement in SAS. You can use the SAMPLE function in SAS/IML 12.1 to sample from a finite set or you can use the DATA step or PROC SURVEYSELECT to extract a random sample from a SAS data set. Sampling without replacement [...]

## Computational Actuarial Science with R

February 3, 2014
I recently co-authored a chapter on “Prospective Life Tables” for this book, edited by Arthur Charpentier. R code to reproduce the figures and to complete the exercises for our chapter is now available on github. Code for the other chapters should...

## Three Yards and a Cloud of Dust: The Evolution of Passing in the NFL

February 2, 2014
Introduction"Three yards and a cloud of dust" (1) - that's how Woody Hayes described his "crunching, frontal assault of muscle against muscle", the offense that defined the Ohio State Buckeyes in the 50s and 60s. He went on say that, in regards to the ...

## Microfoundations of macroeconomics

February 2, 2014
I received the following email the other day: Given your past criticisms of this issue in your posts, I do not think you will like my co-authored paper, “Microfoundations of the Business Cycle and Monetary Shocks” . . . Given this lead-in, of course I had to take a look. The paper is by James […]The post Microfoundations of macroeconomics appeared first on Statistical Modeling, Causal Inference, and Social Science.

## Bayesian analysis of sensory profiling data

February 2, 2014
I looked at Bayesian analysis of sensory profiling data in May and June 2012. I do remember not being totally happy with the result and computations taking a bit more time than I wanted. But now it is 2014, I can use STAN and I have been thinking about...

## Monash Econometrics in the top 10

February 2, 2014
Dave Giles pointed out on his blog yesterday that my department is currently ranked in the top 10 in the world for econometrics, according to IDEAS. We are also ranked 13th in the world in forecasting. Since IDEAS only covers the economics literature,...

## Econometrics at Monash University

February 1, 2014
My first academic position was in the (then) Department of Econometrics and Operations Research at Monash University ( in Melbourne, Australia). I was there for nine wonderful years from the mid 1970's to the mid 1980's.Now re-named the Department of E...

## Bad Bayes: an example of why you need hold-out testing

February 1, 2014
We demonstrate a dataset that causes many good machine learning algorithms to horribly overfit. The example is designed to imitate a common situation found in predictive analytic natural language processing. In this type of application you are often building a model using many rare text features. The rare text features are often nearly unique k-grams […] Related posts: Don’t use correlation to track prediction performance Generalized linear models for predicting…

## Recently in the sister blog

February 1, 2014
Are we becoming more tolerant of nepotism? Republicans have a 54 percent chance of taking the Senate The denominator fallacy rears its ugly head How better educated whites are driving political polarization Controversial claims about marriage promotion...

## Phil 6334: Day #2 Slides

February 1, 2014
Day #2, Part 1: D. Mayo:  Class, Part 2: A. Spanos: Probability/Statistics Lecture Notes 1: Introduction to Probability and Statistical Inference Day #1 slides are here.Filed under: Phil 6334 class material, Philosophy of Statistics, Statistics

## Stick Figure Function Fun – R

January 31, 2014
I have created a stick figure generating function for the purposes of adding a human figure as a demonstration of scale to some of my graphs as well as potentially emoticons to my shiny/concerto applications.You can change basic graphing parameters li...

## Into the thicket of variation: More on the political orientations of parents of sons and daughters, and a return to the tradeoff between internal and external validity in design and interpretation of research studies

January 31, 2014
We recently considered a pair of studies that came out awhile ago involving children and political orientation: Andrew Oswald and Nattavudh Powdthavee found that, in Great Britain, parents of girls were more likely to support left-wing parties, compared to parents of boys. And, in the other direction, Dalton Conley and Emily Rauscher found with survey […]The post Into the thicket of variation: More on the political orientations of parents of…

## Automatic time series forecasting in Granada

January 31, 2014
In two weeks I am presenting a workshop at the University of Granada (Spain) on Automatic Time Series Forecasting. Unlike most of my talks, this is not intended to be primarily about my own research. Rather it is to provide a state-of-the-art overview of the topic (at a level suitable for Masters students in Computer Science). I thought I’d provide some historical perspective on the development of automatic time series forecasting,…

## Python and R: Is Python really faster than R?

January 31, 2014
A friend of mine asked me to code the following in R:Generate samples of size 10 from Normal distribution with $\mu$ = 3 and $\sigma^2$ = 5;Compute the $\bar{x}$ and $\bar{x}\mp z_{\alpha/2}\displaystyle\frac{\sigma}{\sqrt{n}}$ using the 95% confidence...

## LaTeX can be arsey, but boy is it good?!

January 30, 2014
I have been using LaTeX since I wrote my BSc thesis (that was way back in the last century $-$ although I'm saying this just for dramatic effect, but I'm not THAT old!) and have loved it since. Of course, I do use WYSIWYG typesetting software now and t...

## Input data interactively into R

January 30, 2014
To input data interactively into R, use the function readline: