## Presentation on Statistical Genetics at Vancouver SAS User Group – Wednesday, May 28, 2014

I am excited and delighted to be invited to present at the Vancouver SAS User Group‘s next meeting.  I will provide an introduction to statistical genetics; specifically, I will define basic terminology in genetics explain the Hardy-Weinberg equilibrium in detail illustrate how Pearson’s chi-squared goodness-of-fit test can be used in PROC FREQ in SAS to check the Hardy-Weinberg equilibrium illustrate […]

## Machine Learning and Applied Statistics Lesson of the Day – Sensitivity and Specificity

$Machine Learning and Applied Statistics Lesson of the Day – Sensitivity and Specificity$

To evaluate the predictive accuracy of a binary classifier, two useful (but imperfect) criteria are sensitivity and specificity. Sensitivity is the proportion of truly positives cases that were classified as positive; thus, it is a measure of how well your classifier identifies positive cases.  It is also known as the true positive rate.  Formally,   Specificity is the proportion of truly […]

## Unit Root Testing: Sample Size vs. Sample Span

May 26, 2014
By

The more the merrier when it comes to the number of observations we have for our economic time-series data - right? Well, not necessarily. There are several reasons to be cautious, not the least of which include the possibility of structural break...

## WAIC and cross-validation in Stan!

May 26, 2014
By

Aki and I write: The Watanabe-Akaike information criterion (WAIC) and cross-validation are methods for estimating pointwise out-of-sample prediction accuracy from a fitted Bayesian model. WAIC is based on the series expansion of leave-one-out cross-validation (LOO), and asymptotically they are equal. With finite data, WAIC and cross-validation address different predictive questions and thus it is useful […] The post WAIC and cross-validation in Stan! appeared first on Statistical Modeling, Causal Inference,…

## On deck this week

May 26, 2014
By

Mon: WAIC and cross-validation in Stan! Tues: A whole fleet of gremlins: Looking more carefully at Richard Tol’s twice-corrected paper, “The Economic Effects of Climate Change” Wed: Just wondering Thurs: When you believe in things that you don’t understand Fri: I posted this as a comment on a sociology blog Sat: “Building on theories used […] The post On deck this week appeared first on Statistical Modeling, Causal Inference, and…

## A Sequence of 9 Courses on Data Science Starts on Coursera on 2 June and 7 July 2014

May 26, 2014
By

A sequence of 9 courses on Data Science will start on Coursera on 2 June and 7 July 2014, to be lectured by(Associate/Assistant) Professors of Johns Hopkins University. The courses are designed for students to learn to become Data Scientists … Continue reading →

## Data science market places

May 26, 2014
By

Some new websites are being established offering “market places” for data science. Two I’ve come across recently are Experfy and SnapAnalytx. Experfy provides a way for companies to find statisticians and other data scientists, either for short-term consultancies, or to fill full-time positions. They describe their “providers” as “Data Engineers, Data Scientists, Data Mining Experts, Data Analyst/Modelers, Big Data Solutions Architects, Visualization Designers, Statisticians, Applied Physicists, Mathematicians, Econometricians and Bioinformaticians.”…

## Why I decided not to be a physicist

May 25, 2014
By

As I’ve written before, I was a math and physics major in college but I switched to statistics because math seemed pointless if you weren’t the best (and I knew there were people better than me), and I just didn’t feel like I had a good physical understanding. My lack of physical understanding comes up […] The post Why I decided not to be a physicist appeared first on Statistical…

## Significant birthdays in the weekend

May 25, 2014
By

I am a listener to BBC's podcast More or Less. In the program Tim Harford looks at data with both humour and determination to find what the numbers mean. Last week he handled a listener question. Does everybody get a significant birthday (20, 30 years ...

## An interesting mosaic of a data programming course

May 24, 2014
By

Rajit Dasgupta writes: I have been working on a website, SlideRule that in its present state, is a catalog of online courses aggregated from over 35 providers. One of the products we are building on top of this is something called Learning Paths, which are essentially a sequence of Online Courses designed to help learners […] The post An interesting mosaic of a data programming course appeared first on Statistical…

## Buzzfeed, Porn, Kansas…That Can’t Be Good

May 24, 2014
By

This post is by David K. Park and courtesy of Alex Palen Ellis… Thought you might find this funny: Buzzfeed set out to study porn consumption versus the red/blue political spectrum. And they failed miserably. An article form opennews.org outlines six major fallacies Buzzfeed committed, the best of which resulted in the Kansas effect: “Pornhub’s writeup omitted […] The post Buzzfeed, Porn, Kansas…That Can’t Be Good appeared first on Statistical Modeling, Causal…

## De la difficulté de faire des prévisions (quand on a peu de données)

May 24, 2014
By
$p=\frac{159}{1047}\sim 15.2\%$

Depuis plusieurs mois, on observe un engouement (probablement légitime) pour le big data. Si beaucoup peut être fait pour utiliser les volumes énormes de données à la disposition des assureurs, il convient de garder en mémoire que dans beaucoup de cas, les données sont rares et que la technologie ne devrait pas pouvoir changer grand chose. Le manque de données (fiables) crée une variabilité importante. Loi des grands nombres, approximations…

## The gremlins did it? Iffy statistics drive strong policy recommendations

May 23, 2014
By

Recently in the sister blog. Yet another chapter in the continuing saga, Don’t Trust Polynomials. P.S. More here. The post The gremlins did it? Iffy statistics drive strong policy recommendations appeared first on Statistical Modeling, Causal ...

## Structural breaks

May 23, 2014
By

I’m tired of reading about tests for structural breaks and here’s why. A structural break occurs when we see a sudden change in a time series or a relationship between two time series. Econometricians love papers on structural breaks, and apparently believe in them. Personally, I tend to take a different view of the world. I think a more realistic view is that most things change slowly over time, and…

## Did Neyman really say of Fisher’s work, “It’s easy to get the right answer if you never define what the question is,” and did Fisher really describe Neyman as “a theorem-proving poseur who wouldn’t recognize real data if it bit him in the ass”?

May 23, 2014
By

To answer the question in the title of this post: Of course not. Fisher is English. They say arse, not ass. But here’s a quote that is floating around. Joseph Wilson quotes science reporter Regina Nuzzo: Neyman called some of Fisher’s work mathematically “worse than useless”; Fisher called Neyman’s approach “childish” and “horrifying [for] intellectual […] The post Did Neyman really say of Fisher’s work, “It’s easy to get the…

## Rationality and Bayesian Objectivity

May 23, 2014
By

Frequentism and Subjective Bayes are both special cases of Objective Bayes (to the extent they’re true at all). I’ve detailed (here, here, here, and here) in exactly what sense Frequentism is a special case of Bayes. Here I will do the same...

## Introducing Probability

May 23, 2014
By

I have a guilty secret. I really love probability problems. I am so happy to be making videos about probability just now, and conditional probability and distributions and all that fun stuff. I am a little disappointed that we won’t be … Continue reading →

## Introducing Probability

May 23, 2014
By

I have a guilty secret. I really love probability problems. I am so happy to be making videos about probability just now, and conditional probability and distributions and all that fun stuff. I am a little disappointed that we won’t be … Continue reading →

## A. L. Nagar

May 23, 2014
By

Earlier this year I had a post in memory of the eminent statistician and econometrician, Anirudh Nagar. His passing was a great loss to our profession. Today, I was pleased to learn about this site that honours A. L. Nagar's life and contributions...

## 10 things statistics taught us about big data analysis

May 22, 2014
By

In my previous post I pointed out a major problem with big data is that applied statistics have been left out. But many cool ideas in applied statistics are really relevant for big data analysis. So I thought I'd try to answer the … Continue reading →

## The need for documenting functions

May 22, 2014
By

My current work usually requires me to work on a project until we can submit a research paper, and then move on to a new project. However, 3-6 months down the road, when the reviews for the paper return, it is quite common to have to do some new analyses or re-analyses of the data. […]

## Big Data needs Big Model

May 22, 2014
By

Gary Marcus and Ernest Davis wrote this useful news article on the promise and limitations of “big data.” And let me add this related point: Big data are typically not random samples, hence the need for “big model” to map from sample to population. Here’s an example (with Wei Wang, David Rothschild, and Sharad Goel): […] The post Big Data needs Big Model appeared first on Statistical Modeling, Causal Inference,…