## Normal approximation details

May 29, 2014
By

-+*The normal distribution can approximate many other distributions, though the details such as quantitative error estimates and what factors improve or degrade the approximation are harder to find. Here are some notes on normal approximations to several common probability distributions. beta binomial gamma Poisson Student-t

Read more »

## Permute elements within each row of a matrix

May 29, 2014
By

Bootstrap methods and permutation tests are popular and powerful nonparametric methods for testing hypotheses and approximating the sampling distribution of a statistic. I have described a SAS/IML implementation of a bootstrap permutation test for matched pairs of data (an alternative to a matched-pair t test) in my paper "Modern Data […]

Read more »

## Early bird registration for R in Insurance closes tomorrow

May 29, 2014
By

The early bird registration offer for the 2nd R in Insurance conference, 14 July 2014, at Cass Business School closes tomorrow.This one-day conference will focus once more on applications in insurance and actuarial science that use R, the lingua franca...

Read more »

## June Reading List

May 29, 2014
By

Put away that novel! Here's some really fun June reading:Berger, J., 2003. Could Fisher, Jeffreys and Neyman have agreed on testing?. Statistical Science, 18, 1-32.Canal, L. and R. Micciolo, 2014. The chi-square controversy. What if Pearson had R?&nbsp...

Read more »

## Stream Processing with Messaging Systems

May 29, 2014
By

Stream processing involves taking a stream of messages and running a computation on each message. How can we manage high...

Read more »

## Using Volcano Plots in R to Visualize Microarray and RNA-seq Results

May 28, 2014
By

I've been asked a few times how to make a so-called volcano plot from gene expression results. A volcano plot typically plots some measure of effect on the x-axis (typically the fold change) and the statistical significance on the y-axis (typically the...

Read more »

## It’s high time to demystify

May 28, 2014
By

Data, Big Data, Data Scientist, Data Mining …. Statistics. And next: Linked Open Data? Look at this semantically rich clearing …Continue reading →

Read more »

## The Big in Big Data relates to importance not size

May 28, 2014
By

In the past couple of years several non-statisticians have asked me "what is Big Data exactly?" or "How big is Big Data?". My answer has been "I think Big Data is much more about "data" than "big". I explain below. … Continue reading →

Read more »

## Bayesian nonparametric weighted sampling inference

May 28, 2014
By

Yajuan Si, Natesh Pillai, and I write: It has historically been a challenge to perform Bayesian inference in a design-based survey context. The present paper develops a Bayesian model for sampling inference using inverse-probability weights. We use a hierarchical approach in which we model the distribution of the weights of the nonsampled units in the […] The post Bayesian nonparametric weighted sampling inference appeared first on Statistical Modeling, Causal Inference,…

Read more »

## It’s Not A Blog …….

May 27, 2014
By

This gem from @AcademicSay on Twitter today:"It's not a blog. It's an independent open-access journal with a dedicated submission agreement."© 2014, David E. Giles

Read more »

## Which came first, the preference or the choice?

May 27, 2014
By

Obviously, preference precedes choice because choices are made to maximize preference. That is certainly the way we conduct our marketing research. We generate factorial designs and write descriptions full of information about products and services. Ou...

Read more »

## Questions About Granger Causality Testing – The Fine Print

May 27, 2014
By

Judging by the number of hits, comments, and questions that I've had in relation to my various posts on testing for Granger (Non-) Causality, this seems to be a topic that a lot of followers find interesting. For instance, see the posts here, here,&nbs...

Read more »

## A whole fleet of gremlins: Looking more carefully at Richard Tol’s twice-corrected paper, “The Economic Effects of Climate Change”

May 27, 2014
By

We had a discussion the other day of a paper, “The Economic Effects of Climate Change,” by economist Richard Tol. The paper came to my attention after I saw a notice from Adam Marcus that it was recently revised because of data errors. But after looking at the paper more carefully, I see a bunch […] The post A whole fleet of gremlins: Looking more carefully at Richard Tol’s twice-corrected…

Read more »

## Reference page for Trifecta Checkup

May 27, 2014
By

It's here! Many readers have requested a reference to the Junk Charts Trifecta Checkup. I finally found time to write this up. Here is the introduction: The Junk Charts Trifecta Checkup is a general framework for data visualization criticism. It...

Read more »

## Absent No More

May 27, 2014
By

Hello my friends. I'm back. It's been a crazy couple of weeks, with end-of-year travel, crew regattas, graduations, etc.A highlight was lecturing at European University Institute (EUI) in Florence. I tortured a pan-European audience of forty or so...

Read more »

## Allan Birnbaum, Philosophical Error Statistician: 27 May 1923 – 1 July 1976

May 27, 2014
By

Today is Allan Birnbaum’s Birthday. Birnbaum’s (1962) classic “On the Foundations of Statistical Inference” is in Breakthroughs in Statistics (volume I 1993).  I’ve a hunch that Birnbaum would have liked my rejoinder to discussants of my forthcoming paper (Statistical Science): Bjornstad, Dawid, Evans, Fraser, Hannig, and Martin and Liu. I hadn’t realized until recently that all of this is up […]

Read more »

## An easy way to generate a vector of letters

May 27, 2014
By

A little-known but useful feature of SAS/IML 12.3 (which was released with SAS 9.4) is the ability to generate a vector of lowercase or uppercase letters by using the colon operator (:). Many SAS/IML programmers use the colon operator to generate a vector of sequential integers: proc iml; x = […]

Read more »

## Notes from the Kölner R meeting, 23 May 2014

May 27, 2014
By

The 10th Kölner R user meeting took place last Friday at the Institute of Sociology and to celebrate the anniversary we invited Andrie de Vries to join us from Revolution Analytics. Andrie is well known in the R community; he is the co-author of the R...

Read more »

## Presentation on Statistical Genetics at Vancouver SAS User Group – Wednesday, May 28, 2014

I am excited and delighted to be invited to present at the Vancouver SAS User Group‘s next meeting.  I will provide an introduction to statistical genetics; specifically, I will define basic terminology in genetics explain the Hardy-Weinberg equilibrium in detail illustrate how Pearson’s chi-squared goodness-of-fit test can be used in PROC FREQ in SAS to check the Hardy-Weinberg equilibrium illustrate […]

Read more »

## Machine Learning and Applied Statistics Lesson of the Day – Sensitivity and Specificity

$Machine Learning and Applied Statistics Lesson of the Day – Sensitivity and Specificity$

To evaluate the predictive accuracy of a binary classifier, two useful (but imperfect) criteria are sensitivity and specificity. Sensitivity is the proportion of truly positives cases that were classified as positive; thus, it is a measure of how well your classifier identifies positive cases.  It is also known as the true positive rate.  Formally,   Specificity is the proportion of truly […]

Read more »

## Unit Root Testing: Sample Size vs. Sample Span

May 26, 2014
By

The more the merrier when it comes to the number of observations we have for our economic time-series data - right? Well, not necessarily. There are several reasons to be cautious, not the least of which include the possibility of structural break...

Read more »

## WAIC and cross-validation in Stan!

May 26, 2014
By

Aki and I write: The Watanabe-Akaike information criterion (WAIC) and cross-validation are methods for estimating pointwise out-of-sample prediction accuracy from a fitted Bayesian model. WAIC is based on the series expansion of leave-one-out cross-validation (LOO), and asymptotically they are equal. With finite data, WAIC and cross-validation address different predictive questions and thus it is useful […] The post WAIC and cross-validation in Stan! appeared first on Statistical Modeling, Causal Inference,…

Read more »

 Tweet

Email: