## Top 10 tips to get started with R

April 2, 2013
By

Be motivated. R has a steep learning curve. Find a problem you can't solve otherwise. E.g. plotting multivariate data, a statistical analysis for which an R function exists already. Download and install R. Get to know the R console. Learn how to inst...

## Le Monde puzzle [#814]

April 1, 2013
By

The #814 Le Monde math puzzle was to find 100 digits (between 1 and 10) such that their sum is equal to their product. Given the ten possible values of those digits, this is equivalent to finding integers a1,…,a10 such that a1+…+a10=100 and a1+2a2+…+10a10=2a2x….x10a10, which reduces the number of unknowns from 100 to 10 (or [...]

## Flawed Science and Stapel: Priming for a Backlash?

April 1, 2013
By

Deiderik Stapel is back in the news, given the availability of the English translation of the Tilberg (Levelt and Noort Committees) Report as well as his book, Ontsporing (Dutch for “Off the Rails”), where he tries to explain his fraud. An earlier post on him is here. While the disgraced social psychologist was shown to […]

April 1, 2013
By

To install a package in R, the function to be used is install.packages. Let say we want to install the ggplot2 package, well simply code this withTo install more than one package, we do this byNote that in executing the above codes, a dialogue box...

## Wolfram on Mandelbrot

April 1, 2013
By

The most perfect pairing of author and subject since Nicholson Baker and John Updike. Here’s Wolfram on the great researcher of fractals: In his way, Mandelbrot paid me some great compliments. When I was in my 20s, and he in his 60s, he would ask about my scientific work: “How can so many people take [...]

## R Tackles Big Garbage

April 1, 2013
By

April 1, 2013 – Although the capabilities of the R system for data analytics have been expanding with impressive speed, it has heretofore been missing important fundamental methods. A new function works with the popular plyr package to provide these missing … Continue reading →

## A pictorial history of US large cap correlation

April 1, 2013
By

How has the distribution of correlations changed over the last several years? Previously Posts about correlation boxplots explained Data Daily returns of 443 large cap US stocks from 2004 through 2012 were used.  The sample correlations — almost 98,000 of them — during each year were created. If we were actually using the correlations, then … Continue reading →

## The Art of R Programming review – part 6

April 1, 2013
By

These posts are coming in rapid fire. I understand the previous one was dense, so I'll try to make this lighter without skimping on the cool stuff. Let's go! Chapter 10 deals with I/O - input/output. This is a really important topic that I don't think ...

## Changing the ODS style might change color ramp (and what to do about it)

April 1, 2013
By

Did you know that your ODS style might result in changing the color ramp for contour plots and heat maps? For example, the default style in SAS 9.3 is HTMLBlue. Let's create a contour plot in the HTML destination by running an example adapted from the documentation for the RSREG [...]

## p-values are (possibly biased) estimates of the probability that the null hypothesis is true

April 1, 2013
By
$p-values are (possibly biased) estimates of the probability that the null hypothesis is true$

Last week, I posted about statisticians’ constant battle against the belief that the p-value associated (for example) with a regression coefficient is equal to the probability that the null hypothesis is true, for a null hypothesis that beta is zero or negative. I argued that (despite our long pedagogical practice) there are, in fact, many […]

## How do Dew and Fog Form? Nature at Work with Temperature, Vapour Pressure, and Partial Pressure

In the early morning, especially here in Canada, I often see dew – water droplets formed by the condensation of water vapour on outside surfaces, like windows, car roofs, and leaves of trees.  I also sometimes see fog – water droplets or ice crystals that are suspended in air and often blocking visibility at great […]

## Context – if it isn’t fun…

March 31, 2013
By

The role of context in statistical analysis The wonderful advantage of teaching statistics is the real-life context within which any applicaton must exist. This can also be one of the difficulties. Statistics without context is merely the mathematics of statistics, … Continue reading →

## Checking for Normality with Quantile Ranges and the Standard Deviation

$Checking for Normality with Quantile Ranges and the Standard Deviation$

Introduction I was reading Michael Trosset’s “An Introduction to Statistical Inference and Its Applications with R”, and I learned a basic but interesting fact about the normal distribution’s interquartile range and standard deviation that I had not learned before.  This turns out to be a good way to check for normality in a data set. […]

## Easter

March 31, 2013
By

This morning, there was an interesting post entitled “why does Easter move around so much?” online on http://economist.com/blogs/economist-explains/… In my time series classes, I keep saying that sometimes, series can exhibit seasonlity, but the seasonal effect can be quite irregular. It is the cas for river levels, where snowmelt can have a huge impact, and it is irregular. Similarly, chocolate sales (even monthly, or quarterly) depends on Easter. Because it can be…

## He’s getting ready to write a book

March 31, 2013
By

Eric Novik does some open-source planning: My co-author, Jacki Buros, and I [Novik] have just signed a contract with Apress to write a book tentatively entitled “Predictive Analytics with R”, which will cover programming best practices, data munging, data exploration, and single and multi-level models with case studies in social media, healthcare, politics, marketing, and [...]

## Introduction to Approximate Bayesian Computation (ABC)

March 31, 2013
By
$Introduction to Approximate Bayesian Computation (ABC)$

Many of the posts in this blog have been concerned with using MCMC based methods for Bayesian inference. These methods are typically “exact” in the sense that they have the exact posterior distribution of interest as their target equilibrium distribution, but are obviously “approximate”, in that for any finite amount of computing time, we can […]

## George E P Box (1919–2013)

March 31, 2013
By

Last Thursday (28 March 2013), George Box passed away at the age of 93. He was one of the great statisticians of the last 100 years, and leaves an astonishingly diverse legacy. When I teach forecasting to my second year commerce students, we cover Box-Cox transformations, Box-Pierce and Ljung-Box tests, and Box-Jenkins modelling, and my students wonder if it is the same Box in all cases. It is. And we…

## R: Importing Data

March 31, 2013
By

There are number of ways in importing data into R, and several formats are available,From Excel to R From SPSS to RFrom Stata to R, and more hereIn this post, I'm going to talk about importing common data format that we often encounter, such as Excel, ...

## Topological Inference

March 31, 2013
By
$Topological Inference$

We uploaded a paper called Statistical Inference For Persistent Homology on arXiv. (I posted about topological data analysis earlier here.) The paper is written with Siva Balakrishnan, Brittany Fasy, Fabrizio Lecci, Alessandro Rinaldo and Aarti Singh. The basic idea is this. We observe data where and is supported on a set . We want to […]

## “Statistical Modeling: A Fresh Approach”

March 30, 2013
By

Ben Hansen recommended to me this book and course by Daniel Kaplan. It looks pretty good. I’ve only looked at the website, not the book itself, and I’m sure I’d find lots of places to disagree with it on details, but the general flow seemed reasonable, also I liked that there’s lots of course materials [...]

## More ordinal data display

March 30, 2013
By

The past two weeks I made a post regarding analyzing ordinal data with R and JAGS. The calculations in the second part made me realize I could actually get top two box intervals out of R. This demonstrated here. For that I needed the inv...

## Presenting without slides

March 30, 2013
By

Tired of slides, I’ve been experimenting with different ways of presenting. At the recent Conference on Statistical Practice, I decided only to use slides for an outline and references. As it turns out, the most critical feedback I got had to do with...

## The Art of R Programming review – part 5

March 30, 2013
By

It's what you've all been waiting for! Let's continue on with our book review: In Chapter 8, the author discusses Math and Simulation functions in R. The topics in this chapter could fill a book given this is what R is primarily used for. However the a...