## Halloween 2011 count

November 1, 2011
We don’t get many kids seeking candy at our house. I’m not sure if there just aren’t many kids in the neighborhood, or if it’s our location (next to the pond, with a big gap before the next house). I decided to keep track. As usual, we bought a huge bag of candy, and we […]

## Example 9.12: simpler ways to carry out permutation tests

October 31, 2011
In a previous entry, as well as section 2.4.3 of the book, we describe how to carry out a 2 group permutation test in SAS as well as with the coin package in R. We demonstrate with comparing the ages of the female and male subjects in the HELP study.I...

## R 2.14.0 is released!

October 31, 2011
The new R 2.14.0 is out! Get the source code from here.Take a look at these posts for some miscellaneous advices to make the upgrade easier.Also this thread on stackoverflow and this post contributed by Tal Galili can...

## Proc tabulate for simple statistics (corrected)

October 30, 2011
Ken Beath, of Macquarie University, commented on an earlier entry that the best way to generate summary statistics is using proc tabulate. While the best tools might differ, depending on the purpose, we wanted to share Ken's code demonstrating how to ...

## All your Bayes are belong to us!

October 27, 2011
This week's post contains solutions to My Favorite Bayes's Theorem Problems, and one new problem.  If you missed last week's post, go back and read the problems before you read the solutions! If you don't understand the title of this post, brush...

## PAWL package on CRAN

October 26, 2011
The PAWL package (which I talked about there, and which implements the parallel adaptive Wang-Landau algorithm and adaptive Metropolis-Hastings for comparison) is now on CRAN! http://cran.r-project.org/web/packages/PAWL/index.html which means that within R you can easily install it by typing install.packages("PAWL") Isn’t that amazing? It’s just amazing. Kudos to the CRAN team for their quickness and their […]

## Example 9.11: Employment plot

October 25, 2011
A facebook friend posted the picture reproduced above-- it makes the case that President Obama has been a successful creator of jobs, and also paints GW Bush as a president who lost jobs. Another friend pointed out that to be fair, all of Bush's presi...

## Named in Best Colleges top 50 statistics blogs of 2011!

October 25, 2011
Realizations in Biostatistics has been named in Best Colleges top 50 best statistics blogs of 2011! The wide variety of content in this blog has been noted, and, yes, I do try to write about a lot of different aspects of statistics for technical and no...

## Parameter vs. Observation Dimension?

October 24, 2011
*** Updated 10/27/11: Original text appended in strike. *** Bill Bolstad’s response to Xi’an’s review of his book Understanding Computational Bayesian Statistics included the following comment, which I found interesting: Frequentist p-values are constructed in the parameter dimension using a probability distribution defined only in the observation dimension. Bayesian credible intervals are constructed in the [...]

## Support Vector Machine with GPU, Part II

October 22, 2011
In our last tutorial on SVM training with GPU, we mentioned a necessary step to pre-scale the data with rpusvm-scale, and to reverse scaling the prediction outcome. This cumbersome procedure is now simplified with the latest RPUSVM. read more

## My favorite Bayes’s Theorem problems

October 20, 2011
This week: some of my favorite problems involving Bayes's Theorem.  Next week: solutions. 1) The first one is a warm-up problem.  I got it from Wikipedia (but it's no longer there): Suppose there are two full bowls of cookies. Bowl #1 has 10...

## The Wang-Landau algorithm reaches the flat histogram in finite time.

October 20, 2011
Cross-posted from my personal blog. MCMC practitioners may be familiar with the Wang-Landau algorithm, which is widely used in Physics. This algorithm divides the sample space into “boxes”. Given a target distribution, the algorithm then samples proportionally to the target in each box, while aiming at spending a pre-defined proportion of the sample in each [...]

## Parachutes

October 20, 2011
I went to a great talk today by David Goldstein, which I might write about further later since he said many of things of considerable interest. But I had to quickly point to an interesting paper he mentioned: Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled [...]

October 17, 2011
I read Jason Rosenhouse's book about The Monty Hall Problem recently, and I use the problem as an example in my statistics class.  Last semester I wrote a variation of the problem that turns out to be challenging, and a motivating problem for Baye...

## Random art on the web

October 15, 2011
Since we explored some statitics of an abstract painting with Pierre (we even have an article in Variances last issue!), I became more sensitive to art linked to randomness. Here are some pointers to related websites I have digged out. Random.org, mentioned here by Pierre, is, at it reads, a true random number service that […]

## qr_multiply function in scipy.linalg

October 14, 2011
In scipy's development version there's a new function closely related to the QR-decomposition of a matrix and to the least-squares solution of a linear system. What this function does is to compute the QR-decomposition of a matrix and then multiply the...

## Multiply Imputing an Outcome Variable

October 12, 2011
Some scholars suggest that multiply imputing an outcome variable is incorrect. I use intuition and simulation to argue that multiply imputing outcomes can drastically improve estimates, even in the case of non-ignorable missingness. Continue reading &#...

## Artist view of crimes in London

October 10, 2011
At first sight, one could think this picture is a scale model of some narrow moutains, like Bryce Canyon… Actually it represents crimes in East London, an cardboard artwork by the Londoner artist Abigail Reynolds, called Mount Fear.  Here is what can be read on the artist’s webpage: The terrain of Mount Fear is generated […]

## Using Sweave

October 8, 2011
If you use R and haven’t discovered Sweave then go and find out about it. It enables R code and plots to be incorporated into a document so the analysis and report can be combined together in a single document. … Continue reading →

## Kernel Methods and Support Vector Machines de-Mystified

October 8, 2011
We give a simple explanation of the interrelated machine learning techniques called kernel methods and support vector machines. We hope to characterize and de-mystify some of the properties of these methods. To do this we work some examples and draw a few analogies. The familiar no matter how wonderful is not perceived as mystical. Goals [...] Related posts: Book Review: Ensemble Methods in Data Mining (Seni & Elder) Six Fundamental…

## Bayesian Computation (3)

October 6, 2011
In Chapter 3 of "Bayesian Computation with R", Jim Albert talked about how to conduct 2 fundamental tasks of Statistics, namely Estimation and Hypothesis Testing in a single parameter framework.The structure of this chapter is organized as the followin...

## Obtain Trace of the Projection Matrix in a Linear Regression

October 6, 2011
Recently, I am working on coding in SAS for a set of regularized regressions and need to compute trace of the projection matrix:$$S=X(X'X + \lambda I)^{-1}X'$$.Wikipedia has a well written introduction to Trace @ here.To obtain the inverse of matrix ...

## Calling Google Maps API from R

October 5, 2011
Hi, Related to Julyan’s previous post, I want to share an easy way to access Google Maps API through R. And then we’ll stop about Google, otherwise it’ll look like we’re just looking for jobs. My problem was the following: I have a database (from priceofweed.com), with locations written as “city, region, country”. What I [...]