Update to Data on Github Post: Solution to an RCurl problem

June 15, 2012
By
Update to Data on Github Post: Solution to an RCurl problem

A reader of my most recent post tried the R code I had written to download the data set of electoral disproportionality from the GitHub repository. However, it didn’t work for them. After entering disproportionality.data <- getURL(url) they go...

Read more »

Cool-ass signal processing using Gaussian processes (birthdays again)

June 14, 2012
By
Cool-ass signal processing using Gaussian processes (birthdays again)

Aki writes: Here’s my version of the birthday frequency graph. I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect [...]

Read more »

Simple rendering of complex data

June 14, 2012
By
Simple rendering of complex data

Andrew Gelman likes this line chart showing the day-by-day trend in childbirth: Andrew makes a number of good points about this chart. Make sure you read the whole post. One of his points concerns making the line smoother by removing...

Read more »

You are your shoes. An incoherent study claims.

June 14, 2012
By

According to this link, a study proved that "90 percent of a person's traits can be judged with their shoes." Without needing to look at the study, a reader can reason that this claim makes no sense. What does it mean by 90 percent of a person's traits? How many traits are there in a person? If the researcher defines 1000 traits, then the shoe predictor will need to predict…

Read more »

Freedman’s Neglected Theorem

June 14, 2012
By
Freedman’s Neglected Theorem

—Larry Wasserman In this post I want to review an interesting result by David Freedman (Annals of Mathematical Statistics, Volume 36, Number 2 (1965), 454-456) available at projecteuclid.org. The result gets very little attention. Most researchers in statistics and machine learning seem to be unaware of the result. The result says that, “almost all” Bayesian [...]

Read more »

Body Weight in the United States – Part 2, "Non Factors"

June 13, 2012
By
Body Weight in the United States – Part 2, "Non Factors"

Sometimes the story isn't what is a trend, but rather what is not a trend. In this second installment about body weight in the U.S., listing what doesn't seem to be contributing factors will help narrow down what might actually be the problem...

Read more »

Economists . . .

June 13, 2012
By

Catherine Rampell writes: On Monday the Nobel Foundation, which bestows the world’s most prestigious academic, literary and humanitarian prizes, said it was reducing the cash awarded with Nobel Prizes by about 20 percent. . . . Peter A. Diamond, a professor emeritus at the Massachusetts Institute of Technology who also received the Nobel in economic [...]

Read more »

A question about AIC

June 13, 2012
By

Jacob Oaknin asks: Akaike‘s selection criterion is often justified on the basis of the empirical risk of a ML estimate being a biased estimate of the true generalization error of a parametric family, say the family, S_m, of linear regressors on a m-dimensional variable x=(x_1,..,x_m) with gaussian noise independent of x (for instance in “Unifying [...]

Read more »

Why R is Hard to Learn

June 13, 2012
By
Why R is Hard to Learn

The open source R software for analytics has a reputation for being hard to learn. It certainly can be, especially for people who are already familiar with similar packages such as SAS, SPSS or Stata. Training and documentation that leverages … Continue reading →

Read more »

Data on GitHub: The easy way to make your data available

June 13, 2012
By
Data on GitHub: The easy way to make your data available

Update (15 June 2012): See this post for instructions on how to download GitHub based data into R if you are getting the error about an SSL certificate problem. GitHub is designed for collaborating on coding projects. Nonetheless, it is also a pote...

Read more »

Convergence or divergence? A simple iteration with a random component

June 13, 2012
By
Convergence or divergence? A simple iteration with a random component

A collegue who works with time series sent me the following code snippet. He said that the calculation was overflowing and wanted to know if this was a bug in SAS: data A(drop=m); call streaminit(12345); m = 2; x = 0; do i = 1 to 5000; x = m*x [...]

Read more »

Teaching STT 200

June 12, 2012
By
Teaching STT 200

This summer, I am teaching an undergraduate stats class, which is a first class in stats to cover three units, descriptive statistics, probability and statistical inference. The course webpage is here. The following paragraph is from the thesis of Michael Phillip Lesnick. It explains the relationship among the three units: Recall first that in statistics, [...]

Read more »

Statistics Versus Machine Learning

June 12, 2012
By
Statistics Versus Machine Learning

—Larry Wasserman Welcome to my blog, which will discuss topics in Statistics and Machine Learning. Some posts will be technical  and others will be non-technical. Since this blog is about topics in both Statistics and Machine Learning, perhaps I should address the question: What is the difference between these two fields? The short answer is: [...]

Read more »

NBA Predictions — Finals

June 12, 2012
By
NBA Predictions — Finals

Now we are on to the finals! The algorithm enters the finals with a 6-4 record so far. Here is what we have for tonight: So, let’s see if OKC wins this one.

Read more »

NBA Predictions — Finals

June 12, 2012
By
NBA Predictions — Finals

Now we are on to the finals! The algorithm enters the finals with a 6-4 record so far. Here is what we have for tonight: So, let’s see if OKC wins this one.

Read more »

NBA Predictions — Finals

June 12, 2012
By
NBA Predictions — Finals

Now we are on to the finals! The algorithm enters the finals with a 6-4 record so far. Here is what we have for tonight: So, let's see if OKC wins this one.

Read more »

Next R meeting in Paris INSEE: ggplot2 and parallel computing

June 12, 2012
By
Next R meeting in Paris INSEE: ggplot2 and parallel computing

Hi, our group of R users from INSEE, aka FLR, meets monthly in Paris. Next meeting is on Wed 13 (tomorrow), 1-2 pm, room 539 (an ID is needed to come in,  map to access INSEE R), about ggplot2 and parallel computing. Since the first meeting in February, presentations have included hot topics like webscrapping, C in R, RStudio, SQLite […]

Read more »

Finding word use patterns in Wikileaks cables

June 12, 2012
By
Finding word use patterns in Wikileaks cables

6/18: A follow-up to this post is now available here. Recent DiscoveriesWhen I was a diplomat, I was always interested in the Wikileaks cables and what could be done with them. Unfortunately, I never got a chance to look at the site in depth, due to ...

Read more »

Finding word use patterns in Wikileaks cables

June 12, 2012
By
Finding word use patterns in Wikileaks cables

6/18: A follow-up to this post is now available here. Recent Discoveries When I was a diplomat, I was always interested in the Wikileaks cables and what could be done with them. Unfortunately, I never got a chance to look Continue reading →

Read more »

Finding word use patterns in Wikileaks cables

June 12, 2012
By
Finding word use patterns in Wikileaks cables

6/18: A follow-up to this post is now available here. Recent DiscoveriesWhen I was a diplomat, I was always interested in the Wikileaks cables and what could be done with them. Unfortunately, I never got a chance to look at the site in depth, due to s...

Read more »

NBA Predictions — Finals

June 12, 2012
By
NBA Predictions — Finals

Now we are on to the finals! The algorithm enters the finals with a 6-4 record so far. Here is what we have for tonight: So, let’s see if OKC wins this one.

Read more »

Finding Word Use Patterns in Wikileaks Cables

June 12, 2012
By
Finding Word Use Patterns in Wikileaks Cables

6/18: A follow-up to this post is now available here. Recent Discoveries When I was a diplomat, I was always interested in the Wikileaks cables and what could be done with them. Unfortunately, I never got a chance to look at the site in depth, du...

Read more »


Subscribe

Email:

  Subscribe