Pasting Excel data into R on a Mac

June 2, 2012
By
Pasting Excel data into R on a Mac

When starting out with R, getting data in and out can be a bit of a pain. It should take long to work out a convenient method – depending on what OS you use and what other packages you work with. In my case I prefer to work with Excel spreadsheets (which are versatile and […]

Read more »

Question 23 of my final exam for Design and Analysis of Sample Surveys

June 2, 2012
By

23. Suppose you are conducting a survey in which people are asked about their health behaviors (how often they wash their hands, how often they go to the doctor, etc.). There is a concern that different interviewers will get different sorts of responses—that is, there may be important interviewer effects. Describe (in two sentences) how [...]

Read more »

Useful for referring–6-2-2012

June 2, 2012
By
Useful for referring–6-2-2012

Note: the following 4-7 are from Simply Statistics. A Personal Perspective on Machine Learning The differing perspectives of statistics and machine learning Kernel Methods and Support Vector Machines de-Mystified I love this article in the WSJ about the crisis at JP Morgan. The key point it highlights is that looking only at the high-level analysis and summaries can [...]

Read more »

Helpful on happiness

June 2, 2012
By

Following on our recent discussion of contradictory findings on happiness, David Austin writes: A pellucid discussion of happiness and happiness research is Fred Feldman, What is This Thing Called Happiness? (Oxford University Press, 2010). And here&#8...

Read more »

Visualizing car brand choices in ggplot2

June 2, 2012
By
Visualizing car brand choices in ggplot2

I always like to read new posts at chartsnthings as they always inspire me with new ideas for data visualization. Yesterday I have read an article on choices of car brands by members of parliament in Poland in Gazeta.pl. It contains a simple ...

Read more »

Distribution of Oft-Used Bash Commands

June 1, 2012
By
Distribution of Oft-Used Bash Commands

Browsing commandlinefu.com today, I came across this little one-liner to display which commands I use most often. Here’s what I got: Yep, seems legit. I navigate and look at files a whole bunch (ls, cd, cat), and I do a butt tonne of editing (vim). I sudo like a boss, hop onto various servers (ssh),

Read more »

Question 22 of my final exam for Design and Analysis of Sample Surveys

June 1, 2012
By

22. A supermarket chain has 100 equally-sized stores. It is desired to estimate the proportion of vegetables that spoil before being sold. Three stores are selected at random and are checked: the percent of spoiled vegetables are 3%, 5%, and 10% in the three stores. Give an estimate and standard error for the percentage of [...]

Read more »

Gibbs sampling a Gaussian Markov random field (GMRF) using Java

June 1, 2012
By
Gibbs sampling a Gaussian Markov random field (GMRF) using Java

Introduction As I’ve explained previously, I’m gradually coming around to the idea of using Java for the development of MCMC codes, and I’m starting to build up a collection of simple examples for getting started. One of the advantages of Java is that it includes a standard cross-platform GUI library. This might not seem like … … Continue reading →

Read more »

Beta distribution parameterized by mode instead of mean

June 1, 2012
By
Beta distribution parameterized by mode instead of mean

In this post, I describe how it is easier to intuit the beta distribution in terms of its mode than its mean. This is especially handy when specifying a prior beta distribution. (In a previous post, I explained how it is easier to intuit the gamma...

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game ResultsI recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat t...

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game Results I recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 Continue reading →

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game ResultsI recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat th...

Read more »

Why use Odds Ratios in Logistic Regression

June 1, 2012
By
Why use Odds Ratios in Logistic Regression

What that means is there is no way to express in one number how X affects Y in terms of probability. The effect of X on the probability of Y has different values depending on the value of X.

Read more »

Selection in R

June 1, 2012
By

Related posts: R examine objects tutorial Survive R My Favorite Graphs

Read more »

Interview with Amanda Cox – Graphics Editor at the New York Times

June 1, 2012
By
Interview with Amanda Cox – Graphics Editor at the New York Times

Amanda Cox  Amanda Cox received her M.S. in statistics from the University of Washington in 2005. She then moved to the New York Times, where she is a graphics editor. She, and the graphics team at the New York Times, are responsible for many of th...

Read more »

Interview with Amanda Cox – Graphics Editor at the New York Times

June 1, 2012
By
Interview with Amanda Cox – Graphics Editor at the New York Times

Amanda Cox  Amanda Cox received her M.S. in statistics from the University of Washington in 2005. She then moved to the New York Times, where she is a graphics editor. She, and the graphics team at the New York Times, are responsible for many of th...

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game Results I recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat...

Read more »

Halloween/Valentine’s update

June 1, 2012
By
Halloween/Valentine’s update

A few months ago we reported on a claim that more babies are born on Valentine’s Day and fewer on Halloween. At the time, I wrote that I’d like to see a graph with all 366 days of the year. It would be easy enough to make. That way we could put the Valentine’s and [...]

Read more »

Popular posts 2012 May

June 1, 2012
By

Most popular posts in 2012 May Portfolio Diversity Random portfolios: 6 steps to a better fund management industry Cross-sectional skewness and kurtosis: stocks and portfolios A tale of two returns (posted in 2010) Asset correlations with minimum variance portfolios The top 7 portfolio optimization problems The quality of variance matrix estimation Correlations and positive-definiteness Exponential … Continue reading →

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game ResultsI recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat the Spurs. So, we are now at .500 . Considering that the algorithm was about 61.5% accurate over the whole season, this is to be expected.I made…

Read more »

Question 21 of my final exam for Design and Analysis of Sample Surveys

May 31, 2012
By

21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions.. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason [...]

Read more »

Poll Shows Open Source Almost Even with Commercial Analytics Software

May 31, 2012
By
Poll Shows Open Source Almost Even with Commercial Analytics Software

The 2012 results of the annual KDnuggets poll are in. It shows R in first place with 30.7% of users reporting having used it for a real project. Excel is almost as popular. It seems out of place among so … Continue reading →

Read more »

Lindley’s paradox

May 31, 2012
By

Sam Seaver writes: I [Seaver] happened to be reading an ironic article by Karl Friston when I learned something new about frequentist vs bayesian, namely Lindley’s paradox, on page 12. The text is as follows: So why are we worried about trivial effects? They are important because the probability that the true effect size is [...]

Read more »


Subscribe

Email:

  Subscribe