Row names in data frames: beware of 1:nrow

March 21, 2012
By
Row names in data frames: beware of 1:nrow

I spent some time puzzling over row names in data frames in R this morning. It seems that if you make the row names for a data frame, x, as 1:nrow(x), R will act as if you’d not assigned row names, and the names might get changed when you do rbind. Here’s an illustration: As […]

Read more »

Nordic Countries Dominate the World in Internet Penetration

March 21, 2012
By
Nordic Countries Dominate the World in Internet Penetration

Something about that cold weather... The number of internet users in the Nordic countries has greatly outpaced the world by comparison. Denmark, Iceland, Norway, Sweden, Finland - all in the elite echelon. These countries share a common ancestry -...

Read more »

The sun will probably come out tomorrow

March 21, 2012
By
The sun will probably come out tomorrow

I am always looking for good examples of Bayesian analysis, so I was interested in this paragraph from The Economist (September 2000): "The canonical example is to imagine that a precocious newborn observes his first sunset, and wonders whether the sun...

Read more »

Using R for a salary negotiation–an extension of decision tree models

March 21, 2012
By
Using R for a salary negotiation–an extension of decision tree models

Let’s say you are in the middle of a salary negotiation, and you want to know whether you should be aggressive in your offering or conservative. One way to help with the decision is to make a decision tree. We’ll work with the following assumptions...

Read more »

Copy and paste small data sets into R

March 21, 2012
By
Copy and paste small data sets into R

How can I embed a small data set into my R code? That was the question I came across today, when I prepared my talk about Dynamical Systems in R with simecol for the forthcoming Cologne R user group meeting. I wanted to add all the R code of the talk ...

Read more »

Geocode and reverse geocode your data using, R, JSON and Google Maps’ Geocoding API

March 20, 2012
By

Geocode and reverse geocode your data using, R, JSON and Google Maps' Geocoding APITo geocode and reverse geocode my data, I use Google's Geocoding service which returns the geocoded data in a JSON. I will recommend that you register with Google Maps A...

Read more »

Statistical Misconception Removal

March 20, 2012
By
Statistical Misconception Removal

Our central city is being “deconstructed”. That’s the modern word for demolition. We live in Christchurch, New Zealand where many of our buildings were badly damaged by a string of serious earthquakes over the last 18 months, beginning on 4 … Continue reading →

Read more »

Easiest and hardest classes to teach

March 20, 2012
By

I’ve taught a variety of math classes, and statistics has been the hardest to teach. The thing I find most challenging is coming up with homework problems. Most exercises are either blatantly artificial or extremely tedious. It’s hard to find moderately realistic problems that don’t take too long to work out. The course I’ve found easiest [...]

Read more »

Visualize This: The FlowingData Guide to Design, Visualization, and Statistics

March 19, 2012
By
Visualize This: The FlowingData Guide to Design, Visualization, and Statistics

From: http://book.flowingdata.com/A book by Nathan Yau who writes for FlowingData, Visualize This is a practical guide on visualization and how to approach real-world data. The book is published by Wiley and is available&n...

Read more »

Open data: Jimmy Wales and the Man from Sweden – Web Exclusive Article – Significance Magazine

March 19, 2012
By
Open data: Jimmy Wales and the Man from Sweden – Web Exclusive Article – Significance Magazine

Open data: Jimmy Wales and the Man from Sweden - Web Exclusive Article - Significance MagazineAuthor: Julian ChampkinJimmy Wales at the Gottlieb DuttweilerAwards Show, 2011. Image by ThomasEntzeroth (photographer) on behalfof Gottlieb Duttweiler I...

Read more »

Graphing between-subject confidence intervals for ANOVA

March 19, 2012
By
Graphing between-subject confidence intervals for ANOVA

This is a quick follow up to my earlier post that discussed how to graph CIs for within-subjects (repeated measures) ANOVA designs. My forthcoming book Serious stats describes how to do this for between-subjects designs (a much simpler proble...

Read more »

Backtesting Asset Allocation portfolios

March 19, 2012
By
Backtesting Asset Allocation portfolios

In the last post, Portfolio Optimization: Specify constraints with GNU MathProg language, Paolo and MC raised a question: “How would you construct an equal risk contribution portfolio?” Unfortunately, this problem cannot be expressed as a Linear or Quadratic Programming problem. The outline for this post: I will show how Equal Risk Contribution portfolio can be [...]

Read more »

Independent measures (between-subjects) ANOVA and displaying confidence intervals for differences in means

March 18, 2012
By
Independent measures (between-subjects) ANOVA and displaying confidence intervals for differences in means

In Chapter 2 (Confidence Intervals) of Serious stats I consider the problem of displaying confidence intervals (CIs) of a set of means (which I illustrate with the simple case of two independent means). Later, in Chapter 16 (Repeated Measures ANOVA), I consider the trickier problem of displaying of two or more means from paired or […]

Read more »

Logistic map: Feigenbaum diagram in R

March 17, 2012
By
Logistic map: Feigenbaum diagram in R

The other day I found some old basic code I had written about 15 years ago on a Mac Classic II to plot the Feigenbaum diagram for the logistic map. I remember, it took the little computer the whole night to produce the bifurcation chart. With today's c...

Read more »

simulated annealing for Sudokus [2]

March 16, 2012
By
simulated annealing for Sudokus [2]

On Tuesday, Eric Chi and Kenneth Lange arXived a paper on a comparison of numerical techniques for solving sudokus. (The very Kenneth Lange who wrote this fantastic book on numerical analysis.) One of these techniques is the simulated annealing approach I had played with a long while ago.  They seem to use the same penalisation [...]

Read more »

Standards for statistical data dissemination: a wish list

March 16, 2012
By
Standards for statistical data dissemination: a wish list

Standards for statistical data dissemination: a wish list View more PowerPoint from Xavier Badosa The digitization of information exchange processes has led in many industries to define standards to be used in the B2B side of the value chain for the c...

Read more »

p curves revisited

March 15, 2012
By
p curves revisited

I finally found some time to take a closer look at p curves. I haven't had a chance to follow-up my simulations (and probably won't for a few weeks if not months), but I have had time to think through the ideas the p curve approach raises based on some...

Read more »

Seductive Causation

March 15, 2012
By
Seductive Causation

Causation is a seductive notion. We want to make meaning out of our world. I love playing “the beeping nose” with little children. I press their nose and it beeps. I press my nose and it whirrs. It fascinates them. … Continue reading →

Read more »

Call for chapters: Data Mining Applications with R

March 15, 2012
By
Call for chapters: Data Mining Applications with R

Data Mining Applications with R A book to be published by Elsevier http://www.RDataMining.com/books/book2 Proposal Submission Deadline: April 30, 2012 Introduction R is one of the most widely used data mining tools in scientific and business applications, among dozens of commercial … Continue reading →

Read more »

Ideas on A Really Fast Statistics Journal

March 15, 2012
By

I was writing comments on the blog post A proposal for a really fast statistics journal, and I realized the comment box was too small to write down my ideas. I like the proposal a lot, and I feel really bad about the current model of submitting and rev...

Read more »

Bayesian statistics made simple

March 15, 2012
By
Bayesian statistics made simple

At PyCon last week I taught a tutorial on Bayesian statistics.  It is based on Chapters 5 and 8 of Think Stats.  Here is the web page I created for the tutorial. And here, courtesy of PyCon and pyvideo.org, is the video.  It's three ho...

Read more »

R code for p curves

March 14, 2012
By
R code for p curves

I have finally got around to posting the R code for my p curve simulation. Those familiar with R will realize how crude it is (I've been caught up with other urgent stuff and had no time to explore further).You are welcome to play with (and improve!) t...

Read more »


Subscribe

Email:

  Subscribe