Standard sample function works differently when it gets single element integer vector as opposed to longer vectors. This can lead to unexpected bugs in R code.Several times I had a problem with code similar to one given here:for (i in 1:4) {&...

Non-Linear Iterative Partial Least Squares (NIPALS) is an algorithm for calculating principal components, and unlike the SVD method it allows only the required number of components to be calculated, which is useful for large data sets as typically only...

Following Pierre’s post on psycho dice, I want here to see by which average margin repeated plays might be called influenced by mind will. The rules are the following (exerpt from the novel Midnight in the Garden of Good and Evil, by John Berendt): You take four dice and call out four numbers between one […]

Baptiste Coulmont explains on his blog how to use the R package maptools. It is based on shapefile files, for example the ones offered by the French geography agency IGN (at départements and communes level). Some additional material like roads and railways are provided by the OpenStreetMap project, here. For the above map, you need […]

Principal Component Analysis (PCA) is widely used in many data analysis methods as it can reduce the complexity of large interrelated data sets. The easiest way to calculate PCA is by a eigenvalue decomposition namely singular value decomposition (SVD)...

I may want to add a subtitle "Why R-Forge Must Die" (thinking of Barry Rowlingson's talk earlier this year). I have been a GitHub user for two years, and I was mainly influenced by Hadley. Now I even feel a little bit addicted to GitHub (its slogan is ...

It is a subtle point that statistical modeling is different than model based science. However, empirical scientists seem to go out of their way to conflate the two before the public (as statistical modeling is easier to perform and model based science is more highly rewarded). It is often claimed that model based science is [...] Related posts: Statistics to English Translation, Part 2a: ’Significant’ Doesn’t Always Mean ’Important’ Statistics…