It used to be that the one of the first decisions to make when learning to program was between compiled (e.g. C or FORTRAN) and interpreted (e.g. Python) languages. In my opinion these days one would have to be a … Continue reading →
Another R tip: beware of as.character applied to a list. Really, beware of grep with a list: You might have thought that the result would be just 1, but grep expects a vector of character strings. If the input is not that, it uses as.character(). Since the result of that starts with "c(", grep finds […]
It’s widely understood that, in R programming, one should avoid for loops and always try to use apply-type functions. But this isn’t entirely true. It may have been true for Splus, back in the day: As I recall, that had to do with the entire environment from each iteration being retained in memory. Here’s a […]
Many of the posts in this blog have been concerned with using MCMC based methods for Bayesian inference. These methods are typically “exact” in the sense that they have the exact posterior distribution of interest as their target equilibrium distribution, but are obviously “approximate”, in that for any finite amount of computing time, we can […]
Rainer pointed out, in response to my post, Row names in data frames: Beware of 1:nrow, that if I’d used rownames(x) <- as.character(1:3) rather than rownames(x) <- 1:3, I wouldn’t have had the problem I’d seen. If you type rownames(x) you see the same result as rownames(z), and is.character(rownames(x)) and is.character(rownames(z)) both return TRUE, but […]
One of the coolest R packages I heard about at the useR! Conference: Toby Dylan Hocking‘s directlabels package for putting labels directly next to the relevant curves or point clouds in a figure. I think I first learned about this idea from Andrew Gelman: that a separate legend requires a lot of back-and-forth glances, so [...]
Let’s start this blog off right, with the stupidest R mistake I’ve ever made (I think). In the R package that I write, R/qtl, one of the main file formats is a comma-delimited file, where the blank cells in the second row are important, as they distinguish the initial phenotype columns from the genetic marker [...]