Random art on the web

October 15, 2011
By
Random art on the web

Since we explored some statitics of an abstract painting with Pierre (we even have an article in Variances last issue!), I became more sensitive to art linked to randomness. Here are some pointers to related websites I have digged out. Random.org, mentioned here by Pierre, is, at it reads, a true random number service that […]

Read more »

qr_multiply function in scipy.linalg

October 14, 2011
By
qr_multiply function in scipy.linalg

In scipy's development version there's a new function closely related to the QR-decomposition of a matrix and to the least-squares solution of a linear system. What this function does is to compute the QR-decomposition of a matrix and then multiply the...

Read more »

Multiply Imputing an Outcome Variable

October 12, 2011
By

Some scholars suggest that multiply imputing an outcome variable is incorrect. I use intuition and simulation to argue that multiply imputing outcomes can drastically improve estimates, even in the case of non-ignorable missingness. Continue reading &#...

Read more »

Artist view of crimes in London

October 10, 2011
By
Artist view of crimes in London

At first sight, one could think this picture is a scale model of some narrow moutains, like Bryce Canyon… Actually it represents crimes in East London, an cardboard artwork by the Londoner artist Abigail Reynolds, called Mount Fear.  Here is what can be read on the artist’s webpage: The terrain of Mount Fear is generated […]

Read more »

Using Sweave

October 8, 2011
By

If you use R and haven’t discovered Sweave then go and find out about it. It enables R code and plots to be incorporated into a document so the analysis and report can be combined together in a single document. … Continue reading →

Read more »

Kernel Methods and Support Vector Machines de-Mystified

October 8, 2011
By
Kernel Methods and Support Vector Machines de-Mystified

We give a simple explanation of the interrelated machine learning techniques called kernel methods and support vector machines. We hope to characterize and de-mystify some of the properties of these methods. To do this we work some examples and draw a few analogies. The familiar no matter how wonderful is not perceived as mystical. Goals [...] Related posts: Book Review: Ensemble Methods in Data Mining (Seni & Elder) Six Fundamental…

Read more »

Bayesian Computation (3)

October 6, 2011
By
Bayesian Computation (3)

In Chapter 3 of "Bayesian Computation with R", Jim Albert talked about how to conduct 2 fundamental tasks of Statistics, namely Estimation and Hypothesis Testing in a single parameter framework.The structure of this chapter is organized as the followin...

Read more »

Obtain Trace of the Projection Matrix in a Linear Regression

October 6, 2011
By
Obtain Trace of the Projection Matrix in a Linear Regression

Recently, I am working on coding in SAS for a set of regularized regressions and need to compute trace of the projection matrix:$$ S=X(X'X + \lambda I)^{-1}X' $$.Wikipedia has a well written introduction to Trace @ here.To obtain the inverse of matrix ...

Read more »

Calling Google Maps API from R

October 5, 2011
By
Calling Google Maps API from R

Hi, Related to Julyan’s previous post, I want to share an easy way to access Google Maps API through R. And then we’ll stop about Google, otherwise it’ll look like we’re just looking for jobs. My problem was the following: I have a database (from priceofweed.com), with locations written as “city, region, country”. What I [...]

Read more »

Calculating and graphing within-subject confidence intervals for ANOVA

October 4, 2011
By
Calculating and graphing within-subject confidence intervals for ANOVA

Psychologists are gradually coming round to the view that it is a good idea to present interval estimates alongside point estimates of statistics. The most common statistic reported in psychology research is almost certainly the me...

Read more »

Drawing maps using shape files and R

October 4, 2011
By
Drawing maps using shape files and R

Sometimes, the only thing we want is a chart that speaks for itself rather than boring regression tables in our research paper. Graphs are efficient at showing the broad picture of an issue. In fact, graphs in research papers seem to be gaining a momen...

Read more »

Showing Explained Variance in Multilevel Models

October 3, 2011
By
Showing Explained Variance in Multilevel Models

In this post I will show one way to display explained variance using a line chart. For the best of my knowledge, there is no a default plot for displaying the effect of a factor on the deviance of multilevel models; so this is going to be a tentative ...

Read more »

Showing explained variance from multilevel models

October 3, 2011
By
Showing explained variance from multilevel models

In this post I will show one way to display explained variance using a line chart. For the best of my knowledge, there is no a default plot for displaying the effect of a factor on the deviance of multilevel models; so this is going to be a tentativ...

Read more »

Linear Regression Diagnostics

October 2, 2011
By

In a previous post I detailed the method calculating linear regression models using matrices. Here I expand the matrix caculations and show how to calculate studentized residuals and Cooks distance, including some C# code. The following  shows h...

Read more »

scikit-learn 0.9

October 2, 2011
By

Last week we released a new version of scikit-learn. The Changelog is particularly impressive, yet personally this release is important for other reasons. This will probably be my last release as a paid engineer. I'm starting a PhD next month, and alth...

Read more »

The Average Hotel Does Not Get The Average Rating

October 1, 2011
By
The Average Hotel Does Not Get The Average Rating

The millions of travelers who review hotels, restaurants, and other attractions on TripAdvisor also supply a numeric rating by clicking one of five circles ranging from 1 for "terrible" to 5 for "excellent." On the whole, travelers are pretty kind.The ...

Read more »

Converting images in Python

September 29, 2011
By
Converting images in Python

I had a recent request to convert an entire folder of JPEG images into EPS or similar vector graphics formats. The client was on a Mac, and didn’t have ImageMagick. I discovered the Python Image Library  to be enormously useful in this, and allowed me to implement the conversion in around 10 lines of Python code!!! […]

Read more »

Googling Bayes’ pictures

September 29, 2011
By
Googling Bayes’ pictures

I am writing way too many posts in a row on Google tools. I promise I will think about something else soon. I find amusing the possibility to launch a search in Google images by just dragging a picture into the search box, instead of typing text. I remember that Pierre told me about it [...]

Read more »

A Bayesian view of Amazon Resellers

September 28, 2011
By
A Bayesian view of Amazon Resellers

I was buying a used book through Amazon this evening. Three resellers offered the book at essentially the same price. Here were their ratings: 94% positive out of 85,193 reviews 98% positive out of 20,785 reviews 99% positive out of 840 reviews Which reseller is likely to give the best service? Before you assume it’s the seller with the [...]

Read more »

Ghastly R code

September 27, 2011
By
Ghastly R code

My R package, R/qtl, contains about 33k lines of R code (and 21k lines of C code). Some of it is quite good; some of it is terrible. Here’s another example of the terrible. I’ve long needed to revise the function scantwo, for performing a two-dimensional genome scan for pairs of loci. I was looking [...]

Read more »

Gamified

September 26, 2011
By
Gamified

Barry Rowlingson gave an interesting talk at UseR 2011, “Why R-help must die!” He suggested the Q-and-A type sites Stack Overflow (on programming) and Cross Validated (on statistics), both part of Stack Exchange. An interesting feature of these sites is that, in addition to voting up and down on the questions and answers, one accrues [...]

Read more »

ZedGraph Box Plot

September 24, 2011
By

It is possible to get a rudimentary box plot in ZedGraph by combing a HiLowBarItem and ErrorBarItem. That looks something like this, The code below assumes that you have a form with a ZedGraph control on. Here is the Boxplot … Continue reading &#...

Read more »

The equivalence of logistic regression and maximum entropy models

September 23, 2011
By

Nina Zumel recently gave a very clear explanation of logistic regression ( The Simpler Derivation of Logistic Regression ). In particular she called out the central role of log-odds ratios and demonstrated how the “deviance” (that mysterious quantity reported by fitting packages) is both a term in “the pseudo-R^2″ (so directly measures goodness of fit) [...] Related posts: The Simpler Derivation of Logistic Regression Learn Logistic Regression (and beyond) Large…

Read more »


Subscribe

Email:

  Subscribe