## Review: Kölner R Meeting 13 December 2013

December 17, 2013
Last week's Cologne R user group meeting was the best attended so far. Well, we had a great line up indeed. Matt Dowle came over from London to give an introduction to the data.table package. He was joined by his collaborator Arun Srinivasan, who is ba...

## Excel, fanaticism and R

December 17, 2013
This week I’ve been feeling tired of excessive fanaticism (or zealotry) of open source software (OSS) and R in general. I do use a fair amount of OSS and pushed for the adoption of R in our courses; in fact, I do think OSS is a Good ThingTM. I do not like, however, constant yabbering […]

## On Wigner’s law (and the semi-circle)

December 17, 2013
$n\times n$

There is something that I love about mathematics: sometimes, you discover – by chance – a law. It has always been there, it might have been well known by some people (specialized in some given field), but you did not know it. And then, you discover it, and you start wondering how comes you never heard about it before… I experienced that feeling this evening, while working on the syallbus for…

## THE END

December 17, 2013
In addition to being the best comedy TV show ever, Seinfeld was a great source of wisdom. In one episode, Jerry counsels George: “When you hit that high note, you say goodnight and walk off.” Later, George has a good line at a meeting at Kruger Industrial Smoothing. Then he says: “Alright! That’s it for […]

## Christmas came early (or who’s the geekiest in the family?)

December 16, 2013
By pure accident (honest! I didn't do it on purpose!), last week I opened a package that had come in the post for Marta. Too bad that, of all the packages I could have opened, this was the Christmas gift she had bought (in fact I should say manufacture...

## A summary of the evidence that most published research is false

December 16, 2013
One of the hottest topics in science has two main conclusions: Most published research is false There is a reproducibility crisis in science The first claim is often stated in a slightly different way: that most results of scientific experiments … Continue reading →

## Whither the “bet on sparsity principle” in a nonsparse world?

December 16, 2013
Rob Tibshirani writes: Hastie et al. (2001) coined the informal “Bet on Sparsity” principle. The l1 methods assume that the truth is sparse, in some basis. If the assumption holds true, then the parameters can be efficiently estimated using l1 penalties. If the assumption does not hold—so that the truth is dense—then no method will […]The post Whither the “bet on sparsity principle” in a nonsparse world? appeared first on…

## The exception to the rule against dual axes

December 16, 2013
Dual axes are almost always a bad idea. But there is one situation under which I'd use it. *** Last week, Alberto Cairo (link) engaged in a Twitter/blogging debate about a chart that first appeared in Reuters concerning the state...

## FRB St. Louis is Far Ahead of the Data Pack

December 16, 2013
The email below arrived recently from the Federal Reserve Bank of St. Louis. It reminds me of something that's hardly a secret, but that nevertheless merits applause, namely that FRBSL's Research Department is a wonderful source of economic and fi...

## Death of a statistician from Surbiton

December 16, 2013
This morning I heard from Christian the news that Dennis Lindley has passed away, last Saturday. He was 90. I think I'd already linked to this video in a different post, but I'm re-linking it right now.

## How to coerce SAS/IML vectors to matrices

December 16, 2013
Recently a SAS/IML programmer asked a question regarding how to perform matrix arithmetic when some of the data are in vectors and other are in matrices. The programmer wanted to add the following matrices: The problem was that the numbers in the first two matrices were stored in vectors. The [...]

## Scaling An Axis to Make A Point

December 16, 2013
A clever chart redesign last week got a lot of people talking about which one is “right.” What is more interesting to me is not which one is (supposedly) the better representation of the truth, but which purpose each one serves. The original chart is the following, which shows the number of female CEOs in […]

## Sunday data/statistics link roundup (12/15/13)

December 16, 2013
Rafa (in Spanish) clarifying some of the problems with the anti-GMO crowd. Joe Bliztstein, most recently of #futureofstats fame, talks up data science in the Harvard Crimson (via Rafa). As has been pointed out by Rebecca Nugent when she stopped … Continue reading →

## Deterministic and Probabilistic models and thinking

December 16, 2013
The way we understand and make sense of variation in the world affects decisions we make. Part of understanding variation is understanding the difference between deterministic and probabilistic (stochastic) models. The NZ curriculum specifies the following learning outcome: “Selects and … Continue reading →

## The Complexities of Customer Segmentation: Removing Response Intensity to Reveal Response Pattern

December 15, 2013
At the end of the last post, the reader was left assuming respondent homogeneity without any means for discovering if all of our customers adopted the same feature prioritization. To review, nine features were presented one at a time, and each time res...

## Surprising Facts about Surprising Facts

December 15, 2013
A paper of mine on “double-counting” and novel evidence just came out: ”Some surprising facts about (the problem of) surprising facts” in Studies in History and Philosophy of Science (2013), http://dx.doi.org/10.1016/j.shpsa.2013.10.005 ABSTRACT: A common intuition about evidence is that if data x have been used to construct a hypothesis H, then x should not be used again in […]

## The UN Plot to Force Bayesianism on Unsuspecting Americans (penalized B-Spline edition)

December 15, 2013
Mike Spagat sent me an email with the above heading, referring to this paper by Leontine Alkema and Jin Rou New, which begins: National estimates of the under-5 mortality rate (U5MR) are used to track progress in reducing child mortality and to evaluate countries’ performance related to United Nations Millennium Development Goal 4, which calls […]The post The UN Plot to Force Bayesianism on Unsuspecting Americans (penalized B-Spline edition) appeared…

## plotting y and log(y) in one figure

December 15, 2013
Sometimes I have the desire to plot both on the linear and on the log scale. To save space just two figures is not my solution. I want to reuse the x-axis, legend, title. This post examines possibilities to do so with standard plot tools, lattice and g...

## Trapezoidal Integration – Conceptual Foundations and a Statistical Application in R

Introduction Today, I will begin a series of posts on numerical integration, which has a wide range of applications in many fields, including statistics.  I will introduce trapezoidal integration by discussing its conceptual foundations, write my own R function to implement trapezoidal integration, and use it to check that the Beta(2, 5) probability density function […]

## Visualization of 2012 Crime Rates of Different States in the US using rCharts

December 15, 2013
UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS: http://bit.ly/1pi6mGo. PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.In this post, I look at crime rates (per 100,000 people) in 2012...