## Slides of keynote speeches, tutorials and panelist presentations at IEEE Big Data 2014

November 23, 2014
Slides of keynote speeches, tutorials and panelist presentations at the 2014 IEEE International Conference on Big Data can be found at the conference website at links below. (1) Keynote speech http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm – Never-Ending Language Learning, Tom Mitchell – E. Fredkin … Continue reading →

## When should I change to snow tires in Netherlands

November 23, 2014
The Royal Netherlands Meteorological Institute has weather information by day for a number of Dutch stations. In this post I want to use those data for a practical problem: when should I switch to winter tires? (or is that snow tires? In any case nails...

## Msc Kvetch: “You are a Medical Statistic”, or “How Medical Care Is Being Corrupted”

November 22, 2014
A NYT op-ed the other day,”How Medical Care Is Being Corrupted” (by Pamela Hartzband and Jerome Groopman, physicians on the faculty of Harvard Medical School), gives a good sum-up of what I fear is becoming the new normal, even under so-called “personalized medicine”.  “It is obsolete for the doctor to approach each patient strictly as an individual; medical decisions should […]

## Statistical computing languages at the RSS

November 22, 2014
On Friday the Royal Statistical Society hosted a meeting on Statistical computing languages, organised by my colleague Colin Gillespie. Four languages were presented at the meeting: Python, Scala, Matlab and Julia. I presented the talk on Scala. The slides I presented are available, in addition to the code examples and instructions on how to run … Continue reading Statistical computing languages at the RSS

## Statistics for Big Data

November 22, 2014
Doctoral programme in cloud computing for big data I’ve spent much of this year working to establish our new EPSRC Centre for Doctoral Training in Cloud Computing for Big Data, which partly explains the lack of posts on this blog in recent months. The CDT is now established, with 11 students in the first cohort, … Continue reading Statistics for Big Data

November 22, 2014
Tweeting has its virtues, I’m sure. But over and over I’m seeing these blog vs. twitter battles where the blogger wins. It goes like this: blogger gives tons and tons of evidence, tweeter responds with a content-free dismissal. The most recent example (as of this posting; remember we’re on an approx 2-month delay here; yes, […] The post Blogs > Twitter appeared first on Statistical Modeling, Causal Inference, and Social…

## Factor Analysis vs Principal Component Analysis

November 22, 2014
$Factor Analysis vs Principal Component Analysis$

Recently some papers discussed in our journal club  are focused on integrative clustering of multiple omics data sets. I found that they are all originated from factor analysis and make use of the advantage of factor analysis over principal component analysis. Let’s recall the model for factor analysis: where () and , with mean and […]

## 50 shades of gray goes pie-chart

November 22, 2014
Rogier Kievit sends in this under the heading, “Worst graph of the year . . . horribly unclear . . . Even the report doesn’t have a legend!”: My reply: It’s horrible but I still think the black-and-white Stroop test remains the worst visual display of all time: What’s particularly amusing about the Stroop image […] The post 50 shades of gray goes pie-chart appeared first on Statistical Modeling, Causal…

## Flowers/Fractals

November 22, 2014
Last week, I attended a "Flower Fest" where I had the opportunity to admire several of the most beautiful and awarded flowers, orchids, and decoration plants. Surprisingly, though, I never had thought of flowers like fractals the way I did this time. Fractals attract lots of interest, especially from mathematicians who actually spend some time […]

## Ordinal probit regression: Transforming polr() parameter values to make them more intuitive

November 21, 2014
In R, the polr function in the MASS package does ordinal probit regression (and ordinal logistic regression, but I focus here on probit). The polr function yields parameter estimates that are difficult to interpret intuitively because they assume a bas...

## “If you’re not using a proper, informative prior, you’re leaving money on the table.”

November 21, 2014
Well put, Rob Weiss. This is not to say that one must always use an informative prior; oftentimes it can make sense to throw away some information for reasons of convenience. But it’s good to remember that, if you do use a noninformative prior, ...

## Three good charts

November 21, 2014
Alberto Cairo, Stephen McDaniel and I were asked about our "favorite" data visualization at the Qlik Conference this week. Stephen wrote up our answers here.

## Free Stanford online course on Statistical Learning (with R) starting on 19 Jan 2015

November 21, 2014
This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and … Continue reading →

## Gelman explains why massive sample sizes to chase after tiny effects is silly

November 21, 2014
What a lucky day I found time to catch up on some Gelman. He posted about the Facebook research ethics controversy, and I'm glad to see that he and I have pretty much the same attitude (my earlier post is here.). It's a storm in a teacup. Gelman makes two other points about the Facebook study--unrelated to the ethics--which are very important. First, he said: if we happen to see…

## Resampling and permutation tests in SAS

November 21, 2014
My colleagues at the SAS & R blog recently posted an example of how to program a permutation test in SAS and R. Their SAS implementation used Base SAS and was "relatively cumbersome" (their words) when compared with the R code. In today's post I implement the permutation test in […]

## Visualization of probabilistic forecasts

November 21, 2014
This week my research group discussed Adrian Raftery’s recent paper on “Use and Communication of Probabilistic Forecasts” which provides a fascinating but brief survey of some of his work on modelling and communicating uncertain futures. Coincidentally, today I was also sent a copy of David Spiegelhalter’s paper on “Visualizing Uncertainty About the Future”. Both are […]

## A short taxonomy of Bayes factors

November 21, 2014
[Update Oct 2014: Due to some changes to the Bayes factor calculator webpage, and as I understand BFs much better now, this post has been updated …] I started to familiarize myself with Bayesian statistics. In this post I’ll show some insig...

## Erich Lehmann: Statistician and Poet

November 21, 2014
Memory Lane 1 Year (with update): Today is Erich Lehmann’s birthday. The last time I saw him was at the Second Lehmann conference in 2004, at which I organized a session on philosophical foundations of statistics (including David Freedman and D.R. Cox). I got to know Lehmann, Neyman’s first student, in 1997.  One day, I […]

