What good is an old weather forecast?

February 6, 2014
Why would anyone care about what the weather was predicted to be once you know what the weather actually was? Because people make decisions based in part on weather predictions, not just weather. Eric Floehr of ForecastWatch told me that people are starting to realize this and are increasingly interested in his historical prediction data. […]

Two new reviews of Numbersense

February 6, 2014
I've learned one thing about book readers. There is a lag between buying a book and reading it. In fact, I imagine an author has two battles to win: one is at the bookstore (or Amazon) getting you to purchase the book; the next is to get you to pull out the book from your shelves and start reading it. I mean, what am I to say? I own shelves…

Online R and Plotly Graphs: Canadian and U.S. Maps, Old Faithful with Multiple Axes, & Overlaid Histograms

February 6, 2014
Guest post by Matt Sundquist of plot.ly. Plotly is a social graphing and analytics platform. Plotly’s R library lets you make and share publication-quality graphs online. Your work belongs to you, you control privacy and sharing, and public use is free (like GitHub). We are in beta, and would love your feedback, thoughts, and advice. […]

Using MongoHQ to build a Shiny Hit Counter

February 6, 2014
In serveral previous posts I have posted shiny applications which temporarily store data on shiny servers such as hit counters or the survey tool which I created,  These do not work in the long term since shiny will restart its servers without war...

Small multiples with simple axes

February 6, 2014
Jens M., a long-time reader, submits a good graphic! This small-multiples chart (via Quartz) compares the consumption of liquor from selected countries around the world, showing both the level of consumption and the change over time. What they did right:...

Just a thought on peer reviewing – I can’t help myself.

February 6, 2014
Today I was thinking about reviewing, probably because I was handling a couple of papers as AE and doing tasks associated with reviewing several other papers. I know that this is idle thinking, but suppose peer review was just a … Continue reading →

Risk Measures with Extreme Value Models

February 5, 2014
$F$

We’ve seen Monday, in the MAT8595 course how to use the Generalized Pareto Distribution to estimate some downside risk measures, given a sample (assumed to be i.i.d., I will not mention here properties on extremes for stochastic processes) with distribution . The cumulative distribution function of the  Pareto distribution is here For some threshold , and , we can write From Pickands–Balkema–de Haan theorem, if is large enough, then Given our…

Prior distribution for a predicted probability

February 5, 2014
I received the following email: I have an interesting thought on a prior for a logistic regression, and would love your input on how to make it “work.” Some of my research, two published papers, are on mathematical models of **. Along those lines, I’m interested in developing more models for **. . . . […]The post Prior distribution for a predicted probability appeared first on Statistical Modeling, Causal Inference,…

February 5, 2014
We are currently selecting the cover design for OTexts books. The first one to go into print will be Forecasting: principles and practice. We have narrowed the choice to the two designs below, although changes are still possible. I thought it would be useful to get some feedback on these designs from readers of this blog (and from people who subscribe to my twitter feed). If you have any comments…

A simple way to find the root of a function of one variable

February 5, 2014
Finding the root (or zero) of a function is an important computational task because it enables you to solve nonlinear equations. I have previously blogged about using Newton's method to find a root for a function of several variables. I have also blogged about how to use the bisection method [...]

“Probabilism as an Obstacle to Statistical Fraud-Busting” (draft iii)

February 5, 2014
Update: Feb. 21, 2014 (slides at end): Ever find when you begin to “type” a paper to which you gave an off-the-cuff title months and months ago that you scarcely know just what you meant or feel up to writing a paper with that (provocative) title? But then, pecking away at the outline of a possible […]

Interview for the Capital of Statistics

February 5, 2014
Earo Wang recently interviewed me for the Chinese website Capital of Statistics. The English transcript of the intervew is on Earo’s personal website. This is the third interview I’ve done in the last 18 months. The others were for: Data Mining R...

Bayesian First Aid: One Sample and Paired Samples t-test

February 4, 2014
Student’s t-test is a staple of statistical analysis. A quick search on Google Scholar for “t-test” results in 170,000 hits in 2013 alone. In comparison, “Bayesian” gives 130,000 hits while “box plot” results in only 12,500 hits. To be ...

BMHE @ University of Alberta (reds vs blues)

February 4, 2014
When I was a kid, we use to play Subbuteo all the time (in fact, my brother and I had this exact box, featuring Sampdoria on the cover \$-\$ I thought I just mentioned this, since last night we won the Genova derby...).You may think this is totally irrel...

Special discount on Stan! \$999 cheaper than Revolution R!

February 4, 2014
And we’ll throw in RStan and PyStan for free! Details here. The post Special discount on Stan! \$999 cheaper than Revolution R! appeared first on Statistical Modeling, Causal Inference, and Social Science.

February 4, 2014
As always - there's lots of interesting reading out there. Here are my suggestions for this month:Advani, A. and Tymon Słoczyński, 2013. Mostly harmless simulations? On the internal validity of empirical Monte Carlo studies.Discussion Paper...

MailChimp Gmail study as an example of Big Data studies 2/2

February 4, 2014
Previously, I analyzed the data analysis by MailChimp on the impact of Gmail's new tabbing feature, and noted a potential data issue (link). In Part 2, I will look at the MailChimp study as a typical example of "Big Data" studies. The Gmail study has several features that are hallmarks of Big Data. First and foremost, the analyst boasts of a staggering amount of data ("29 billion emails, 4.9 billion…

Widening the goalposts in medical trials

February 4, 2014
Paul Alper writes: I do not believe your blog has ever dealt with the following phenomenon which might be called “(widening) moving the goalposts.” Drug companies and the medical world at large often create powerful drugs and procedures for people who are far (many standard deviations) from the norm (mean) and via randomized clinical trials, […]The post Widening the goalposts in medical trials appeared first on Statistical Modeling, Causal Inference,…

February 4, 2014
Depuis la fin de semaine passée, Julien Tomas est de passage à l’UQAM, pour quelques semaines, en tant que stagiaire post-doctoral. Il fera un séminaire cet après-midi sur Vraisemblance locale adaptative et application à l’assurance dépendance. Nous nous intéressons à la construction de la loi de survie d’individus dépendants ayant le même niveau de sévérité (dépendance lourde). En pratique, les actuaires utilisent souvent des méthodes s’appuyant fortement sur l’opinion d’experts.…

Within Group Index in R

February 4, 2014
There are many occasions in my research when I want to create a within group index for a data frame. For example, with demographic data for siblings one might want to create a birth order index. The below illustrates a simple example of how one can create such an index in R.

My Online Course Development Workflow

February 4, 2014
One of the nice things about developing 9 new courses for the JHU Data Science Specialization in a short period of time is that you get to learn all kinds of cool and interesting tools. One of the ways that we … Continue reading →

Peabody here.

February 4, 2014
I saw the trailer for the new Mr. Peabody movie and it looked terrible. They used that weird animation where everything looks round, also the voice had none of the intonations of the “real” Peabody (for some reason, the trailer had the original English voices, maybe they didn’t get their act together to make a […]The post Peabody here. appeared first on Statistical Modeling, Causal Inference, and Social Science.