Another 180 on Piketty’s Measurement

June 14, 2014
By

My first Piketty Post unabashedly praised Piketty's measurement (if not his theory):"Piketty's book truly shines on the data side. ... Its tables and figures...provide a rich and jaw-dropping image, like a new high-resolution photo of a previously...

RGolf: NGSL Scrabble

June 14, 2014
By

It is last part of RGolf before summer. As R excels in visualization capabilities today the task will be to generate a plot.We will work with NGSL data - a list of 2801 important vocabulary words for students of English as a second ...

European talks. June-July 2014

June 14, 2014
By

For the next month I am travelling in Europe and will be giving the following talks. 17 June. Challenges in forecasting peak electricity demand. Energy Forum, Sierre, Valais/Wallis, Switzerland. 20 June. Common functional principal component models for...

Identifying Pathways in the Consumer Decision Journey: Nonnegative Matrix Factorization

June 13, 2014
By

The Internet has freed us from the shackles of the yellow page directory, the trip to the nearby store to learn what is available, and the forced choice among a limited set of alternatives. The consumer is in control of their purchase journey and can t...

Health Care Costs Gone Wild

June 13, 2014
By

Yet again I am amazed by the unscrupulous hubris of the US medical industry. On May 12th I went into a local dermatologist's office in MD to have a wart removed. The procedure was exceedingly simple and took no more than 20 minutes.  It involved ...

The Oracle (2)

June 13, 2014
By

The World Cup is now under way, after an arguably fairly lacklustre performance by the host against a tough (if possibly a bit naive) Croatian team, still resulting in a 3-1 win for Brazil. I'll try and comment on our predictions for the first few...

An Annotated Online Bioinformatics / Computational Biology Curriculum

June 13, 2014
By

Two years ago David Searls published an article in PLoS Comp Bio describing a series of online courses in bioinformatics. Yesterday, the same author published an updated version, "A New Online Computational Biology Curriculum," (PLoS Comput Biol 10(6):...

World Cup pseudo-science

June 13, 2014
By

Lee Sechrest pointed me to this news article by Vitomir Miles Raguz, “Brazil Won’t Win the World Cup. A European team will win again thanks to training and statistical analysis.” Hmmm . . . “statistical analysis.” This Raguz character better coordinate stories with Nate; it seems that the statistical experts are disagreeing . . . […] The post World Cup pseudo-science appeared first on Statistical Modeling, Causal Inference, and Social…

What I do when I get a new data set as told through tweets

June 13, 2014
By

Hilary Mason asked a really interesting question yesterday: Data people: What is the very first thing you do when you get your hands on a new data set? — Hilary Mason (@hmason) June 12, 2014 You should really consider reading … Continue reading →

June 13, 2014
By

If history can tell us anything about the World Cup, it’s that the host nation has an advantage of all other teams. Evidence of this was presented last night as the referee in the Brazil-Croatia match unjustly ruled in Brazil’s favour on several occasions. But what it is the statistical evidence of a host advantage? […]

Geometry, sensitivity, and parameters of the lognormal distribution

June 13, 2014
By

Today is my 500th blog post for The DO Loop. I decided to celebrate by doing what I always do: discuss a statistical problem and show how to solve it by writing a program in SAS. Two ways to parameterize the lognormal distribution I recently blogged about the relationship between […]

trying to speed up Metropolis… and failing!

June 12, 2014
By

A while ago (but still after Iceland since I used the thorn rune as a math symbol!), I wrote the following post draft as a memo. Now that Marco Banterle, Clara Grazian and myself have completed our delayed acceptance paper, it may be of interest to some readers to see how a first attempt proved […]

Stan is Turing Complete. So what?

June 12, 2014
By

This post is by Bob Carpenter. Stan is Turing complete! There seems to a persistent misconception that Stan isn’t Turing complete.1, 2 My guess is that it stems from Stan’s (not coincidental) superficial similarity to BUGS and JAGS, which provide directed graphical model specification languages. Stan’s Turing completeness follows from its support of array data […] The post Stan is Turing Complete. So what? appeared first on Statistical Modeling, Causal…

The Syrian p-value that I didn’t bother to calculate

June 12, 2014
By

I posted something on the sister blog about the fake vote totals from the Syrian election. We know the numbers are fake from the official report, which reads: Speaker of the People’s Assembly, Mohammad Jihad al-Laham announced Wednesday that Dr. Bashar Hafez al-Assad won the post of the Syrian Arab Republic’s President for a new […] The post The Syrian p-value that I didn’t bother to calculate appeared first on…

Example 2014.6: Comparing medians and the Wilcoxon rank-sum test

June 12, 2014
By

A colleague recently contacted us with the following question: "My outcome is skewed-- how can I compare medians across multiple categories?" What they were asking for was a generalization of the Wilcoxon rank-sum test (also known as the Mann-Whitney...

The First Nuclear Bomb, Bayesian Statistics, and the Progress of Science

June 12, 2014
By

The yield from the first atomic bomb was classified, and might have remained so if some knucklehead hadn’t published a series of photos complete with scales and time stamps in Life magazine. This was enough to allow the physicist G. I. Taylor to ...

The Poisson Transform for Unnormalised Statistical Models

June 12, 2014
By
$The Poisson Transform for Unnormalised Statistical Models$

Nicolas Chopin has just arxived our manuscript on inference for unnormalised statistical models. An unnormalised statistical model whose likelihood function can be written where is easy to compute but the normalisation constant is hard. A lot of common models fall into that category, for example Ising models or restricted Boltzmann machines. Not having the normalisation […]

Mathematics and Applied Statistics Lesson of the Day – The Harmonic Mean

$Mathematics and Applied Statistics Lesson of the Day – The Harmonic Mean$

The harmonic mean, H, for positive real numbers is defined as . This type of mean is useful for measuring the average of rates.  For example, consider a car travelling for 240 kilometres at 2 different speeds: 60 km/hr for 120 km 40 km/hr for another 120 km Then its average speed for this trip […]

The Most Comprehensive Review of Comic Books Teaching Statistics

June 12, 2014
By

As I’m more or less an autodidact when it comes to statistics, I have a weak spot for books that try to introduce statistics in an accessible and pedagogical way. I have therefore collected what I believe are all books that introduces statistics us...

If you have a 45% chance of winning, is it “yours to lose”?

June 11, 2014
By

Nate Silver gives Brazil a 45% chance of winning the World Cup, with only Argentina and Germany having more than a 10% chance. My gut feeling is that that’s a bit high, but I’m no expert. What I find striking, though, is that the headline says it’s “Brazil’s to lose.” Huh? If we take Silver’s […] The post If you have a 45% chance of winning, is it “yours to…

Superfast Metrop using data partitioning, from Marco Banterle, Clara Grazian, and Christian Robert

June 11, 2014
By

Superfast not because of faster convergence but because they use a clever acceptance/rejection trick so that most of the time they don’t have to evaluate the entire target density. It’s written in terms of single-step Metropolis but I think it should be possible to do it in HMC or Nuts, in which case we could […] The post Superfast Metrop using data partitioning, from Marco Banterle, Clara Grazian, and Christian…

June 11, 2014
By

The majority of the blog-related comments and requests for help that I receive come from the one person - called "Anonymous". (S)he seems to have very broad interests.Here's a very recent request for help relating to ARDL models - something that I...

Do You Use P-Values and Confidence Intervals?

June 11, 2014
By

Unless your econometrics training has been true-blue Bayesian in nature, you'll have reported a lot of p-values, and constructed heaps of confidence intervals in your time.Both of these concepts have been the centre of widespread controversy in the sta...