Index or indicator variables

April 19, 2014
By

Someone who doesn’t want his name shared (for the perhaps reasonable reason that he’ll “one day not be confused, and would rather my confusion not live on online forever”) writes: I’m exploring HLMs and stan, using your book with Jennifer Hill as my field guide to this new territory. I think I have a generally […]The post Index or indicator variables appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

Old tails: a crude power law fit on ebook sales

April 18, 2014
By
Old tails: a crude power law fit on ebook sales

We use R to take a very brief look at the distribution of e-book sales on Amazon.com. Recently Hugh Howey shared some eBook sales data spidered from Amazon.com: The 50k Report. The data is largely a single scrape of statistics about various anonymized books. Howey’s analysis tries to break sales down by declared category and […] Related posts: Sample size…

Read more »

Welcome to Econometrics Students in China

April 18, 2014
By
Welcome to Econometrics Students in China

One of my students mentioned to me yesterday that there was quite a bit of action on Weibo (the Chinese equivalent to Twitter) relating to posts on this blog - especially those posts relating to MCMC methods in econometrics. That's just great - thanks ...

Read more »

My talks @ Universitat de Girona

April 18, 2014
By
My talks @ Universitat de Girona

Just after Easter, I'll go for a very quick trip to lovely Girona, where Marc Saez has invited me to give two talks.The first one will be a re-run of the short course on INLA that I did at Bayes Pharma last year. It's scheduled (and prepared) as a 3-ho...

Read more »

Date formating in R

April 18, 2014
By

As I often manipulate time series from different sources, I rarely come across the same date format twice. Having to reformat the dates every time is a real waste of time because I never remember the syntax of the as.Date function. I put below a few examples that turn strings into standard R date format. […]

Read more »

One-tailed or two-tailed?

April 18, 2014
By
One-tailed or two-tailed?

Someone writes: Suppose I have two groups of people, A and B, which differ on some characteristic of interest to me; and for each person I measure a single real-valued quantity X. I have a theory that group A has a higher mean value of X than group B. I test this theory by using […]The post One-tailed or two-tailed?…

Read more »

More from xkcd

April 18, 2014
By
More from xkcd

Here's another from xkcd.com, on our "good graphics" theme.

Read more »

Les Arbres de Classification

April 18, 2014
By
Les Arbres de Classification

J’animerai une formation lundi 28 de 14:00 à 16:00 au local N-6320 de l’UQAM sur le thème introduction aux arbres de classification. Cette formation est organisée dans le cadre des séminaires en méthodes d’analyses quantitatives et qualitatives qui se tiennent régulièrement depuis un peu plus d’un mois. animé par le collectif pour le développement et les applications en mesure et évaluation (Cdame). Les…

Read more »

An overused chart, why it fails, and how to fix it

April 17, 2014
By
An overused chart, why it fails, and how to fix it

Reader and tipster Chris P. found this "death spiral" chart dizzying (link). It's one of those charts that has conceptual appeal but does not do the data justice. As the name implies, the designer has a strong message, that the...

Read more »

Correlation does not imply causation (parental involvement edition)

April 17, 2014
By

The New York Times recently published an article on education titled "Parental Involvement Is Overrated". Most research in this area supports the opposite view, but the authors claim that "evidence from our research suggests otherwise".  Before you stop helping your children … Continue reading →

Read more »

If you get to the point of asking, just do it. But some difficulties do arise . . .

April 17, 2014
By

Nelson Villoria writes: I find the multilevel approach very useful for a problem I am dealing with, and I was wondering whether you could point me to some references about poolability tests for multilevel models. I am working with time series of cross sectional data and I want to test whether the data supports cross […]The post If you get…

Read more »

How Valuable is a #1 Ranking for Analytics Software? Not as Much as You Might Think!

April 17, 2014
By
How Valuable is a #1 Ranking for Analytics Software?  Not as Much as You Might Think!

In my never-ending quest to study the Popularity of Data Analysis Software, I recently read the 2013 Edition of the Wisdom of Crowds Business Intelligence Market Study by Dresner Advisory Services, LLC. In it, I found the table below which … Continue reading →

Read more »

Data Stories Episode About Data Storytelling

April 17, 2014
By
Data Stories Episode About Data Storytelling

How is it possible that it has taken a podcast called Data Stories 35 episodes to get to the topic of data storytelling? Alberto Cairo and I helped get the topic straightened out, and I think we even convinced Moritz that stories are not the enemy of exploration. It was a fun episode to record, and it touches on many…

Read more »

How Fast the Fastest Human Would Run 100m?

April 17, 2014
By
How Fast the Fastest Human Would Run 100m?

Ethan Siegel wrote a post entitled The Math of the Fastest Human Alive five years ago, using regressions. An alternative is too use extreme value models (I wrote a post a long time ago on the maximum length of a tennis match using extreme value theory a few years ago). In 2009, John Einmahl and Sander Smeets wrote a great article entitled…

Read more »

Bitsanity

April 16, 2014
By
Bitsanity

BitsanityThe awesome folks at Quandl (an amazing data collection and distribution service) have been so kind as to allow me to write for their blog.In my first post for them I demonstrate (with detailed R code) how a user of their free data services co...

Read more »

The horrible confusion between different entropies explained in a way that answers: Where do likelihoods and priors come from?

April 16, 2014
By

Here I derive a simple formula for probability distributions general enough for Statistical Mechanics and Classical Statistics in which the roles, meanings, and interpretations between the Information Entropy and Boltzmann’s Entropy are as clear ...

Read more »

An Exercise With the SURE Model

April 16, 2014
By
An Exercise With the SURE Model

Here's an exercise that I sometimes set for students if we're studying the Seemingly Unrelated Regression equations (SURE) model. In fact, I used it as part of a question in the final examination that my grad. students sat last week.Suppose that we hav...

Read more »

Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials

April 16, 2014
By

Prakash Nayak writes: I work as a musculoskeletal oncologist (surgeon) in Mumbai, India and am keen on sarcoma research. Sarcomas are rare disorders, and conventional frequentist analysis falls short of providing meaningful results for clinical application. I am thus keen on applying Bayesian analysis to a lot of trials performed with small numbers in this […]The post Looking for Bayesian…

Read more »

The reality is most A/B tests fail, and Facebook is here to help

April 16, 2014
By

Two years ago, Wired breathlessly extolled the virtues of A/B testing (link). A lot of Web companies are in the forefront of running hundreds or thousands of tests daily. The reality is that most A/B tests fail. A/B tests fail for many reasons. Typically, business leaders consider a test to have failed when the analysis fails to support their hypothesis.…

Read more »

Errors on percentage errors

April 16, 2014
By
Errors on percentage errors

The MAPE (mean absolute percentage error) is a popular measure for forecast accuracy and is defined as     where denotes an observation and denotes its forecast, and the mean is taken over . Armstrong (1985, p.348) was the first (to my knowledge) to point out the asymmetry of the MAPE saying that “it has a bias favoring estimates that…

Read more »

The Granville incident

April 16, 2014
By
The Granville incident

Earlier this morning, there was some commotion on the allstat mailing list (if you don't know what it is, that's a UK-based discussion list specifically focussed on statistics; it's been active for quite some time and usually you get useful information...

Read more »

On the determinant of the Hilbert matrix

April 16, 2014
By
On the determinant of the Hilbert matrix

Last week I described the Hilbert matrix of size n, which is a famous square matrix in numerical linear algebra. It is famous partially because its inverse and its determinant have explicit formulas (that is, we know them exactly), but mainly because the matrix is ill-conditioned for moderate values of […]

Read more »

A. Spanos: Jerzy Neyman and his Enduring Legacy

April 16, 2014
By
A. Spanos: Jerzy Neyman and his Enduring Legacy

A Statistical Model as a Chance Mechanism Aris Spanos  Jerzy Neyman (April 16, 1894 – August 5, 1981), was a Polish/American statistician[i] who spent most of his professional career at the University of California, Berkeley. Neyman is best known in statistics for his pioneering contributions in framing the Neyman-Pearson (N-P) optimal theory of hypothesis testing […]

Read more »


Subscribe

Email:

  Subscribe