Suppose you’re drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you’ve already seen? The problem is more interesting when the interval is unknown. You may be trying…Read more ›

Suppose you’re drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you’ve already seen? The problem is more interesting when the interval is unknown. You may be trying…Read more ›

Fractional factorial designs use the notation; unfortunately, this notation is not clearly explained in most textbooks or web sites about experimental design. I hope that my explanation below is useful. is the number of levels in each factor; note that the notation assumes that all factors have the same number of levels. If a factor has […]

What is meant by regression modeling? Linear Regression is one of the most common statistical modeling techniques. It is very powerful, important, and (at first glance) easy to teach. However, because it is such a broad topic it can be a minefield for teaching and discussion. It is common for angry experts to accuse writers […] Related posts: How robust is logistic regression? Modeling Trick: Masked Variables Modeling Trick: the…

There are several other blogs on forecasting that readers might be interested in. Here are seven worth following: No Hesitations by Francis Diebold (Professor of Economics, University of Pennsylvania). Diebold needs no introduction to forecasters. He primarily covers forecasting in economics and finance, but also xkcd cartoons, graphics, research issues, etc. Econometrics Beat by Dave Giles. Dave is a professor of economics at the University of Victoria (Canada), formerly from my own…

Four years ago, many of us were glued to the “spill cam” showing, in real time, the gushing oil from the April 20, 2010 explosion sinking the Deepwater Horizon oil rig in the Gulf of Mexico, killing 11, and spewing oil until July 15 (see video clip that was added below).Remember junk shots, top kill, blowout preventers? [1] The EPA has […]

Here the monotonicity of the EM algorithm is established. $$ f_{o}(Y_{o}|\theta)=f_{o,m}(Y_{o},Y_{m}|\theta)/f_{m|o}(Y_{m}|Y_{o},\theta)$$ $$ \log L_{o}(\theta)=\log L_{o,m}(\theta)-\log f_{m|o}(Y_{m}|Y_{o},\theta) \label{eq:loglikelihood} $$ where \( L_{o}(\theta)\) is the likelihood under the observed data and \(L_{o,m}(\theta)\) is the likelihood under the complete data. Taking the expectation of the second line with respect to the conditional distribution of \(Y_{m}\) given \(Y_{o}\) and […] The post Monotonicity of EM Algorithm Proof appeared first on Lindons Log.

The “sampling from an infinite population” metaphor beloved by statisticians of all types is a disaster for reproducible science. To explain why I’ll show what sampling from a finite population has going for it that’s not there ...

We use R to take a very brief look at the distribution of e-book sales on Amazon.com. Recently Hugh Howey shared some eBook sales data spidered from Amazon.com: The 50k Report. The data is largely a single scrape of statistics about various anonymized books. Howey’s analysis tries to break sales down by declared category and […] Related posts: Sample size and power for rare events Living in A Lognormal World…