Blog Archives

Chances of being picked twice for drug testing

June 18, 2014
By
Chances of being picked twice for drug testing

Suppose in a company of N employees, m are chosen randomly for drug screening [1]. In two independent screenings, what is the probability that someone will be picked both times? It may be unlikely that any given individual will be picked twice, while being very likely that someone will be picked twice. Imagine m employees […]

Read more »

Normal approximation details

May 29, 2014
By

The normal distribution can approximate many other distributions, though the details such as quantitative error estimates and what factors improve or degrade the approximation are harder to find. Here are some notes on normal approximations to several ...

Read more »

Robust in one sense, sensitive in another

May 14, 2014
By

When you sort data and look at which sample falls in a particular position, that’s called order statistics. For example, you might want to know the smallest, largest, or middle value. Order statistics are robust in a sense. The median of a sample, for example, is a very robust measure of central tendency. If Bill […]

Read more »

Distribution of a range

April 23, 2014
By

Suppose you’re drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you’ve already seen? The problem is more interesting when the interval is unknown. You may be trying to estimate the end points of the interval by taking the max and min of […]

Read more »

Timid medical research

April 15, 2014
By

Cancer research is sometimes criticized for being timid. Drug companies run enormous trials looking for small improvements. Critics say they should run smaller trials and more of them. Which side is correct depends on what’s out there waiting to be discovered, which of course we don’t know. We can only guess. Timid research is rational […]

Read more »

The mean of the mean is the mean

April 9, 2014
By
The mean of the mean is the mean

There’s a theorem in statistics that says You could read this aloud as “the mean of the mean is the mean.” More explicitly, it says that the expected value of the average of some number of samples from some distribution is equal to the expected value of the distribution itself. The shorter reading is confusing […]

Read more »

On replacing calculus with statistics

March 7, 2014
By

Russ Roberts had this to say about the proposal to replacing the calculus requirement with statistics for students. Statistics is in many ways much more useful for most students than calculus. The problem is, to teach it well is extraordinarily difficult. It’s very easy to teach a horrible statistics class where you spit back the […]

Read more »

Nomenclatural abomination

March 4, 2014
By

David Hogg calls conventional statistical notation a “nomenclatural abomination”: The terminology used throughout this document enormously overloads the symbol p(). That is, we are using, in each line of this discussion, the function p() to mean something different; its meaning is set by the letters used in its arguments. That is a nomenclatural abomination. I […]

Read more »

What good is an old weather forecast?

February 6, 2014
By
What good is an old weather forecast?

Why would anyone care about what the weather was predicted to be once you know what the weather actually was? Because people make decisions based in part on weather predictions, not just weather. Eric Floehr of ForecastWatch told me that people are starting to realize this and are increasingly interested in his historical prediction data. […]

Read more »

Heterogeneous data

January 9, 2014
By

I have a quibble with the following paragraph from Introducing Windows Azure for IT Professionals: The problem with big data is that it’s difficult to analyze it when the data is stored in many different ways. How do you analyze data that is distributed across relational database management systems (RDBMS), XML flat-file databases, text-based log […]

Read more »


Subscribe

Email:

  Subscribe