# Posts Tagged ‘ mathematics ’

## Principal Components Regression, Pt. 3: Picking the Number of Components

May 30, 2016
By

In our previous note we demonstrated Y-Aware PCA and other y-aware approaches to dimensionality reduction in a predictive modeling context, specifically Principal Components Regression (PCR). For our examples, we selected the appropriate number of principal components by eye. In this note, we will look at ways to select the appropriate number of principal components in … Continue reading Principal Components Regression, Pt. 3: Picking the Number of Components

## A bit on the F1 score floor

April 2, 2016
By

At Strata+Hadoop World “R Day” Tutorial, Tuesday, March 29 2016, San Jose, California we spent some time on classifier measures derived from the so-called “confusion matrix.” We repeated our usual admonition to not use “accuracy itself” as a project quality goal (business people tend to ask for it as it is the word they are … Continue reading A bit on the F1 score floor

## Finding the K in K-means by Parametric Bootstrap

February 9, 2016
By

One of the trickier tasks in clustering is determining the appropriate number of clusters. Domain-specific knowledge is always best, when you have it, but there are a number of heuristics for getting at the likely number of clusters in your data. We cover a few of them in Chapter 8 (available as a free sample … Continue reading Finding the K in K-means by Parametric Bootstrap

## Sequential Analysis

December 11, 2015
By

We here at Win-Vector LLC been working through an ad-hoc series about A/B testing combining elements of both operations research and statistical points of view. A dynamic programming solution to A/B test design Why does designing a simple A/B test seem so complicated? A clear picture of power and significance in A/B tests Bandit Formulations … Continue reading Sequential Analysis

## Odds and Probability: Commonly Misused Terms in Statistics – An Illustrative Example in Baseball

Yesterday, all 15 home teams in Major League Baseball won on the same day – the first such occurrence in history.  CTV News published an article written by Mike Fitzpatrick from The Associated Press that reported on this event.  The article states, “Viewing every game as a 50-50 proposition independent of all others, STATS figured the […]

## Odds and Probability: Commonly Misused Terms in Statistics – An Illustrative Example in Baseball

Yesterday, all 15 home teams in Major League Baseball won on the same day – the first such occurrence in history.  CTV News published an article written by Mike Fitzpatrick from The Associated Press that reported on this event.  The article states, “Viewing every game as a 50-50 proposition independent of all others, STATS figured the […]

## Mathematical Statistics Lesson of the Day – Basu’s Theorem

$Mathematical Statistics Lesson of the Day – Basu’s Theorem$

Today’s Statistics Lesson of the Day will discuss Basu’s theorem, which connects the previously discussed concepts of minimally sufficient statistics, complete statistics and ancillary statistics.  As before, I will begin with the following set-up. Suppose that you collected data in order to estimate a parameter .  Let be the probability density function (PDF) or probability […]

## Mathematical Statistics Lesson of the Day – Basu’s Theorem

$Mathematical Statistics Lesson of the Day – Basu’s Theorem$

Today’s Statistics Lesson of the Day will discuss Basu’s theorem, which connects the previously discussed concepts of minimally sufficient statistics, complete statistics and ancillary statistics.  As before, I will begin with the following set-up. Suppose that you collected data in order to estimate a parameter .  Let be the probability density function (PDF) or probability […]

## A dynamic programming solution to A/B test design

July 6, 2015
By

Our last article on A/B testing described the scope of the realistic circumstances of A/B testing in practice and gave links to different standard solutions. In this article we will be take an idealized specific situation allowing us to show a particularly beautiful solution to one very special type of A/B test. For this article … Continue reading A dynamic programming solution to A/B test design →

## Mathematical Statistics Lesson of the Day – An Example of An Ancillary Statistic

$Mathematical Statistics Lesson of the Day – An Example of An Ancillary Statistic$

Consider 2 random variables, and , from the normal distribution , where is unknown.  Then the statistic has the distribution . The distribution of does not depend on , so is an ancillary statistic for . Note that, if is unknown, then is not ancillary for .Filed under: Mathematical Statistics, Statistics, Statistics Lesson of the […]