# Posts Tagged ‘ Tutorials ’

October 30, 2014
By

Continuing our series of reading out loud from a single page of a statistics book we look at page 224 of the 1972 Dover edition of Leonard J. Savage’s “The Foundations of Statistics.” On this page we are treated to an example attributed to Leo A. Goodman in 1953 that illustrates how for normally distributed […] Related posts: Automatic bias correction doesn’t fix omitted variable bias Reading the Gauss-Markov theorem…

## Calculating the sum or mean of a numeric (continuous) variable by a group (categorical) variable in SAS

Introduction A common task in data analysis and statistics is to calculate the sum or mean of a continuous variable.  If that variable can be categorized into 2 or more classes, you may want to get the sum or mean for each class. This sounds like a simple task, yet I took a surprisingly long time […]

August 26, 2014
By

What is the Gauss-Markov theorem? From “The Cambridge Dictionary of Statistics” B. S. Everitt, 2nd Edition: A theorem that proves that if the error terms in a multiple regression have the same variance and are uncorrelated, then the estimators of the parameters in the model produced by least squares estimation are better (in the sense […] Related posts: What is meant by regression modeling? Skimming statistics papers for the ideas…

## The Chi-Squared Test of Independence – An Example in Both R and SAS

$The Chi-Squared Test of Independence – An Example in Both R and SAS$

Introduction The chi-squared test of independence is one of the most basic and common hypothesis tests in the statistical analysis of categorical data.  Given 2 categorical random variables, and , the chi-squared test of independence determines whether or not there exists a statistical dependence between them.  Formally, it is a hypothesis test with the following null and […]

## Do your "data janitor work" like a boss with dplyr

August 20, 2014
By

Data “janitor-work”The New York Times recently ran a piece on wrangling and cleaning data:“For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights”Whether you call it “janitor-work,” wrangling/munging, cleaning/cleansing/scru...

## Video Tutorial – Calculating Expected Counts in a Contingency Table Using Joint Probabilities

In an earlier video, I showed how to calculate expected counts in a contingency table using marginal proportions and totals.  (Recall that expected counts are needed to conduct hypothesis tests of independence between categorical random variables.)  Today, I want to share a second video of calculating expected counts – this time, using joint probabilities.  This method uses […]

## Video Tutorial – Allelic Frequencies Remain Constant From Generation to Generation Under the Hardy-Weinberg Equilibrium

The Hardy-Weinberg law is a fundamental principle in statistical genetics.  If its 7 assumptions are fulfilled, then it predicts that the allelic frequency of a genetic trait will remain constant from generation to generation.  In this new video tutorial in my Youtube channel, I explain the math behind the Hardy-Weinberg theorem.  In particular, I clarify […]

## Automatic bias correction doesn’t fix omitted variable bias

July 8, 2014
By

Page 94 of Gelman, Carlin, Stern, Dunson, Vehtari, Rubin “Bayesian Data Analysis” 3rd Edition (which we will call BDA3) provides a great example of what happens when common broad frequentist bias criticisms are over-applied to predictions from ordinary linear regression: the predictions appear to fall apart. BDA3 goes on to exhibit what might be considered […] Related posts: Frequentist inference only seems easy Six Fundamental Methods to Generate a Random…

## Video Tutorial – Calculating Expected Counts in Contingency Tables Using Marginal Proportions and Marginal Totals

A common task in statistics and biostatistics is performing hypothesis tests of independence between 2 categorical random variables.  The data for such tests are best organized in contingency tables, which allow expected counts to be calculated easily.  In this video tutorial in my Youtube channel, I demonstrate how to calculate expected counts using marginal proportions […]

## Introduction to R for Life Scientists: Course Materials

July 7, 2014
By

Last week I taught a three-hour introduction to R workshop for life scientists at UVA's Health Sciences Library.I broke the workshop into three sections:In the first half hour or so I presented slides giving an overview of R and why R is so awesome. Du...