# Posts Tagged ‘ Statistical Programming ’

## Using simulation to compute a power curve

June 5, 2013
By

Last week I showed how to use simulation to estimate the power of a statistical test. I used the two-sample t test to illustrate the technique. In my example, the difference between the means of two groups was 1.2, and the simulation estimated a probability of 0.72 that the t [...]

## Using simulation to estimate the power of a statistical test

May 30, 2013
By

The power of a statistical test measures the test's ability to detect a specific alternate hypothesis. For example, educational researchers might want to compare the mean scores of boys and girls on a standardized test. They plan to use the well-known two-sample t test. The null hypothesis is that the [...]

## Understanding local and global variables in the SAS/IML language

April 29, 2013
By

The TV show Cheers was set in a bar "where everybody knows your name." Global knowledge of a name is appealing for a neighborhood pub, but not for a programming language. Most programming languages enable you to define functions that have local variables: variables whose names are known only inside [...]

## How to overlay a custom density curve on a histogram in SAS

April 24, 2013
By

I've previously described how to overlay two or more density curves on a single plot. I've also written about how to use PROC SGPLOT to overlay custom curves on a graph. This article describes how to overlay a density curve on a histogram. For common distributions, you can overlay a [...]

## Point/Counterpoint: Where should you put ODS SELECT and ODS OUTPUT statements?

April 8, 2013
By

ODS statements are global SAS statements. As such, you can put them anywhere in your SAS program. For maximum readability, many SAS programmers agree that most ODS statements should appear outside procedures in "open" SAS code. For example, most programmers agree that the following statements should appear outside of procedures: [...]

## How to compute the distance between observations in SAS

March 27, 2013
By

In statistics, distances between observations are used to form clusters, to identify outliers, and to estimate distributions. Distances are used in spatial statistics and in other application areas. There are many ways to define the distance between observations. I have previously written an article that explains Mahalanobis distance, which is [...]

## Got Matrix? Reach for the SAS/IML language

March 20, 2013
By

Someone recently asked a question on the SAS Support Communities about estimating parameters in ridge regression. I answered the question by pointing to a matrix formula in the SAS documentation. One of the advantages of the SAS/IML language is that you can implement matrix formulas in a natural way. The [...]

## The case of spilled coffee and the regression intercept

March 13, 2013
By

Argh! I've just spilled coffee on output that shows the least squares coefficients for a regression model that I was investigating. Now the parameter estimate for the intercept is completely obscured, although I can still see the parameter estimates for the coefficients of the continuous explanatory variable. What can I [...]

## Construct normal data from summary statistics

March 11, 2013
By

Last week there was an interesting question posted to the "Stat-Math Statistics" group on LinkedIn. The original question was a little confusing, so I'll state it in a more general form: A population is normally distributed with a known mean and standard deviation. A sample of size N is drawn [...]

## Find variables common to multiple data sets

February 6, 2013
By

Last week the SAS Training Post blog posted a short article on an easy way to find variables in common to two data sets. The article used PROC CONTENTS (with the SHORT option) to print out the names of variables in SAS data sets so that you can visually determine [...]