12 Tips for SAS Statistical Programmers

January 3, 2013

(This article was originally published at The DO Loop, and syndicated at StatsBlogs.)

It's the start of a new year. Have you made a resolution to be a better data analyst? A better SAS statistical programmer? To learn more about multivariate statistics? What better way to start the New Year than to read (or re-read!) the top 12 articles for statistical programmers from my blog in 2012. Each contains tips and techniques to make you a better programmer. (Sorry, but I can't promise that they will help you to lose weight!)

I've organized the 12 tips into four categories: multivariate statistics, simulation, matrix computations, and data analysis. Each of these categories is an essential area of knowledge for statistical programmers. The articles that made the Top 12 lists were among my most popular blog posts of 2012.

Multivariate statistics

Multivariate data are often correlated. Therefore multivariate analysis, simulation, and outlier detection must account for correlation. These articles describe techniques for understanding and analyzing correlated data:


I've written many article on simulation, but these two articles describe how to implement efficient simulation algorithms in SAS:

Matrix computations

The SAS/IML language makes it easy to compute with matrices and vectors, and to compute quantities such as eigenvalues:

Data analysis

Data analysis requires knowledge of statistics, software, programming, and a lot of common sense. No wonder the "data scientist" is the current hot job!

  • Fitting a Poisson distribution to data in SAS: Some people ask why the UNIVARIATE procedure doesn't support fitting a Poisson distribution. It's because the Poisson distribution is discrete, whereas the UNIVARIATE procedure fits continuous distributions. To fit Poisson data, use PROC GENMOD.
  • Compute a running mean and variance: In matrix-vector languages, it is important to vectorize computations to maximize the efficiency of your program. This program describes a vectorized algorithm for computing the running mean and the running variance.
  • For each observation, find the variable that contains the minimum value: In SAS software, there are usually many ways to compute a quantity. This article describes how to carry out a common task by using PROC IML, PROC SQL, and the DATA step.

What articles will be popular in 2013? I don't know, but I am committed to bringing you efficient tips and techniques for statistical programming, statistical graphics, and data analysis in SAS. Subscribe to this blog so that you don't miss a single article!

tags: Data Analysis, Getting Started, Statistical Programming

Please comment on the article here: The DO Loop

Tags: , , ,