Posts Tagged ‘ Data Analysis ’

Fisher’s transformation of the correlation coefficient

September 20, 2017
By
Fisher’s transformation of the correlation coefficient

Pearson's correlation measures the linear association between two variables. Because the correlation is bounded between [-1, 1], the sampling distribution for highly correlated variables is highly skewed. Even for bivariate normal data, the skewness makes it challenging to estimate confidence intervals for the correlation, to run one-sample hypothesis tests ("Is [...] The post Fisher's transformation of the correlation coefficient appeared first on The DO Loop.

Read more »

Store multiple strings of text as a macro variable in SAS with PROC SQL and the INTO statement

Store multiple strings of text as a macro variable in SAS with PROC SQL and the INTO statement

I often need to work with many variables at a time in SAS, but I don’t like to type all of their names manually – not only is it messy to read, it also induces errors in transcription, even when copying and pasting.  I recently learned of an elegant and efficient way to store multiple […]

Read more »

Construct polynomial effects in SAS regression models

September 7, 2017
By
Construct polynomial effects in SAS regression models

If you use SAS regression procedures, you are probably familiar with the "stars and bars" notation, which enables you to construct interaction effects in regression models. Although you can construct many regression models by using that classical notation, a friend recently reminded me that the EFFECT statement in SAS provides [...] The post Construct polynomial effects in SAS regression models appeared first on The DO Loop.

Read more »

7 ways to view correlation

September 5, 2017
By
7 ways to view correlation

Correlation is a fundamental statistical concept that measures the linear association between two variables. There are multiple ways to think about correlation: geometrically, algebraically, with matrices, with vectors, with regression, and more. To paraphrase the great songwriter Paul Simon, there must be 50 ways to view your correlation! But don't [...] The post 7 ways to view correlation appeared first on The DO Loop.

Read more »

The singular value decomposition and low-rank approximations

August 30, 2017
By
The singular value decomposition and low-rank approximations

A previous article discussed the mathematical properties of the singular value decomposition (SVD) and showed how to use the SVD subroutine in SAS/IML software. This article uses the SVD to construct a low-rank approximation to an image. Applications include image compression and denoising an image. Construct a grayscale image The [...] The post The singular value decomposition and low-rank approximations appeared first on The DO Loop.

Read more »

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

I often create character variables (i.e. variables with strings of text as their values) in SAS, and they sometimes don’t render as expected.  Here is an example involving the built-in data set SASHELP.CLASS. Here is the code: data c1;      set sashelp.class;      * define a new character variable to classify someone as tall or […]

Read more »

Use a bar chart to visualize pairwise correlations

August 16, 2017
By
Use a bar chart to visualize pairwise correlations

Visualizing the correlations between variables often provides insight into the relationships between variables. I've previously written about how to use a heat map to visualize a correlation matrix in SAS/IML, and Chris Hemedinger showed how to use Base SAS to visualize correlations between variables. Recently a SAS programmer asked how [...] The post Use a bar chart to visualize pairwise correlations appeared first on The DO Loop.

Read more »

What is rank correlation?

August 14, 2017
By
What is rank correlation?

When someone refers to the correlation between two variables, they are probably referring to the Pearson correlation, which is the standard statistic that is taught in elementary statistics courses. Elementary courses do not usually mention that there are other measures of correlation. Why would anyone want a different estimate of [...] The post What is rank correlation? appeared first on The DO Loop.

Read more »

Robust principal component analysis in SAS

August 9, 2017
By
Robust principal component analysis in SAS

Recently, I was asked whether SAS can perform a principal component analysis (PCA) that is robust to the presence of outliers in the data. A PCA requires a data matrix, an estimate for the center of the data, and an estimate for the variance/covariance of the variables. Classically, these estimates [...] The post Robust principal component analysis in SAS appeared first on The DO Loop.

Read more »

Dimension reduction: Guidelines for retaining principal components

August 2, 2017
By
Dimension reduction: Guidelines for retaining principal components

Last week I blogged about the broken-stick problem in probability, which reminded me that the broken-stick model is one of the many techniques that have been proposed for choosing the number of principal components to retain during a principal component analysis. Recall that for a principal component analysis (PCA) of [...] The post Dimension reduction: Guidelines for retaining principal components appeared first on The DO Loop.

Read more »


Subscribe

Email:

  Subscribe