# Posts Tagged ‘ Data Analysis ’

## The 80-20 rule for blogs

April 23, 2018
By

You've probably heard about the "80-20 Rule," which describes many natural and manmade phenomena. This rule is sometimes called the "Pareto Principle" because it was discovered by Vilfredo Pareto (1848–1923) who used it to describe the unequal distribution of wealth. Specifically, in his study, 80% of the wealth was held [...] The post The 80-20 rule for blogs appeared first on The DO Loop.

## Distance correlation

April 4, 2018
By

Correlation is a statistic that measures how closely two variables are related to each other. The most popular definition of correlation is the Pearson product-moment correlation, which is a measurement of the linear relationship between two variables. Many textbooks stress the linear nature of the Pearson correlation and emphasize that [...] The post Distance correlation appeared first on The DO Loop.

## Visualize repetition in song lyrics

March 14, 2018
By

One of my favorite magazines, Significance, printed an intriguing image of a symmetric matrix that shows repetition in a song's lyrics. The image was created by Colin Morris, who has created many similar images. When I saw these images, I knew that I wanted to duplicate the analysis in SAS! [...] The post Visualize repetition in song lyrics appeared first on The DO Loop.

## The difference between CLASS statements and BY statements in SAS

February 14, 2018
By

When I first learned to program in SAS, I remember being confused about the difference between CLASS statements and BY statements. A novice SAS programmer recently asked when to use one instead of the other, so this article explains the difference between the CLASS statement and BY variables in SAS [...] The post The difference between CLASS statements and BY statements in SAS appeared first on The DO Loop.

## 10 posts from 2017 that deserve a second look

January 10, 2018
By

Last week I wrote about the 10 most popular articles from The DO Loop in 2017. My most popular articles tend to be about elementary statistics or SAS programming tips. Less popular are the articles about advanced statistical and programming techniques. However, these technical articles fill an important niche. Not [...] The post 10 posts from 2017 that deserve a second look appeared first on The DO Loop.

## Label multiple regression lines in SAS

January 8, 2018
By

A SAS programmer asked how to label multiple regression lines that are overlaid on a single scatter plot. Specifically, he asked to label the curves that are produced by using the REG statement with the GROUP= option in PROC SGPLOT. He wanted the labels to be the slope and intercept [...] The post Label multiple regression lines in SAS appeared first on The DO Loop.

## The top 10 posts from The DO Loop in 2017

January 3, 2018
By

I wrote more than 100 posts for The DO Loop blog in 2017. The most popular articles were about SAS programming tips, statistical data analysis, and simulation and bootstrap methods. Here are the most popular articles from 2017 in each category. General SAS programming techniques INTCK and INTNX: Do you [...] The post The top 10 posts from <em>The DO Loop</em> in 2017 appeared first on The DO Loop.

## How to create a sliced fit plot in SAS

December 20, 2017
By

I previously showed an easy way to visualize a regression model that has several continuous explanatory variables: use the SLICEFIT option in the EFFECTPLOT statement in SAS to create a sliced fit plot. The EFFECTPLOT statement is directly supported by the syntax of the GENMOD, LOGISTIC, and ORTHOREG procedures in [...] The post How to create a sliced fit plot in SAS appeared first on The DO Loop.

## Visualize multivariate regression models by slicing continuous variables

December 18, 2017
By

Slice, slice, baby! You've got to slice, slice, baby! When you fit a regression model that has multiple explanatory variables, it is a challenge to effectively visualize the predicted values. This article describes how to visualize the regression model by slicing the explanatory variables. In SAS, you can use the [...] The post Visualize multivariate regression models by slicing continuous variables appeared first on The DO Loop.

## 3 problems with mean imputation

December 6, 2017
By

In a previous article, I showed how to use SAS to perform mean imputation. However, there are three problems with using mean-imputed variables in statistical analyses: Mean imputation reduces the variance of the imputed variables. Mean imputation shrinks standard errors, which invalidates most hypothesis tests and the calculation of confidence [...] The post 3 problems with mean imputation appeared first on The DO Loop.