# Posts Tagged ‘ Data Analysis ’

## Sorting correlation coefficients by their magnitudes in a SAS macro

$Sorting correlation coefficients by their magnitudes in a SAS macro$

Theoretical Background Many statisticians and data scientists use the correlation coefficient to study the relationship between 2 variables.  For 2 random variables, and , the correlation coefficient between them is defined as their covariance scaled by the product of their standard deviations.  Algebraically, this can be expressed as . In real life, you can never […]

## Find your birthday in the digits of pi

March 13, 2017
By

It is time for Pi Day, 2017! Every year on March 14th (written 3/14 in the US), geeky mathematicians and their friends celebrate "all things pi-related" because 3.14 is the three-decimal approximation to pi. This year I use SAS software to show an amazing fact: you can find your birthday [...] The post Find your birthday in the digits of pi appeared first on The DO Loop.

## Quantile estimates and the difference of medians in SAS

February 22, 2017
By

Sometimes SAS programmers ask about how to analyze quantiles with SAS. Common questions include: How can I compute 95% confidence intervals for a median in SAS? How can I test whether the medians of two independent samples are significantly different? How can I repeat the previous analyses with other percentiles, [...] The post Quantile estimates and the difference of medians in SAS appeared first on The DO Loop.

## The distribution of colors for plain M&M candies

February 20, 2017
By

Many introductory courses in probability and statistics encourage students to collect and analyze real data. A popular experiment in categorical data analysis is to give students a bag of M&M® candies and ask them to estimate the proportion of colors in the population from the sample data. In some classes, [...] The post The distribution of colors for plain M&M candies appeared first on The DO Loop.

## An easy way to run thousands of regressions in SAS

February 13, 2017
By

A common question on SAS discussion forums is how to repeat an analysis multiple times. Most programmers know that the most efficient way to analyze one model across many subsets of the data (perhaps each country or each state) is to sort the data and use a BY statement to [...] The post An easy way to run thousands of regressions in SAS appeared first on The DO Loop.

## Counting is hard, especially when you don’t have theories

January 19, 2017
By

Exploring the data about movies, uncovering data issues

## Ten posts from 2016 that deserve a second look

January 11, 2017
By

Last week I wrote about the 10 most popular articles from The DO Loop in 2016. The popular articles tend to be about elementary topics that appeal to a wide range of SAS programmers. Today I present an "editor's choice" list of technical articles that describe more advanced statistical methods […] The post Ten posts from 2016 that deserve a second look appeared first on The DO Loop.

## Is "La Quinta" Spanish for "Next to Denny’s"?

January 6, 2017
By

“La Quinta” is Spanish for “next to Denny’s.”      -- Mitch Hedberg, comedian Mitch Hedberg's joke resonates with travelers who drive on the US interstate system because many highway exits feature both a La Quinta Inn™ and a Denny's® restaurant within a short distance of each other. But does a […] The post Is "La Quinta" Spanish for "Next to Denny's"? appeared first on The DO Loop.

## The top 10 posts from The DO Loop in 2016

January 4, 2017
By

I wrote 105 posts for The DO Loop blog in 2016. My most popular articles were about data analysis, SAS programming tips, and elementary statistics. Without further ado, here are the most popular articles from 2016. Data Analysis and Visualization Start with a juicy set of data and an interesting […] The post The top 10 posts from The DO Loop in 2016 appeared first on The DO Loop.

## Data Preparation, Long Form and tl;dr Form

December 26, 2016
By

Data preparation and cleaning are some of the most important steps of predictive analytic and data science tasks. They are laborious, where most of the errors are made, your last line of defense against a wild data, and hold the biggest opportunities for outcome improvement. No matter how much time you spend on them, they … Continue reading Data Preparation, Long Form and tl;dr Form