Posts Tagged ‘ Data Analysis ’

Weighted percentiles

August 29, 2016
By
Weighted percentiles

Many univariate descriptive statistics are intuitive. However, weighted statistic are less intuitive. A weight variable changes the computation of a statistic by giving more weight to some observations than to others. This article shows how to compute and visualize weighted percentiles, also known as a weighted quantiles, as computed by […] The post Weighted percentiles appeared first on The DO Loop.

Read more »

Formats for p-values and odds ratios in SAS

August 15, 2016
By
Formats for p-values and odds ratios in SAS

Last week I showed some features of SAS formats, including the fact that you can use formats to bin a continuous variable without creating a new variable in the DATA step. During the discussion I mentioned that it can be confusing to look at the output of a formatted variable […] The post Formats for p-values and odds ratios in SAS appeared first on The DO Loop.

Read more »

Compute highest density regions in SAS

August 1, 2016
By
Compute highest density regions in SAS

In a scatter plot, the regions where observations are packed tightly are areas of high density. A contour plot or heat map of a bivariate kernel density estimate (KDE) is one way to visualize regions of high density. A SAS customer asked whether it is possible to use SAS to […] The post Compute highest density regions in SAS appeared first on The DO Loop.

Read more »

What happened when I was forced to wait 30 minutes for the subway

July 18, 2016
By

What happened when I was forced to wait 30 minutes for the subway: pondering how easy it is for data analysts to get fooled by bad data

Read more »

How much do New Yorkers tip taxi drivers?

May 2, 2016
By
How much do New Yorkers tip taxi drivers?

When I read Robert Allison's article about the cost of a taxi ride in New York City, I was struck by the scatter plot (shown at right; click to enlarge) that plots the tip amount against the total bill for 12 million taxi rides. The graph clearly reveals diagonal and […] The post How much do New Yorkers tip taxi drivers? appeared first on The DO Loop.

Read more »

Visualize missing data in SAS

April 20, 2016
By
Visualize missing data in SAS

You can visualize missing data. It sounds like an oxymoron, but it is true. How can you draw graphs of something that is missing? In a previous article, I showed how you can use PROC MI in SAS/STAT software to create a table that shows patterns of missing data in […] The post Visualize missing data in SAS appeared first on The DO Loop.

Read more »

Examine patterns of missing data in SAS

April 18, 2016
By
Examine patterns of missing data in SAS

Missing data can be informative. Sometimes missing values in one variable are related to missing values in another variable. Other times missing values in one variable are independent of missing values in other variables. As part of the exploratory phase of data analysis, you should investigate whether there are patterns […] The post Examine patterns of missing data in SAS appeared first on The DO Loop.

Read more »

The WHERE clause in SAS/IML

April 4, 2016
By
The WHERE clause in SAS/IML

In SAS procedures, the WHERE clause is a useful way to filter observations so that the procedure receives only a subset of the data to analyze. The IML procedure supports the WHERE clause in two separate statements. On the USE statement, the WHERE clause acts as a global filter. The […] The post The WHERE clause in SAS/IML appeared first on The DO Loop.

Read more »

Save descriptive statistics for multiple variables in a SAS data set

March 28, 2016
By
Save descriptive statistics for multiple variables in a SAS data set

Descriptive univariate statistics are the foundation of data analysis. Before you create a statistical model for new data, you should examine descriptive univariate statistics such as the mean, standard deviation, quantiles, and the number of nonmissing observations. In SAS, there is an easy way to create a data set that […] The post Save descriptive statistics for multiple variables in a SAS data set appeared first on The DO Loop.

Read more »

High school rankings of top NCAA wrestlers

March 25, 2016
By
High school rankings of top NCAA wrestlers

Last weekend was the 2016 NCAA Division I wrestling tournament. In collegiate wrestling there are ten weight classes. The top eight wrestlers in each weight class are awarded the title "All-American" to acknowledge that they are the best wrestlers in the country. I saw a blog post on the InterMat […] The post High school rankings of top NCAA wrestlers appeared first on The DO Loop.

Read more »


Subscribe

Email:

  Subscribe