SAS

Blogs on the SAS software

Sample quantiles: A comparison of 9 definitions

May 24, 2017
By
Sample quantiles: A comparison of 9 definitions

According to Hyndman and Fan ("Sample Quantiles in Statistical Packages," TAS, 1996), there are nine definitions of sample quantiles that commonly appear in statistical software packages. Hyndman and Fan identify three definitions that are based on rounding and six methods that are based on linear interpolation. This blog post shows [...] The post Sample quantiles: A comparison of 9 definitions appeared first on The DO Loop.

Read more »

Quantile definitions in SAS

May 22, 2017
By
Quantile definitions in SAS

In last week's article about the Flint water crisis, I computed the 90th percentile of a small data set. Although I didn't mention it, the value that I reported is different from the the 90th percentile that is reported in Significance magazine. That is not unusual. The data only had [...] The post Quantile definitions in SAS appeared first on The DO Loop.

Read more »

Quantiles and the Flint water crisis

May 17, 2017
By
Quantiles and the Flint water crisis

The April 2017 issue of Significance magazine features a cover story by Robert Langkjaer-Bain about the Flint (Michigan) water crisis. For those who don't know, the Flint water crisis started in 2014 when the impoverished city began using the Flint River as a source of city water. The water was [...] The post Quantiles and the Flint water crisis appeared first on The DO Loop.

Read more »

Dueling Data Science Surveys: KDnuggets & Rexer Go Live

May 16, 2017
By

What tools do we use most for data science, machine learning, or analytics? Python, R, SAS, KNIME, RapidMiner,…? How do we use them? We are about to find out as the two most popular surveys on data science tools have … Continue reading →

Read more »

INTCK and INTNX: Two essential functions for computing intervals between dates in SAS

May 15, 2017
By
INTCK and INTNX: Two essential functions for computing intervals between dates in SAS

Last week I showed a timeline of living US presidents. The number of living presidents is constant during the time interval between inaugurations and deaths of presidents. The data was taken from a Wikipedia table (shown below) that shows the number of years and days between events. This article shows [...] The post INTCK and INTNX: Two essential functions for computing intervals between dates in SAS appeared first on The…

Read more »

Simulate lognormal data in SAS

May 10, 2017
By
Simulate lognormal data in SAS

A SAS customer asked how to simulate data from a three-parameter lognormal distribution as specified in the PROC UNIVARIATE documentation. In particular, he wanted to incorporate a threshold parameter into the simulation. Simulating lognormal data is easy if you remember an important fact: if X is lognormally distributed, then Y=log(X) [...] The post Simulate lognormal data in SAS appeared first on The DO Loop.

Read more »

Timeline of living US presidents

May 8, 2017
By
Timeline of living US presidents

Quick! What is the next term in the numerical sequence 1, 2, 1, 2, 3, 4, 5, 4, 3, 4, ...? If you said '3', then you must be an American history expert, because that sequence represents the number of living US presidents beginning with Washington's inauguration on 30APR1789 and [...] The post Timeline of living US presidents appeared first on The DO Loop.

Read more »

Perceptions of probability

May 3, 2017
By
Perceptions of probability

If a financial analyst says it is "likely" that a company will be profitable next year, what probability would you ascribe to that statement? If an intelligence report claims that there is "little chance" of a terrorist attack against an embassy, should the ambassador interpret this as a one-in-a-hundred chance, [...] The post Perceptions of probability appeared first on The DO Loop.

Read more »

Split data into groups that have the same mean and variance

May 1, 2017
By
Split data into groups that have the same mean and variance

A frequently asked question on SAS discussion forums concerns randomly assigning units (often patients in a study) to various experimental groups so that each group has approximately the same number of units. This basic problem is easily solved in SAS by using PROC SURVEYSELECT or a DATA step program. A [...] The post Split data into groups that have the same mean and variance appeared first on The DO Loop.

Read more »

Visualize a design matrix

April 26, 2017
By
Visualize a design matrix

Most SAS regression procedures support a CLASS statement which internally generates dummy variables for categorical variables. I have previously described what dummy variables are and how are they used. I have also written about how to create design matrices that contain dummy variables in SAS, and in particular how to [...] The post Visualize a design matrix appeared first on The DO Loop.

Read more »


Subscribe

Email:

  Subscribe