Blog Archives

Position items in a grid

May 21, 2018
By
Position items in a grid

In a recent blog post, Chris Hemedinger used a scatter plot to show the result of 100 coin tosses. Chris arranged the 100 results in a 10 x 10 grid, where the first 10 results were shown on the first row, the second 10 were shown on the second row, and so [...] The post Position items in a grid appeared first on The DO Loop.

Read more »

Decile calibration plots in SAS

May 16, 2018
By
Decile calibration plots in SAS

In my article about how to construct calibration plots for logistic regression models in SAS, I mentioned that there are several popular variations of the calibration plot. The previous article showed how to construct a loess-based calibration curve. Austin and Steyerberg (2013) recommend the loess-based curve on the basis of [...] The post Decile calibration plots in SAS appeared first on The DO Loop.

Read more »

Calibration plots in SAS

May 14, 2018
By
Calibration plots in SAS

A logistic regression model is a way to predict the probability of a binary response based on values of explanatory variables. It is important to be able to assess the accuracy of a predictive model. This article shows how to construct a calibration plot in SAS. A calibration plot is [...] The post Calibration plots in SAS appeared first on The DO Loop.

Read more »

Independent streams of random numbers in SAS

May 9, 2018
By
Independent streams of random numbers in SAS

In a previous blog post, I discussed ways to produce statistically independent samples from a random number generator (RNG). The best way is to generate all samples from one stream. However, if your program uses two or more SAS DATA steps to simulate the data, you cannot use the same [...] The post Independent streams of random numbers in SAS appeared first on The DO Loop.

Read more »

Independence and overlap in streams of random numbers

May 7, 2018
By
Independence and overlap in streams of random numbers

Simulation studies require both randomness and reproducibility, two qualities that are sometimes at odds with each other. A Monte Carlo simulation might need to generate millions of random samples, where each sample contains dozens of continuous variables and many thousands of observations. In simulation studies, the researcher wants each sample [...] The post Independence and overlap in streams of random numbers appeared first on The DO Loop.

Read more »

Order variables in a heat map or scatter plot matrix

May 2, 2018
By
Order variables in a heat map or scatter plot matrix

Order matters. When you create a graph that has a categorical axis (such as a bar chart), it is important to consider the order in which the categories appear. Most software defaults to alphabetical order, which typically gives no insight into how the categories relate to each other. Alphabetical order [...] The post Order variables in a heat map or scatter plot matrix appeared first on The DO Loop.

Read more »

Assign colors in heat maps: A study of married couples and college majors

April 30, 2018
By
Assign colors in heat maps: A study of married couples and college majors

Some say that opposites attract. Others say that birds of a feather flock together. Which is it? Phillip N. Cohen, a professor of sociology at the University of Maryland, recently posted an interesting visualization that indicates that married couples who are college graduates tend to be birds of a feather. [...] The post Assign colors in heat maps: A study of married couples and college majors appeared first on The…

Read more »

An easier way to run thousands of regressions

April 25, 2018
By
An easier way to run thousands of regressions

SAS programmers on SAS discussion forums sometimes ask how to run thousands of regressions of the form Y = B0 + B1*X_i, where i=1,2,.... A similar question asks how to solve thousands of regressions of the form Y_i = B0 + B1*X for thousands of response variables. I have previously [...] The post An easier way to run thousands of regressions appeared first on The DO Loop.

Read more »

The 80-20 rule for blogs

April 23, 2018
By
The 80-20 rule for blogs

You've probably heard about the "80-20 Rule," which describes many natural and manmade phenomena. This rule is sometimes called the "Pareto Principle" because it was discovered by Vilfredo Pareto (1848–1923) who used it to describe the unequal distribution of wealth. Specifically, in his study, 80% of the wealth was held [...] The post The 80-20 rule for blogs appeared first on The DO Loop.

Read more »

The sweep operator: A fundamental operation in regression

April 18, 2018
By
The sweep operator: A fundamental operation in regression

The sweep operator performs elementary row operations on a system of linear equations. The sweep operator enables you to build regression models by "sweeping in" or "sweeping out" particular rows of the X`X matrix. As you do so, the estimates for the regression coefficients, the error sum of squares, and [...] The post The sweep operator: A fundamental operation in regression appeared first on The DO Loop.

Read more »


Subscribe

Email:

  Subscribe