Posts Tagged ‘ Statistical Programming ’

Simultaneous confidence intervals for multinomial proportions

February 15, 2017
By
Simultaneous confidence intervals for multinomial proportions

A categorical response variable can take on k different values. If you have a random sample from a multinomial response, the sample proportions estimate the proportion of each category in the population. This article describes how to construct simultaneous confidence intervals for the proportions as described in the 1997 paper [...] The post Simultaneous confidence intervals for multinomial proportions appeared first on The DO Loop.

Read more »

An easy way to run thousands of regressions in SAS

February 13, 2017
By
An easy way to run thousands of regressions in SAS

A common question on SAS discussion forums is how to repeat an analysis multiple times. Most programmers know that the most efficient way to analyze one model across many subsets of the data (perhaps each country or each state) is to sort the data and use a BY statement to [...] The post An easy way to run thousands of regressions in SAS appeared first on The DO Loop.

Read more »

Winsorization: The good, the bad, and the ugly

February 8, 2017
By
Winsorization: The good, the bad, and the ugly

On discussion forums, I often see questions that ask how to Winsorize variables in SAS. For example, here are some typical questions from the SAS Support Community: I want an efficient way of replacing (upper) extreme values with (95th) percentile. I have a data set with around 600 variables and [...] The post Winsorization: The good, the bad, and the ugly appeared first on The DO Loop.

Read more »

Simulate many samples from a linear regression model

February 1, 2017
By
Simulate many samples from a linear regression model

In a previous article, I showed how to simulate data for a linear regression model with an arbitrary number of continuous explanatory variables. To keep the discussion simple, I simulated a single sample with N observations and p variables. However, to use Monte Carlo methods to approximate the sampling distribution [...] The post Simulate many samples from a linear regression model appeared first on The DO Loop.

Read more »

Automate the creation of a discrete attribute map

January 30, 2017
By
Automate the creation of a discrete attribute map

If you are a SAS programmer and use the GROUP= option in PROC SGPLOT, you might have encountered a thorny issue: if you use a WHERE clause to omit certain observations, then the marker colors for groups might change from one plot to another. This happens because the marker colors [...] The post Automate the creation of a discrete attribute map appeared first on The DO Loop.

Read more »

Simultaneous confidence intervals for a multivariate mean

December 7, 2016
By
Simultaneous confidence intervals for a multivariate mean

Many SAS procedure compute statistics and also compute confidence intervals for the associated parameters. For example, PROC MEANS can compute the estimate of a univariate mean, and you can use the CLM option to get a confidence interval for the population mean. Many parametric regression procedures (such as PROC GLM) […] The post Simultaneous confidence intervals for a multivariate mean appeared first on The DO Loop.

Read more »

Loess regression in SAS/IML

October 19, 2016
By
Loess regression in SAS/IML

A previous post discusses how the loess regression algorithm is implemented in SAS. The LOESS procedure in SAS/STAT software provides the data analyst with options to control the loess algorithm and fit nonparametric smoothing curves through points in a scatter plot. Although PROC LOESS satisfies 99.99% of SAS users who […] The post Loess regression in SAS/IML appeared first on The DO Loop.

Read more »

Compute nearest neighbors in SAS

September 14, 2016
By
Compute nearest neighbors in SAS

Finding nearest neighbors is an important step in many statistical computations such as local regression, clustering, and the analysis of spatial point patterns. Several SAS procedures find nearest neighbors as part of an analysis, including PROC LOESS, PROC CLUSTER, PROC MODECLUS, and PROC SPP. This article shows how to find […] The post Compute nearest neighbors in SAS appeared first on The DO Loop.

Read more »

How to visualize a kernel density estimate

July 27, 2016
By
How to visualize a kernel density estimate

A kernel density estimate (KDE) is a nonparametric estimate for the density of a data sample. A KDE can help an analyst determine how to model the data: Does the KDE look like a normal curve? Like a mixture of normals? Is there evidence of outliers in the data? In […] The post How to visualize a kernel density estimate appeared first on The DO Loop.

Read more »

Use the EFFECTPLOT statement to visualize regression models in SAS

June 22, 2016
By
Use the EFFECTPLOT statement to visualize regression models in SAS

Graphs enable you to visualize how the predicted values for a regression model depend on the model effects. You can gain an intuitive understanding of a model by using the EFFECTPLOT statement in SAS to create graphs like the one shown at the top of this article. Many SAS regression […] The post Use the EFFECTPLOT statement to visualize regression models in SAS appeared first on The DO Loop.

Read more »


Subscribe

Email:

  Subscribe