SAS

Blogs on the SAS software

Example 2014.11: Contrasts the basic way for R

September 30, 2014
By
Example 2014.11: Contrasts the basic way for R

As we discuss in section 6.1.4 of the second edition, R and SAS handle categorical variables and their parameterization in models quite differently. SAS treats them on a procedure-by-procedure basis, which leads to some odd differences in capabilities...

Read more »

Create discrete heat maps in SAS/IML

September 29, 2014
By
Create discrete heat maps in SAS/IML

In a previous article I introduced the HEATMAPCONT subroutine in SAS/IML 13.1, which makes it easy to visualize matrices by using heat maps with continuous color ramps. This article introduces a companion subroutine. The HEATMAPDISC subroutine, which also requires SAS/IML 13.1, is designed to visualize matrices that have a small […]

Read more »

The frequency of bigrams in an English corpus

September 26, 2014
By
The frequency of bigrams in an English corpus

In last week's article about the distribution of letters in an English corpus, I presented research results by Peter Norvig who used Google's digitized library and tabulated the frequency of each letter. Norvig also tabulated the frequency of bigrams, which are pairs of letters that appear consecutively within a word. […]

Read more »

Designing a quantile bin plot

September 24, 2014
By
Designing a quantile bin plot

While at JSM 2014 in Boston, a statistician asked me whether it was possible to create a "customized bin plot" in SAS. When I asked for more information, she told me that she has a large data set. She wants to visualize the data, but a scatter plot is not […]

Read more »

Skew this

September 22, 2014
By
Skew this

The skewness of a distribution indicates whether a distribution is symmetric or not. A distribution that is symmetric about its mean has zero skewness. In contrast, if the right tail of a unimodal distribution has more mass than the left tail, then the distribution is said to be "right skewed" […]

Read more »

The frequency of letters in an English corpus

September 19, 2014
By
The frequency of letters in an English corpus

It's time for another blog post about ciphers. As I indicated in my previous blog post about substitution ciphers, the classical substitution cipher is no longer used to encrypt ultra-secret messages because the enciphered text is prone to a type of statistical attack known as frequency analysis. At the root […]

Read more »

Read from one data set and write to another with SAS/IML

September 17, 2014
By
Read from one data set and write to another with SAS/IML

Many people know that the SAS/IML language enables you to read data from and write results to multiple SAS data sets. When you open a new data set, it is a good programming practice to close the previous data set. But did you know that you can have two data […]

Read more »

Handling run-time errors in user-defined modules

September 15, 2014
By
Handling run-time errors in user-defined modules

I received the following email from a SAS/IML programmer: I am getting an error in a PROC IML module that I wrote. The SAS Log says NOTE: Paused in module NAME When I submit other commands, PROC IML doesn't seem to understand them. How can I continue the program? The […]

Read more »

An exploratory technique for visualizing the distributions of 100 variables

September 10, 2014
By
An exploratory technique for visualizing the distributions of 100 variables

In a previous blog post I showed how to order a set of variables by a statistic. After reshaping data, you can create a graph that contains box plots for many variables. Ordering the variables by some statistic (mean, median, variance,...) helps to differentiate and distinguish the variables. You can […]

Read more »

Order variables by values of a statistic

September 8, 2014
By
Order variables by values of a statistic

When I create a graph of data that contains a categorical variable, I rarely want to display the categories in alphabetical order. For example, the box plot to the left is a plot of 10 standardized variables where the variables are ordered by their median value. The ordering makes it […]

Read more »


Subscribe

Email:

  Subscribe