Initializing vectors by using repetition factors

February 25, 2013

(This article was originally published at The DO Loop, and syndicated at StatsBlogs.)

The SAS/IML language has a curious syntax that enables you to specify a "repetition factor" when you initialize a vector of literal values. Essentially, the language enables you to specify the frequency of an element. For example, suppose you want to define the following vector:

proc iml;
x = {1 2 2 2 2 3 3 3 3 3 4 4 5 5 5 5 5 5};

The vector has one 1, followed by four 2s, followed by five 3s, two 4s, and six 5s. An alternative syntax is to specify the "repetition factor" for each element by using a positive integer enclosed in brackets, like so:

x = {1 [4]2 [5]3 [2]4 [6]5};

You can think of the repetition factor as the frequency or number of occurrences of the value that follows the closing bracket.

Admittedly, this simple example does not save a lot of typing, but if a value is repeated tens or hundreds of times, this syntax not only saves typing, but also is less prone to error and is clearer to read. For example, repetition factors make it easy to specify the genders of 100 subjects:

gender = {[42]"Female" [58]"Male"};

I find this syntax interesting because I am not aware of many other languages that support repetition factors like this. FORTRAN has repetition factors for the FORMAT and the DATA statement. This syntax is supported in the SAS DATA step, which obviously preceded and inspired the SAS/IML syntax:

data A;
array x(18) (1 4*2 5*3 2*4 6*5);
Furthermore, the SAS/SCL language had repetition factors for defining arrays. Does anyone know of other languages that support a similar syntax for initializing arrays?

tags: Getting Started, SAS Programming

Please comment on the article here: The DO Loop

Tags: , ,