Data with repeated measures often come to us in the "wide" format, as shown below for the HELP data set we use in our book. Here we show just an ID, the CESD depression measure from four follow-up assessments, plus the baseline CESD. Obs ID CESD...

As we discuss in section 6.1.4 of the second edition, R and SAS handle categorical variables and their parameterization in models quite differently. SAS treats them on a procedure-by-procedure basis, which leads to some odd differences in capabilities...

In Example 8.40, side-by-side histograms, we showed how to generate histograms for some continuous variable, for each level of a categorical variable in a data set. An anonymous reader asked how we would do this if both the variables were continuous. ...

As of today, the second edition of "SAS and R: Data Management, Statistical Analysis, and Graphics" is shipping from CRC Press, Amazon, and other booksellers. There are lots of additional examples from this blog, new organization, and other features ...

In our last entry, we demonstrated how to simulate data from a logistic regression with an interaction between a dichotomous and a continuous covariate. In this entry we show how to use the simulation to estimate the power to detect that interaction. ...

Reader Annisa Mike asked in a comment on an early post about power calculation for logistic regression with an interaction. This is a topic that has come up with increasing frequency in grant proposals and article submissions. We'll begin by showing ...

A colleague recently contacted us with the following question: "My outcome is skewed-- how can I compare medians across multiple categories?" What they were asking for was a generalization of the Wilcoxon rank-sum test (also known as the Mann-Whitney...

We're both users of multiple imputation for missing data. We believe it is the most practical principled method for incorporating the most information into data analysis. In fact, one of our more successful collaborations is a review of software for ...

Rick Wicklin showed how to make a Hilbert matrix in SAS/IML. Rick has a nice discussion of these matrices and why they might be interesting; the value of H_{r,c} is 1/(r+c-1). We show how to make this matrix in the data step and in R. We also show t...

One common violation of the assumptions needed for linear regression is heterscedasticity by group membership. Both SAS and R can easily accommodate this setting. Our data today comes from a real example of vitamin D supplementation of milk. Four sup...