Computing the nearest correlation matrix

November 28, 2012

(This article was originally published at The DO Loop, and syndicated at StatsBlogs.)

Frequently someone will post a question to the SAS Support Community that says something like this:

I am trying to do [statistical task] and SAS issues an error and reports that my correlation matrix is not positive definite. What is going on and how can I complete [the task]?

The statistical task varies, but one place where this problem occurs is in simulating multivariate normal data. I have previously written about why an estimated matrix of pairwise correlations is not always a valid correlation matrix. This article discusses what to do about it. The material in this article is taken from my forthcoming book, Simulating Data with SAS

Two techniques are frequently used when an estimated correlation matrix is not positive definite. One is known as the "shrinkage method" (see Ledoit and Wolf (2004) or Schafer and Strimmer (2005)) and the other is known as the "projection method" (see Higham (2002)). This article describes Higham's projection technique for correlation matrices. (This method also applies to a covariance matrix, since you can convert a covariance matrix to a correlation matrix, and vice versa.)

The basic idea is this: You start with ANY symmetric matrix. You then iteratively project it onto (1) the space of positive semidefinite matrices, and (2) the space of matrices with ones on the diagonal. Higham (2002) shows that this iteration converges to the positive semidefinite correlation matrix that is closest to the original matrix (in a matrix norm).

You don't need to understand the details of this algorithm in order to use it. Just define the following SAS/IML functions, and then pass your estimated correlation matrix to the NearestCorr function. This is a good algorithm to use when you do not require any structure in the final correlation matrix.

Step 1: Define SAS/IML functions that project a matrix onto the nearest positive definite matrix

The following SAS/IML functions implement Higham's algorithm for computing the nearest correlation matrix to a given symmetric matrix.

proc iml;
/* Project symmetric X onto S={positive semidefinite matrices}.
   Replace any negative eigenvalues of X with zero */
start ProjS(X);
   call eigen(D, Q, X); /* note X = Q*D*Q` */
   V = choose(D>0, D, 0);
   W = Q#sqrt(V`);      /* form Q*diag(V)*Q` */
   return( W*W` );      /* W*W` = Q*diag(V)*Q` */
/* project square X onto U={matrices with unit diagonal}.
   Return X with the diagonal elements replaced by ones. */
start ProjU(X);
   n = nrow(X);
   Y = X;
   diagIdx = do(1, n*n, n+1);
   Y[diagIdx] = 1;      /* set diagonal elements to 1 */
   return ( Y );
/* Helper function: the infinity norm is the max abs value of the row sums */
start MatInfNorm(A);
   return( max(abs(A[,+])) );
/* Given a symmetric correlation matrix, A, 
   project A onto the space of positive semidefinite matrices.
   The function uses the algorithm of Higham (2002) to return
   the matrix X that is closest to A in the Frobenius norm. */
start NearestCorr(A);
   maxIter = 100; tol = 1e-8;  /* can change these */
   iter = 1;      maxd   = 1;  /* initial values */ 
   Yold = A;  Xold = A;  dS = 0;
   do while( (iter <= maxIter) & (maxd > tol) );
     R = Yold - dS; /* dS is Dykstra's correction */
     X = ProjS(R);  /* project onto S={positive semidefinite} */
     dS = X - R;
     Y = ProjU(X);  /* project onto U={matrices with unit diagonal} */
     /* How much has X changed? (Eqn 4.1) */
     dx = MatInfNorm(X-Xold) / MatInfNorm(X);
     dy = MatInfNorm(Y-Yold) / MatInfNorm(Y);
     dxy = MatInfNorm(Y - X) / MatInfNorm(Y);
     maxd = max(dx,dy,dxy);
     iter = iter + 1;
     Xold = X; Yold = Y; /* update matrices */
   return( X ); /* X is positive semidefinite */

Step 2: Compute the nearest correlation matrix

The following matrix, A, is not positive definite, as you can show by using the EIGVAL function. The matrix is passed to the NearestCorr function, which returns a matrix, B, which is a valid correlation matrix:

/* symmetric matrix, but not positive definite */
A = {1.0  0.99 0.35, 
     0.99 1.0  0.80, 
     0.35 0.80 1.0} ;
B = NearestCorr(A);
print B;

You can see that several off-diagonal elements of A were too large. The (1,2) and (2,3) elements of B are smaller than the corresponding elements of A.

Step 3: Use the positive definite matrix in your algorithm

If B is an acceptable alternative to A, you can use the B matrix instead of A. For example, if you are trying to simulate random multivariate normal data, you must use a positive definite matrix. To simulate 1,000 random trivariate observations, you can use the following function:

mean = {-2,4,0};
x = RandNormal(1000, mean, B);

What if there are still problems?

For large matrices (say, more than 100 rows and columns), you might discover that the the numerical computation of the eigenvalues of B is subject to numerical rounding errors. In other words, if you compute the numerical eigenvalues of B, there might be some very small negative eigenvalues, such as -1E-14. If this interferes with your statistical method and SAS still complains that the matrix is not positive definite, then you can bump up the eigenvalues by adding a small multiple of the identity to B, such as B2 = ε I + B, where ε is a small positive value chosen so that all eigenvalues of B2 are positive. Of course, B2 is not a correlation matrix, since it does not have ones on the diagonal, so you need to convert it to a correlation matrix. It turns out that this combined operation is equivalent to dividing the off-diagonal elements of B by 1+ε, as follows:

/* for large matrices, might need to correct for numerical rounding errors */
epsilon = 1e-10;
B2 = ProjU( B/(1+epsilon) );  /* divide off-diagonal elements by 1+epsilon */

How fast is Higham's algorithm?

Because each iteration of Higham's algorithm requires an eigenvector decomposition, Higham's algorithm will not be blazingly fast for huge matrices. Still, the performance is not too bad. On my desktop machine from 2010, the algorithm can project a 200 x 200 matrix in less than two seconds. You can use the TIME function to time a SAS/IML computation:

c = ranuni(j(200,200));   /* random matrix */
c = (c+c`)/2;             /* symmetric matrix */
t0 = time();              /* begin timing */
   R = NearestCorr(c);
ElapsedTime = time()-t0;  /* elapsed time, in seconds */
print ElapsedTime;
tags: Matrix Computations, Statistical Programming

Please comment on the article here: The DO Loop

Tags: , ,