Blog Archives

Fast SQL moving average calculation without windowing functions

May 11, 2015
By
Fast SQL moving average calculation without windowing functions

In this post, I show a trick to do moving average calculation (can be extended to other operations requiring windowing functions) that is super fast. Often, SAS analysts need to conduct moving average calculation and there are several options by the or...

Read more »

sklearn DecisionTree plot example needs pydotplus

April 26, 2015
By

In Python, sklearn (scikit-learn)'s DecisionTree example uses pydot for plotting the generated tree: @here.But for Python 3, pydot has some issues with the string from dot_data.getvalue(), for example it will report "TypeError: startswith first arg mus...

Read more »

Migrating code pieces to GitHub

February 5, 2015
By
Migrating code pieces to GitHub

One of the original reasons for this blog was to keep track of my SAS code as well as its relevant context. That was the mindset when I was a SAS analyst, but now working in professional software company, using the right tool for versioning, col...

Read more »

%SVD macro with BY-Processing

December 18, 2014
By
%SVD macro with BY-Processing

For the Regularized Discriminant Analysis Cross Validation, we need to compute SVD for each pair of \((\lambda, \gamma)\), and the factorization result will be feed to the downdating algorithm to obtain leave one out variance-covariance matrix \(\hat{\...

Read more »

Experient downdating algorithm for Leave-One-Out CV in RDA

December 15, 2014
By
Experient downdating algorithm for Leave-One-Out CV in RDA

In this post, I want to demonstrate a piece of experiment code for downdating algorithm for Leave-One-Out (LOO) Cross Validation in Regularized Discriminant Analysis [1]. In LOO CV, the program needs to calculate the inverse of \(\hat{\Sigma}_{k\v}(\la...

Read more »

Control Excel via SAS DDE & Python win32com

December 15, 2014
By
Control Excel via SAS DDE & Python win32com

Excel is probably the most used interface between human and data. Whenever you are dealing with business people, Excel is the de facto means for all things about data processing. I used to only use SAS and Python for number crunching but in one of my r...

Read more »

%HPGLIMMIX SAS macro is available online at JSS website

July 1, 2014
By
%HPGLIMMIX SAS macro is available online at JSS website

My paper "%HPGLIMMIX: A High-Performance SAS Macro for GLMM Estimation" is now available at Journal of Statistical Software website @here.SAS macro and code can also be found there. If you use it, please kindly send me an email so that I know my work i...

Read more »

Market trend in advanced analytics for SAS, R and Python

December 6, 2013
By
Market trend in advanced analytics for SAS, R and Python

Disclaimer: This study is a view on the market trend on demand of advanced analytics software and their adoptions from the job market perspective, and should not be read as a conclusive statement on what is all happening there. The findings should...

Read more »

I don’t always do regression, but when I do, I do it in SAS 9.4

July 19, 2013
By
I don’t always do regression, but when I do, I do it in SAS 9.4

There are several exciting add-ins from SAS Analytics products running on v9.4, especially the SAS/STAT high performance procedures, where "high performance" refers to either in single-machine multi-threading mode or full distributed mode. HPGENSE...

Read more »

Finding the closest pair in datat using PROC MODECLUS

May 9, 2013
By
Finding the closest pair in datat using PROC MODECLUS

  UPDATE: Rick Wicklin kindly shared his visualization efforts on the output to put a more straightforward sense on the results. Thanks. Here is the code, run after my code below. Note that this is designed for K=2. proc iml;use out;&nbs...

Read more »


Subscribe

Email:

  Subscribe