Blog Archives

Functional Principal Component Analysis

December 20, 2014
By
Functional Principal Component Analysis

In mathematics, a general principle for studying an object is always from the study of the object itself to the study of the relationship between objects. In functional data analysis, the most important part for studying of the object itself, i.e. one functional data set, is functional principal component analysis (FPCA). And for the study […]

Read more »

Degrees of Freedom and Information Criteria

December 4, 2014
By
Degrees of Freedom and Information Criteria

Degrees of freedom and information criteria are two fundamental concepts in statistical modeling, which are also taught in introductory statistics courses. But what are the exact abstract definitions for them which can be used to derive specific calculation formula in different situations. I often use fit criteria like AIC and BIC to choose between models. […]

Read more »

Useful for referring—12-04-2014

December 4, 2014
By
Useful for referring—12-04-2014

  Tutorial: How to detect spurious correlations, and how to find the … Practical illustration of Map-Reduce (Hadoop-style), on real data Jackknife logistic and linear regression for clustering and predict… From the trenches: 360-degrees data science A synthetic variance designed for Hadoop and big data Fast Combinatorial Feature Selection with New Definition of Predict… A […]

Read more »

Factor Analysis vs Principal Component Analysis

November 22, 2014
By
Factor Analysis vs Principal Component Analysis

Recently some papers discussed in our journal club  are focused on integrative clustering of multiple omics data sets. I found that they are all originated from factor analysis and make use of the advantage of factor analysis over principal component analysis. Let’s recall the model for factor analysis: where () and , with mean and […]

Read more »

EM algorithm revisited

November 20, 2014
By
EM algorithm revisited

On this Tuesday, Professor Xuming He presented their recent work on subgroup analysis, which is very interesting and useful in reality. Think about the following very much practical problem (since the drug is expensive or has certain amount of side effect): If you are given the drug response, some baseline covariates which have nothing to […]

Read more »

Empirical Likelihood meets Bayesian Analysis

November 18, 2014
By
Empirical Likelihood meets Bayesian Analysis

The core idea of Empirical Likelihood (EL) is to use a maximum entropy discrete distribution supported on the data points and constrained by estimating equations related with the parameters of interest. As such, it is a non-parametric approach in the sense that the distribution of the data does not need to be specified, only some of […]

Read more »

Multiple Linear Regression Revisited

November 10, 2014
By
Multiple Linear Regression Revisited

Last night, I had a discussion about the integrative data analysis (closely related with the discussion of AOAS 2014 paper from Dr Xihong Lin’s group and JASA 2014 paper from Dr. Hongzhe Li’s group) with my friend. If some biologist gave you the genetic variants (e.g. SNP) data and the phenotype (e.g. some trait) data, […]

Read more »

p-value vs Bayes

September 30, 2014
By
p-value vs Bayes

p-value and Bayes are the two hottest words in Statistics. Actually I still can not get why the debate between frequentist  statistics and Bayesian statistics can last so long. What is the essence arguments behind it? (Any one can help me with this?) In my point of view, they are just two ways for solving […]

Read more »

It’s time for job application now!

September 29, 2014
By
It’s time for job application now!

I collected the following series on applying for faculty positions in 2011, when I was in my second year PhD. Now it’s my turn to apply for jobs. I will share the following useful materials with all you who want to apply for jobs this year. Applying for Jobs: Application Materials Applying for Jobs : […]

Read more »

Useful for referring—9-11-2014

September 12, 2014
By
Useful for referring—9-11-2014

Some R Resources for GLMs 失联搜救中的统计数据分析 The gap between data mining and predictive models Data Mining, machine learning and statistics. useR! 2014 is underway with 16 tutorials What is Scalable Machine Learning? rlist:基于list在R中处理非关系型数据 The perfect candidate The Leek group guide to giving talks 38 Seminal Articles Every Data Scientist Should Read Deep Learning – important […]

Read more »


Subscribe

Email:

  Subscribe