A few days ago, I was asked if we should spend a lot of time to choose the distribution we use, in GLMs, for (actuarial) ratemaking. On that topic, I usually claim that the family is not the most important parameter in the regression model. Consider the following dataset > db <- data.frame(x=c(1,2,3,4,5),y=c(1,2,4,2,6)) > plot(db,xlim=c(0,6),ylim=c(-1,8),pch=19) To visualize a regression model, use the following code > nd=data.frame(x=seq(0,6,by=.1)) > add_predict = function(reg){…

I just read this charming article by Lee Wilkinson’s brother on a mathematician named Yitang Zhang. Zhang recently gained some fame after recently proving a difficult theorem, and he seems to be a quite unusual, but likable, guy. What I liked about Wilkinson’s article is how it captured Zhang’s eccentricities with affection but without condescension. […] The post Eccentric mathematician appeared first on Statistical Modeling, Causal Inference, and Social Science.

Sometimes different communities use the same name for different objects. To a soldier, "boots" are rugged, heavy, high-top foot coverings. To a soccer (football) player, "boots" are lightweight cleats. So it is with the term "waterfall plot." To researchers in the medical field, a "waterfall plot" is a sorted bar […] The post Create a cascade chart in SAS appeared first on The DO Loop.

Last week, I had the pleasure of attending the CHI 2015 conference in Seoul, South Korea. CHI technically stands for Computer-Human Interaction, but it has become a name rather than an acronym in recent years. And CHI’s scope is very broad, it covers many areas that are not strictly part of HCI (Human-Computer Interaction – … Continue reading Conference Report: CHI 2015

Dylan Small writes: The conference will take place May 20-21 (with a short course on May 19th) and the web site for the conference is here. The deadline for submitting a poster title for the poster session is this Friday. Junior researchers (graduate students, postdoctoral fellows, and assistant professors) whose poster demonstrates exceptional research will […] The post This year’s Atlantic Causal Inference Conference: 20-21 May appeared first on Statistical…

In Python, sklearn (scikit-learn)'s DecisionTree example uses pydot for plotting the generated tree: @here.But for Python 3, pydot has some issues with the string from dot_data.getvalue(), for example it will report "TypeError: startswith first arg mus...

We've just arxived our paper on efficient computation for the Expected Value of Partial Perfect Information (EVPPI) based on SPDE-INLA. The EVPPI is a decision-theoretic measure of the impact of uncertainty in some of the parameters in a mode...

This is an oldie but a goodie. Donna Towns writes: I am wondering if you could help me solve an ongoing debate? My colleagues and I are discussing (disagreeing) on the ability of a researcher to analyze information on a population. My colleagues are sure that a researcher is unable to perform statistical analysis on […] The post Statistical analysis on a dataset that consists of a population appeared first…

Another powerful procedure of SAS, my favorite one, that I would like to share is the PROC IML (Interactive Matrix Language). This procedure treats all objects as a matrix, and is very useful for doing scientific computations involving vectors and matrices. To get started, we are going to demonstrate and discuss the following: Creating and Shaping Matrices;Matrix Query;Subscripts;Descriptive Statistics;Set Operations;Probability Functions and Subroutine;Linear Algebra;Reading and Creating Data;Above outline is based…