I'll be speaking at the NYU Bookstore on Oct 8 (next Tuesday), 6-7:30 pm. See here. On Oct 9 (Wed), I'll be speaking at the Princeton Tech Meetup. The meeting starts at 7; my talk starts at 8. Details here.

I'll be speaking at the NYU Bookstore on Oct 8 (next Tuesday), 6-7:30 pm. See here. On Oct 9 (Wed), I'll be speaking at the Princeton Tech Meetup. The meeting starts at 7; my talk starts at 8. Details here.

I'll be speaking at the NYU Bookstore on Oct 8 (next Tuesday), 6-7:30 pm. See here. On Oct 9 (Wed), I'll be speaking at the Princeton Tech Meetup. The meeting starts at 7; my talk starts at 8. Details here.

First, let me start of by saying I'm a classical statistics and p-value apologist--I think it's the cats pajamas. It was mine (and many others') first introduction to statistics. So, in spite of my being a card-carrying member of The Bayesian Consipiracy, there will always be a place in my heart (grinch-sized though it is) »more

The Division of Biostatistics and Epidemiology at Weill Medical College of Cornell University invite applications for a post-doctoral fellow position in Biostatistics. We are seeking a highly motivated individual to develop novel statistical methods for missing data imputation and applications in health disparities research using national administrative data, with funding from Agency of Healthcare Research […]The post Bayes alert! Cool postdoc position here on missing data imputation and applications in…

Massive open online courses (MOOCs) are all the rage today. Some people see free online courses as a convenient way to introduce statistical concepts to tens of thousands of students who would not otherwise have an opportunity to learn about data analysis. Whereas 2013 is the International Year of Statistics, [...]

There are many kinds of intervals in statistics. To name a few of the common intervals: confidence intervals, prediction intervals, credible intervals, and tolerance intervals. Each are useful and serve their own purpose. I’ve been recently working on a couple of projects that involve making predictions from a regression model and I’ve been doing some […]

The hype surrounding "Big Data" has escalated to borderline nauseating. Is it just a sham?Yes, I know, I have earlier gushed about the wonders of Big Data. But that was then, and now is now, and I hear my inner contrarian alarm sounding.One thing ...

Milan Valasek writes: Psychology students (and probably students in other disciplines) are often taught that in order to perform ‘parametric’ tests, e.g. independent t-test, the data for each group need to be normally distributed. However, in literature (and various university lecture notes and slides accessible online), I have come across at least 4 different interpretation […]The post I’ll say it again appeared first on Statistical Modeling, Causal Inference, and Social…

Introduction In the previous post I showed that it is possible to couple parallel tempered MCMC chains in order to improve mixing. Such methods can be used when the target of interest is a Bayesian posterior distribution that is difficult to sample. There are (at least) a couple of obvious ways that one can temper […]

On page 8 of Numbersense (link), I wrote: Web logs are a messy, messy world. If two vendors are deployed to analyze traffic on the same website, it is guaranteed that their statistics would not reconcile, and the gap can be as high as 20 or 30 percent. Insiders will nod their heads; for those who aren’t familiar with Web data, take a look at this recent post on The…

When I was a kid I took a writing class, and one of the assignments was to write a 1-to-2 page story. I can’t remember what I wrote, but I do remember the following story from one of the other kids. In its entirety: I snuck into this pay toilet and I can’t get out! […]The post Using the aggregate of the outcome variable as a group-level predictor in a…

Originally posted on The Political Methodologist:In a prior post on my personal blog, I argued that it is misleading to label matching procedures as causal inference procedures (in the Neyman-Rubin sense of the term). My basic argument was that the causal quality of these inferences depends on untested (and in some cases untestable) assumptions…

To get back to a question asked after the last course (still on non-life insurance), I will spend some time to discuss ROC curve construction, and interpretation. Consider the dataset we’ve been using last week, > db = read.table("http://freakonometrics.free.fr/db.txt",header=TRUE,sep=";") > attach(db) The first step is to get a model. For instance, a logistic regression, where some factors were merged together, > X3bis=rep(NA,length(X3)) > X3bis[X3%in%c("A","C","D")]="ACD" > X3bis[X3%in%c("B","E")]="BE" > db$X3bis=as.factor(X3bis) > reg=glm(Y~X1+X2+X3bis,family=binomial,data=db)…

Someone sent me the following email: I am an environmental journalist writing an Environmental Science 101 textbook and I’m currently working on the section on hypothesis testing and statistical significance. I am searching for a story to make the importance of thinking statistically come alive for the students, ideally one from the environmental sciences. I’m […]The post Query from a textbook author – looking for stories to tell to undergrads…