Tesla is hiring a data scientist. That is all. I'm not sure I buy the idea that Python is taking over for R among people who actually do regular data science. I think it is still context dependent. A huge … Continue reading →

Tesla is hiring a data scientist. That is all. I'm not sure I buy the idea that Python is taking over for R among people who actually do regular data science. I think it is still context dependent. A huge … Continue reading →

Cette fin de semaine, Martin Grandjean a mis en ligne un billet intéressant sur son blog, sur l’utilisation des statistiques (dans un but de propagande). L’exercice n’est pas nouveau, mais Martin soulève des questions, malheureusement importantes et complexes. Dans un paragraphe, intitulé “faire parler les chiffres… n’importe comment” (que j’ai repris comme titre, j’avoue avoir hésité avec “with great power comes great responsibility“), on retrouve l’analyse (rapide) d’un graphique, présenté ci-dessous.…

Andrew Anthony tells the excellent story of how Nick Brown, Alan Sokal, and Harris Friedman shot down some particularly silly work in psychology. (“According to the graph, it all came down to a specific ratio of positive emotions to negative emotions. If your ratio was greater than 2.9013 positive emotions to 1 negative emotion you […]The post “The British amateur who debunked the mathematics of happiness” appeared first on Statistical…

Nassim Nicholas Taleb recently wrote an article advocating the abandonment of the use of standard deviation and advocating the use of mean absolute deviation. Mean absolute deviation is indeed an interesting and useful measure- but there is a reason that standard deviation is important even if you do not like it: it prefers models that […] Related posts: Don’t use correlation to track prediction performance What does a generalized linear…

Some facts and some speculation. Definition Volatility is the annualized standard deviation of returns — it is often expressed in percent. A volatility of 20 means that there is about a one-third probability that an asset’s price a year from now will have fallen or risen by more than 20% from its present value. In … Continue reading →

Steve Peterson writes: I recently submitted a proposal on applying a Bayesian analysis to gender comparisons on motivational constructs. I had an idea on how to improve the model I used and was hoping you could give me some feedback. The data come from a survey based on 5-point Likert scales. Different constructs are measured […]The post Transformations for non-normal data appeared first on Statistical Modeling, Causal Inference, and Social…

A colleague asked if I had any material for a course in sample surveys. And indeed I do. See here. It’s all the slides for a 14-week course, also the syllabus (“surveyscourse.pdf”), the final exam (“final2012.pdf”) and various misc files. Also more discussion of final exam questions here (keep scrolling thru the “previous entries” until […]The post A course in sample surveys for political science appeared first on Statistical Modeling,…

UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS: http://bit.ly/1jccIBN. PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.This post uses animated choropleths to visualize violent crime r...

Consider a standard linear regression setting with \(K\) regressors and sample size \(N\). We will say that an estimator \(\hat{\beta}\) is consistent for a treatment effect (``T-consistent") if \(plim \hat{\beta}_k = {\partial E(y|x) }/{\partial x_k}\...

There’s a paradigm in applied statistics that goes something like this: 1. There is a scientific or policy question of some theoretical or practical importance. 2. Researchers gather data on relevant outcomes and perform a statistical analysis, ideally leading to a clear conclusion (p less than 0.05, or a strong posterior distribution, or good predictive […]The post How to think about the statistical evidence when the statistical evidence can’t be…

The simplest experimental design is the completely randomized design with 1 factor. In this design, each experimental unit is randomly assigned to each factor level. This design is most useful for a homogeneous population (one that does not have major differences between any sub-populations). It is appealing because of its simplicity and flexibility – it can […]