Statistical modeling: two ways to see the world.

January 20, 2014
By
Statistical modeling: two ways to see the world.

This a machine-learning-vs-traditional-statistics kind of blog post inspired by Leo Breiman's "Statistical Modeling: The Two Cultures". If you're like: "I had enough of this machine learning vs. statistics discussion,  BUT I would love to see beau...

Read more »

Rectangular Integration (a.k.a. The Midpoint Rule) – Conceptual Foundations and a Statistical Application in R

Rectangular Integration (a.k.a. The Midpoint Rule) – Conceptual Foundations and a Statistical Application in R

Introduction Continuing on the recently born series on numerical integration, this post will introduce rectangular integration.  I will describe the concept behind rectangular integration, show a function in R for how to do it, and use it to check that the distribution actually integrates to 1 over its support set.  This post follows from my […]

Read more »

The AAA Tranche of Subprime Science

January 20, 2014
By

In our new ethics column for Chance, Eric Loken and I write about our current favorite topic: One of our ongoing themes when discussing scientific ethics is the central role of statistics in recognizing and communicating uncer- tainty. Unfortunately, statistics—and the scientific process more generally—often seems to be used more as a way of laundering […]The post The AAA Tranche of Subprime Science appeared first on Statistical Modeling, Causal Inference,…

Read more »

Statistics meets rhetoric: A text analysis of "I Have a Dream" in R

January 20, 2014
By
Statistics meets rhetoric: A text analysis of "I Have a Dream" in R

Today, we celebrate the would-be 85th birthday of Martin Luther King, Jr., a man remembered for pioneering the civil rights movement through his courage, moral leadership, and oratory prowess. This post focuses on his most famous speech, I Have a Drea...

Read more »

Peer Review, Part 1: Quilt Plots

January 20, 2014
By
Peer Review, Part 1: Quilt Plots

What is peer review? How does it work? And is it really as flawed as people claim it is? In this little series, I will talk about all that, and then end up arguing that peer review does, in fact, work – at least in visualization. But first an example where it didn’t. A paper […]

Read more »

Sunday data/statistics link roundup (1/19/2014)

January 20, 2014
By

Tesla is hiring a data scientist. That is all. I'm not sure I buy the idea that Python is taking over for R among people who actually do regular data science.  I think it is still context dependent. A huge … Continue reading →

Read more »

Faire parler les chiffres… n’importe comment

January 19, 2014
By
Faire parler les chiffres… n’importe comment

Cette fin de semaine, Martin Grandjean a mis en ligne un billet intéressant sur son blog, sur l’utilisation des statistiques (dans un but de propagande). L’exercice n’est pas nouveau, mais Martin soulève des questions, malheureusement importantes et complexes. Dans un paragraphe, intitulé “faire parler les chiffres… n’importe comment” (que j’ai repris comme titre, j’avoue avoir hésité avec “with great power comes great responsibility“), on retrouve l’analyse (rapide) d’un graphique, présenté ci-dessous.…

Read more »

“The British amateur who debunked the mathematics of happiness”

January 19, 2014
By

Andrew Anthony tells the excellent story of how Nick Brown, Alan Sokal, and Harris Friedman shot down some particularly silly work in psychology. (“According to the graph, it all came down to a specific ratio of positive emotions to negative emotions. If your ratio was greater than 2.9013 positive emotions to 1 negative emotion you […]The post “The British amateur who debunked the mathematics of happiness” appeared first on Statistical…

Read more »

Use standard deviation (not mad about MAD)

January 19, 2014
By
Use standard deviation (not mad about MAD)

Nassim Nicholas Taleb recently wrote an article advocating the abandonment of the use of standard deviation and advocating the use of mean absolute deviation. Mean absolute deviation is indeed an interesting and useful measure- but there is a reason that standard deviation is important even if you do not like it: it prefers models that […] Related posts: Don’t use correlation to track prediction performance What does a generalized linear…

Read more »

The Myth of Random Sampling

January 19, 2014
By
The Myth of Random Sampling

I feel a slight quiver of trepidation as I begin this post – a little like the boy who pointed out that the emperor has  no clothes. Random sampling is a myth. Practical researchers know this and deal with it. … Continue reading →

Read more »

The Myth of Random Sampling

January 19, 2014
By
The Myth of Random Sampling

I feel a slight quiver of trepidation as I begin this post – a little like the boy who pointed out that the emperor has  no clothes. Random sampling is a myth. Practical researchers know this and deal with it. … Continue reading →

Read more »

Hopper – new in the travel space

January 19, 2014
By
Hopper – new in the travel space

Briefly - Hopper is something new in the travel / local space. In their own words: What if you could plan an amazing trip based on a vague idea — like “spring surfing in California” or “Mediterranean cruise”? What if...

Read more »

What is volatility?

January 19, 2014
By
What is volatility?

Some facts and some speculation. Definition Volatility is the annualized standard deviation of returns — it is often expressed in percent. A volatility of 20 means that there is about a one-third probability that an asset’s price a year from now will have fallen or risen by more than 20% from its present value. In … Continue reading →

Read more »

Transformations for non-normal data

January 19, 2014
By

Steve Peterson writes: I recently submitted a proposal on applying a Bayesian analysis to gender comparisons on motivational constructs. I had an idea on how to improve the model I used and was hoping you could give me some feedback. The data come from a survey based on 5-point Likert scales. Different constructs are measured […]The post Transformations for non-normal data appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

Sir Harold Jeffreys’ (tail area) one-liner: Sat night comedy [draft ii]

January 19, 2014
By
Sir Harold Jeffreys’ (tail area) one-liner: Sat night comedy [draft ii]

You might not have thought there could be new material for 2014, but there is, and if you look a bit more closely, you’ll see that it’s actually not Jay Leno who is standing up there at the mike …. It’s Sir Harold Jeffreys himself! And his (very famous) joke, I admit, is funny. So, since […]

Read more »

Le Monde puzzle [#849]

January 18, 2014
By
Le Monde puzzle [#849]

A straightforward Le Monde mathematical puzzle: Find a pair (a,b) of integers such that a has an odd number d of digits larger than 2 and ab is written as 10d+1+10a+1. Find the smallest possible values of a and of b. I ran the following R code which produced a=137 (and b=83) as the unique […]

Read more »

Measurement and Measurement Error, Weight, Success and Failure

January 18, 2014
By
Measurement and Measurement Error, Weight, Success and Failure

This blog currently weights 200 pounds. It's inscribed in my data base, so it must be true. 200 is the latest in a series of daily morning readings wearing the same clothing, at the same time of my day. But how is that 200 measured? And is 200 good or ...

Read more »

Converting plots to data

January 18, 2014
By
Converting plots to data

It is a problem which occurs ever so often in applied work, you have a plot, but you want the data. There are at least two programs which can help you there; PlotDigitizer and Engauge Digitizer. I got both on my openSuse machine. Both are available for...

Read more »

A course in sample surveys for political science

January 18, 2014
By

A colleague asked if I had any material for a course in sample surveys. And indeed I do. See here. It’s all the slides for a 14-week course, also the syllabus (“surveyscourse.pdf”), the final exam (“final2012.pdf”) and various misc files. Also more discussion of final exam questions here (keep scrolling thru the “previous entries” until […]The post A course in sample surveys for political science appeared first on Statistical Modeling,…

Read more »

Machine Learning Lesson of the Day – Cross-Validation

Machine Learning Lesson of the Day – Cross-Validation

Validation is a good way to assess the predictive accuracy of a supervised learning algorithm, and the rule of thumb of using 70% of the data for training and 30% of the data for validation generally works well.  However, what if the data set is not very large, and the small amount of data for […]

Read more »

Metaphors Matter: Factor Structure vs. Correlation Network Maps

January 17, 2014
By
Metaphors Matter: Factor Structure vs. Correlation Network Maps

The psych R package includes a data set called "bfi" with self-report ratings on 25 personality items along a 6-point agreement scale. All the details are provided in the documentation accompanying the package. My focus is how to represent the correlat...

Read more »

Animated choropleths using animation, ggplot2, rCharts, googleVis and Shiny to visualize violent crime rates in different US States across 5 decades

January 17, 2014
By

UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS: http://bit.ly/1jccIBN. PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.This post uses animated choropleths to visualize violent crime r...

Read more »

Causality and T-Consistency vs. Correlation and P-Consistency

January 17, 2014
By

Consider a standard linear regression setting with \(K\) regressors and sample size \(N\). We will say that an estimator \(\hat{\beta}\) is consistent for a treatment effect (``T-consistent") if \(plim \hat{\beta}_k = {\partial E(y|x) }/{\partial x_k}\...

Read more »


Subscribe

Email:

  Subscribe