Bertrand Russell goes to the IRB

February 28, 2015
By

Jonathan Falk points me to this genius idea from Eric Crampton: Here’s a fun one for those of you still based at a university. All of you put together a Human Ethics Review proposal for a field experiment on Human Ethics Review proposals. Here is the proposal within my proposal. Each of you would propose […] The post Bertrand Russell goes to the IRB appeared first on Statistical Modeling, Causal…

Read more »

Career NBA: The Road Least Traveled

February 27, 2015
By
Career NBA: The Road Least Traveled

The bell rings - time to go to practice. Jarnell Stokes heads over to the gym, changes, and starts warming up with his teammates. It's his Junior year in high school. The Memphis, Tennessee native has a lot on his mind; soon he'll have to mak...

Read more »

William Shakespeare (1) vs. Karl Marx

February 27, 2015
By

For yesterday‘s winner, I’ll follow the reasoning of Manuel in comments: Popper. We would learn more from falsifying the hypothesis that Popper’s talk is boring than what we would learn from falsifying the hypothesis that Richard Pryor’s talk is uninteresting. And today we have the consensus choice for greatest writer vs. the notorious political philosopher. […] The post William Shakespeare (1) vs. Karl Marx appeared first on Statistical Modeling, Causal…

Read more »

Does Balancing Classes Improve Classifier Performance?

February 27, 2015
By
Does Balancing Classes Improve Classifier Performance?

It’s a folk theorem I sometimes hear from colleagues and clients: that you must balance the class prevalence before training a classifier. Certainly, I believe that classification tends to be easier when the classes are nearly balanced, especially when the class you are actually interested in is the rarer one. But I have always been … Continue reading Does Balancing Classes Improve Classifier Performance? → Related posts: Don’t use correlation…

Read more »

Big Data Is The New Phrenology?

February 27, 2015
By
Big Data Is The New Phrenology?

Originally posted on mathbabe:Have you ever heard of phrenology? It was, once upon a time, the “science” of measuring someone’s skull to understand their intellectual capabilities. This sounds totally idiotic but was a huge fucking deal in the mid-1800’s, and really didn’t stop getting some credit until much later. I know that because I…

Read more »

“The harm done by tests of significance” (article from 1994 in the journal, “Accident Analysis and Prevention”)

February 27, 2015
By

Ezra Hauer writes: In your January 2013 Commentary (Epidemiology) you say that “…misunderstanding persists even in high-stakes settings.” Attached is an older paper illustrating some such. “It is like trying to sink a battleship by...

Read more »

Plotting multiple time series in SAS/IML (Wide to Long, Part 2)

February 27, 2015
By
Plotting multiple time series in SAS/IML (Wide to Long, Part 2)

I recently wrote about how to overlay multiple curves on a single graph by reshaping wide data (with many variables) into long data (with a grouping variable). The implementation used PROC TRANSPOSE, which is a procedure in Base SAS. When you program in the SAS/IML language, you might encounter data […]

Read more »

Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

February 26, 2015
By
Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

Anscombe’s quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician Francis Ansco...

Read more »

Data are from the Past

February 26, 2015
By
Data are from the Past

There’s a lot of discussion and also big hope about what is called Big Data and the role of Data …Continue reading →

Read more »

Richard Pryor (1) vs. Karl Popper

February 26, 2015
By
Richard Pryor (1) vs. Karl Popper

The top-seeded comedian vs. an unseeded philosopher. Pryor would be much more entertaining, that’s for sure (“Arizona State Penitentiary population: 80 percent black people. But there are no black people in Arizona!”). But Karl Popper laid out the philosophy that is the foundation for modern science. His talk, even if it is dry, might ultimately […] The post Richard Pryor (1) vs. Karl Popper appeared first on Statistical Modeling, Causal…

Read more »

R: How to Layout and Design an Infographic

February 26, 2015
By
R: How to Layout and Design an Infographic

As promised from my recent article, here's my tutorial on how to layout and design an infographic in R. This article will serve as a template for more infographic design that I plan to share on future posts. Hence, we will go through the following sect...

Read more »

Psych journal bans significance tests; stat blogger inundated with emails

February 26, 2015
By

OK, it’s been a busy email day. From Brandon Nakawaki: I know your blog is perpetually backlogged by a few months, but I thought I’d forward this to you in case it hadn’t hit your inbox yet. A journal called Basic and Applied Social Psychology is banning null hypothesis significance testing in favor of descriptive […] The post Psych journal bans significance tests; stat blogger inundated with emails appeared first…

Read more »

"Is the call to abandon p-values the red herring of the replicability crisis?"

February 26, 2015
By
"Is the call to abandon p-values the red herring of the replicability crisis?"

In an opinion article [here] titled "Is the call to abandon p-values the red herring of the replicability crisis?", Victoria Savalei and Elizabeth Dunn concluded, "at present we lack empirical evidence that encouraging researchers to abandon p-values ...

Read more »

Announcing: Introduction to Data Science video course

February 25, 2015
By
Announcing: Introduction to Data Science video course

Win-Vector LLC’s Nina Zumel and John Mount are proud to announce their new data science video course Introduction to Data Science is now available on Udemy. We designed the course as an introduction to an advanced topic. The course description is: Use the R Programming Language to execute data science projects and become a data … Continue reading Announcing: Introduction to Data Science video course → Related posts: A bit…

Read more »

3 YEARS AGO: (FEBRUARY 2012) MEMORY LANE

February 25, 2015
By
3 YEARS AGO: (FEBRUARY 2012) MEMORY LANE

MONTHLY MEMORY LANE: 3 years ago: February 2012. I am to mark in red three posts (or units) that seem most apt for general background on key issues in this blog. Given our Fisher reblogs, we’ve already seen many this month. So, I’m marking in red (1) The Triad, and (2) the Unit on Spanos’ misspecification tests. Plase see those posts for […]

Read more »

Abraham (4) vs. Jane Austen

February 25, 2015
By
Abraham (4) vs. Jane Austen

Yesterday’s is a super-tough call. I’d much rather hear Stewart Lee than Aristotle. I read one of Lee’s books, and he’s a fascinating explicator of performance. Lee gives off a charming David Owen vibe—Phil, you know what I’m saying here—he’s an everyman, nothing special, he’s just been thinking really hard lately and wants to share […] The post Abraham (4) vs. Jane Austen appeared first on Statistical Modeling, Causal Inference,…

Read more »

Link: The Graphic Continuum

February 25, 2015
By

The Graphic Continuum is a poster created by Jon Schwabish and Severino Ribecca (the man behind the Data Visualisation Catalogue). It lists almost 90 different chart types and organizes them into five large groups: distribution, time, comparing categories, geospatial, part-to-whole, and relationships. Some of them are connected across groups where there are further similarities. The poster is printed very nicely and … Continue reading Link: The Graphic Continuum

Read more »

The axes are labeled but I don’t know what the dots represent.

February 25, 2015
By
The axes are labeled but I don’t know what the dots represent.

John Sukup writes: I came across a chart recently posted by Boston Consulting Group on LinkedIn and wondered what your take on it was. To me, it seems to fall into the “suspicious” category but thought you may have a different opinion. I replied that this one baffles me cos I don’t know what the […] The post The axes are labeled but I don’t know what the dots represent.…

Read more »

Composite ranking and numbersense

February 25, 2015
By
Composite ranking and numbersense

Chapter 1 of Numbersense (link)uses the example of U.S. News ranking of law schools to explore the national pastime of ranking almost anything. Since there is no objective standard for the "correct" ranking, it is pointless to complain about "arbitrary" weighting and so on. Every replacement has its own assumptions. A more productive path forward is to understand how the composite ranking is created, and shine a light on the…

Read more »

Plotting multiple series: Transforming data from wide to long

February 25, 2015
By
Plotting multiple series: Transforming data from wide to long

Data. To a statistician, data are the observed values. To a SAS programmer, analyzing data requires knowledge of the values and how the data are arranged in a data set. Sometimes the data are in a "wide form" in which there are many variables. However, to perform a certain analysis […]

Read more »

Upcoming talk on survival analysis in Python

February 24, 2015
By
Upcoming talk on survival analysis in Python

On March 2, 2015 I am presenting a short talk for the Python Data Science meetup.  Here is the announcement for the meetup.And here are my slides:The code for the talk is in an IPython notebook you can view on nbviewer.  It is still a work in...

Read more »

Visualizing Clusters

February 24, 2015
By
Visualizing Clusters

Consider the following dataset, with (only) ten points x=c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) y=c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) plot(x,y,pch=19,cex=2) We want to get – say – two clusters. Or more specifically, two sets of observations, each of them sharing some similarities. Since the number of observations is rather small, it is actually possible to get an exhaustive list of all partitions, and to minimize some criteria, such as the within variance. Given a vector with clusters, we compute…

Read more »

Finding the best dose

February 24, 2015
By

In a dose-finding clinical trial, you have a small number of doses to test, and you hope find the one with the best response. Here “best” may mean most effective, least toxic, closest to a target toxicity, some combination of criteria, etc. Since your goal is to find the best dose, it seems natural to compare dose-finding […]

Read more »


Subscribe

Email:

  Subscribe