## Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R

$Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R$

Introduction Continuing my recent series on exploratory data analysis (EDA), and following up on the last post on the conceptual foundations of empirical cumulative distribution functions (CDFs), this post shows how to plot them in R.  (Previous posts in this series on EDA include descriptive statistics, box plots, kernel density estimation, and violin plots.) I […]

## Three Ways to Run Bayesian Models in R

June 25, 2013
By

There are different ways of specifying and running Bayesian models from within R. Here I will compare three different methods, two that relies on an external program and one that only relies on R. I won’t go into much detail about the differences i...

## Is there too much coauthorship in economics (and science more generally)? Or too little?

June 25, 2013
By

Economist Stan Liebowitz has a longstanding interest in the difficulties of flagging published research errors. Recently he wrote on the related topic of dishonest authorship: While not about direct research fraud, I thought you might be interested in this paper. It discusses the manner in which credit is given for economics articles, and I suspect [...]The post Is there too much coauthorship in economics (and science more generally)? Or too…

## Doing Statistical Research

June 25, 2013
By

There's a wonderful article over at the STATtr@k web site by Terry Speed on How to Do Statistical Research. There is a lot of good advice there, but the column is most notable because it's pretty much the exact opposite … Continue reading →

## A short statistics course

June 25, 2013
By

For years, I have wanted to see a statistics course that is not a math class. So I made one myself. The title of the course is "How to do statistics without really doing statistics?". It's on a new online learning platform called Three Nights and Done. There are three hours worth of materials divided into three or four chunks each hour. Here is the link. I'd love to hear…

## Predicting spatial locations using point processes

June 25, 2013
By

I’ve uploaded a draft tutorial on some aspects of prediction using point processes. I wrote it using R-Markdown, so there’s bits of R code for readers to play with. It’s hosted on Rpubs, which turns out to be a great deal more conveni...

## Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Introduction Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R.  (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and violin plots.) To give you […]

## Talking data: Building interactive relationships with data and colleagues

June 25, 2013
By

Last week I had the honour to give the opening keynote talk at the Talking Data South Westconference, organised by the Exeter Initiative for Statistics and its Applications. The event was chaired by Steve Brooks and brought together over 100 people to...

## Opel Corsa Diesel Usage

June 24, 2013
By

I wanted to extend my car weight distribution calculation of June 16 from only 2000 to years 2000 to 2013. Unfortunately, come Sunday afternoon the code seemed too slow and not even the beginning of a post. So, I went on to another calculation I w...

## Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

June 24, 2013
By

I’m reposing this classic from 2011 . . . Peter Bergman pointed me to this discussion from Cyrus of a presentation by Guido Imbens on design of randomized experiments. Cyrus writes: The standard analysis that Imbens proposes includes (1) a Fisher-type permutation test of the sharp null hypothesis–what Imbens referred to as “testing”–along with a [...]The post Why it doesn’t make sense in general to form confidence intervals by inverting…

## Does fraud depend on my philosophy?

June 24, 2013
By

Ever since my last post on replication and fraud I've been doing some more thinking about why people consider some things "scientific fraud". (First of all, let me just say that I was a bit surprised by the discussion in … Continue reading →

## Bayesian quality control?

June 24, 2013
By

Gabriel Murray writes: I saw this post and response from about 5 years ago, regarding a fellow analyzing levels of white blood cells. He was asking about Bayesian approaches to quality control and couldn’t find a canonical resource on that topic. Five years on and I similarly don’t see many good resources on the topic, [...]The post Bayesian quality control? appeared first on Statistical Modeling, Causal Inference, and Social Science.

June 24, 2013
By

Predictive Analytics by Eric Siegel (link) was published earlier this year. Siegel is a consultant and organizer of a series of popular industry conferences, which I attend with some regularity. I recommend this book for readers who want to understand the current state of “data science” at a deeper level than the New York Times’s but still nonmathematical. If you want to measure against my own writing, then Siegel spends…

## Count the number of unique rows in a matrix

June 24, 2013
By

How do you count the number of unique rows in a matrix? The simplest algorithm is to sort the data and then iterate down the rows, comparing each row with the previous row. However, this algorithm has two shortcomings: it physically sorts the data (which means that the original locations [...]

## Sunday data/statistics link roundup (6/23/13)

June 24, 2013
By

An interesting study describing the potential benefits of using significance testing may be potentially beneficial and a scenario where the file drawer effect may even be beneficial. Granted this is all simulation so you have to take it with a … Continue reading →

## Revisualizing the best cities in the US in 2012- Shiny + googleVis = Incredibly powerful

June 24, 2013
By

UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS:http://bit.ly/1hguDHM .  PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.This is the last time I will talk...

## Difficult concepts in statistics

June 23, 2013
By

Recently someone asked: “I don’t suppose you’d like to blog a little on the pedagogical knowledge relevant to statistics teaching, would you? A ‘top five statistics student misconceptions (and what to do about them)’ would be kind of a nice … Continue reading →

## AI Stats conference on Stan etc.

June 23, 2013
By

Jaakko Peltonen writes: The Seventeenth International Conference on Artificial Intelligence and Statistics (http://www.aistats.org) will be next April in Reykjavik, Iceland. AISTATS is an interdisciplinary conference at the intersection of computer science, artificial intelligence, machine learning, statistics, and related areas. ============================================================================== AISTATS 2014 Call for Papers Seventeenth International Conference on Artificial Intelligence and Statistics April 22 [...]The post AI Stats conference on Stan etc. appeared first on Statistical Modeling, Causal Inference,…

## What do these share in common: m&ms, limbo stick, ovulation, Dale Carnegie? Sat night potpourri

June 23, 2013
By

I had said I would label as pseudoscience or questionable science any enterprise that regularly permits the kind of ‘verification biases’ in the laundry list of my June 1 post.  How regularly? (I’ve been asked) Well, surely if it’s as regular as, say, social psychology, it goes over the line. But it’s not mere regularity, it’s […]

## What is “Practical Data Science with R”?

June 23, 2013
By

A bit about our upcoming book “Practical Data Science with R”. Nina and I share our current draft of the front matter from the book, which is a description which will help you decide if this is the book for you (we hope that it is). Or this could be the book that helps explain […] Related posts: Data Science, Machine Learning, and Statistics: what is in a name? Data…

## Struggles over the criticism of the “cannabis users and IQ change” paper

June 22, 2013
By

Ole Rogeberg points me to a discussion of a discussion of a paper: Did pre-release of my [Rogeberg's] PNAS paper on methodological problems with Meier et al’s 2012 paper on cannabis and IQ reduce the chances that it will have its intended effect? In my case, serious methodological issues related to causal inference from non-random [...]The post Struggles over the criticism of the “cannabis users and IQ change” paper appeared…

## Great Firewall of China

June 21, 2013
By

After spending the last few months in China and not able to see or post at my own blog,  this site seems dead. For a long while the famous Great FireWall of China has been blocking access to all wordpress.com traffics. Computers in China have hard time to gain access to webpages with domain names, […]

## The pretender

June 21, 2013
By

Tonight I'll pretend to still be very, very young and go catch a Ryanair flight from Stansted, so that I can be all revenge-y at my brother's stag do, tomorrow morning. I have pointedly avoided Ryanair for quite a while now, on the grounds that as...