Future ISFs

June 26, 2013
By
Future ISFs

The next few locations for the International Symposium on Forecasting have been announced: 2014: Rotterdam, The Netherlands 2015: Riverside, California, USA 2016: Santander, Spain 2017: Cairns, Australia The ISF is easily the best forecasting confere...

Read more »

Natural language processing tutorial

June 26, 2013
By
Natural language processing tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. ...

Read more »

My talk at Boston Python

June 26, 2013
By
My talk at Boston Python

I just gave a talk at Boston Python about natural language processing in general, and edX ease and discern in specific. You can find the presentation source here, and the web version of it here. There is a video of it here. Nelle Varoquaux and Micha...

Read more »

Natural Language Processing Tutorial

June 26, 2013
By
Natural Language Processing Tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The ...

Read more »

My Talk at Boston Python

June 25, 2013
By
My Talk at Boston Python

I just gave a talk at Boston Python about natural language processing in general, and edX ease and discern in specific. You can find the presentation source here, and the web version of it here. There is a video of it here. Nelle Varoquaux and Michael ...

Read more »

Hot Shot Charts: Data-Based Insights of Past NBA Basketball Games

June 25, 2013
By
Hot Shot Charts: Data-Based Insights of Past NBA Basketball Games

Hot Shot Charts [hotshotcharts.com], developed by a small team of data scientists, analysts and visualization researchers of consulting form Accenture, provides a wide range of data-based insights of the NBA basketball competition matches that were pl...

Read more »

Stadtbilder: Mapping the Digital Hotspots of a City

June 25, 2013
By
Stadtbilder: Mapping the Digital Hotspots of a City

Stadtbilder [stadt-bilder.com], designed by Moritz Stefaner, provides an artistic overview of the typical digital "hotspots" in a city, such as its local restaurants, hotels or clubs. Based on data retrieved from different social media providers suc...

Read more »

Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R

Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R

Introduction Continuing my recent series on exploratory data analysis (EDA), and following up on the last post on the conceptual foundations of empirical cumulative distribution functions (CDFs), this post shows how to plot them in R.  (Previous posts in this series on EDA include descriptive statistics, box plots, kernel density estimation, and violin plots.) I […]

Read more »

Three Ways to Run Bayesian Models in R

June 25, 2013
By
Three Ways to Run Bayesian Models in R

There are different ways of specifying and running Bayesian models from within R. Here I will compare three different methods, two that relies on an external program and one that only relies on R. I won’t go into much detail about the differences i...

Read more »

Is there too much coauthorship in economics (and science more generally)? Or too little?

June 25, 2013
By
Is there too much coauthorship in economics (and science more generally)?  Or too little?

Economist Stan Liebowitz has a longstanding interest in the difficulties of flagging published research errors. Recently he wrote on the related topic of dishonest authorship: While not about direct research fraud, I thought you might be interested in this paper. It discusses the manner in which credit is given for economics articles, and I suspect [...]The post Is there too much coauthorship in economics (and science more generally)? Or too…

Read more »

Doing Statistical Research

June 25, 2013
By

There's a wonderful article over at the STATtr@k web site by Terry Speed on How to Do Statistical Research. There is a lot of good advice there, but the column is most notable because it's pretty much the exact opposite … Continue reading →

Read more »

A short statistics course

June 25, 2013
By
A short statistics course

For years, I have wanted to see a statistics course that is not a math class. So I made one myself. The title of the course is "How to do statistics without really doing statistics?". It's on a new online learning platform called Three Nights and Done. There are three hours worth of materials divided into three or four chunks each hour. Here is the link. I'd love to hear…

Read more »

Predicting spatial locations using point processes

June 25, 2013
By
Predicting spatial locations using point processes

I’ve uploaded a draft tutorial on some aspects of prediction using point processes. I wrote it using R-Markdown, so there’s bits of R code for readers to play with. It’s hosted on Rpubs, which turns out to be a great deal more conveni...

Read more »

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Introduction Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R.  (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and violin plots.) To give you […]

Read more »

Talking data: Building interactive relationships with data and colleagues

June 25, 2013
By
Talking data: Building interactive relationships with data and colleagues

Last week I had the honour to give the opening keynote talk at the Talking Data South Westconference, organised by the Exeter Initiative for Statistics and its Applications. The event was chaired by Steve Brooks and brought together over 100 people to...

Read more »

Opel Corsa Diesel Usage

June 24, 2013
By
Opel Corsa Diesel Usage

I wanted to extend my car weight distribution calculation of June 16 from only 2000 to years 2000 to 2013. Unfortunately, come Sunday afternoon the code seemed too slow and not even the beginning of a post. So, I went on to another calculation I w...

Read more »

Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

June 24, 2013
By
Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

I’m reposing this classic from 2011 . . . Peter Bergman pointed me to this discussion from Cyrus of a presentation by Guido Imbens on design of randomized experiments. Cyrus writes: The standard analysis that Imbens proposes includes (1) a Fisher-type permutation test of the sharp null hypothesis–what Imbens referred to as “testing”–along with a [...]The post Why it doesn’t make sense in general to form confidence intervals by inverting…

Read more »

Does fraud depend on my philosophy?

June 24, 2013
By

Ever since my last post on replication and fraud I've been doing some more thinking about why people consider some things "scientific fraud". (First of all, let me just say that I was a bit surprised by the discussion in … Continue reading →

Read more »

Bayesian quality control?

June 24, 2013
By

Gabriel Murray writes: I saw this post and response from about 5 years ago, regarding a fellow analyzing levels of white blood cells. He was asking about Bayesian approaches to quality control and couldn’t find a canonical resource on that topic. Five years on and I similarly don’t see many good resources on the topic, [...]The post Bayesian quality control? appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

Reading Predictive Analytics

June 24, 2013
By
Reading Predictive Analytics

Predictive Analytics by Eric Siegel (link) was published earlier this year. Siegel is a consultant and organizer of a series of popular industry conferences, which I attend with some regularity. I recommend this book for readers who want to understand the current state of “data science” at a deeper level than the New York Times’s but still nonmathematical. If you want to measure against my own writing, then Siegel spends…

Read more »

Count the number of unique rows in a matrix

June 24, 2013
By
Count the number of unique rows in a matrix

How do you count the number of unique rows in a matrix? The simplest algorithm is to sort the data and then iterate down the rows, comparing each row with the previous row. However, this algorithm has two shortcomings: it physically sorts the data (which means that the original locations [...]

Read more »

Sunday data/statistics link roundup (6/23/13)

June 24, 2013
By

An interesting study describing the potential benefits of using significance testing may be potentially beneficial and a scenario where the file drawer effect may even be beneficial. Granted this is all simulation so you have to take it with a … Continue reading →

Read more »

Revisualizing the best cities in the US in 2012- Shiny + googleVis = Incredibly powerful

June 24, 2013
By
Revisualizing the best cities in the US in 2012- Shiny + googleVis = Incredibly powerful

This is the last time I will talk about visualizing the best cities of 2012 based on Bloomberg Businessweek's rankings. In an earlier post on this topic, interactive applications to plot bar graphs and histograms for different characteristics...

Read more »


Subscribe

Email:

  Subscribe