## Treading a New Path for Reproducible Research: Part 1

August 21, 2013
By

Discussions about reproducibility in scientific research have been on the rise lately, including on this blog. There are many underlying trends that have produced this increased interest in reproducibility: larger and larger studies being harder to replicate independently, cheaper data … Continue reading →

Read more »

## Champions league

August 21, 2013
By

Before I even start any real writing here, I should remark that I am most definitely not an Inter Milan fan (but they are my brother's and dad's team, so I thought I used this picture anyway); nor is this post really about football (well: it is in...

Read more »

## Utility script for launching bare JAR files

August 21, 2013
By

Torsten Seemann compiled a list of minimum standards for bioinformatics command line tools, things like printing help when no commands are specified, including version info, avoid hardcoded paths, etc. These should be obvious to any seasoned software e...

Read more »

## BDA3 table of contents (also a new paper on visualization)

August 21, 2013
By

In response to our recent posting of Amazon’s offer of Bayesian Data Analysis 3rd edition at 40% off, some people asked what was in this new edition, with more information beyond the beautiful cover image and the brief paragraph I’d posted earlier. Here’s the table of contents. The following sections have all-new material: 1.4 New […]The post BDA3 table of contents (also a new paper on visualization) appeared first on…

Read more »

## The case method and statistics

August 21, 2013
By

Readers of my books will not be surprised to hear that I am a fan of the case method in teaching. The case method was pioneered by Harvard Law School in 1870, and I was exposed to it at Harvard Business School, where it is used to teach everything from marketing to leadership, and from economics to operations, and even accounting. Since the 1920s, the case method replaced the "lecture…

Read more »

## Comparing two groups? Two tips that make a difference

August 21, 2013
By

A common visualization is to compare characteristics of two groups. This article emphasizes two tips that will help make the comparison clear. First, consider graphing the differences between the groups. Second, in any plot that has a categorical axis, sort the categories by a meaningful quantity. This article is motivated [...]

Read more »

## Las Vegas and financial institutions

August 21, 2013
By

Exactly one month ago, I entered the Bellagio casino to gamble at the roulette. It was actually a request from my daughter’s godfather (who happens to be a probabilist, actually). On a comment on a previous post, he suggested the following deal, In the Bellagio you put 10\$ for me on the 33 and 10\$ for you as well. If 33 shows up, you bring me to a French “3…

Read more »

## Job opening at an organization that promotes reproducible research!

August 21, 2013
By

I was told about an organization called Reproducibility Initiative. They tell me they are trying to make what was described in our “50 shades of gray” post standard across all of science, particularly areas like cancer research. I don’t know anything else about them, but that sounds like a good start! Here’s the ad: Data […]The post Job opening at an organization that promotes reproducible research! appeared first on Statistical…

Read more »

## BKLYNR: Mapping the Age of each Building in Brooklyn

August 20, 2013
By

BKLYNR [bklynr.com] by web designer Thomas Rhiel is a highly detailed map that reveals the age of each of the more than 320,000 buildings currently present in Brooklyn. The interactive map reveals how the historical urban development has rippled acro...

Read more »

## Time-series forecasting: Bike Accidents

August 20, 2013
By

About a year ago I posted this video visualization of all the reported accidents involving bicycles in Montreal between 2006 and 2010. In the process I also calculated and plotted the accident rate using a monthly moving average. The results followed a pattern that was for the most part to be expected. The rate shoots up […]

Read more »

## “[” and “[[” with the apply() functions

August 20, 2013
By

Did you know you can use "[" and "[[" as function names for subsetting with calls to the apply-type functions? For example, suppose you have a bunch of identifier strings like "ZYY-43S-CWA3" and you want to pull off the bit before the first hyphen ("ZYY" in this case). (For code to create random IDs like […]

Read more »

## When did statistics jump the shark?

August 20, 2013
By

Statistics jumped the shark the moment they adopted the following definition, (Gelman & Hill, page 13): A probability distribution corresponds to an urn with a potentially infinite number of balls inside. When a ball is drawn at random, the &#8220...

Read more »

## A couple of requests for the @Statistics2013 future of statistics workshop

August 20, 2013
By

Statistics 2013 is hosting a workshop on the future of statistics. Given the timing and the increasing popularity of our discipline I think its a great idea to showcase the future of our field. I just have two requests: Please … Continue reading →

Read more »

## Correcting for multiple comparisons in a Bayesian regression model

August 20, 2013
By

Joe Northrup writes: I have a question about correcting for multiple comparisons in a Bayesian regression model. I believe I understand the argument in your 2012 paper in Journal of Research on Educational Effectiveness that when you have a hierarchical model there is shrinkage of estimates towards the group-level mean and thus there is no […]The post Correcting for multiple comparisons in a Bayesian regression model appeared first on Statistical…

Read more »

## Light entertainment: Hidden time, and shifted label

August 20, 2013
By

Rick (via Twitter) tells me he is baffled by this chart that showed up in Financial Review: I'm baffled as well. What might the designer have in mind? Based on the cues such as length of the curves, one would...

Read more »

## Electronic lab notebook

August 20, 2013
By

I was interested to read C. Titus Brown‘s recent post, “Is version control an electronic lab notebook?” I think version control is really important, and I think all computational scientists should have something equivalent to a lab notebook. But I think of version control as serving needs orthogonal to those served by a lab notebook. […]

Read more »

## Step by step to build my first R Hadoop System

August 20, 2013
By

by Yanchang Zhao, RDataMining.com After reading documents and tutorials on MapReduce and Hadoop and playing with RHadoop for about 2 weeks, finally I have built my first R Hadoop system and successfully run some R examples on it. My experience … Continue reading →

Read more »

## ChainLadder 0.1.6 released with chain-ladder factor models

August 20, 2013
By

Version 0.1.6 of the ChainLadder package has been released and is already available from CRAN.The new version adds the function CLFMdelta. CLFMdelta finds consistent weighting parameters delta for a vector of selected age-to-age chain-ladder factors fo...

Read more »

## Exploratory Data Analysis: Useful R Functions for Exploring a Data Frame

Introduction Data in R are often stored in data frames, because they can store multiple types of data.  (In R, data frames are more general than matrices, because matrices can only store one type of data.)  Today’s post highlights some common functions in R that I like to use to explore a data frame before […]

Read more »

## MovieGalaxies: the Social Graph of Popular Movies

August 19, 2013
By

Movie Galaxies [moviegalaxies.com], developed by Jermain Kaminski and Michael Schober provides an alternative, data-driven experience to the story lines of popular movies. Based on each movie script, all the interactions of the main characters are ...

Read more »

## Statistics and Dr. Strangelove

August 19, 2013
By
$Statistics and Dr. Strangelove$

One of the biggest embarrassments in statistics is that we don’t really have confidence bands for nonparametric functions estimation. This is a fact that we tend to sweep under the rug. Consider, for example, estimating a density from a sample . The kernel estimator with kernel and bandwidth is Let’s start with getting a confidence […]

Read more »

## Mean Values

August 19, 2013
By

Statistical parameters are used to describe a population and are often based on a large number of observations in public …Continue reading »

Read more »

## The Bayesian Counterpart of Pearson’s Correlation Test

August 19, 2013
By

Except for maybe the t test, a contender for the title “most used and abused statistical test” is Pearson’s correlation test. Whenever someone wants to check if two variables relate somehow it is a safe bet (at least in psychology) that the fir...

Read more »

 Tweet

Email: