Boot

June 4, 2013
By

Joshua Hartshorne writes: I ran several large-N experiments (separate participants) and looked at performance against age. What we want to do is compare age-of-peak-performance across the different tasks (again, different participants). We bootstrapped age-of-peak-performance. On each iteration, we sampled (with replacement) the X scores at each age, where X=num of participants at that age, and [...]The post Boot appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

Understanding U.S hospital billing practices – part 2 – chest pain

June 3, 2013
By
Understanding U.S hospital billing practices – part 2 – chest pain

I have chest pain, but not because of what you might think. Although if it were that serious, I might go to the hospital and get it checked out. And the cost would vary wildly - depending upon where the hospital is located. And that's how you segu...

Read more »

random sudokus

June 3, 2013
By
random sudokus

In a paper arXived on Friday, Roberto Fontana relates the generation of Sudoku grids to the one of Latin squares (which is unsurprising) and to maximum cliques of a graph (more surprising). The generation of a random Latin square proceeds in three steps: generate a random Latin square L with identity permutation matrix on symbol […]

Read more »

Is Quandl the easiest way to find and use numerical data on the internet?

June 3, 2013
By

From: http://www.quandl.com/aboutQuandl has indexed over 5 million time-series datasets from over 400 sources. All of Quandl's datasets are open and free.You can download any Quandl dataset in any format that you want. You can also visualize, save, sha...

Read more »

The statistical properties of smart chains (and referral chains more generally)

June 3, 2013
By
The statistical properties of smart chains (and referral chains more generally)

Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know [...]The post The statistical properties of smart chains (and referral chains more generally) appeared first…

Read more »

Maxima and minima

June 3, 2013
By
Maxima and minima

Andrew Sullivan (link) highlights the insanity of the law with this "Chart of the Day", except that chart fails to bring out the message: For this data set, a Bumps-style chart works very well: *** The bar chart uses the...

Read more »

Passing values from PROC IML into SAS procedures

June 3, 2013
By
Passing values from PROC IML into SAS procedures

A SAS user told me that he computed a vector of values in the SAS/IML language and wanted to use those values on a statement in a SAS procedure. The particular application involved wanting to use the values on the ESTIMATE and CONTRAST statements in a SAS regression procedure, but [...]

Read more »

A Few Tips for Writing an R Book

June 3, 2013
By
A Few Tips for Writing an R Book

I just finished fixing (hopefully all) the problems in the knitr book returned from the copy editor. David Smith has kindly announced this book before I do. I do not have much to say about this book: almost everything in the book can be found in the on...

Read more »

Aspect Ratio and Banking to 45 Degrees

June 3, 2013
By
Aspect Ratio and Banking to 45 Degrees

The same data can look very different in a line chart depending on its aspect ratio. But what is the perfect shape for a chart? A square? A rectangle? Which rectangle? It depends on the data. Aspect Ratios The ratio between the width and the height of a rectangle is called its aspect ratio. It is typically expressed as a fraction with two numbers, the width divided by the height.…

Read more »

Sunday data/statistics link roundup (6/2/13)

June 3, 2013
By

Awesome, a GUI for d3 graphs. Via John M. Tom L. on why statistics matter, especially at the Census! I've been spending the last several weeks house hunting like crazy, so the idea of data on schools is high on … Continue reading →

Read more »

Don’t Take Good Data for Granted: A Caution for Statisticians

Don’t Take Good Data for Granted: A Caution for Statisticians

Background Yesterday, I had the pleasure of attending my first Spring Alumni Reunion at the University of Toronto.  (I graduated from its Master of Science program in statistics in 2012.)  There were various events for the alumni: attend interesting lectures, find out about our school’s newest initiatives, and meet other alumni in smaller gatherings tailored […]

Read more »

A dearth of raw data

June 2, 2013
By
A dearth of raw data

The desired outcome of this post is to be proved wrong. Here is my assertion: It is really difficult to find appropriate sets of data to use for teaching and assessing statistical analysis. This is a problem; one of the … Continue reading →

Read more »

First (new) post (in a while)!

June 2, 2013
By

Hello, everyone.  I am in the process of making a personal blog.  This will entail me migrating everything over from my old blog .  I am really looking forward to starting to post again, and stay tuned for some interesting entries.

Read more »

Flame bait

June 2, 2013
By
Flame bait

Mark Palko asks what I think of this article by Francisco Louca, who writes about “‘hybridization’, a synthesis between Fisherian and Neyman-Pearsonian precepts, defined as a number of practical proceedings for statistical testing and inference that were developed notwithstanding the original authors, as an eventual convergence between what they considered to be radically irreconcilable.” To [...]The post Flame bait appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

Cars in Netherlands

June 2, 2013
By
Cars in Netherlands

I am looking for a new car. So when I saw there was an update on vehicles in Statistics Netherlands I just had to go and look at the data. So, I learned the brown is getting more popular, often the number of cars from a certain construction year is lar...

Read more »

Some statistical dirty laundry

June 1, 2013
By
Some statistical dirty laundry

I finally had a chance to fully read the 2012 Tilberg Report* on “Flawed Science” last night. Here are some stray thoughts… 1. Slipping into pseudoscience. The authors of the Report say they never anticipated giving a laundry list of “undesirable conduct” by which researchers can flout pretty obvious requirements for the responsible practice of science. It […]

Read more »

Loading Historical Stock Data

June 1, 2013
By
Loading Historical Stock Data

Historical Stock Data is critical for testing your investment strategies. I illustrated all my back-test examples with getSymbols function from quantmod package. For example, following is a back-test comparison for a few portfolio allocation methods: The getSymbols function, from quantmod package, downloads historical stock prices from Yahoo Fiance. I often get questions about alternative ways […]

Read more »

Benford’s law and addresses

June 1, 2013
By
Benford’s law and addresses

One example we give to illustrate Benford’s law is the first digits of addresses. Javier Marquez Pena had a survey and, just for laffs, he looked the distribution of first digits: Cool—it really works! P.S. The y-axis shouldn’t go below zero, and I’d much prefer an L-type graphics box (par(bty=”l”)) rather than the square, but [...]The post Benford’s law and addresses appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

Tweetanalytics – Interactively analyzing tweets from accounts of 5 universities

June 1, 2013
By
Tweetanalytics – Interactively analyzing tweets from accounts of 5 universities

UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS:http://bit.ly/1nzKbdq .  PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.This is an attempt at learning an...

Read more »

Flotsam 12: early June linkathon

June 1, 2013
By

A list of interesting R/Stats quickies to keep the mind distracted: A long draft Advanced Data Analysis from an Elementary Point of View by Cosma Shalizi, in which he uses R to drive home the message. Not your average elementary point of view. Good notes by Frank Davenport on starting using R with data from […]

Read more »

Regression regularization example

May 31, 2013
By
Regression regularization example

Recently I needed a simple example showing when application of regularization in regression is worthwhile. Here is the code I came up with (along with basic application of parallelization of code execution). Assume you have 60 observations and 50 expla...

Read more »

How to fix the tabloids? Toward replicable social science research

May 31, 2013
By
How to fix the tabloids?  Toward replicable social science research

This seems to be the topic of the week. Yesterday I posted on the sister blog some further thoughts on those “Psychological Science” papers on menstrual cycles, biceps size, and political attitudes, tied to a horrible press release from the journal Psychological Science hyping the biceps and politics study. Then I was pointed to these [...]The post How to fix the tabloids? Toward replicable social science research appeared first on…

Read more »

accurate ABC: comments by Oliver Ratman [guest post]

May 31, 2013
By
accurate ABC: comments by Oliver Ratman [guest post]

Here are comments by Olli following my post: I think we found a general means to obtain accurate ABC in the sense of matching the posterior mean or MAP exactly, and then minimising the KL distance between the true posterior and its ABC approximation subject to this condition. The construction works on an auxiliary probability […]

Read more »


Subscribe

Email:

  Subscribe