What is it with Americans in Olympic ski teams from tropical countries?

March 2, 2014
By

Every time I hear this sort of story: Morrone—listed at 48 years old, which would have made her the oldest Olympic cross-country skier of all time by seven years—didn’t even show up for the 10K women’s classic on Feb. 13, claiming injury. (She was the only one of the race’s 76 entrants who didn’t start.) […]The post What is it with Americans in Olympic ski teams from tropical countries? appeared…

Read more »

BusinessNewsDaily Reference: What is Statistical Analysis?

March 2, 2014
By
BusinessNewsDaily Reference: What is Statistical Analysis?

From: http://www.businessnewsdaily.com/6000-statistical-analysis.htmlByChad Brooks, BusinessNewsDaily Contributor   |   February 28, 2014 12:07am ETIn an effort to organize their data and predict future trends based on the info...

Read more »

Simple Pharmacokinetics with Jags

March 2, 2014
By
Simple Pharmacokinetics with Jags

In this post I want to analyze a first order pharmocokinetcs problem: the data of study problem 9, chapter 3 of Rowland and Tozer (Clinical pharmacokinetics and pharmacodynamics, 4th edition) with Jags. It is a surprising simple set of data, but still ...

Read more »

The Statistics behind “Verification by Multiplicity”

March 2, 2014
By
The Statistics behind “Verification by Multiplicity”

There’s a new post up at the ninazumel.com blog that looks at the statistics of “verification by multiplicity” — the statistical technique that is behind NASA’s announcement of 715 new planets that have been validated in the data from the Kepler Space Telescope. We normally don’t write about science here at Win-Vector, but we do […] Related posts: “I don’t think that means what you think it means;” Statistics to…

Read more »

Short Review: the War of Art by Steven Pressfield

March 2, 2014
By

The War of Art: Winning the Inner Creative Battle by Steven Pressfield Pressfield is the author of several bestsellers. The War of Art is a 12 step self-help support group for procrastinators, a biological and psychological disection of procrastination...

Read more »

Short Review: the War of Art by Steven Pressfield

March 2, 2014
By

The War of Art: Winning the Inner Creative Battle by Steven Pressfield Pressfield is the author of several bestsellers. The War of Art is a 12 step self-help support group for procrastinators, a biological and psychological disection of procrastination...

Read more »

March Madness in the Reading Department

March 2, 2014
By
March Madness in the Reading Department

It's time for the monthly round-up of recommended reading material.Gan, L. and J. Jiang, 1999. A test for global maximum. Journal of the American Statistical Association, 94, 847-854.Nowak-Lehmann, F., D. Herzer, S. Vollmer, and I. Martinez-Zarzosa, 20...

Read more »

Oldies but Goldies: Statistical Graphics Books

March 2, 2014
By
Oldies but Goldies: Statistical Graphics Books

I just wanted to plug for three classical books on statistical graphics that I really enjoyed reading. The books are old (that is, older than me) but still relevant and together they give a sense of the development of exploratory graphics in general ...

Read more »

Short Review: Writing Tools: 50 Essential Strategies for Every Writer

March 1, 2014
By

This is the first of perhaps three short book reviews.  Certain basics of writing I go over with almost every student. Organization, content, paragraphs and sentences. Roy Peter Clark's Writing Tools: 50 Essential Strategies for Every Writer covers m...

Read more »

Short Review: Writing Tools: 50 Essential Strategies for Every Writer

March 1, 2014
By

This is the first of perhaps three short book reviews.  Certain basics of writing I go over with almost every student. Organization, content, paragraphs and sentences. Roy Peter Clark's Writing Tools: 50 Essential Strategies for Every Writer covers m...

Read more »

C++11 versus R Standalone Random Number Generation Performance Comparison

If you are writing some C++ code with the intent of calling it from R or even developing it into a package you might wonder whether it is better to use the pseudo random number library native to C++11 or the R standalone library. On the one hand users of your package might have an […] The post C++11 versus R Standalone Random Number Generation Performance Comparison appeared first on…

Read more »

Lines and Circles and Logistic Regression

March 1, 2014
By
Lines and Circles and Logistic Regression

Euclidean geometry, formalized in Euclid's Elements about 2,300 years ago, is in many ways a study of lines and circles.  One might think that after more than two millennia, we have moved beyond such basic shapes particularly in a realm such as da...

Read more »

Cosma Shalizi gets tenure (at last!) (metastat announcement)

March 1, 2014
By
Cosma Shalizi gets tenure (at last!) (metastat announcement)

News Flash! Congratulations to Cosma Shalizi who announced yesterday that he’d been granted tenure (Statistics, Carnegie Mellon). Cosma is a leading error statistician, a creative polymath and long-time blogger (at Three-Toad sloth). Shalizi wrote an early book review of EGEK (Mayo 1996)* that people still send me from time to time, in case I hadn’t […]

Read more »

“We are moving from an era of private data and public analyses to one of public data and private analyses. Just as we have learned to be cautious about data that are missing, we may have to be cautious about missing analyses also.”

March 1, 2014
By

Stephen Senn writes: For many years now I [Senn] have been making the point that obtaining a license to market a drug should carry with it the obligation to share the results with interested parties. . . . Amongst those misunderstanding the issues, are many who work in the pharmaceutical industry. A common assumption is […]The post “We are moving from an era of private data and public analyses to…

Read more »

Fitting models to long time series

March 1, 2014
By
Fitting models to long time series

I received this email today: I recall you made this very insightful remark somewhere that, fitting a standard arima model with too much data, ie. a very long time series, is a bad idea. Can you elaborate why? I can see the issue with noise, which compounds the ML estimation as the series gets too long. But is there anything else? I’m not sure where I made a comment about…

Read more »

On Getting Tenure

March 1, 2014
By

Attention conservation notice: Navel-gazing by a middle-aged academic. I got tenure a few weeks ago. (Technically it takes effect in July.) The feedback from the department and university which accompanied the decision was gratifyingly positive, a...

Read more »

Machine Learning Lesson of the Day – K-Nearest Neighbours Regression

Machine Learning Lesson of the Day – K-Nearest Neighbours Regression

I recently introduced the K-nearest neighbours classifier.  Some slight adjustments to the same algorithm can make it into a regression technique. Given a training set and a new input , we can predict the target of the new input by identifying the K data (the K “neighbours”) in the training set that are closest to by Euclidean […]

Read more »

The Normality of Joint and Marginal Distributions

February 28, 2014
By
The Normality of Joint and Marginal Distributions

I'm often surprised how many people are confused when it comes to joint and marginal normal distributions.Most students of econometrics are taught that the marginal and conditional distributions associated with a multivariate normal random vector are t...

Read more »

Combining two of my interests

February 28, 2014
By
Combining two of my interests

Paul Alper writes: Hi Andrew (or Andy or even Gelman [17 of them]): Go to this link and have some fun with (useless? powerful?) data mining. As the authors say, it is addictive. Paul (no other way to spell it) Alper [215 of us] I’m reminded of this discussion from 2012, “Michael’s a Republican, Susan’s […]The post Combining two of my interests appeared first on Statistical Modeling, Causal Inference, and…

Read more »

Using CART for Stock Market Forecasting

February 28, 2014
By
Using CART for Stock Market Forecasting

There is an enormous body of literature both academic and empirical about market forecasting. Most of the time it mixes two market features: Magnitude and Direction. In this article I want to focus on identifying the market direction only. The goal I set myself, is to identify market conditions when the odds are significantly biased […]

Read more »

God/leaf/tree

February 28, 2014
By

Govind Manian writes: I wanted to pass along a fragment from Lichtenberg’s Waste Books — which I am finding to be great stone soup — that reminded me of God is in Every Leaf: To the wise man nothing is great and nothing small…I believe he could write treatises on keyholes that sounded as weighty […]The post God/leaf/tree appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

r4stats.com 2013 in review

February 28, 2014
By
r4stats.com 2013 in review

The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog. Here’s an excerpt: The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 150,000 times in 2013. If it were an exhibit at … Continue reading →

Read more »

Useful Functions in R for Manipulating Text Data

Useful Functions in R for Manipulating Text Data

Introduction In my current job, I study HIV at the genetic and biochemical levels.  Thus, I often work with data involving the sequences of nucleotides or amino acids of various patient samples of HIV, and this type of work involves a lot of manipulating text.  (Strictly speaking, I analyze sequences of nucleotides from DNA that are reverse-transcribed from […]

Read more »


Subscribe

Email:

  Subscribe