Personalized medicine is primarily a population-health intervention

June 12, 2013
By

There has been a lot of discussion of personalized medicine, individualized health, and precision medicine in the news and in the medical research community. Despite this recent attention, it is clear that healthcare has always been personalized to some extent. For … Continue reading →

Read more »

An Introduction to Importance Sampling

An Introduction to Importance Sampling

Importance Sampling is a Monte Carlo integration technique for getting (very accurate) approximations to integrals. Consider the integral and suppose we wish to approximate this without doing any calculus. Statistically speaking we want to compute the normalizing constant for a standard normal, which we know to be We can rewrite the above integral as because […]The post An Introduction to Importance Sampling appeared first on Lindons Log.

Read more »

How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

June 12, 2013
By
How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

Jonathan Robinson writes: I’m a survey researcher who mostly does political work, but I also have a strong interest in economics. I have a question about this graph you commonly see in the economics literature. It is of a concept called the Beveridge Curve [recently in the newspaper here]. It is one of the more [...]The post How to best graph the Beveridge curve, relating the vacancy rate in jobs…

Read more »

De-noising data

June 12, 2013
By
De-noising data

One of the most important steps in analyzing data is to remove noise. First, we have to identify where the noise is, then we find ways to reduce the noise, which has the effect of surfacing the signal. The labor...

Read more »

Happy Birthday Normal Deviate

June 12, 2013
By
Happy Birthday Normal Deviate

Today is the one year anniversary of this blog. First of all, thanks to all the readers. And special thanks to commenters and guest posters. This seems like a good time to assess whether I have achieved my goals for the blog and to get suggestions on how I might proceed in year two. GOALS. … … Continue reading →

Read more »

How to interpret a residual-fit spread plot

June 12, 2013
By
How to interpret a residual-fit spread plot

In a previous blog post, I described how to use a spread plot to compare the distributions of several variables. Each spread plot is a graph of centered data values plotted against the estimated cumulative probability. Thus, spread plots are similar to a (rotated) plot of the empirical cumulative distribution [...]

Read more »

Mayo: comment on the repressed memory research

June 12, 2013
By
Mayo: comment on the repressed memory research

Here are some reflections on the repressed memory articles from Richard Gill’s post, focusing on Geraerts, et.al.,(2008). 1. Richard Gill reported that “Everyone does it this way, in fact, if you don’t, you’d never get anything published: …People are not deliberately cheating: they honestly believe in their theories and believe the data is supporting them and […]

Read more »

Visualizing densities of spatial processes

June 11, 2013
By
Visualizing densities of spatial processes

We recently uploaded on http://hal.archives-ouvertes.fr/hal-00725090 a revised version of our work, with Ewen Gallic (a.k.a. @3wen) on Visualizing spatial processes using Ripley’s correction: an application to bodily-injury car accident location In this paper, we investigate (and extend) Ripley’s circumference method to correct bias of density estimation of edges (or frontiers) of regions. The idea of the method was theoretical and di#cult to implement. We provide a simple technique – based…

Read more »

Why not have a "future of the field" session at a conference with only young speakers?

June 11, 2013
By

I'm in the process of trying to get together a couple of sessions to submit to ENAR 2014. I'm pretty psyched about the topics and am looking forward to hosting the conference in Baltimore. It is pretty awesome to have … Continue reading →

Read more »

Folic acid and autism

June 11, 2013
By
Folic acid and autism

Aurelian Muntean writes: I have read an article on NPR and the journal article that spun this news. What draw my attention was the discussion in terms of causation implied by one of the authors of the article interviewed in the NPR news, and also by the conclusions of the article itself claiming large effects. [...]The post Folic acid and autism appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

Willing the data to fit your model

June 11, 2013
By

It strikes me that in medicine, we are stuck with simplistic models - models that use one variable only, and are linear in the response. In short, we are told X results in Y, and the more X, the more Y. Real life often does not cooperate, but many people in medical research hold on to their models for dear life. Exhibit 1 is the disappearing of unhelpful data used…

Read more »

Computing skewness and kurtosis in one pass

June 11, 2013
By

If you compute the standard deviation of a data set by directly implementing the definition, you’ll need to pass through the data twice: once to find the mean, then a second time to accumulate the squared differences from the mean.…Read more ›

Read more »

R package development

June 11, 2013
By
R package development

Building R packages is not particular hard, but it can be a bit of a daunting endeavour at the beginning, particularly if you are more of a statistician than a computer scientist or programmer. Some concepts may appear foreign or like red tape, yet man...

Read more »

R: Measures of Skewness and Kurtosis

June 11, 2013
By
R: Measures of Skewness and Kurtosis

Skewness and kurtosis in R are available in the moments package (to install an R package, click here), and these are:Skewness - skewnessKurtosis - kurtosisExample 1. Mirra is interested in the elapse time (in minutes) she spends on riding a tricycle fr...

Read more »

Running time

June 10, 2013
By
Running time

Marta and I are doing some re-analysis of our Eurovision contest (some context here and here). We have slightly modified our original model (mostly, I have navigated the mess in Marta's notation $-$ it's OK: I'm not at risk of her mighty wrath, as I've...

Read more »

Le Monde puzzle [#822]

June 10, 2013
By
Le Monde puzzle [#822]

For once Le Monde math puzzle is much more easily solved on a piece of paper than in R, even in a plane from Roma: Given a partition of the set {1,…,N} in k groups, one considers the collection of all subsets of  the set {1,…,N} containing at least one element from each group. Show […]

Read more »

A statistical problem with “nothing to hide”

June 10, 2013
By

One problem with the nothing-to-hide argument is that it assumes innocent people will be exonerated certainly and effortlessly. That is, it assumes that there are no errors, or if there are, they are resolved quickly and easily. Suppose the probability…Read more ›

Read more »

I don’t think we get much out of framing politics as the Tragic Vision vs. the Utopian Vision

June 10, 2013
By

Ole Rogeberg writes: Recently read your blogpost on Pinker’s views regarding red and blue states. This might help you see where he’s coming from: The “conflict of visions” thing that Pinker repeats to likely refers to Thomas Sowell’s work in the books “Conflict of Visions” and “Visions of the anointed.” The “Conflict of visions” book is [...]The post I don’t think we get much out of framing politics as the Tragic…

Read more »

R: Measure of Relative Variability

June 10, 2013
By

The measure of relative variability is the coefficient of variation (CV). Unlike measures of absolute variability, the CV is unitless when it comes to comparisons between the dispersions of two distributions of different units of measurement. In R, CV ...

Read more »

Once more, superimposing time series creates silly theories

June 10, 2013
By
Once more, superimposing time series creates silly theories

After I wrote the post about superimposing two time series to generate fake correlations, there was a lively discussion in the comments about whether a scatter plot would have done better. Here is the promised follow-up post. The contentious issue...

Read more »

Introduction to stable distributions for finance

June 10, 2013
By
Introduction to stable distributions for finance

A few basics about the stable distribution. Previously “The distribution of financial returns made simple” satirized ideas about the statistical distribution of returns, including the stable distribution. Origin As “A tale of two returns” points out, the log return of a long period of time is the sum of the log returns of the shorter … Continue reading →

Read more »

Visually comparing different data distributions: The spread plot

June 10, 2013
By
Visually comparing different data distributions: The spread plot

Suppose that you have several data distributions that you want to compare. Questions you might ask include "Which variable has the largest spread?" and "Which variables exhibit skewness?" More generally, you might be interested in visualizing how the distribution of one variable differs from the distribution of other variables. The [...]

Read more »

R: Measures of Absolute Variability

June 10, 2013
By

Measures of absolute variability deal with the dispersion of the data points. This include the following:Range - rangeInterquartile Range - IQRQuartile DeviationAverage DeviationStandard Deviation - sdThese measures of variability restrict to uniform u...

Read more »


Subscribe

Email:

  Subscribe