## Latent Class Modeling Election Data

June 14, 2013
Latent class analysis is a useful tool that is used to identify groups within multivariate categorical data.  An example of this is the likert scale. In categorical language these groups are known as latent classes. As a simple comparison this can be compared to the k-means multivariate cluster analysis. There are several key differences between the […]

## R: Interval Estimation of the Population Mean

June 14, 2013
Interval estimation of the population mean can be computed from the functions of the following R packages:stats - contains the t.testTeachingDemos - contains the z.testBSDA - contains the zsum.test and tsum.testThe t.test of the stats package is a stud...

## Stephen Ziliak Rejects Significance Testing

June 14, 2013
In an opinion piece in the Financial Post, Stephen Ziliak goes into the land of hyperbole, declaring that all significance testing is junk science. It starts like this: I want to believe as much as the next person that particle physicists have discovered a Higgs boson, the so-called “God particle,” one with a mass of … … Continue reading →

## Big in Japan

June 13, 2013
Inspired by this post on R-bloggers, I decided to check how BCEA was doing. Unfortunately, it does not feature in the top 100 most downloaded R packages. However, I think it's doing well \$-\$ considering the book (which is the main medium of advertising...

## Chicago, Baseball and Paul Erdös

June 13, 2013
Thursday afternoon, before the 2013 CAE Faculty Conference, Stuart Klugman should invit us to go and watch the Cubs playing, in Chicago. That should be fun. First baseball game, ever. I will be back in Montréal (and on the blog) next week ! That will be an opportunity to discuss with mathematicians and baseball fans. Actually, a colleague told me that there was a nice anecdote about baseball and mathematics.…

## Ages 10-12 Toy Exoplanet Detection

$Ages 10-12 Toy Exoplanet Detection$

A major objection with the previous simulated light curves is that the baseline is rarely constnat. Instead, from what I have learned, it is a horrible mess of discontinuities and curves due to the telescope rotating and instruments heating up. I spoke to someone who said that there is some periodicity in the curve. It […] The post Ages 10-12 Toy Exoplanet Detection appeared first on Lindons Log.

## Against the myth of the heroic visualization

June 13, 2013
Alberto Cairo tells a fascinating story about John Snow, H. W. Acland, and the Mythmaking Problem: Every human community—nations, ethnic and cultural groups, professional guilds—inevitably raises a few of its members to the status of heroes and weaves myths around them. . . . The visual display of information is no stranger to heroes and [...]The post Against the myth of the heroic visualization appeared first on Statistical Modeling, Causal…

## False discovery rate regression (cc NSA’s PRISM)

June 13, 2013
There is an idea I have been thinking about for a while now. It re-emerged at the top of my list after seeing this really awesome post on using metadata to identify "conspirators" in the American revolution. My first thought was: … Continue reading →

## When’s that next gamma-ray blast gonna come, already?

June 13, 2013
Phil Plait writes: Earth May Have Been Hit by a Cosmic Blast 1200 Years Ago . . . this is nothing to panic about. If it happened at all, it was a long time ago, and unlikely to happen again for hundreds of thousands of years. This left me confused. If it really did happen [...]The post When’s that next gamma-ray blast gonna come, already? appeared first on Statistical Modeling,…

June 13, 2013
## Le Monde puzzle [#824]

June 13, 2013
A rather dull puzzle this week: Show that, for any integer y, (√3-1)2y+(√3+1)2y is an integer multiple of a power of two. I just have to apply Newton’s binomial theorem to obtain the result. What’s the point?! Filed under: Books, Kids...

## Review: Chabris, Simons, The Invisible Gorilla

June 13, 2013
Inattentional and change blindness are two fascinating phenomena that more people should be aware of. The Invisible Gorilla describes them as well as some other interesting and surprising psychological research. This book has been out for over three years, and I’ve been meaning to write a review forever. What brought it back to my attention is a recent news story on the safety implications of voice-controlled systems in cars. Just…

## Twitter Twitter on the Web, Who is the Most Popular of All? Interactively Determining Popularity of Two Entitites on Twitter

June 12, 2013
UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS:http://bit.ly/1jKxfDu .  PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.Code updated based on feedback (s...

## The Reorderable Data Matrix and the Promise of Pattern Discovery

June 12, 2013
We typically start with the data matrix, a rectangular array of rows and columns.  If we type its name on the R command line, it will show itself.  But the data matrix is hard to read, even when there are not many rows or columns.  The h...

## Toy Exoplanet Change Point Light Curve Transit Detection

I’ve been attending an exoplanet data conference this week, a gathering between astrophysicists and statisticians. One way to look for exoplanets is by the “Transit” method. Basically a dip in the flux from a star is observed as an orbiting planet passes across the line of sight between the observer and the star. There was […]The post Toy Exoplanet Change Point Light Curve Transit Detection appeared first on Lindons Log.

## Peter Thiel is writing another book!

June 12, 2013
Tyler Cowen links: “I’m writing this book because we need to think about the future for more than just 140 characters or 15 minutes at a time if we want to make real long-term progress,” Mr. Thiel said in a statement. “’Zero to One’ is about learning from Silicon Valley how to solve hard problems [...]The post Peter Thiel is writing another book! appeared first on Statistical Modeling, Causal Inference,…

## Personalized medicine is primarily a population-health intervention

June 12, 2013
There has been a lot of discussion of personalized medicine, individualized health, and precision medicine in the news and in the medical research community. Despite this recent attention, it is clear that healthcare has always been personalized to some extent. For … Continue reading →

## An Introduction to Importance Sampling

$An Introduction to Importance Sampling$

Importance Sampling is a Monte Carlo integration technique for getting (very accurate) approximations to integrals. Consider the integral and suppose we wish to approximate this without doing any calculus. Statistically speaking we want to compute the normalizing constant for a standard normal, which we know to be We can rewrite the above integral as because […]The post An Introduction to Importance Sampling appeared first on Lindons Log.

## How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

June 12, 2013
Jonathan Robinson writes: I’m a survey researcher who mostly does political work, but I also have a strong interest in economics. I have a question about this graph you commonly see in the economics literature. It is of a concept called the Beveridge Curve [recently in the newspaper here]. It is one of the more [...]The post How to best graph the Beveridge curve, relating the vacancy rate in jobs…

## De-noising data

June 12, 2013
One of the most important steps in analyzing data is to remove noise. First, we have to identify where the noise is, then we find ways to reduce the noise, which has the effect of surfacing the signal. The labor...

## Happy Birthday Normal Deviate

June 12, 2013
Today is the one year anniversary of this blog. First of all, thanks to all the readers. And special thanks to commenters and guest posters. This seems like a good time to assess whether I have achieved my goals for the blog and to get suggestions on how I might proceed in year two. GOALS. … … Continue reading →

## How to interpret a residual-fit spread plot

June 12, 2013
In a previous blog post, I described how to use a spread plot to compare the distributions of several variables. Each spread plot is a graph of centered data values plotted against the estimated cumulative probability. Thus, spread plots are similar to a (rotated) plot of the empirical cumulative distribution [...]