## Sample size and power for rare events

December 3, 2013
By

We have written a bit on sample size for common events, we have written about rare events, and we have written about frequentist significance testing. We would like to specialize our sample size analysis to rare events (which allows us to derive a somewhat tighter estimate). In web marketing and a lot of other applications […] Related posts: Estimating rates from a single occurrence of a rare event A bit…

## Maximizing Return from Every Item in the Marketing Research Questionnaire

December 3, 2013
By

Consumers will not complete long questionnaires, so marketing research must get the most it can from every item.  In this post, we look into the toolbox of R packages and search for statistical models that enable us to learn a great deal about eac...

## On the future of the textbook

December 3, 2013
By

The latest issue of Technological Innovations in Statistics Education is focused on the future of the textbook. Editor Rob Gould has put together an interesting list of contributions as well as discussions from the leaders in the field of statistics … Continue reading →

## Guest Post: Risk, Insurance and the Actuary

December 3, 2013
By

Risk, Insurance, and the Actuary Risk is an inherent part of our daily life. As a result, most of us, take out insurance policies as a means of protection against scenarios which, were they to occur, may cause hardship whether … Continue reading →

## Objects of the class “Lawrence Summers”: Arne Duncan edition

December 3, 2013
By

We have a new “Objects of the class,” and it’s a good one! Here’s what happened. I came across a thoughtful discussion by Mark Palko of how it was that Secretary of Education Arne Duncan ticked off so many people with his recent remarks about “white suburban moms”: To understand why Duncan hit such a […]The post Objects of the class “Lawrence Summers”: Arne Duncan edition appeared first on Statistical…

## Preview of book Data Mining Applications with R

December 3, 2013
By

An edited book titled Data Mining Applications with R will be on market soon, which features 15 real-word applications on data mining with R. A preview of the book is available on Google Books. R code, data and color figures … Continue reading →

## R in Insurance Conference, London, 14 July 2014

December 3, 2013
By

Following the very positive feedback that Andreas and I have received from delegates of the first R in Insurance conference in July of this year, we are planning to repeat the event next year. We have already reserved a bigger auditorium. The second co...

## R in Insurance Conference, London, 14 July 2014

December 3, 2013
By

Following the very positive feedback that Andreas and I have received from delegates of the first R in Insurance conference in July of this year, we are planning to repeat the event next year. We have already reserved a bigger auditorium. The second co...

## Immersion Reveals How People are Connected via Email

December 2, 2013
By

Immersion [mit.edu] is a quite revealing visualization tool of which the NSA - or your own national security agency - can only be jealous of... Developed by MIT students Daniel Smilkov, Deepak Jagdish and C�sar Hidalgo, Immersion generates a time-va...

## Does a professor’s intervention in online discussions have the effect of prolonging discussion or cutting it off?

December 2, 2013
By

Usually I don’t post answers to questions right away, but Mark Liberman was kind enough to answer my question yesterday so I think I should reciprocate. Mark asks: I’ve been playing around with data from Coursera transaction logs, for an economics course and a modern poetry course so far. For the Modern Poetry course, where […]The post Does a professor’s intervention in online discussions have the effect of prolonging discussion…

December 2, 2013
By

## Newest release of BCEA

December 2, 2013
By

Very shortly, I'll upload the newest release of BCEA, my R package to post-process the output of a (Bayesian) health economic model and produce systematic summaries (such as graphs and tables) for a full economic evaluation and probabilistic sensitivit...

## Shaping up Laplace Approximation using Importance Sampling

December 2, 2013
By

In the last post I showed how to use Laplace approximation to quickly (but dirtily) approximate the posterior distribution of a Bayesian model coded in R. This is just a short follow up where I show how to use importance sampling as an easy method to...

## Speeding up model bootstrapping in GNU R

December 2, 2013
By

After my last post I have recurringly received two questions: (a) is it worthwhile to analyze GNU R speed in simulations and (b) how would simulation speed compare between GNU R and Python. In this post I want to address the former question and next ti...

## Solving Big Data’s big skills shortage – The Conversation

December 2, 2013
By

From: http://theconversation.com/solving-big-datas-big-skills-shortage-20352The skills required to tap Big Data include statistics, mathematics, computer science and engineering. Shutterstock.comAccording to analyst firm Gartner, Big Data is at the por...

## Academics should not feel guilty for maximizing their potential by leaving their homeland

December 2, 2013
By

In a New York Times op-ed titled Migration Hurts the Homeland, Paul Collier tells us that What’s good for migrants from poor places is not always good for the countries they’re leaving behind. He makes the argument that those that … Continue reading →

## Should personal genetic testing be regulated? Battle of the blogroll

December 2, 2013
By

On the side of less regulation is Alex Tabarrok in “Our DNA, Our Selves”: At the same time that the NSA is secretly and illegally obtaining information about Americans the FDA is making it illegal for Americans to obtain information about themselves. In a warning letter the FDA has told Anne Wojcicki, The Most Daring […]The post Should personal genetic testing be regulated? Battle of the blogroll appeared first on…

## Beyond the obvious

December 2, 2013
By

Flowing Data has been doing some fine work on the baby names data. The names voyager is a successful project by Martin Wattenberg that has received praise from many corners. It's one of these projects that have taken on a...

## The e-Writing Jungle Part 3: Web-Based e-books Using Python / Sphinx

December 2, 2013
By

In the previous Parts 1 and 2, I essentially dealt with two extremes: (1) LaTeX to pdf to web, and (2) raw HTML (however arrived at) with math rendered by MathJax. Now let's look at something of a middle ground: the Python package, Sphinx, for producin...

## Write a matrix in the "long form"

December 2, 2013
By

If you write an n x p matrix from PROC IML to a SAS data set, you'll get a data set with n rows and p columns. For some applications, it is more convenient to write the matrix in a "long format" with np observations and three columns. The first [...]

## Probabilities and P-Values

December 2, 2013
By
$Probabilities and P-Values$

P-values seem to be the bane of a statistician’s existence.  I’ve seen situations where entire narratives are written without p-values and only provide the effects. It can also be used as a data reduction tool but ultimately it reduces the world into a binary system: yes/no, accept/reject. Not only that but the binary threshold is […]

## Evaluating Quandl Data Quality – part II

December 2, 2013
By

This post is a more in depth analysis of Quandl futures data vs. Bloomberg data. Since my last post Quandl has updated its futures database to 200+ contracts from 68 contracts originally. For practical reasons, I limit myself here to the initial list of 60+ contracts. I’m still comparing the “Front Month” contract between the […]

## The Border of Search

December 2, 2013
By

The original proposition of a web search engine was to help you find the answer to your information need in a page or site on the web: if someone has already solved your problem, let us help you find their...