On deck this week

April 25, 2016
By

Mon: I owe it all to my Neanderthal genes Tues: If Yogi Berra could see this one, he’d spin in his grave: Regression modeling using a convenience sample Wed: 64 Shades of Gray: The subtle effect of chessboard images on foreign policy polarization Thurs: Integrating graphs into your workflow Fri: Gary Venter’s age-period-cohort decomposition of […] The post On deck this week appeared first on Statistical Modeling, Causal Inference, and…

Read more »

Matrix computations at SAS Global Forum 2016

April 25, 2016
By
Matrix computations at SAS Global Forum 2016

Last week I attended SAS Global Forum 2016 in Las Vegas. I and more than 5,000 other attendees discussed and shared tips about data analysis and statistics. Naturally, I attended many presentations that featured using SAS/IML software to implement advanced analytical algorithms. Several speakers showed impressive mastery of SAS/IML programming […] The post Matrix computations at SAS Global Forum 2016 appeared first on The DO Loop.

Read more »

Spreadsheet Thinking vs. Database Thinking

April 25, 2016
By
Spreadsheet Thinking vs. Database Thinking

The shape of a dataset is hugely important to how well it can be handled by different software. The shape defines how it is laid out: wide as in a spreadsheet, or long as in a database table. Each has its use, but it’s important to understand their differences and when each is the right choice. Wide … Continue reading Spreadsheet Thinking vs. Database Thinking

Read more »

The Distribution of Global Economic Activity…

April 24, 2016
By
The Distribution of Global Economic Activity…

... as proxied by the global distribution of nighttime lights (from a fascinating new paper by Hendersen et al.).  Like many good graphics, this one repays careful study.  You'll see lots of places where the lights match your prior, but you'l...

Read more »

Risk aversion is a two-way street

April 24, 2016
By
Risk aversion is a two-way street

“Risk aversion” comes up a lot in microeconomics, but I think that it’s too broad a concept to do much for us. In many many cases, it seems to me that, when there is a decision option, either behavior X or behavior not-X can be thought as risk averse, depending on the framing. Thus, when […] The post Risk aversion is a two-way street appeared first on Statistical Modeling, Causal…

Read more »

What is the “true prior distribution”? A hard-nosed answer.

April 23, 2016
By

The traditional answer is that the prior distribution represents your state of knowledge, that there is no “true” prior. Or, conversely, that the true prior is an expression of your beliefs, so that different statisticians can have different true priors. Or even that any prior is true by definition, in representing a subjective state of […] The post What is the “true prior distribution”? A hard-nosed answer. appeared first on…

Read more »

Stochastic natural-gradient EP

April 22, 2016
By

Yee Whye Teh sends along this paper with Leonard Hasenclever, Thibaut Lienart, Sebastian Vollmer, Stefan Webb, Balaji Lakshminarayanan, and Charles Blundell. I haven’t read it in detail but they not similarities to our “expectation propaga...

Read more »

Sharp-R Time Series

April 22, 2016
By

This is the second part in a series of posts on the flexibility of our Excel Add In Sharp-R, that allows functions defined in R code to be run on data in any Excel worksheet. Part one looked as exploratory data analysis. This post deals with time serie...

Read more »

The causal inference competition you’ve all been waiting for!

April 21, 2016
By

Jennifer Hill announces “the first-ever ACIC causal inference data analysis competition”: Is your SATT where it’s at? Participate by submitting treatment effect estimates across a range of datasets OR by submitting a function (in any of a variety of programming languages) that will take input (covariate, treatment assignment, and response) and generate a treatment effect […] The post The causal inference competition you’ve all been waiting for! appeared first on…

Read more »

Introducing fidlr: FInancial Data LoadeR

April 21, 2016
By
Introducing fidlr: FInancial Data LoadeR

fidlr is an RStudio addin designed to simplify the financial data downloading process from various providers. This initial version is a wrapper around the getSymbols function in the quantmod package and only Yahoo, Google, FRED and Oanda are supported. I will probably add functionalities over time. As usual with those things just a kind reminder: “THE SOFTWARE […]

Read more »

Principal curves example (Elements of Statistical Learning)

April 21, 2016
By
Principal curves example (Elements of Statistical Learning)

The bit of R code below illustrates the principal curves methods as described in The Elements of Statistical Learning, by Hastie, Tibshirani, and Friedman (Ch. 14; the book is freely available from the authors' website). Specifically, the code generates some bivariate data that have a nonlinear association, initializes the principal curve using the first (linear) principal … Continue reading Principal curves example (Elements of Statistical Learning) →

Read more »

A new idea for a science core course based entirely on computer simulation

April 21, 2016
By

I happen to come across this post from 2011 that I like so much, I thought I’d say it again: Columbia College has for many years had a Core Curriculum, in which students read classics such as Plato (in translation) etc. A few years ago they created a Science core course. There was always some […] The post A new idea for a science core course based entirely on computer…

Read more »

Workshop on Infectious Disease Modelling in Public Health Policy: Current status and challenges (again)

April 21, 2016
By

We've had a fantastic response to the workshop. In just a few days of public advertisement (I firstly posted about it and then advertised on allstat and HEALTHECON-ALL) we got 65 registration, as of today. We have provisionally set out 100 "t...

Read more »

Write papers like a modern scientist (use Overleaf or Google Docs + Paperpile)

April 21, 2016
By
Write papers like a modern scientist (use Overleaf or Google Docs + Paperpile)

Editor’s note - This is a chapter from my book How to be a modern scientist where I talk about some of the tools and techniques that scientists have available to them now that they didn’t before. Writing - what should I do and why? Write using co...

Read more »

an integer programming riddle

April 20, 2016
By
an integer programming riddle

A puzzle on The Riddler this week that ends up as a standard integer programming problem. Removing the little story around the question, it boils down to optimise 200a+100b+50c+25d under the constraints 400a+400b+150c+50d≤1000, b≤a, a≤1, c≤8, d≤4, and (a,b,c,d) all non-negative integers. My first attempt was a brute force R code since there are only […]

Read more »

Course Announcements: Statistical Network Models, Fall 2016

April 20, 2016
By

Attention conservation notice: Self-promotion, and irrelevant unless you (1) will be a student at Carnegie Mellon in the fall, or (2) have a morbid curiosity about a field in which the realities of social life are first caricatured into an impoverishe...

Read more »

In memoriam Prita Shireen Kumarappa Shalizi

April 20, 2016
By

4 September 1918 -- 4 April 2016

Read more »

“Cancer Research Is Broken”

April 20, 2016
By

Michael Oakes pointed me to this excellent news article by Daniel Engber, subtitled, “There’s a replication crisis in biomedicine—and no one even knows how deep it runs.” Engber suggests that the replication problem in biomedical research is worse than the much-publicized replication problem in psychology. One reason, which I didn’t see Engber discussing, is financial […] The post “Cancer Research Is Broken” appeared first on Statistical Modeling, Causal Inference, and…

Read more »

Visualize missing data in SAS

April 20, 2016
By
Visualize missing data in SAS

You can visualize missing data. It sounds like an oxymoron, but it is true. How can you draw graphs of something that is missing? In a previous article, I showed how you can use PROC MI in SAS/STAT software to create a table that shows patterns of missing data in […] The post Visualize missing data in SAS appeared first on The DO Loop.

Read more »

As a data analyst the best data repositories are the ones with the least features

April 20, 2016
By
As a data analyst the best data repositories are the ones with the least features

Lately, for a range of projects I have been working on I have needed to obtain data from previous publications. There is a growing list of data repositories where data is made available. General purpose data sharing sites include: The open science ...

Read more »

Le Monde puzzle [#959]

April 19, 2016
By
Le Monde puzzle [#959]

Another of those arithmetic Le Monde mathematical puzzle: Find an integer A such that A is the sum of the squares of its four smallest dividers (including1) and an integer B such that B is the sum of the third poser of its four smallest factors. Are there such integers for higher powers? This begs […]

Read more »

Notes from 2nd Bayesian Mixer Meetup

April 19, 2016
By
Notes from 2nd Bayesian Mixer Meetup

Last Friday the 2nd Bayesian Mixer Meetup (@BayesianMixer) took place at Cass Business School, thanks to Pietro Millossovich and Andreas Tsanakas, who helped to organise the event.Bayesian Mixer at CassFirst up was Davide De March talking about the cha...

Read more »

R’s Growth Continues to Accelerate

April 19, 2016
By
R’s Growth Continues to Accelerate

Each year I update the growth in R’s capability on The Popularity of Data Analysis Software. And each year, I think R’s incredible rate of growth will finally slow down. Below is a graph of the latest data, and as … Continue reading →

Read more »


Subscribe

Email:

  Subscribe