## Le Monde puzzle [#1037]

January 23, 2018
A purely geometric Le Monde mathematical puzzle this (or two independent ones, rather): Find whether or not there are inscribed and circumscribed circles to a convex polygon with 2018 sides of lengths ranging 1,2,…,2018. In the first (or rather second) case, the circle of radius R that is tangential to the polygon and going through […]

## To better enable others to avoid being misled when trying to learn from observations, I promise not be transparent, open, sincere nor honest?

January 23, 2018
I recently read a paper by Stephen John with the title “Epistemic trust and the ethics of science communication: against transparency, openness, sincerity and honesty”. On a superficial level, John’s paper can be re-stated as honesty  (transparency, openness and sincerity) is not always the best policy. For instance, “publicising the inner workings of sausage factories does […] The post To better…

## Books to Read While the Algae Grow in Your Fur, May 2016

January 23, 2018
Attention conservation notice: I have no taste. Amitav Ghosh, Sea of Poppies, River of Smoke and Flood of Fire Collectively, "the Ibis trilogy", three historical novels centered around the First Opium War. They're beautifully written and the viewpo...

## State-space modeling for poll aggregation . . . in Stan!

January 23, 2018
Peter Ellis writes: As part of familiarising myself with the Stan probabilistic programming language, I replicate Simon Jackman’s state space modelling with house effects of the 2007 Australian federal election. . . . It’s not quite the model that I’d use—indeed, Ellis writes, “I’m fairly new to Stan and I’m pretty sure my Stan programs […] The post State-space modeling…

## Two nice examples of interactivity

January 23, 2018
Kaiser Fung, founder of Junk Charts and Principal Analytics Prep, finds two examples of interactive graphics that enhance the reader's experience.

## Hey—here’s the title of my talk for this year’s New York R conference

January 23, 2018
Toward a Fuller Integration of Graphics in Statistical Analysis The talk will be 20 Apr 2018 at 1:25pm. And here are some things to read ahead of time, if you’re interested: [2003] A Bayesian formulation of exploratory data analysis and goodness-of-fit testing. {\em International Statistical Review} {\bf 71}, 369–382. [2004] Exploratory data analysis for complex […] The post Hey—here’s the…

## Comonads for scientific and statistical computing in Scala

January 22, 2018
Introduction In a previous post I’ve given a brief introduction to monads in Scala, aimed at people interested in scientific and statistical computing. Monads are a concept from category theory which turn out to be exceptionally useful for solving many problems in functional programming. But most categorical concepts have a dual, usually prefixed with “co”, … Continue reading Comonads for…

## Big Data Needs Big Model

January 22, 2018
Big Data are messy data, available data not random samples, observational data not experiments, available data not measurements of underlying constructs of interest. To make relevant inferences from big data, we need to extrapolate from sample to population, from control to treatment group, and from measurements to latent variables. All these steps require modeling. At […] The post Big Data…

## The new airline re-booking policy

January 22, 2018
Kaiser Fung, author of Numbersense and founder of Principal Analytics Prep, discusses the consequences of using algorithms to make decisions by looking at the shift by airlines in rebooking stranded passengers to alternative flights.

## Create lists by using a natural syntax in SAS/IML

January 22, 2018
SAS/IML 14.3 (SAS 9.4M5) introduced a new syntax for creating lists and for assigning and extracting item in a list. Lists (introduced in SAS/IML 14.2) are data structures that are convenient for holding heterogeneous data. A single list can hold character matrices, numeric matrices, scalar values, and other lists, as [...] The post Create lists by using a natural syntax…

## Some datasets for teaching data science

January 22, 2018
In this post I describe the dslabs package, which contains some datasets that I use in my data science courses. A much discussed topic in stats education is that computing should play a more prominent role in the curriculum. I strongly agree, but I thi...

## Averaging for Prediction in Econometrics and ML

January 21, 2018
Random thought. At the risk of belaboring the obvious, it's interesting to heighten collective awareness by thinking about the many appearances of averaging in forecasting, particularly in forecast combination. Some averages are weighted, and...

## PhD studentships @ UCL

January 21, 2018
My department at UCL has been allocated 1 EPSRC Doctoral Training Partnership (DTP) award for 2018/19. The award will be 4 years in duration (or 6 years for part-time candidates), covering UK/EU fees, minimum RCUK stipend and a small allowance for cons...

## How smartly.io productized Bayesian revenue estimation with Stan

January 21, 2018
Markus Ojala writes: Bayesian modeling is becoming mainstream in many application areas. Applying it needs still a lot of knowledge about distributions and modeling techniques but the recent development in probabilistic programming languages have made it much more tractable. Stan is a promising language that suits single analysis cases well. With the improvements in approximation […] The post How smartly.io…

## How to get a sense of Type M and type S errors in neonatology, where trials are often very small? Try fake-data simulation!

January 20, 2018
Tim Disher read my paper with John Carlin, “Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors,” and followed up with a question: I am a doctoral student conducting research within the field of neonatology, where trials are often very small, and I have long suspected that many intervention effects are potentially […] The post How to…

## Visualizing effects of a categorical explanatory variable in a regression

January 20, 2018
Recently, I’ve been working on two problems that might be related to semiotic issues in predictive modeling (i.e. instead of a standard regression table, how can we plot coefficient values in a regression model). To be more specific, I have a variable of interest that is observed for several individuals , with explanatory variables , year , in a specific region…

## The Trumpets of Lilliput

January 19, 2018
Gur Huberman pointed me to this paper by George Akerlof and Pascal Michaillat that gives an institutional model for the persistence of false belief. The article begins: This paper develops a theory of promotion based on evaluations by the already promoted. The already promoted show some favoritism toward candidates for promotion with similar beliefs, just […] The post The Trumpets…

## A lesson from the Charles Armstrong plagiarism scandal: Separation of the judicial and the executive functions

January 19, 2018
[updated link] Charles Armstrong is a history professor at Columbia University who, so I’ve heard, has plagiarized and faked references for an award-winning book about Korean history. The violations of the rules of scholarship were so bad that the American Historical Association “reviewed the citation issue after being notified by a member of the concerns […] The post A lesson…

## 501 days of Summer (school)

January 19, 2018
As I anticipated earlier, we're now ready to open registration for our Summer School in Florence (I was waiting for UCL to set up the registration system and thought it may take much longer than it actually did $-$ so well done UCL!).We'll probabl...

## The difference between me and you is that I’m not on fire

January 19, 2018
$The difference between me and you is that I’m not on fire$

“Eat what you are while you’re falling apart and it opened a can of worms. The gun’s in my hand and I know it looks bad, but believe me I’m innocent.” – Mclusky While the next episode of Madam Secretary buffers on terrible hotel internet, I (the other other white meat) thought I’d pop in […] The post The difference…

## Back To The DT Package After Two Years

As I maintain more and more R packages, I find it more and more difficult to look back on all my previous packages. The DT package is one of them. I think readers of my blog and those who are familiar with my work know that 2016 was my bookdown year, and 2017 was the blogdown year. Of course, life…

## We were measuring the speed of Stan incorrectly—it’s faster than we thought in some cases due to antithetical sampling

January 18, 2018
Aki points out that in cases of antithetical sampling, our effective sample size calculations were unduly truncated above at the number of iterations. It turns out the effective sample size can be greater than the number of iterations if the draws are anticorrelated. And all we really care about for speed is effective sample size […] The post We were…

## (What’s So Funny ‘Bout) Evidence, Policy, and Understanding

January 18, 2018
[link] Kevin Lewis asked me what I thought of this article by Oren Cass, “Policy-Based Evidence Making.” That title sounds wrong at first—shouldn’t it be “evidence-based policy making”?—but when you read the article you get the point, which is that Cass argues that so-called evidence-based policy isn’t so evidence-based at all, that what is considered […] The post (What’s So…