How do you evaluate the sine of a large number in floating point arithmetic? What does the result even mean? Sine of a trillion Let’s start by finding the sine of a trillion (1012) using floating point arithmetic. There are a couple ways to think about this. The floating point number t = 1.0e12 can only […]

## Niall Ferguson and the perils of playing to your audience

History professor Niall Ferguson had another case of the sillies. Back in 2012, in response to Stephen Marche’s suggestion that Ferguson was serving up political hackery because “he has to please corporations and high-net-worth individuals, the people who can pay 50 to 75K to hear him talk,” I wrote: But I don’t think it’s just […]

## First Look at N-P Methods as Severe Tests: Water plant accident [Exhibit (i) from Excursion 3]

Exhibit (i) N-P Methods as Severe Tests: First Look (Water Plant Accident) There’s been an accident at a water plant where our ship is docked, and the cooling system had to be repaired. It is meant to ensure that the mean temperature of discharged water stays below the temperature that threatens the ecosystem, perhaps not […]

## “Statistical insights into public opinion and politics” (my talk for the Columbia Data Science Society this Wed 9pm)

7pm in Fayerweather 310: Why is it more rational to vote than to answer surveys (but it used to be the other way around)? How does this explain why we should stop overreacting to swings in the polls? How does modern polling work? What are the factors that predict election outcomes? What’s good and bad […]

## Six degrees of Kevin Bacon, Paul Erdos, and Wikipedia

I just discovered the web site Six Degrees of Wikipedia. It lets you enter two topics and it will show you how few hops it can take to get from one to the other. Since the mathematical equivalent of Six Degrees of Kevin Bacon is Six degrees of Paul Erdős, I tried looking for the […]

## Mersenne prime trend

Mersenne primes have the form 2p -1 where p is a prime. The graph below plots the trend in the size of these numbers based on the 50 51 Mersenne primes currently known. The vertical axis plots the exponents p, which are essentially the logs base 2 of the Mersenne primes. The scale is logarithmic, so […]

## Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.”

This is an abstract I wrote for a talk I didn’t end up giving. (The conference conflicted with something else I had to do that week.) But I thought it might interest some of you, so here it is: Bayes, statistics, and reproducibility The two central ideas in the foundations of statistics—Bayesian inference and frequentist […]

## Very Non-Standard Calling in R

Our group has done a lot of work with non-standard calling conventions in R. Our tools work hard to eliminate non-standard calling (as is the purpose of wrapr::let()), or at least make it cleaner and more controllable (as is done in the wrapr dot pipe). And even so, we still get surprised by some of … Continue reading Very Non-Standard Calling in R

## Spherical trig, Research Triangle, and Mathematica

This post will look at the triangle behind North Carolina’s Research Triangle using Mathematica’s geographic functions. Spherical triangles A spherical triangle is a triangle drawn on the surface of a sphere. It has three vertices, given by points on the sphere, and three sides. The sides of the triangle are portions of great circles running […]

## My talk tomorrow (Tues) noon at the Princeton University Psychology Department

Integrating collection, analysis, and interpretation of data in social and behavioral research Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University The replication crisis has made us increasingly aware of the flaws of conventional statistical reasoning based on hypothesis testing. The problem is not just a technical issue with p-values, not can […]

## In which I demonstrate my ignorance of world literature

Fred Buchanan, a student at Saint Anselm’s Abbey School, writes: I’m writing a paper on the influence of Jorge Luis Borges in academia, in particular his work “The Garden of Forking Paths”. I noticed that a large number of papers from a wide array of academic fields include references to this work. Your paper, “The […]

## StanCon 2018 Helsinki talk slides, notebooks and code online

StanCon 2018 Helsinki talk slides, notebooks and code have been sometime available in StanCon talks repository, but it seems we forgot to announce this. The StanCon 2018 Helsinki talk list includes also links to videos. StanCon’s version of conference proceedings is a collection of contributed talks based on interactive notebooks. Every submission is peer reviewed […]

## The p-value is 4.76×10^−264

Jerrod Anderson points us to Table 1 of this paper: It seems that the null hypothesis that this particular group of men and this particular group of women are random samples from the same population, is false. Good to know. For a moment there I was worried. On the plus side, as Anderson notes, the […]

## Neyman-Pearson Tests: An Episode in Anglo-Polish Collaboration: Excerpt from Excursion 1 (3.2)

3.2 N-P Tests: An Episode in Anglo-Polish Collaboration* We proceed by setting up a specific hypothesis to test, H0 in Neyman’s and my terminology, the null hypothesis in R. A. Fisher’s . . . in choosing the test, we take into account alternatives to H0 which we believe possible or at any rate consider it most important to be […]

## Visualizing data breaches

The image below is a static screen shot of an interactive visualization of the world’s biggest data breaches. The site lets you filter the data by industry and type of breach. See the site for credits and the raw data.