If you know the sine of any angle, you can find its cosine from the Pythagorean theorem. And if you know the sine of an angle you can find the sine of any multiple of that angle using the identity for the sine of a sum. You can find the sine of 30 degrees from […]

# Category: Statistics

## Social science plaig update

OK, we got two items for you, one in political science and one in history. Both are updates on cases we’ve discussed in the past on this blog. I have no personal connection to any of the people involved; my only interest is annoyance at the ways in which plagiarism pollutes scientific understanding and the […]

## quote of the year

“J’ai eu une vie assez droite, je ne me suis jamais conduit comme un salaud” [I have lived a rather righteous life, I never behaved like a bastard] Jean-Marie le Pen [condemned for apology of war crimes (1969), contestation of crimes against humanity (2009), Holocaust denial (1988, 2006), antisemitism (1986), racial hatred (2005, 2008), provocation […]

## Golay codes

Suppose you want to sent pictures from Jupiter back to Earth. A lot could happen as a bit travels across the solar system, and so you need some way of correcting errors, or at least detecting errors. The simplest thing to do would be to transmit photos twice. If a bit is received the same […]

## The real lesson learned from those academic hoaxes: a key part of getting a paper published in a scholarly journal is to be able to follow the conventions of the journal. And some people happen to be good at that, irrespective of the content of the papers being submitted.

I wrote this email to a colleague: Someone pointed me to this paper. It’s really bad. It was published by The Review of Environmental Economics and Policy, “the official journal of the Association of Environmental and Resource Economists and the European Association of Environmental and Resource Economists.” Is this a real organization? The whole thing […]

## Chinese character frequency and entropy

Yesterday I wrote a post looking at the frequency of Koine Greek letters and the corresponding entropy. David Littleboy asked what an analogous calculation would look like for a language like Japanese. This post answers that question. First of all, information theory defines the Shannon entropy of an “alphabet” to be bits where pi is […]

## Forecasts are always wrong

Recently I was interviewed for the Monash Business School podcast “Thought Capital” on the topic of forecasting. You can listen to the episode here (or read the transcript).

## three birthdays and a numeral

The riddle of the week on The Riddler was to find the size n of an audience for at least a 50% chance of observing at least one triplet of people sharing a birthday, as is the case in the present U.S. Senate. The question is much harder to solve than for a pair of […]

## Practical Data Science with R 2nd Edition update

We are in the last stages of proofing the galleys/typesetting of Zumel, Mount, Practical Data Science with R, 2nd Edition, Manning 2019. So this edition will definitely be out soon! If you ever wanted to see what Nina Zumel and John Mount are like when we have the help of editors, this book is your … Continue reading Practical Data Science with R 2nd Edition update

## “Here’s an interesting story right in your sweet spot”

Jonathan Falk writes: Here’s an interesting story right in your sweet spot: Large effects from something whose possible effects couldn’t be that large? Check. Finding something in a sample of 1024 people that requires 34,000 to gain adequate power? Check. Misuse of p values? Check Science journalist hype? Check Searching for the cause of an […]

## and it only gets worse [verbatim]

## The science of snow

Kenneth G. Libbrecht has posted a 523-page book on snow to arXiv.

## Greek letter frequency and entropy

Would the letters in an ancient Greek text carry more or less information than in modern English? To address this question, I downloaded a copy of the Greek New Testament from Project Gutenberg and ran the word frequency script from my previous post. This lead to the follow table of letters and percent frequency. α […]

## Non-Gaussian forecasting using fable

library(tidyverse) library(tsibble) library(lubridate) library(feasts) library(fable) In my previous post about the new fable package, we saw how fable can produce forecast distributions, not just point forecasts. All my examples used Gaussian (normal)…

## stochastic magnetic bits, simulated annealing and Gibbs sampling

A paper by Borders et al. in the 19 September issue of Nature offers an interesting mix of computing and electronics and optimisation. With two preparatory tribunes! One [rather overdone] on Feynman’s quest. As a possible alternative to quantum computers for creating probabilistic bits. And making machine learning (as an optimisation program) more efficient. And […]

## File character counts

Once in a while I need to know what characters are in a file and how often each appears. One reason I might do this is to look for statistical anomalies. Another reason might be to see whether a file has any characters it’s not supposed to have, which is often the case. A few […]

## The status-reversal heuristic

Awhile ago we came up with the time-reversal heuristic, which was a reaction to the common situation that there’s a noisy study, followed by an unsuccessful replication, but all sorts of people want to take the original claim as the baseline and construct high walls to make it difficult to move away from that claim. […]

## My talk on visualization and data science this Sunday 9am

Uncovering Principles of Statistical Visualization Visualizations are central to good statistical workflow, but it has been difficult to establish general principles governing their use. We will try to back out some principles of visualization by considering examples of effective and ineffective uses of graphics in our own applied research. We consider connections between three goals […]

## The Current State of Play in Statistical Foundations: A View From a Hot-Air Balloon

Continue to the third, and last stop of Excursion 1 Tour I of Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (2018, CUP)–Section 1.3. It would be of interest to ponder if (and how) the current state of play in the stat wars has shifted in just one year. I’ll do […]

## Le Monde puzzle [#1114]

Another very low-key arithmetic problem as Le Monde current mathematical puzzle: 32761 is 181² and the difference of two cubes, which ones? And 181=9²+10², the sum of two consecutive integers. Is this a general rule, i.e. the root z of a perfect square that is the difference of two cubes is always the sum of […]