Category: Statistics

Exact values of sine and cosine

If you know the sine of any angle, you can find its cosine from the Pythagorean theorem. And if you know the sine of an angle you can find the sine of any multiple of that angle using the identity for the sine of a sum. You can find the sine of 30 degrees from […]

Social science plaig update

OK, we got two items for you, one in political science and one in history. Both are updates on cases we’ve discussed in the past on this blog. I have no personal connection to any of the people involved; my only interest is annoyance at the ways in which plagiarism pollutes scientific understanding and the […]

quote of the year

“J’ai eu une vie assez droite, je ne me suis jamais conduit comme un salaud” [I have lived a rather righteous life, I never behaved like a bastard] Jean-Marie le Pen [condemned for apology of war crimes (1969), contestation of crimes against humanity (2009), Holocaust denial (1988, 2006), antisemitism (1986), racial hatred (2005, 2008), provocation […]

Golay codes

Suppose you want to sent pictures from Jupiter back to Earth. A lot could happen as a bit travels across the solar system, and so you need some way of correcting errors, or at least detecting errors. The simplest thing to do would be to transmit photos twice. If a bit is received the same […]

The real lesson learned from those academic hoaxes: a key part of getting a paper published in a scholarly journal is to be able to follow the conventions of the journal. And some people happen to be good at that, irrespective of the content of the papers being submitted.

I wrote this email to a colleague: Someone pointed me to this paper. It’s really bad. It was published by The Review of Environmental Economics and Policy, “the official journal of the Association of Environmental and Resource Economists and the European Association of Environmental and Resource Economists.” Is this a real organization? The whole thing […]

Chinese character frequency and entropy

Yesterday I wrote a post looking at the frequency of Koine Greek letters and the corresponding entropy. David Littleboy asked what an analogous calculation would look like for a language like Japanese. This post answers that question. First of all, information theory defines the Shannon entropy of an “alphabet” to be bits where pi is […]

three birthdays and a numeral

The riddle of the week on The Riddler was to find the size n of an audience for at least a 50% chance of observing at least one triplet of people sharing a birthday, as is the case in the present U.S. Senate. The question is much harder to solve than for a pair of […]

“Here’s an interesting story right in your sweet spot”

Jonathan Falk writes: Here’s an interesting story right in your sweet spot: Large effects from something whose possible effects couldn’t be that large? Check. Finding something in a sample of 1024 people that requires 34,000 to gain adequate power? Check. Misuse of p values? Check Science journalist hype? Check Searching for the cause of an […]

Greek letter frequency and entropy

Would the letters in an ancient Greek text carry more or less information than in modern English? To address this question, I downloaded a copy of the Greek New Testament from Project Gutenberg and ran the word frequency script from my previous post. This lead to the follow table of letters and percent frequency. α […]

stochastic magnetic bits, simulated annealing and Gibbs sampling

A paper by Borders et al. in the 19 September issue of Nature offers an interesting mix of computing and electronics and optimisation. With two preparatory tribunes! One [rather overdone] on Feynman’s quest. As a possible alternative to quantum computers for creating probabilistic bits. And making machine learning (as an optimisation program) more efficient. And […]

File character counts

Once in a while I need to know what characters are in a file and how often each appears. One reason I might do this is to look for statistical anomalies. Another reason might be to see whether a file has any characters it’s not supposed to have, which is often the case. A few […]

The status-reversal heuristic

Awhile ago we came up with the time-reversal heuristic, which was a reaction to the common situation that there’s a noisy study, followed by an unsuccessful replication, but all sorts of people want to take the original claim as the baseline and construct high walls to make it difficult to move away from that claim. […]

My talk on visualization and data science this Sunday 9am

Uncovering Principles of Statistical Visualization Visualizations are central to good statistical workflow, but it has been difficult to establish general principles governing their use. We will try to back out some principles of visualization by considering examples of effective and ineffective uses of graphics in our own applied research. We consider connections between three goals […]

Le Monde puzzle [#1114]

Another very low-key arithmetic problem as Le Monde current mathematical puzzle: 32761 is 181² and the difference of two cubes, which ones? And 181=9²+10², the sum of two consecutive integers. Is this a general rule, i.e. the root z of a perfect square that is the difference of two cubes is always the sum of […]