## Harking, Sharking, Tharking

Bert Gunter writes: You may already have seen this [“Harking, Sharking, and Tharking: Making the Case for Post Hoc Analysis of Scientific Data,” John Hollenbeck, Patrick Wright]. It discusses many of the same themes that you and others have highlighted in the special American Statistician issue and elsewhere, but does so from a slightly different […]

## “Superior: The Return of Race Science,” by Angela Saini

“People so much wanted the story to be true . . . that they couldn’t look past it to more mundane explanations.” – Angela Saini, Superior. I happened to be reading this book around the same time as I attended the Metascience conference, which was motivated by the realization during the past decade or so […]

## Collatz conjecture skepticism

The Collatz conjecture asks whether the following procedure always terminates at 1. Take any positive integer n. If it’s odd, multiply it by 3 and add 1. Otherwise, divide it by 2. For obvious reasons the Collatz conjecture is also known as the 3n + 1 conjecture. It has been computationally verified that the Collatz […]

## Feature-based time series analysis

In my last post, I showed how the feasts package can be used to produce various time series graphics.
The feasts package also includes functions for computing FEatures And Statistics from Time Series (hence the name). In this post I will give three exa…

## Practical Data Science with R update

Just got the following note from a new reader: Thank you for writing Practical Data Science with R. It’s challenging for me, but I am learning a lot by following your steps and entering the commands. Wow, this is exactly what Nina Zumel and I hoped for. We wish we could make everything easy, but … Continue reading Practical Data Science with R update

## Le Monde puzzle [#1110]

A low-key sorting problem as Le Monde current mathematical puzzle: If the numbers from 1 to 67 are randomly permuted and if the sorting algorithm consists in picking a number i with a position higher than its rank i and moving it at the correct i-th position, what is the maximal number of steps to […]

## “Boston Globe Columnist Suspended During Investigation Of Marathon Bombing Stories That Don’t Add Up”

I came across this news article by Samer Kalaf and it made me think of some problems we’ve been seeing in recent years involving cargo-cult science. Here’s the story: The Boston Globe has placed columnist Kevin Cullen on “administrative leave” while it conducts a review of his work, after WEEI radio host Kirk Minihane scrutinized […]

## two positions at UBC

A long-time friend at UBC pointed out to me the opening of two tenure-track Assistant Professor positions at the Department of Statistics at the University of British Columbia, Vancouver, with an anticipated start date of July 1, 2020 or January 1, 2021. The deadline for applications is October 18, 2019. Statistics at UBC is an […]

## the three i’s of poverty

Today I made a “quick” (10h door to door!) round trip visit to Marseille (by train) to take part in the PhD thesis defense (committee) of Edwin Fourrier-Nicolaï, which title was Poverty, inequality and redistribution: an econometric approach. While this was mainly a thesis in economics, meaning defending some theory on inequalities based on East […]

## I think that science is mostly “Brezhnevs.” It’s rare to see a “Gorbachev” who will abandon a paradigm just because it doesn’t do the job. Also, moving beyond naive falsificationism

Sandro Ambuehl writes: I’ve been following your blog and the discussion of replications and replicability across different fields daily, for years. I’m an experimental economist. The following question arose from a discussion I recently had with Anna Dreber, George Loewenstein, and others. You’ve previously written about the importance of sound theories (and the dangers of […]

## email footprint

While I was wondering (im Salzburg) at the carbon impact of sending emails with an endless cascade of the past history of exchanges and replies, I found this (rather rudimentary) assessment  that, while standard emails had an average impact of 4g, those with long attachments could cost 50g, quoting from Burners-Lee, leading to the fairly […]

## Deterministic thinking (“dichotomania”): a problem in how we think, not just in how we act

This has come up before: – Basketball Stats: Don’t model the probability of win, model the expected score differential. – Econometrics, political science, epidemiology, etc.: Don’t model the probability of a discrete outcome, model the underlying continuous variable – Thinking like a statistician (continuously) rather than like a civilian (discretely) – Message to Booleans: It’s […]

## String interpolation in Python and R

One of the things I liked about Perl was string interpolation. If you use a variable name in a string, the variable will expand to its value. For example, if you a variable \$x which equals 42, then the string “The answer is \$x.” will expand to “The answer is 42.” Perl requires variables to […]

## open reviews

When looking at a question on X validated, on the expected Metropolis-Hastings ratio being one (not all the time!), I was somewhat bemused at the OP linking to an anonymised paper under review for ICLR, as I thought this was breaching standard confidentiality rules for reviews. Digging a wee bit deeper, I realised this was […]

## WVPlots 1.1.2 on CRAN

I have put a new release of the WVPlots package up on CRAN. This release adds palette and/or color controls to most of the plotting functions in the package. WVPlots was originally a catch-all package of ggplot2 visualizations that we at Win-Vector tended to use repeatedly, and wanted to turn into “one-liners.” A consequence of … Continue reading WVPlots 1.1.2 on CRAN

## My math is rusty

When I’m giving talks explaining how multilevel modeling can resolve some aspects of the replication crisis, I mention this well-known saying in mathematics: “When a problem is hard, solve it by embedding it in a harder problem.” As applied to statistics, the idea is that it could be hard to analyze a single small study, […]

## Detecting typos with the four color theorem

In my previous post on VIN numbers, I commented that if a check sum has to be one of 11 characters, it cannot detect all possible changes to a string from an alphabet of 33 characters. The number of possible check sum characters must be at least as large as the number of possible characters […]

## Vehicle Identification Number (VIN) check sum

A VIN (vehicle identification number) is a string of 17 characters that uniquely identifies a car or motorcycle. These numbers are used around the world and have three standardized formats: one for North America, one for the EU, and one for the rest of the world. Letters that resemble digits The characters used in a […]

## The uncanny valley of Malcom Gladwell

Gladwell is a fun writer, and I like how he plays with ideas. To my taste, though, he lives in an uncanny valley between nonfiction and fiction, or maybe I should say between science and storytelling. I’d enjoy him more, and feel better about his influence, if he’d take the David Sedaris route and go […]

## Gelman blogged our exchange on abandoning statistical significance

I came across this post on Gelman’s blog today: Exchange with Deborah Mayo on abandoning statistical significance It was straight out of blog comments and email correspondence back when the ASA, and significant others, were rising up against the concept of statistical significance. Here it is: Exchange with Deborah Mayo on abandoning statistical significance Posted […]