**Excursion 3 Tour II: It’s The Methods, Stupid**

Tour II disentangles a jungle of conceptual issues at the heart of today’s statistics wars. **(3.4)** unearths the basis for a number of howlers and chestnuts thought to be licensed by Fisherian or N-P tests.* In each exhibit, we study the basis for the joke. Together, they show: the need for an adequate test statistic, the difference between implicationary (i assumptions) and actual assumptions, and the fact that tail areas serve to raise, and not lower, the bar for rejecting a null hypothesis. (Additional howlers occur in Excursion 3 Tour III)

*recommended: medium to heavy shovel *

Stop **(3.5)** pulls back the curtain on the view that Fisher and N-P tests form an incompatible hybrid. Incompatibilist tribes retain caricatures of F & N-P tests, and rob each from notions they need (e.g., power and alternatives for F, P-values & post-data error probabilities for N-P). Those who allege that Fisherian P-values are not error probabilities often mean simply that Fisher wanted an evidential not a performance interpretation. This is a philosophical not a mathematical claim. N-P and Fisher tended to use P-values in both ways. It’s time to get beyond incompatibilism. Even if we couldn’t point to quotes and applications that break out of the strict “evidential versus behavioral” split, we should be the ones to interpret the methods for inference, and supply the statistical philosophy that directs their right use.” (p. 181)

*strongly recommended: light to medium shovel, thick-skinned jacket*

In **(3.6)** we slip into the jungle. Critics argue that P-values are for evidence, unlike error probabilities, but then aver P-values aren’t good measures of evidence either, since they disagree with probabilist measures: likelihood ratios, Bayes Factors or posteriors. A famous peace-treaty between Fisher, Jeffreys & Bayes promises a unification. A bit of magic ensues! The meaning of error probability changes into a type of Bayesian posterior probability. It’s then possible to say ordinary frequentist error probabilities (e.g., type I & II error probabilities) aren’t error probabilities. We get beyond this marshy swamp by introducing subscripts 1 and 2. Whatever you think of the two concepts, they are very different. This recognition suffices to get you out of quicksand.

*required: easily removed shoes, stiff walking stick (review Souvenir M on day of departure)*

*Several of these may be found in searching for “Saturday night comedy” on this blog. In SIST, however I trace out the basis for the jokes.

**selected key terms and ideas **

Howlers and chestnuts of statistical tests

armchair science

Jeffreys tail area criticism

Limb sawing logic

Two machines with different positions

Weak conditionality principle (WCP)

Conditioning (see WCP)

Likelihood principle

Long run performance vs probabilism

Alphas and p’s

Fisher as behaviorist

Hypothetical long-runs

Freudian metaphor for significance tests

Pearson, on cases where there’s no repetition

Armour-piercing naval shell

Error probability_{1} and error probability _{2
}Incompatibilist philosophy (F and N-P must remain separate)

Test statistic requirements (p. 159)

**Please send me your list of key terms in the comments; typos would also be appreciated**

These are Tour Guide Mementos from Excursion 3 Tour II of Mayo (2018, CUP): Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars.

To see an excerpt from Excursion 3 Tour II (and “where you are” in the journey), see my last post.

For all excerpts and mementos (on this blog) from SIST (to Nov.30), see this post.