**Excursion 3 Statistical Tests and Scientific Inference**

Tour I Ingenious and Severe Tests

[T]he impressive thing about [the 1919 tests of Einstein’s theory of gravity] is the risk involved in a prediction of this kind. If observation shows that the predicted effect is definitely absent, then the theory is simply refuted.The theory is incompatible with certain possible results of observation – in fact with results which everybody before Einstein would have expected. This is quite different from the situation I have previously described, [where] . . . it was practically impossible to describe any human behavior that might not be claimed to be a verification of these [psychological] theories. (Popper 1962, p. 36)

The 1919 eclipse experiments opened Popper’ s eyes to what made Einstein’ s theory so different from other revolutionary theories of the day: Einstein was prepared to subject his theory to risky tests.[1] Einstein was eager to galvanize scientists to test his theory of gravity, knowing the solar eclipse was coming up on May 29, 1919. Leading the expedition to test GTR was a perfect opportunity for Sir Arthur Eddington, a devout follower of Einstein as well as a devout Quaker and conscientious objector. Fearing “ a scandal if one of its young stars went to jail as a conscientious objector,” officials at Cambridge argued that Eddington couldn’ t very well be allowed to go off to war when the country needed him to prepare the journey to test Einstein’ s predicted light deflection (Kaku 2005, p. 113).

The museum ramps up from Popper through a gallery on “ Data Analysis in the 1919 Eclipse” (Section 3.1) which then leads to the main gallery on origins of statistical tests (Section 3.2). Here’ s our Museum Guide:

According to Einstein’ s theory of gravitation, to an observer on earth, light passing near the sun is deflected by an angle, λ , reaching its maximum of 1.75″ for light just grazing the sun, but the light deflection would be undetectable on earth with the instruments available in 1919. Although the light deflection of stars near the sun (approximately1 second of arc) would be detectable, the sun’ s glare renders such stars invisible, save during a total eclipse, which “ by strange good fortune” would occur on May 29, 1919 (Eddington [1920] 1987, p. 113).

There were three hypotheses for which “ it was especially desired to discriminate between” (Dyson et al. 1920 p. 291). Each is a statement about a parameter, the deflection of light at the limb of the sun (in arc seconds): λ = 0″ (no deflection), λ = 0.87″ (Newton), λ = 1.75″ (Einstein). The Newtonian predicted deflection stems from assuming light has mass and follows Newton’ s Law of Gravity. The difference in statistical prediction masks the deep theoretical differences in how each explains gravitational phenomena. Newtonian gravitation describes a force of attraction between two bodies; while for Einstein gravitational effects are actually the result of the curvature of spacetime. A gravitating body like the sun distorts its surrounding spacetime, and other bodies are reacting to those distortions.

**Where Are Some of the Members of Our Statistical Cast of Characters in 1919?** In 1919, Fisher had just accepted a job as a statistician at Rothamsted Experimental Station. He preferred this temporary slot to a more secure offer by Karl Pearson (KP), which had so many strings attached – requiring KP to approve everything Fisher taught or published – that Joan Fisher Box writes: After years during which Fisher “ had been rather consistently snubbed” by KP, “It seemed that the lover was at last to be admitted to his lady’ s court – on conditions that he first submit to castration” (J. Box 1978, p. 61). Fisher had already challenged the old guard. Whereas KP, after working on the problem for over 20 years, had only approximated “ the fi rst two moments of the sample correlation coefficient; Fisher derived the relevant distribution, not just the first two moments” in 1915 (Spanos 2013a). Unable to fight in WWI due to poor eyesight, Fisher felt that becoming a subsistence farmer during the war, making food coupons unnecessary, was the best way for him to exercise his patriotic duty.

In 1919, Neyman is living a hardscrabble life in a land alternately part of Russia or Poland, while the civil war between Reds and Whites is raging. “ It was in the course of selling matches for food” (C. Reid 1998, p. 31) that Neyman was first imprisoned (for a few days) in 1919. Describing life amongst “roaming bands of anarchists, epidemics” (ibid., p. 32), Neyman tells us,“existence” was the primary concern (ibid., p. 31). With little academic work in statistics, and “ since no one in Poland was able to gauge the importance of his statistical work (he was ‘sui generis,’ as he later described himself)” (Lehmann 1994, p. 398), Polish authorities sent him to University College in London in 1925/1926 to get the great Karl Pearson’ s assessment. Neyman and E. Pearson begin work together in 1926. Egon Pearson, son of Karl, gets his B.A. in 1919, and begins studies at Cambridge the next year, including a course by Eddington on the theory of errors. Egon is shy and intimidated, reticent and diffi dent, living in the shadow of his eminent father, whom he gradually starts to question after Fisher’ s criticisms. He describes the psychological crisis he’ s going through at the time Neyman arrives in London: “ I was torn between conflicting emotions: a. finding it difficult to understand R.A.F., b. hating [Fisher] for his attacks on my paternal ‘ god,’ c. realizing that in some things at least he was right” (C. Reid 1998, p. 56). As far as appearances amongst the statistical cast: there are the two Pearsons: tall, Edwardian, genteel; there’ s hardscrabble Neyman with his strong Polish accent and small, toothbrush mustache; and Fisher: short, bearded, very thick glasses, pipe, and eight children. Let’ s go back to 1919, which saw Albert Einstein go from being a little known German scientist to becoming an international celebrity.

- You will recognize the above as echoing Popperian “theoretical novelty” – Popper developed it to fit the Einstein test.

…To read further see *Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars* (CUP, 2018)

**Excursion 3: Statistical Tests and Scientific Inference**

**Tour I Ingenious and Severe Tests 119**

** YOU**

3.1 Statistical Inference and Sexy Science: The 1919

Eclipse Test 121

3.2 N-P Tests: An Episode in Anglo-Polish Collaboration 131

3.3 How to Do All N-P Tests D (and more) While

a Member of the Fisherian Tribe 146