(This article was originally published at Statistical Modeling, Causal Inference, and Social Science, and syndicated at StatsBlogs.)
Last week I posted skeptical remarks about Ron Unz’s claim that Harvard admissions discriminate in favor of Jews. The comment thread was getting long enough there that I thought it most fair to give Unz a chance to present his thoughts here as a new post. I’ve done that before in cases where I’ve disagreed with someone and he wanted to make his views clear. I will post Unz’s email and my brief response.
This is what Unz wrote to me:
Since there’s been a great deal of dispute over the numerator and the denominator, it might be useful for each of us should provide our own estimate-range of what we believe are the true figures, and the justification. Perhaps if our ranges actually overlap substantially, then we don’t really disagree much after all. I’d think if you’ve been reading most of the endless comments and refreshing your memory about my claims, you’ve probably now developed your own mental model about the likely reality of the values whereas initially you may have simply been questioning my own numbers or my methodology.
I’ll be glad to start. Based on my detailed analysis of the NMS semifinalist lists, I’d feel pretty confident that the national percentage of Jewish students is within the range 5.5-7.0%. My greatest irritation was that despite considerable effort I never managed to locate lists from NJ or CT, which have large, academically-elite Jewish populations. But the number of NY Jews is 2.5x larger, and unless the NJ/CT Jews dramatically outperform their NY cousins, I just can’t see the national total breaking out of my range. Meanwhile, the non-Jewish white percentage remains 65-70%. Obviously, questions can be raised about whether NMS semifinalist numbers are the best numerator to use as a high-performance proxy, but since SAT distributions aren’t available, I just can’t think of any better one.
The numerator is the percentage of Jews enrolled at Harvard and the Ivies, and the heated dispute there has been a total surprise to me. None of the colleges make their enrollment lists publicly available, so if we don’t use the Hillel figures at least in some modified form, I’m just not sure what we can use instead. I would never claim that the Hillel figures are precisely accurate—I emphasized the uncertainly in my text—but I just doubt they’re wildly inaccurate either.
Let’s take Harvard, which Hillel claims is 25% Jewish. My suspicion is that ethnic advocacy organizations always tend to exaggerate their numbers, so I’d regard the 25% as an upper bound, and could easily see the true figure being as low as 20%. Thus, my plausible range would be 20-25%, with similar sorts of ranges for the other Ivies. But I’d be pretty skeptical of whether the Hillel numbers were inflated by more than about 25% or so (20% => 25%).
Here’s my reasoning. According to the Hillel numbers and the official racial data, Jews constitute between one-half and two-thirds of all the white Americans enrolled at each of the Ivies except for Princeton and Dartmouth. Indeed, I found a reference on the College Confidential discussion forum to a 2012 Harvard Crimson article making the exaggerated claim that 3/4ths of all the whites at Harvard were Jewish, though unfortunately I haven’t yet managed to locate a copy.
These are huge fractions and if the actual reality were totally different, surely *some* Jewish students would have realized that Hillel’s numbers were ridiculous and complained somewhere. If Hillel regularly claims that 60% of the white students at some college are Jewish, but the true figure were 30%, it’s difficult to believe no one would have noticed.
Therefore, if we focus strictly on Harvard, my plausible range over the last few years would be 20-25% Jewish and 24-29% non-Jewish white (assuming the Race Unknown category is split 50-50 white and Asian), with the total Ivy figures following a similar pattern.
So the ranges I get are Jews as 5.5-7.0% of top performing students with NJWs at 65-70%, while the Harvard ranges are 20-25% for Jews and 24-29% for NJWs. Thus, the range of “raw” Jewish over-representation relative to high-performing NJW students is between 540% and 1200%, while the range across the entire Ivy League would be between 420% and 950%. It’s perfectly possible that I’ve made a stupid calculational mistake, so you might want to check these derived figures.
These are “raw” over-representation percentages, and we must obviously adjust for the significant impact of geographical skew, legacy effects, athletic admissions, and various other things, some of which would certainly reduce them. But I’d argue these raw figures are so enormous, it’s difficult to see how they wouldn’t still remain very sizable even after any reasonable provision is made for those factors. As I think I mentioned, during the late 1980s an Ivy “admissions anomaly” of 20- 30% for Asians was considered such strong evidence of discrimination that the Federal government launched an investigation.
Now it’s perfectly possible that your own “raw” over-representation estimates might not be too far from my own, and you might just believe that they could be completely accounted for by those various adjustment factors, which are somewhat difficult to quantify. But in that case, our disagreement would then shift into an entirely different area, and the current dispute would have been largely resolved.
As I repeatedly emphasized throughout my paper, it was the sheer magnitude of the anomaly that persuaded me it was real rather than merely an artifact due to a combination of underlying measurement errors.
Anyway, I always prefer dispassionate quantitative analysis to angry exchanges in comment threads, though I admit I may sometimes fall into the latter if I lose my temper. So if you would like to provide me what you think are the plausible ranges for the Jewish numerator (high-ability students nationwide) and denominator (Jewish Harvard/Ivy enrollments), perhaps we can begin to isolate and resolve the nature of our possible disagreement.
I don’t want to write a long response because I pretty much said it all in last week’s blog post, but briefly:
I have not directly studied these issues. I have done some work on name scale-up methods (there is a brief article on the topic in yesterday’s New York Times) but not on Jews in particular. I have no reason to believe that the factors of 12 and 20 from the Weyl method are correct. I’m not saying they’re wrong, I just have no particular reason to trust them. Nor do I know anything about how Hillel counts their numbers.
So I can’t supply my own estimates. All I can say is that, according to the person who sent me that email, if the Weyl method is applied to Harvard undergraduates, it gives an estimate of 10-11%, and if it is applied to NMS scholars, it is something close to that (whatever you get by taking the appropriate weighted average of 9-14% from Massachusetts, 24% from New York, 14-21% from Pennsylvania, etc). That’s what seems to happen if the same method is used to estimate both numbers. But I have no idea what the actual numbers are. The only thing we seem pretty sure of are the Putnam and Olympiad students because Janet Mertz asked them directly.
So I remain skeptical of Unz’s claims—the direct comparisons I’ve seen don’t seem to support them—but I wanted to give him a chance to present things here from his perspective.
P.S. Unz remarks elsewhere notes that I referred to him as “a ‘political activist’ who used ‘sloppy counting.’” He characterizes those as “insults.” I don’t think these are insulting! First off, it’s not an insult at all to call someone a political activist. That’s what Unz does! He’s run for office, he’s funded political campaigns, he bought a political magazine. There’s nothing wrong with being a political activist. It’s a noble calling. As to the second phrase, I agree that “sloppy counting” could be an insult in some settings but it wasn’t intended as such. It remains a mystery exactly how Unz came up with the claim that over 40% of Math Olympiad participants in the 1970s were Jewish while only counting 2.5% from the 2000s. Such sloppy counting makes a difference: it leads to an impression of a dramatic decline in Jewish performance in this area, while the best estimates from an expert in the area is that the decline is a factor of 2 rather than a factor of 15. The sloppiness in the counting comes from the use of an undefined criterion for classification which allows unintended bias to creep in. As discussed above, there was also the big mistake of incompatible numerator and denominator in comparing Harvard students to National Merit Scholar semifinalists, but I wouldn’t quite call this sloppiness, it was more of a mistake that arose because of an unexamined assumption from combining two different data sources. We all make unexamined assumptions all the time. If “sloppy counting” is too rude, let me replace by “inaccurate counting.” Speaking retroactively, I think any count that is off by a factor of 5 is pretty sloppy, but maybe that’s a judgment call. I’m happy to just call it “inaccurate” and remove any perception of insult.
Unz also writes, “I find it highly intriguing that although Gelman chose not to substantially engage the 1,000 word framework of my statistical analysis that I offered him for constructive mutual dialogue.” Just to repeat what I wrote above, I don’t want to write a long response because I pretty much said it all in last week’s blog post. A lack of point-by-point argument does not mean I agree with Unz’s claims; what it means is that I think it’s only fair to allow him to present his claims clearly in one place on this blog, so people can see right here what he has to say without having to sift through blog comments.
Also let me repeat what I wrote in a comment, that I do not view this as a “fight” or a debate. I have not directly studied these issues and would not want to imply otherwise.
Finally, let me emphasize that Unz’s statistical mistakes do not necessarily mean that all of his ideas are wrong or meaningless. There certainly have been large demographic changes in the United States in recent decades, and the result is increasing academic competition. These things are worth studying.
Please comment on the article here: Statistical Modeling, Causal Inference, and Social Science