An op-ed piece in yesterday's New York Times by Charles Blow contains the following sentences:
There was one very telling (and virtually ignored) statistic in CNN’s exit poll data [on Proposition 8] that may shed some light: There were far more black women than black men, and a higher percentage of them said that they voted for the measure than the men. How wide was the gap? According to the exit poll, 70 percent of all blacks said that they voted for the proposition. But 75 percent of black women did. There weren’t enough black men in the survey to provide a reliable percentage for them. However, one can mathematically deduce that of the raw number of survey respondents, nearly twice as many black women said that they voted for it than black men.
This seems like a pretty striking finding, and indeed it forms the whole basis of the op-ed piece, which goes on to suggest reasons why black women would be far more likely than black men to support Proposition 8.
But Blow is performing a pretty iffy maneuver when he relies on "the raw number of survey respondents." In fact, I'm pretty sure this column by Blow deserves its own appendix in the next edition of Darrell Huff's classic book How to Lie with Statistics.
In the online version of his article, Blow links to CNN's exit poll data here. We find, in addition to the numbers mentioned by Blow, the fact that African Americans were 10% of the total sample -- 6% of the total sample were black women, and 4% were black men. The percentage of support for Proposition 8 among black men was not reported, presumably because CNN's pollsters considered the sample for that subgroup too small to draw to draw meaningful conclusions about the total population. But that's exactly what Blow proceeds to do.
First, Blow must have calculated (or "mathematically deduced" -- to use his somewhat more authoritative, not to say grandiose, phrase) the unreported percentage for Proposition 8 among black men in CNN's sample.
Black women were 6% of the total sample and 60% of the African American sample. Three-quarters of them voted for Proposition 8: that would be 4.5% (roughly -- there's always rounding error) of the total sample and 45% of the African American sample. So to get to a 70% level of support for all blacks, a fraction of black men amounting to 25% of the African American sample must have voted for the measure. Black men were 40% of the African American sample. And 25/40 is 62.5%, which is the proportion of black men in the sample who must have supported Proposition 8.
So, disregarding the fact that the pollsters felt the sample of black men was too small to draw conclusions from, we find that there was 75% support for Proposition 8 in the sample among black women, and only about 62.5% support among black men. Maybe it's an interesting difference, but would it look compelling as a basis for a whole column in The New York Times?
Perhaps not. Which is why Blow took his next weird and intellectually dubious step.
There were 3 black women in the sample for every 2 black men. 75% of 3 is 2.25, and 62.5% of 2 is 1.25. So 2.25 black women in the sample supported Prop 8 for every 1.25 black men who did. And 2.25 is almost twice 1.25. Ah, that magic almost twice! Now we're in business! Mr. Editor, I'm almost ready to file my column!
But notice the magnitude and chutzpah of Blow's misdirection. He's actually used the larger number of black women sampled by CNN's exit poll to magnify the difference between levels of support among the different sexes in the black sample. To fully grasp the degree of dishonesty involved, consider this: using the same methodology, even if black men in the sample had supported proposition 8 at the very same rate as black women (75%), Blow could still have claimed that, "of the raw number of survey respondents," more than 30% more black women than black men supported Proposition 8 (2.25 black women for every 1.5 black men). Or some such nonsense.
So Blow's column turns out to be based entirely on statistical slight-of-hand. The horrible thing, though, is that Blow is not alone in his misuse of statistics to justify his column inches. Astonishingly, the people who publish medical and epidemiological studies do something similar -- though not quite as flagrantly bad -- all the time.
In recent years, they've taken to using something called "relative risk." If they want to compare the proportions of their treatment and control groups who get a disease (for example), they don't just subtract one percentage from the other and report the difference in percentage points. They compare the two percentages as a ratio instead. And this offers wonderful possibilities for misleading a statistically unsavvy public.
For example, say that 1.2% of people who live near a chemical plant get cancer, but only 1% of people in the control group do. Now, that may be a statistically significant difference if the sample size is big enough. But it's a little misleading to say, as they do -- and as the drive-by media will eagerly report -- that the people who lived near the chemical plant had a 20% higher chance of getting cancer. It's literally true (1.2% is 20% greater than 1%) but it's misleading, because people will assume that the proportions of cancer victims in the two samples differed by something like 40% as against 60%. See here for more on the use and abuse of "relative risk" and the related concept of "odds ratios."
Of course, doctors and medical researchers don't do anything as grossly bad as to exploit a meaningless difference in raw numbers between treatment and control groups. They're generally more sophisticated than Times columnists, and Charles Blow could have learned from them. He could have taken the 75% and 62.5% levels of support for Prop 8 among black women and black men, respectively, and written that black women were 20% more likely than black men to support Proposition 8. In a way, Blow should get credit for his explicit reference to "the raw number of survey respondents." He was, at least, flagrant in his disingenuousness.
