Breaking Down an Unfounded Election Fraud argument

How to make sense of a bogus election fraud claim that fails under common sense and statistical scrutiny.

Douglas Frank has toured the country alleging every county’s election results are marred by vote manipulation.

Frank’s claim is that election results are determined at the state-level, with an unnamed nefarious group stuffing ballots according to a state-level “key”. The key, according to Frank, determines turnout rates for registered voters grouped by age, 18 to 100, for all voters in the state. And this key-determined turnout rate is combined with information about registered voters in a county to determine the reported number of votes from each age group in each county.

Frank claims that because he can accurately predict turnout in every county in America, he has uncovered this manipulation. Specifically, in his presentations he shows that the correlation between his predicted turnout numbers—formed using the state-level turnout rate key multiplied by the number of registered voters of each age in a county— and actual turnout numbers is close to 1, the maximum possible value.

Frank’s claims fail for (at least) three reasons:

Predictability Doesn’t Imply Manipulation

Even if Frank could accurately predict turnout counts there is no reason to believe that predictability equals manipulation. After all, if the weather channel accurately forecasts the path of a hurricane, we would not conclude that a meteorologist is controlling the weather. While Frank regularly claims human behavior isn’t predictable, this also simply is not true. Given these basic logical failings, it isn’t surprising that Frank never shows that his test actually works—that it can correctly separate elections with manipulation from elections without manipulation.

There is a LOT of Variation in Age Group Turnout Rates in a State

If Douglas Frank were correct, every age group in a county would turnout at the same rate. But this is definitely not the case.

Consider a state with heavy Republican support, Tennessee.

Each Tennesse county has different turnout rates

This plot shows the turnout rate for each age from 18 to 100 in all of Tennessee’s 95 counties.

Each point is the turnout rate for a particular age group in a particular county.

And the red line shows the average turnout rate for the age groups across counties.

There is no statewide key

This is essentially Douglas Frank’s state-level “key” that he uses to predict the turnout rate in each county . If Doug Frank was correct, then all the points would cluster around this line.

But clearly this isn’t the case. There is a lot of variation across the counties. So not surprisingly, the state-level turnout rate isn’t going to be a perfect prediction of the turnout rate in Tennessee’s counties.

But if there is so much variability, how could Douglas Frank possibly claim to have such strong correlations between his predictions and the actual results? That’s because he performs a sleight of hand and examines the number of people who turnout to vote from an age group, rather than the turnout rate.

Frank’s focus on turnout counts causes him to overstate the predictive power of his voter fraud test. In fact, when predicting counts he reaches the unsurprising conclusion that age groups with more individuals have more people turnout to vote.

Had he focused on turnout rates—which is actually what his conspiracy theory implies should be predictable–instead of counts, he would have noticed his predictive power is much lower.

We can see this if we zoom in on a county in Tennessee: Union County.

We’ll use Frank’s procedure to estimate a state-level key for Tennessee (basically the red line in the previous plot).

This is his prediction of Union County’s turnout rate.

We can plot the actual turnout rate against this prediction. The 45-degree line shows where the points would fall if we made a perfect prediction. Clearly, the predicted turnout rate is different than the actual turnout rate. The predicted and actual only correlate at only about 0.57.

But if we plot the turnout count rather than the rate it magically appears that the relationship between the prediction and actual results is stronger.

Now the correlation is 0.99–a number that Frank says “ain’t natural.”

What’s going on here?

After all, it seems strange that a poorly predicted turnout rate could result in such a strong correlation for turnout counts. Frank’s test is not a test of voter fraud at all. When we correlate predicted and actual counts we’re only recovering the obvious fact that age groups with more registered voters will have more voters cast ballots than age groups with fewer registered voters.

The variation in the age group sizes overwhelms the correlation coefficient, causing it to increase and converge on 1. In other words, because age groups with large numbers of people will obviously cast more votes than age groups with small numbers of people, we will mechanically observe a very strong correlation between group size and the number of votes cast.
This is not suspicious in the least: the larger a group is, the more votes they will cast in an election, and the correlation will therefore be very high.

To understand this basic pattern, let’s consider a simple fictitious county where there is no vote manipulation. In our fictitious county every age group turns out at a different rate, but in our first election every age group has the same number of people—100 residents.

Using this fictitious county we follow Doug Frank’s procedure to estimate a predicted turnout rate for each group. Based on this prediction, we find that the predicted correlation correlates with the truth at 0.26.

Because every age group is the same size, we find the same correlation when we predict the number of people from each age group who vote, rather than their turnout rate.

But if there is variation in the number of people in each age group, the correlation between our predicted count of voters and the actual number of voters from each age group will appear stronger.

To see this, let’s return to our fictitious county and make one age group slightly bigger than the other age groups—but critically we leave all other age groups the same size.

The plot on the left shows the new relationship between the predicted and actual turnout counts as we increase the size of this single age group.

To get a sense of how increasing the size of the largest group changes the correlation between predicted and actual counts, the plot on the right shows how the correlation changes as we increase the size of the largest group.

The horizontal axis is the size of the largest group and the vertical axis is the correlation between predicted and actual turnout counts—Douglas Frank’s test of voter fraud.

The line tracks the evolution of the correlation as we change the size of the largest group and the point shows the correlation between predicted and actual turnout counts for the plot on the left.

Put simply, the plot on the right shows that Frank’s supposedly unnatural correlation can have quite natural origins. As the group size gets bigger, the correlation between the predicted and actual counts gets closer to 1, even though the underlying prediction about turnout rates remains identical–with a correlation at about 0.26.

The explanation is simple: making one group bigger results in more variation across age groups and this variation swamps the correlation.

Bottom Line: Douglas Frank’s supposed evidence of fraud is not really evidence at all.

His tests aren’t really tests, there is plenty of county to county variation, and his statistical analysis exaggerates his predictive ability.

Learn more:

You can find an analysis of turnout rates and counts for more than 2,800 counties.