The daily polls report vote percentages like 48-45 or 49-47, leaving 3-6 percent of the votes undecided. Most Americans, we're told, made up their minds long ago. What excitement! No doubt this will be another squeaker!
Well, it ain't necessarily so.
If the electorate were really so predetermined, why have we seen such variation in polling numbers? In major polls since June, Bush's percentage has ranged from as low as 41% to as high as 51%. Kerry's has been at 42% and at 54%. In August alone, Bush's numbers have ranged from 43-50% and Kerry's from 44-52%. Where does all this movement come from if most of the voters have already decided?
There are several answers, each of which tells part of the story.
- Margin of error. The least significant explanation is the well-known margin of error. Ranging from 3-5 percentage points depending on the sample size, this indicates how representative each poll sample is likely to be of the electorate as a whole. In other words, if you draw 500-1000 red or blue marbles out of a vat of 100 million, the laws of probability indicate that you'll likely be within 3-5 percentage points of the actual distribution of red and blue marbles in the vat.
The margin of error is inherent to the science of polling. By pure chance, some polls will come out high and others low. The results of a single poll are thus not necessarily a reliable indicator of changes in public opinion.
Why isn't this a sufficient explanation? Because the variations we see in polls are not limited to minor fluctuations within the margin of error. The margin of error explains how one poll can report significantly different results from others taken under similar circumstances. However, we see clear changes in polling trends over time, which are reflected in multiple polls simultaneously.
In mid-July, Kerry consistently led the polls by 3-5 percentage points; now Bush leads by 0-3 points. These sorts of shifts have nothing to due with the margin of error.
- Sampling problems. Polling is not as simple as picking people at random and asking them questions. How do you select people? By phone number? What if some households have three phones for one voter, and others have three voters for one phone? What if these ratios vary by state, or neighborhood, or political preference? What about people who refuse to participate - that doesn't mean they don't vote.
Pollsters go to great lengths to select their samples in a representative manner, or to adjust them afterwards to match the demographics of the voting population. But they don't often say much about what techniques they use, and none of the uncertainty involved in this process is counted as part of the margin of error.
- Voter turnout. Especially in the US, with its low voter turnout, a major factor in election results is which voters actually decide to show up on election day. This is notoriously hard to predict. It's much easier to say you're going to vote on the telephone than to actually get out to the polling booth.
- Leaners. Finally, the headline poll results don't reveal the biggest secret: they include undecided voters. Generally, the polltaker will ask, "If the election were held today, who would you vote for?" Then, if the response is "I'm not sure," they'll follow up with "Which way are you leaning?" Both the definites and the leaners are reported as supporting their respective candidates.
Take the August Los Angeles Times poll, for example. It reported Bush over Kerry 49%-46%, with 5% "don't know". But when asked whether they're certain of their votes or might vote for someone else, 16% of those who had supported a candidate said they might change their minds. So how many "don't know"s are there really - 5% or 20% (the original don't knows plus those who might change their minds)?
Pollsters might say that most of the leaners will generally end up voting for the candidate they now prefer. But maybe they won't. Isn't that what they mean when they say they might change their minds?
So when you see poll results reported, remind yourself: About 20% aren't sure of their votes yet. Those who are might not turn out to cast them. And how exactly were the polling samples collected and adjusted?
Only then, keep in mind the margin of error.
Update (31 Aug.): T. Bevan of RealClearPolitics writes, "The Bush campaign estimates the undecided vote at about 7%. [Republican strategist Matthew] Dowd says the number of "true" undecideds is probably half that, about 3 or 4 percent."