If you follow the political news then you probably have come across discussion of poll results that are within or beyond the “margin of error”. The margin of error is a statistic associated with the poll; the results reported in the newspapers typically include it in their fine print down towards the bottom, and occasionally the pundits even mention it. But what exactly is it? Is it really that important? And what is the right way to make use of it? Read on — as little or as much as you’d like — for an explanation.
The short version
Polling involves recruiting a random sample and recording their answers to the poll questions. The results are usually reported as precise values, which give us an estimate of the population’s views. But the sample is only a subset of the population, and that estimate will have some amount of error.
The margin of error lets us estimate a range, within which we can be reasonably confident the population’s views actually fall. The sample values, our best estimate, are in the middle of that range, but the range extends above and below that point by the margin of error. In other words, we estimate that the population’s real support for any given polling response are within one margin of error above or below the percentage response in the poll’s sample.
The margin of error should be taken into account whenever we want to use polls to make inferences about public opinion or changes in political sentiment. When margins of error are not considered we are left vulnerable to misinterpretations and misrepresentations of the poll’s findings. We might see differences and trends where nothing is really happening. Or we might see a Narrowing [TM] of a traditional gap between parties when it could just be an effect of the samples selected in the latest poll. The margin of error reminds us that refining our knowledge requires replication and the search for patterns, rather than just plucking a single, neat number derived from a relatively small group of people and treating it as gospel truth.
The example
It can help to have a concrete example to illustrate the concepts as we go. Let’s start, Dennis Shanahan-style, with the only number that (sometimes) matters — the question of preferred Prime Minister. Newspoll asks the question as, “Who do you think would make the better PM?” Respondents have three choices: Julia Gillard, Tony Abbott, or uncommitted. The last available Newspoll data comes from early December 2010 — in that poll, the responses from a sample of 1123 randomly selected voters were 52% to Gillard, 32% to Abbott and 16% uncommitted (the Newspoll data are in this 4.21MB[?!] PDF).
The concept
Polls — along with most other research involving human participants — are conducted by measuring the responses of a random sample of people to the poll’s questions. But the aim is to use the sample to draw conclusions about the attitudes of the population as a whole. We don’t just want to know about the 1123 people who answered the questions — we want to use those people’s responses to infer what Australian voters on the whole think.
The statistics associated with a sample — for instance, the percentage of people who choose a given response — provide an estimate of the corresponding parameters in the population. But there will be some amount of sampling error. In other words, because we only have responses from a sample, our statistics are unlikely to be a perfect representation of the true results we would find if we polled the entire population. The sample might have happened to select too many people who think of Gillard as the better PM relative to the population, or too few.
So, while the exact percentage in our sample might be the best estimate we can make about the population, we also need to recognise that there is some degree of uncertainty about the true value. The margin of error indicates just how much uncertainty there is.
Now, let’s look at the logic behind estimating the margin of error (NB: if you feel you got the concept from the previous section and want to skip the logical and computational details, use the link below and then select a later section).
*This post originally appeared on David Mallard’s blog — click here to read the full version.
Thanks David for adding to the understanding of polling. Margins of error regularly go unexplained or underexplained in Australian media. Beyond the range of margin of error, I would also hope that Australian editors would devote some ink to explain the degree of confidence researchers have in the results and what that would mean if the entire population were surveyed.
In Canada, there is a convention that is used across the industry and the media. The words are: “A survey with a sample of this size and a 100% response rate would have an estimated margin of error of +/-2.6 percentage points 19 times out of 20 of what the results would have been had the entire population of adults in Canada been polled.” Alternately, editors could say that the result is “predicted to be replicated 95% of the time.” Either way, it would help readers understand poll results better and protect all of us from poor quality (small sample size) polling.
In a society in which ‘academic’ research is routinely reported as if correlation means the same as causation, don’t margins of error, while important, seem down the list of ‘need to know’ items?