^{COMMON
MISTEAKS
MISTAKES IN
USING STATISTICS: Spotting and Avoiding Them}

^{Introduction
    Types of Mistakes
    Suggestions
    Resources
    Table
of Contents
        About}

^{Expecting
Too Much Certainty}

'Statistics is the 'Science of Uncertainty",'
Noel Cressie and Christopher K. Wikle, Statistics for Spatio-Temporal Data, Wiley, 2011, p. 4

Uncertainty is all around us; we can't expect certainty. But uncertainty can often be "quantified" -- that is, we can talk about degrees of certainty or uncertainty. This is the idea of probability: a higher probability expresses a higher degree of certainty that something will happen.

Statistical techniques are designed to help us understand areas where uncertainty is present and can be quantified. Most statistical techniques are based on probability. W. Edwards Deming, a pioneer in the use if statistics in industry, said, "It is his knowledge and use of the theory of probability that distinguishes the statistician from the expert in chemistry, agriculture, bacteriology, medicine, production, consumer research, engineering, or anything else." ¹

Contemporary statistician Xiao-Li Meng reiterates and expands on this idea, using the words "randomness" and "variation" instead of uncertainty:

Statistics, in a nutshell, is a discipline that studies the best ways of dealing with randomness, or more precisely and broadly, variation. As human beings, we tend to love information, but we hate uncertainty -- especially when we need to make decisions. Information and uncertainty, however, are actually two sides of the same coin. If I ask you to go to the airport to pick up a student you have never met, my description of her is information only because there are variations; if everyone at the airport looks identical, my description has no value. On the other hand, the same variation causes uncertainty. If all I tell you is to pick up a Chinese female student ..., then my description is not informative enough because it still allows too many variations. There may be a substantial number of individuals at the airport who look like a Chinese female student. ²

Statistical techniques can't eliminate uncertainty, but can help us gain some knowledge despite it. They can help us see patterns through it, and help us quantify the certainty/uncertainty that the patterns are real and not just chance artifacts of our data or of our perception . The following quote from mathematics educator Alan Schoenfeld nicely expresses reasonable expectations in fields where statistics is likely to be applied:

Consider the theory of evolution, for example. Biologists are in general agreement with regard to its essential correctness, but the evidence marshalled in favor of evolution is quite unlike the kind of evidence used in mathematics or physics. There is no way to prove that evolution is correct in a mathematical sense; the arguments that support it consist of (to borrow the title of one of Pólya’s books) “patterns of plausible reasoning”, along with the careful consideration of alternative hypotheses. In effect, biologists have said the following: “We have mountains of evidence that are consistent with the theory, broadly construed; there is no clear evidence that falsifies the proposed theory, and no rival hypotheses meet the same criteria.” ³

In other words, in many areas, we can't expect certainty, or even anything approaching it, from a single study. But an accumulated body of evidence based on high quality research can give us a high degree of certainty. Working well in a field with high degrees of uncertainty requires patience and often humility while the mountains of evidence accumulate -- and might not turn out to support our pet theories. Statistician Howard Wainer said it well on the last page of one of his fascinating books on visual representations of data:

… to deal with uncertainty successfully we must have a kind of tentative humility. We need a lack of hubris to allow us to see data and let them generate, in combination with what we already know, multiple alternative working hypotheses. These hypotheses are then modified as new data arrive. The sort of humility required was well described by the famous Princeton chemist Hubert N. Alyea, who once told his class, “I say not that it is, but that it seems to be; as it now seems to me to seem to be.”⁴

In case you need more support to convince yourself or someone else to give uncertainty the respect it is due, here are some more quotes about uncertainty. And/or try the Radiolab episode on Stochasticity, or David Spiegelhalter's Times Online article and video on uncertainty in science, or David Aldous's Annotated list of contexts where we perceive chance, or Charles Seife's⁵ Edge article on Randomness.

For speculations by a neurologist (and engaging writer) on why we have so much difficulty accepting uncertainty, see Robert Burton's On Being Certain.⁶

Common Mistakes Arising from Not Taking Uncertainty Seriously Enough

One consequence of not taking uncertainty seriously enough is that authors often write results in terms that misleadingly suggest certainty. For example, some authors might conclude from a study that a hypothesis is true or has been proved, when it would be more correct to say that the evidence supports the hypothesis or is consistent with the hypothesis.

Another consequence is (mis)interpreting results of statistical analyses in a deterministic rather than probabilistic (also called stochastic) manner.

Discussion of Terminology: Variation, Variability, Uncertainty

1. Deming, W. Edwards, Walter A. Shewhart, 1891 - 1976, Amstat News, September, 2009, p. 19
2. Meng, Xiao-Li, Statistics: Your Chance for Happiness (or Misery), Amstat News, September, 2009, p. 43
3. Schoenfeld, Alan, Purposes and Methods of Research in Mathematics Education, Notices of the American Mathematical Society, v. 47, 2000, pp. 641 - 649. Available at http://www.ams.org/notices/200006/fea-schoenfeld.pdf
4. Wainer, Howard, Picturing the Uncertain World, Princeton University Press, 2009, p. 210.
5. Seife's book Proofiness: The Dark Arts of Mathematical Deception (Penguin, 2010) is also a good read.
6. Burton, Robert (2008). On Being Certain: Believing You Are Right Even When You're Not, St. Martin's Press.

Last updated Sept 25, 2011

COMMON MISTEAKS MISTAKES IN USING STATISTICS: Spotting and Avoiding Them

Expecting Too Much Certainty

Common Mistakes Arising from Not Taking Uncertainty Seriously Enough

^{COMMON
MISTEAKS
MISTAKES IN
USING STATISTICS: Spotting and Avoiding Them}

^{Expecting
Too Much Certainty}