Misinterpretations of p-values

Misinterpretations and misuses of p-values

p-value = the probability of obtaining a test statistic at least as extreme as the one from the data at hand, assuming:

the model assumptions for the inference procedure used are all true, and

the null hypothesis is true, and

the random variable is the same (including the same population), and

the sample size is the same.

Notice that this is a conditional probability: The probability that something happens, given that various other conditions hold. One common misunderstanding is to neglect some or all of the conditions.¹

Example: Researcher 1 conducts a clinical trial to test a drug for a certain medical condition on 30 patients all having that condition. The patients are randomly assigned to either the drug or a look-alike placebo (15 each). Neither patients nor medical personnel know which patient takes which drug. Treatment is exactly the same for both groups, except for whether the drug or placebo is used. The hypothesis test has null hypothesis "proportion improving on the drug is the same as proportion improving on the placebo" and alternate hypothesis "proportion improving on the drug is greater than proportion improving on the placebo." The resulting p-value is p = 0.15.

Researcher 2 does another clinical trial on the same drug, with the same placebo, and everything else the same except that 200 patients are randomized to the treatments, with 100 in each group. The same hypothesis test is conducted with the new data, and the resulting p-value is p = 0.03.
Are these results contradictory? No -- since the sample sizes are different, the p-values are not comparable, even though everything else is the same. (In fact, a larger sample size typically results in a smaller p-value; see the discussion of power).

Another common misunderstanding of p-values is the belief that the p-value is "the probability that the null hypothesis is true". The basic assumption of frequentist hypothesis testing is that the null hypothesis is either true (in which case the probability that it is true is 1) or false (in which case the probability that it is true is 0).²

1. Neglecting the condition that the populations are the same results in extrapolation of the results, one form of over-interpretation.

2. In the Bayesian perspective, it makes sense to consider "the probability that the null hypothesis is true" as having values other than 0 or 1. In that perspective, we consider "states of nature;" in different states of nature, the null hypothesis may have different probabilities of being true. The goal is then to determine the probability that the null hypothesis is true, given the data. This is the reverse conditional probability from the one considered in frequentist inference (the probability of the data given that the null hypothesis is true).

This site is under construction. Please check back every few weeks for updates

COMMON MISTEAKS MISTAKES IN USING STATISTICS: Spotting and Avoiding Them

Introduction Types of Mistakes Suggestions Resources Table of Contents About

Misinterpretations and misuses of p-values