COMMON MISTEAKS MISTAKES IN USING STATISTICS: Spotting and Avoiding Them

- Confidence levels should be chosen before
analyzing the data -- preferably before gathering the data.
^{1} - The choice should be based on the consequences of
having a large or small confidence level.

Example
1:
Two drugs are being compared for effectiveness in treating
the same condition. Drug 1 is very affordable,
but Drug 2 is extremely expensive. The
confidence interval is for (proportion of patients for whom Drug 2 is
effective) - (proportion of patients for whom drug 1 is
effective). If 0 is not
in the confidence interval, Drug 2 will be deemed more effective and
will be recommended over Drug 1, resulting in much greater cost for the
patient than if Drug 1 were used. If 0 is in the confidence interval,
this will be taken as of evidence of equal effectiveness, and the less
expensive Drug 2 will be recommended. From the patient's perspective,
this would be a serious consequence. Thus the patient would consider it
important to have a high level of confidence. (Note that a high
confidence level corresponds to a small significance level; cf Example
1 in Type I and II Errors.)

^{}

Example
2:
Two drugs are known to be equally effective for a certain
condition. They are also each equally affordable. However, there is
some suspicion that Drug 2 causes a serious side-effect in some
patients, whereas Drug 1 has been used for decades with no reports of
serious side effects. The confidence interval is for (incidence of side
effect in Drug 2) - (incidence of side effect in Drug 1). If 0 is in
the confidence interval, the investigators will decide that the
incidence of the side effects is the same for both drugs. The higher
the confidence level, the larger the confidence interval, so the more
likely that this decision will be made. If this is the wrong decision,
the results could be serious for the patient. Thus the patient would
consider a lower confidence level to be preferable to a higher one.
(Note that a low confidence level corresponds to a large significance
level; cf Example 2 in Type I and II Errors.)

- Sometimes there may be serious consequences of each alternative, so some compromises or weighing priorities may be necessary.
- Sometimes different stakeholders have different interests that compete (e.g., in the second example above, the developers of Drug 2 might prefer to have a higher confidence level.)
- See the discussion of Power for more discussion of deciding on significance level.
- These are essentially the same considerations as are
involved in setting significance levels for hypothesis tests (See
Type
I and II Errors).

1. There are (at least) two reasons why this is important. First, the confidence level desired is one criterion in deciding on an appropriate sample size. (See Power for more information.) Second, if more than one confidence interval will be calculated, additional considerations need to be taken into account. (See Multiple Inference for more information.)