H_{0}: All non-constant
coefficients in the regression equation are zero

and alternate hypothesis

H_{a}: At least one of the
non-constant coefficients in the regression equation is non-zero.

More explicitly, if Y is the response variable and the predictors are X

(*) E(Y|X_{1}, X_{2},
... , X_{m}) = β_{0} + β_{1} X_{1}+
β_{2} X_{2}+ ... + β_{m}X_{m}

then the null and alternate hypotheses for this F-test are

H_{0}: β_{1}
= β_{2} = ... = β_{m}= 0

andH_{a}: At least one
of β_{1},
β_{2} , ... or β_{m} is non-zero.

Misinterpreting the output for this hypothesis test is a common mistake in regression. Two types of mistakes are common here.

First type of mistake: Assuming that if the output for this hypothesis test has a small p-value, then the regression equation fits the data well.

Second type of mistake: Assuming that if the output for this hypothesis test does not show statistical significance, then Y does not depend on the variables X

Both mistakes are based on neglecting a model assumption -- namely, the assumption expressed by (*): that the conditional mean E(Y|X

Examples of each type of mistake:

1. The following graph shows DC output vs. wind speed for a windmill.

E(DC output|wind speed) = β_{0}
+ β_{1}×(wind speed)

gives overall F-statistic 160.257 with 1 degree of freedom, and corresponding p-value 7.5455E-12, which is certainly statistically significant.

However, the data clearly have a curved pattern; thus a model equation expressing a suitable curved relationship will fit better than a linear model equation. (For a good way to do this, see Example 3 of Overinterpreting High R

Of course, in a case with several predictor variables, it is typically difficult (if not impossible) to tell in advance whether or not a linear model fits. Thus, unless there is other evidence that a linear model does fit, all that a statistically significant F-test can say is that the data give evidence that the best-fitting linear model of the type specified has at least one predictor with a non-zero coefficient.

One method that sometimes works to get around this problem is to (attempt to) transform the variables to have a multivariate normal distribution, then work with the transformed variables. This will ensure that the conditional means are a linear function of the transformed explanatory variables, no matter which subset of explanatory variables is chosen. Such a transformation is sometimes possible with some variant of a Box-Cox transformation procedure. See, e.g., pp. 236 and 324 - 329 of Cook and Weisberg's text

2. The following graph shows data and the computed regression line.

Notes:

1. In the expresion used above, E(Y|X

Y = β_{0} + β_{1}
X_{1}+
β_{2} X_{2}+ ... + β_{m}X_{m}
+ ε

or asy_{i} = β_{0} +
β_{1} x_{i1}+
β_{2} x_{i2}+ ... + β_{m}x_{im}
+ ε_{i
}

2. Cook and Weisberg (1999) Applied
Regression Including Computing and
Graphics, Wiley.Last updated June 13, 2014