1. Unfortunately, these
methods are typically better at telling you when the model assumption
does not fit than when it
does.

2. Different techniques have different model assumptions, so additional model checking plots may be needed; be sure to consult a good reference for the particular technique you are considering using.

General Rule of Thumb:
First check any independence assumptions, then any equal variance
assumption, then any assumption on distribution (e.g., normal) of
variables.2. Different techniques have different model assumptions, so additional model checking plots may be needed; be sure to consult a good reference for the particular technique you are considering using.

Rationale:
Techniques are usually least robust to departures from independence^{1}
and most robust to departures from normality^{2, 3}.

Suggestions and Guidelines for Checking Specific Model Assumptions

Checking for Independence

Independence assumptions are
usually formulated in terms of error terms rather than in terms of the
outcome variables. For example, in simple linear regression, the model
equation is

Y = α
+ βx + ε,

where Y is the outcome
(response) variable and ε
denotes
the error term (also a random variable). It is the error terms that are
assumed to be independent^{4}, not the values of the response
variable.

We do not know the values of the error terms ε, so we can only plot the residuals e_{i} (defined as the observed value y_{i}
minus the fitted value, according to the model),
which approximate the error terms.

Rule of Thumb: To check independence, plot residuals against any time variables present (e.g., order of observation), any spatial variables present, and any variables used in the technique (e.g., factors, regressors). A pattern that is not random suggests lack of independence.

We do not know the values of the error terms ε, so we can only plot the residuals e

Rule of Thumb: To check independence, plot residuals against any time variables present (e.g., order of observation), any spatial variables present, and any variables used in the technique (e.g., factors, regressors). A pattern that is not random suggests lack of independence.

Rationale:
Dependence on time or
spatial variables are common sources of lack of independence, but the
other plots might also detect lack of independence.

Comments:

1. Because time or spatial correlations are so frequent, it is important when making observations to record any time or spatial variables that could conceivably influence results. This not only allows you to make the residual plots to detect possible lack of independence, but also allows you to change to a technique incorporating additional time or spatial variables if lack of independence is detected in these plots.

2. Since it is known that the residuals sum to zero, they are not independent, so the plot is really a very rough approximation.

1. Because time or spatial correlations are so frequent, it is important when making observations to record any time or spatial variables that could conceivably influence results. This not only allows you to make the residual plots to detect possible lack of independence, but also allows you to change to a technique incorporating additional time or spatial variables if lack of independence is detected in these plots.

2. Since it is known that the residuals sum to zero, they are not independent, so the plot is really a very rough approximation.

Checking for Equal Variance

Plot residuals against fitted
values (in most cases, these are the estimated conditional means,
according to the model), since it is not uncommon for
conditional variances to depend on conditional means, especially to
increase as conditional means increase. (This would show up as a funnel
or megaphone shape to the residual plot.)

Caution: Hypothesis tests for equality of variance are often not reliable, since they also have model assumptions and are typically not robust to departures from these assumptions.

Caution: Hypothesis tests for equality of variance are often not reliable, since they also have model assumptions and are typically not robust to departures from these assumptions.

Checking for Normality or Other Distribution

Caution:
A histogram (whether of
outcome values or of residuals) is not
a good way to check for normality, since histograms of
the same data but using different bin sizes
(class-widths) and/or different cut-points between the bins may look
quite different. Example.

Instead, use a probability plot (also know as a quantile plot or Q-Q plot). Click here for a pdf file explaining what these are. Most statistical software has a function for producing these.

Caution: Probability plots for small data sets are often misleading; it is very hard to tell whether or not a small data set comes from a particular distribution.

Instead, use a probability plot (also know as a quantile plot or Q-Q plot). Click here for a pdf file explaining what these are. Most statistical software has a function for producing these.

Caution: Probability plots for small data sets are often misleading; it is very hard to tell whether or not a small data set comes from a particular distribution.

Checking for Linearity

When considering a simple
linear regression model, it is important to check the linearity
assumption -- i.e., that the conditional means of the response variable
are a linear function of the predictor variable. Graphing the response
variable vs the predictor can often give a good idea of whether or not
this is true. However, one or both of the following refinements may be
needed:

1. Plot residuals (instead of
response) vs. predictor. A non-random pattern suggests that a simple
linear model is not appropriate; you may need to transform the response
or predictor, or add a quadratic or higher term to the mode.

2. Use a scatterplot smoother such as lowess (also known as loess) to give a visual estimation of the conditional mean. Such smoothers are available in many regression software packages. Caution: You may need to choose a value of a smoothness parameter. Making it too large will oversmooth; making it too small will not smooth enough.

2. Use a scatterplot smoother such as lowess (also known as loess) to give a visual estimation of the conditional mean. Such smoothers are available in many regression software packages. Caution: You may need to choose a value of a smoothness parameter. Making it too large will oversmooth; making it too small will not smooth enough.

When considering
a linear regression with just two
terms, plotting response (or residuals) against the two terms
(making a
three-dimensional graph) can help gauge suitability of a linear model,
especially if your software allows you to rotate the graph.

Caution:
It is not possible to gauge
from
scatterplots whether a linear model in more than two predictors is
suitable. One way to address this problem is to try to transform the
predictors to approximate multivariate normality.^{5} This will
ensure not only that a linear model is appropriate for all
(transformed) predictors together, but that a linear model is
appropriate even when some transformed predictors are dropped from the
model.^{6 }

1. Some techniques may merely require uncorrelated errors rather than independent errors, but the model-checking plots needed are the same.

2. Robustness to departures from normality is related to the Central Limit Theorem, since most estimators are linear combinations of the observations, and hence approximately normal if the number of observations is large.

3. In this context, "robustness" can be formulated in terms of the effect of the departure from a model assumption on the Type I error rate. See Van Belle (2008) Statistical Rules of Thumb, pp. 173 - 177 and the references given there for more detail.

4. In some formulations of regression, the error terms are only assumed to be uncorrelated, not necessarily independent.

5. See Cook and Weisberg (1999) Applied Regression Including Computing and Graphics, p. 324- 329 for one way to do this.

6. If a linear model fits with all predictors included, it is not true that a linear model will still fit when some predictors are dropped. For example, if E(Y|X