Analyzing Data Without Regard to How They Were Collected

Using a two-sample t-test when observations are paired is one example of this. Here is another:

Example:¹ An experiment was conducted to study the effect of two factors (pretreatment and stain) on the water resistance of wood. Two types of pretreatment and four types of stain were considered. For reasons of practicality and economy, the experiment was conducted with a split-plot design as follows:

Six entire boards were the whole plots. One pretreatment was applied to each board, with the two pretreatments randomly assigned to the six boards (three boards per pretreatment). Then each pre-treated board was cut into four smaller pieces of equal size (these were the split-plots). The four pieces from each entire board were randomly assigned to the four stains. The water resistance of each of the 24 smaller pieces was measured; this was the response variable.

If the correct split-plot analysis is used, the interaction of pretreatment and the effect of pretreatment are not statistically significant, but the effect of stain is statistically significant.

However, if you were to do an analysis of variance incorrectly assuming that the experiment used a crossed design, with the 6 treatment combinations randomly assigned to the 24 smaller pieces of wood, the analysis would indicate that the interaction and effect of stain are not statistically significant, whereas the effect of pretreatment is -- a very different conclusion.

Some of the many considerations to take into account in deciding on an appropriate method of analysis include:

The sampling method

Whether or not there was blocking in an experimental design

Whether factors are nested or crossed

Fixed vs random factors

Pseudoreplication
Missing data

Notes:
1. For details (including data), see Potcner and Kowalski, How to Analyze a Split-Plot Experiment, Quality Progress, December 2004, pp. 67 - 74.

Last updated August 28, 2012