While browsing through the November 29, 2013 issue of Science a couple of days ago, I noticed the catchy title of the last report (McNulty et al, “Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying,” Science 29 November 2013: 1119-1120). I suspected that the catchy title would mean the article would be discussed in the popular press (as indeed it has been), and wondered if, since it appeared in a top-ranked journal, it would be of high quality in its statistical analysis.
Alas, I was disappointed (but not surprised). In particular, thirteen hypothesis tests (all using the same data) were reported in the article. Eight were declared significant — apparently at an individual .05 significance rate, since there was no mention of adjusted p-values or overall significance rate or anything else that would suggest that the authors took multiple testing into account in reporting “statistical significance.” So I did a quick Bonferroni calculation (i.e., using .05/13 as an individual significance level to ensure an overall significance rate of 0.05), and found that only three of these 8 tests were statistically significant at that conservative adjusted criterion. (I then tried the sometimes more liberal Holm’s procedure, but with the same result.) So, accounting for multiple testing, the only hypotheses that could be considered statistically significant at an overall .05 significance rate are those that were reported as significant at the 0.001 level, namely:
- “… spouses’ marital satisfaction declined significantly over the 4 years of the study”
- “Spouses’ conscious attitudes … were positively associated with initial levels of marital satisfaction”
- “… spouses’ perceptions of their marital problems at each assessment significantly negatively predicted changes in their satisfaction from that assessment to the next”
Among the tests that are not supported as being statistically significant at an overall .05 level are the ones crucial to the authors’ assertions that automatic attitudes predicted changes in their marital satisfaction. (Actually, I’m being rather generous: The Supplemental Material contains many more significance tests.)
There are other questionable aspects to the paper, in addition to the one pointed out above; some are mentioned in Andrew Gelman’s January 1 blog.