In recent weeks I have seen two references to the article “Plasma 1,8-cineole correlates with cognitive performance following exposure to rosemary essential oil aroma,” Mark Moss and Lorraine Oliver, Therapeutic Advances in Psychopharmacology, published online 24 February 2012 (DOI: 10.1177/2045125312436573). The first reference was in a Quantitative Self website March 4 posting. My intuition suggested this might be a good case study for misuse of statistics, so I clicked the link – which went, not to the article, but to a page titled Could rosemary scent boost brain performance?, which seems to be a PR release for the article. It said in part,
“The investigators tested cognitive performance and mood in a cohort of 20 subjects, who were exposed to varying levels of the rosemary aroma. Using blood samples to detect the amount of 1,8-cineole participants had absorbed, the researchers applied speed and accuracy tests, and mood assessments, to judge the rosemary oil’s affects.
Results indicate for the first time in human subjects that concentration of 1,8-cineole in the blood is related to an individual’s cognitive performance – with higher concentrations resulting in improved performance. Both speed and accuracy were improved, suggesting that the relationship is not describing a speed–accuracy trade off.”
It was not difficult to find the article, and indeed, it provides a good case study. (See details below.)
The second reference to the article was in the May, 2012 issue of the University of California Berkeley Wellness Letter, which said, “Though research in people is sparse, a small study in Therapeutic Advances in Psychopharmacology in February found that young adults who had the highest blood levels of a key rosemary compound following inhalation of the essential oil performed better and faster on some cognitive tasks. This suggests that volatile compounds in rosemary oil may be absorbed into the blood and perhaps even cross the blood-brain barrier. But the field of aromatherapy is hard to study, and these finding are preliminary.” Well, this has some caution. But … here’s my reading of the Moss-Oliver article:
First, their sample size was twenty. That raised a red flag – pretty small for this kind of thing. Then there was the study design: The PR article described the study as an “experiment” – and it did have some superficial similarity to an experiment; e.g., “Participants were randomly assigned to be exposed to the aroma in the cubicle for 4, 6, 8 or 10 min prior to completing the cognitive tests. However, to their credit, the authors said correctly, “The study used a correlational design.” (Both quotes from p. 3 of article).
But we all (I hope) know that a correlational study can’t establish causality, so one more reason to be cautious in interpreting the results. But the real kicker was (not to my surprise) basing conclusions on several hypothesis tests without adjusting for multiple inference. (If you’re not familiar with the problem of multiple inference, see the Wikipedia page Multiple Comparisons or my page Multiple Inference.) On p. 6, the authors highlight results of ten hypothesis tests, with respective p-values 0.037, 0.024, 0.056, 0.038, 0.624, 0.049, 0.733, 0.044, 0.253, and “p > 0.05”. They claimed that all six with p-values less than 0.05 were significant. I tried a couple of methods of adjusting for multiple inference, but neither of them yielded any significant results at an overall significance level of 0.05.
Morals of the story:
1.Watch out for small sample sizes, especially if more than one hypothesis test is involved.
2. Be alert for the problem of multiple testing.
3. Don’t believe the summary in a PR release.
4. Check it out yourself.