Can placebo effects be harnessed to improve treatment outcomes? Stories of a placebo changing bodily function are important in promoting mind-body medicine, but mostly turn out to be false positives. Was this one an exception?
A lesson in critical appraisal: How to screen complicated studies in order to decide whether to put the time and energy into a closer look.
Howe LC, Goyer JP, Crum AJ. Harnessing the Placebo Effect: Exploring the Influence of Physician Characteristics on Placebo Response. Health Psychology Vol 36(11), Nov 2017, 1074-1082 http://dx.doi.org/10.1037/hea0000499
From the Abstract:
After inducing an allergic reaction in participants through a histamine skin prick test, a health care provider administered a cream with no active ingredients and set either positive expectations (cream will reduce reaction) or negative expectations (cream will increase reaction).
The provider demonstrated either high or low warmth, or either high or low competence.
Results: The impact of expectations on allergic response was enhanced when the provider acted both warmer and more competent and negated when the provider acted colder and less competent.
Conclusion: This study suggests that placebo effects should be construed not as a nuisance variable with mysterious impact but instead as a psychological phenomenon that can be understood and harnessed to improve treatment outcomes.
Why I dismissed this study
The small sample size was set in a power analysis based on the authors hopes of finding a moderate effect size, not any existing results. With only 20 participants per cell, most significant findings are likely to be false positives.
The authors had a complicated design with multiple manipulations and time points, They examined 2 physiological measures, but only reported results for one of them in the paper, the one with stronger results.
The authors did not report a key overall test of whether there was a significant main or interaction effect. Without such a finding, jumping down to significant comparisons between groups is likely to a false positive.
The authors did not adjust for multiple comparisons, despite doing a huge number.
The authors did not report raw mean differences for comparisons, only differences at two time points controlling for gender, race, and the first two time points. No rationale is given.
The authors used language like ‘marginally significant, and ‘different, but not significantly so,’ which might suggest they were chasing and selectively reporting significant findings.
The phenomena under study was mild allergic reaction in the short term: three time points, 9-15 minutes, with data for 2 earlier time points not reported as outcomes. It is unclear the mechanism by which an experimental manipulation could have an observable effect on such a mild reaction in such a short period of time.
Claims of placebo effects figures heavily in discussions of the power of the mind over the body. Yet, this power is greatly exaggerated by lay persons and in the lay press and social media. Effects of a placebo manipulation on objective physiological measures, as opposed to subjective self-report measures are uncommon and usually turn out to be false positives.
A New England Journal of Medicine review of 130 clinical trials found
Little evidence in general that placebos had powerful clinical effects. Although placebos had no significant effects on objective or binary outcomes, they had possible small benefits in studies with continuous subjective outcomes and for the treatment of pain. Outside the setting of clinical trials, there is no justification for the use of placebos.
I often cite another great NEJM study showing the sharp contrast in positive results obtained subjective self-report versus negative results with objective physical functioning measures.
That is probably the case with a recent report of effects of expectancies and interpersonal relationship on a mild allergic reaction induced by a histamine skin prick test (SPT). The study involved manipulation of the perceived warmth and competence of a provider, as well as whether research participants were told that an inert cream being applied would have a positive or negative effect.
The authors invoke in claiming support that psychological variables do indeed influence a mild allergic reaction. Examining all of the numerous pairwise comparisons, would be a long and tedious task. However, I decided from some details of the design and analysis of the study, I would not proceed.
Some notable features of the study.
The key manipulations of high versus low warmth and high versus low competence were in the behavior of a single unblinded experimenter.
The design is described as 2x2x2 with a cell size of n= 20 (19 in one cell).
It is more properly described as 2x2x2x(5) because of the 5 time points after the provider administeried the skin prick:
(T1 = 3 min post-SPT, T2 = 6 mi post-SPT and cream administered directly afterward, T3 = 9 min post-SPTand 3 min post-cream,T4 = 12 min post-SPT and 6 min post-cream, T5 =15 min post-SPT and 9 min. post-cream).
The small number of participants per cell was set in a power analysis based on hope a moderate effect size could be shown, not on past results.
The physiological reaction was measured in terms of size of a wheal (raised bump) and size of the flare (redness surrounding the bump).
Numerous other physiological measures were obtained, including blood pressure and pre-post session saliva samples. It is not stated what was done with these data, but they could have been used to evaluate further the manipulation of experimenter behavior.
No simple correlation between participants’ perceptions of warm and competence are reported, which would have been helpful in interpreting the 2×2 crossing of warmth and competence.
In the supplementary materials, readers are told ratings of itchiness and mood were obtained after the skin prick. No effects of the experimental manipulation were observed, which would seem not to support the effectiveness of the intervention.
No overall ANOVA or test for significance of interactions is presented.
Instead, numerous paired comparisons are presented without correction for post hoc multiplicity.
Further comparisons were conducted with a sample that was constructed post hoc:
To better understand the mechanism by which expectations differed, within a setting of high warmth and high competence, we compared the wheal and flare size for the positive and negative expectations conditions to a follow-up sample who received neutral expectations. This resulted in a total sample of N=62.
Differences arising using this sample were discussed, despite significance levels being p = .095 and p = .155.
Raw mean scores are not presented nor discussed. Instead, all comparisons controlled for gender and race and size of the wheal at Times 1 and 2,
Only the size of the wheal is reported in the body of the paper, but it was reported
The results on the flare of the reaction were mostly similar (see the supplemental material available online).
Actually, the results reported in the supplemental material were considerably weaker, with claims of differences being marginally significant and favoring results that were only significant at particular time points.
So, what do you think? If you are interested, take a look at the study and let me know if I was premature to dismiss it.
Preorders are being accepted for e-books providing skeptical looks at mindfulness and positive psychology, and arming citizen scientists with critical thinking skills. Right now there is a special offer for free access to a Mindfulness Master Class. But hurry, it won’t last.
I will also be offering scientific writing courses on the web as I have been doing face-to-face for almost a decade. I want to give researchers the tools to get into the journals where their work will get the attention it deserves.
Sign up at my website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.