We now know that estimates of the efficacy of antidepressants that were once readily accepted were exaggerated. Sure, the estimates came from meta analyses of the carefully searched published literature, but that literature was skewed to make antidepressants look better than they were. Pharmaceutical companies had withheld publication of negative and weak trials and they had selectively promoted positive trials, even buying reprints to distribute journal articles that reported them. Some worried that these sometimes huge purchases gave journals an added incentive for the journals to participate in the confirmatory bias.
A much more balanced perspective on how well antidepressants worked was contained in reports submitted to the Food and Drug Administration (FDA). Since 1997 Drug Approval Packages had been available for download by anyone. For drugs approved before then, reports were available on request. But no one had yet made effective use of these reports in a published evaluation of the efficacy of antidepressants. Either those who were potentially interested in doing so did not know how to gain access or they had been discouraged by what had been the FDA’s notoriously slow response to such requests.
Erick Turner once worked for the FDA, but is now a Veterans Administration Health System psychiatrist. He made use of these data in a provocative paper. Whereas the published literature up until then suggested that almost all randomized trials involving antidepressants were positive, the FDA records he obtained revealed that only half were positive. He found that data submitted to the FDA that was positive was 12 times more likely to end up published than negative trials. So, just as many suspected, but could not quantify, there was a considerable bias going on in the evidence being made available to clinicians and policymakers. Turner re-analyzed the data concerning the efficacy of antidepressants, and found that when the suppressed FDA data were included, the effect size (mean standardized difference) dropped from 0.41 to 0.31. Turner did not claim that antidepressants lacked efficacy in comparison to pill placebos, but he did point out that previous estimates of their advantage were exaggerated.
Turner is a cautious, understated guy. As I noted in another blog post, Rich Bonin, a CBS Morning News producer approached him to appear in a segment about whether antidepressants were no more effective than sugar pills. By Turner’s account, his careful explanation of his work left the CBS producer thinking out loud with his co-producer about whether they should proceed with the story, “or is it too murky…Is it good murky, or is it just murky?”
The producer resolved his dilemma by leaving Turner out of the program. Better a dramatic even if inaccurate message rather than a “murky” one.
Irving Kirsch, a psychologist conducted analyses on four of the 12 antidepressants Turner had analyzed, and obtained essentially the same result, an effect size of 0.32. But Kirsch had much more of a flair for grabbing media attention than Turner did, and some would say much more of a flair for hyperbole. Kirsch had already become well known for his relentless declarations that the differences between antidepressants and a placebo pill were trivial. As in the 60 Minutes segment from which Turner was excluded:
Irving Kirsch: The difference between the effect of a placebo and the effect of an antidepressant is minimal for most people.
Lesley Stahl: So you’re saying if they took a sugar pill, they’d have the same effect?
Irving Kirsch: They’d have almost as large an effect and whatever difference there would be would be clinically insignificant.
Kirsch based his claims on the arbitrary requirement that in order to be clinically significant, an effect size has to be 0.50 and the differences between an antidepressant and a pill placebo on the rating scale that the FDA required, the Hamilton Rating Scale for Depression, had to be three points. This standard had previously surfaced in evaluations by the UK’s National Institute for Clinical Excellence (NICE), but had already been abandoned by the time Kirsch made his claims. I asked Simon Gilbody, a respected British psychiatrist who does lots of meta-analyses what he thought of the now rejected NICE criterion of 0.50 as a sharp boundary between effective and ineffective treatments. He said
“I never believe any study these days which finds an ES > 0.5. it usually means the science is wrong and we haven’t found the bias(es) just yet.”
What would an effect size of 0.5 represent? It would indicate that the mean of the treatment group is at the 69th percentile of the control group and that the non-overlap is 33%.
Turner may not have made it to CBS 60 Minutes, but he had been invited to discuss in BMJ the differences between his and Kirsch’s interpretation of getting basically the same results. In a previous blog post, I commented on what Turner said.
He noted that the NICE criterion of an effect size of 0.5 had been taken from Jacob Cohen’s designation of it as “medium”, but Cohen himself had distanced himself from his own designation of effect sizes as “small,” “medium,” or “large” with “The values chosen had no more reliable a basis than my own intuition.” Turner suggested that Cohen would undoubtedly have rejected NICE’s rigid distinction of 0.5 as a categorical cutoff between ineffective versus effective treatments. Indeed, Cohen who is now deceased, offered no hint that he would have welcomed becoming the arbiter of categorical judgments of clinical significance and famously lampooned the categorical p<.05 for deciding the world is flat rather than round.
Turner went on:
It seems unfair that pharmacological, and not psychotherapeutic, treatment has become the usual first line approach to depression merely for economic reasons. But before we embrace any treatment as first line, it is prudent to ask whether its efficacy is beyond question. For psychotherapy trials, there is no equivalent of the FDA whose records we can examine, so how can we be sure that selective publication is not occurring here as well?
In hindsight, Turner’s challenge has become particularly important now that Ben Goldacre has caused such a stir about hidden pharmaceutical trials. Goldacre showed that Pharma had successfully promoted the drug reboxetine, as safe and effective as an antidepressant, when data withheld from publication actually indicated it was dangerous for some people and, on average, ineffective. Does psychotherapy have a problem with hidden trials? In a future blog post, I will demonstrate that indeed it does.
But more immediately, Turner left me puzzling about what if we apply the same requirement of an effect size of 0.50 to comparisons between psychotherapy and pill placebo. Fair is fair. If we’re going to judge antidepressants as no better than a sugar pill, what if psychotherapy was no better than a sugar pill by the same standard?
Kirsch attracted a lot of attention, especially among those who already wanted to believe that antidepressants were ineffective and maybe even dangerous – the growing anti-psychiatry crowd that had almost become mainstream, but also clinical psychologists who were frustrated with their declining role in treating all the depression that was now being detected in primary care. When I raised in blog posts the issue that Kirsch’s interpretation of the uselessness of antidepressants might also be extended to the uselessness of psychotherapy, something none of us wanted to do, I found my blogs attacked by trolls and spammed repeatedly, even with praise for what Scientology had done for psychiatry. This was not the first time and I am sure it will not be the last in which I hit on an emotional issue in which the opposing sides were invoking “evidence”, but really did not want to look more closely to see whether the evidence actually supported their views.
Nonetheless, I became determined to put the issue of psychotherapy versus pill placebo to an empirical test. I recruited Erick Turner, but also Pim Cuijpers who had an excellent database about psychotherapy for depression that he constantly updated with citation alerts and renewed searches. You can find a list of the meta analyses he has produced here. We assembled a writing team that included Erick, Pim, and myself, but also David Mohr, Stefan Hofmann, Gerhard Andersson, and Matthias Berking. We did a fresh search of the literature, Pim performed the meta analysis, and together the group drafted the manuscript that became the 2013 Psychological Medicine article that I am now going to discuss.
The Question We Asked.
We asked whether if we integrated all of the available data that we could find, would psychotherapy be more effective than pill placebo.
Facing the Criticism That We Should Not Have Even Done This Meta-Analysis
When we first sent our manuscript out for review, the reviewers were critical about our even having undertaken this meta-analysis. Our responding to this criticism in revising the paper strengthened it. The basic criticism concerned a lack of blindness for either clinicians administering psychotherapy versus pill placebo or to their patients receiving it. In double-blind pharmaceutical trials, clinicians and patients can both see that the patients are given a pill, but they don’t know whether its an antidepressant or a placebo, because both are kept in the dark. On the other hand, in a comparison between psychotherapy and pill placebo, it’s obvious which treatment patients are being given. There is lots of room for bias to creep into the study in the form of patients having a strong preference for one treatment or the other and that influencing their response either by skewing the answers they give in assessments or even in having them a more positive expectation about one treatment affecting their actual response.
Elements of the bias are present even in comparisons of psychotherapy and antidepressants. In an earlier study, my colleagues and I showed that over half of patients enrolling in a study in improving the care for depression in primary care would have preferred psychotherapy to the antidepressants, and those with a strong preference did less well with the antidepressant. The bias is present any time you compare psychotherapy to a pill, and so it’s not exactly equivalent to comparing an active treatment contained in a pill in an inert one, especially when neither the clinician nor the patient knows which pill is which. The reviewers were correct that in important ways psychotherapy versus pill placebo comparisons were different then antidepressants versus pill placebo comparisons.
If so, why do the meta-analysis? Because it allowed us to address important policy issues. Policymakers, clinicians, and patient/consumers were making decisions based on arguments that the difference between antidepressants and pill placebo were trivial, and jumping to the conclusion that therefore psychotherapy would be a more effective treatment. We wanted to put that judgment to a test.
A pill placebo condition is not a sugar pill
I phrased the title of this blog post to parallel a lot of the discussions that have gone on about antidepressants versus pill placebos. But I should be clear that a lot more goes into the effect size calculated for such a trial. Patients receiving either treatment benefit from the expectation that they are getting an active treatment and there is a lot of attention that goes on in terms of an active clinical management protocol in which patients are regularly being asked about how well they are doing and provided a lot of social support. These nonspecific factors can then themselves lead to substantial improvement, but in a clinical trial data provided to all conditions. So, the effect size that is calculated takes into account both conditions having this advantage.
This brings us to a point that is not well understood in lots of popular discussions of how well treatments work.. We often talk about efficacy in terms of effect sizes, but we need to keep in mind that effect sizes do not characterize a treatment, but rather a comparison of treatments in a particular context. So, when we are discussing a pill placebo condition, this a lot more involved in producing the effect size than the pill itself, namely, the comparison condition and the context in which the treatment and control condition were administered.
It’s sad that many antidepressants are prescribed in routine depression care in the community without the support and follow-up that patients enrolled in a clinical trial get with their active clinical management. Instead, patients in routine care are simply given their prescription and sent off, often without anyone checking in on them further. When their prescriptions need refilling, that can be accomplished simply with a telephone call to the prescribing physician’s office, without anybody asking how they are doing or whether they require adjustments in dosage or changing medications. Studies show that about 50% of patients prescribed antidepressant require such an inquiry and possible adjustment or different treatment to get any effect.
I wouldn’t be surprised if the benefits of getting an antidepressant in the context of primary care are less than the benefits obtained from getting a pill placebo within the support and attention of a clinical trial. For that matter, many patients getting non-evidence-based psychotherapy in the community can come back month after month without any formal assessment of whether they are getting any benefit. But these are stories for another time.
We went into our meta-analysis fully expecting that psychotherapy might not perform as well as some would like when compared to a pill placebo. As we said in the published article,
Because pill placebo more plausibly controls for positive expectations, support and attention, it is conceivable that this approach will lead to effect sizes that are smaller than those historically found using waitlist and no-treatment control conditions. This would have important implications for clinical and policy decisions concerning provision of psychotherapy versus antidepressants for depression. Because earlier research based on head-to-head comparisons of psychotherapy and pharmacotherapy have found that their effects on depression are comparable (Cuijpers et al., 2008), our hypothesis is that the effect size of psychotherapy is comparable to that of pharmacotherapy, i.e. about g = 0.3 (Turner et al., 2008).
So, rather than 0.5 that Kirsch had proposed has a cutoff for a nontrivial difference, we anticipated that psychotherapy versus pill placebo would be about the same as antidepressants versus placebo. The effect size we anticipated, would indicate that patients getting either psychotherapy or antidepressants would be at the 62nd percentile of those getting a pill placebo and the non-overlap between either of the two active treatments and a pill placebo would be 21%.
What We Found.
We had the requirement that any study that we included had to rely on a standardized assessment of depression such as the Beck Depression Inventory or Hamilton Depression Rating Scale. An exhaustive search only revealed 10 randomized trials that yielded comparisons between psychotherapy and pill placebo. Thats because researchers typically do not use pill placebo as the comparison condition for psychotherapy. Rather, they are more interested in comparing psychotherapy to an antidepressant and a placebo condition is there only to establish that the antidepressant was effectively delivered. That would be shown in an antidepressant/pill placebo difference. So, we were taking advantage of comparisons that were not necessarily intended by investigators.
At the end of clinical trials, the effect size for psychotherapy compared to pill placebo was g = 0.25 [what does this mean?]. If we translate that into practical terms of Number Needed to Treat (NNT), 7.14 psychotherapy patients had to be treated in order to get assured of getting one who did better than getting a pill placebo. Patients in the psychotherapy conditions scored 2.66 points lower on the Hamilton depression rating scale than those assigned to pill placebo. These differences are well within the range of the differences found between antidepressants and pill placebo in the FDA registered trials.
Essentially, when compared to pill placebo, psychotherapy did as well or, if you’d like, as poorly as an antidepressant. So, inferring that psychotherapy is the preferred treatment based simply on the basis of the small differences between antidepressants and pill placebos is not warranted.
The FDA data on which both Erick Turner and Irving Kirsch drew did not include psychotherapy conditions. Thanks to the regulatory requirements of the FDA, we can be assured that there are not many trials being hidden. But we cannot make that assumption about psychotherapy trials because psychotherapy investigators do not consistently register their trials, and can simply leave unpublished trials in which the psychotherapy does not perform well. Add to that that much psychotherapy is conducted by persons who would benefit from a positive result in terms of either their academic career or selling of their treatment on the workshop circuit. There are well-known strong investigator allegiance effects so that it isunusual for published clinical trials to obtain results counter to the interest of the investigators.
Thus, it may have been a draw between the effects observed for psychotherapy versus pill placebo in a meta-analyses versus the effects observed in other meta-analyses of data obtained from the FDA concerning antidepressants versus pill placebos. But our analyses may have overestimated the advantages of psychotherapy.
Elsewhere, I referred to the controversies about antidepressants as being part of the “antidepressant wars.” Combatants in these wars have given a lot of attention to the lack of strong differences between antidepressants and pill placebo. But I think both sides should recognize that neither psychotherapy nor meidcation have the efficacy that we would like to obtain from them in treating depression, especially when they are ineffectively delivered in routine care in the community. Some have pointed out that the efficacy of neither treatment has greatly changed in the past 50 years. Pharmaceutical companies have stopped investing large amounts of money in research seeking to discover new treatments for depression. Similarly, Pim Cuijpers has shown that past research strongly indicates that no one active psychotherapy treatment for depression is consistently superior to its evidence-based alternatives, and so theres little reason to expect there is some breakthrough treatment on the horizon. We are stuck trying to do the best we can with existing treatments and improving upon their in adequate delivery in the community. If anywhere, for now at least, that is where progress is to be made.
Acknowledgments: Some of my co-authors on the Psychological Medicine provided extremely helpful feedback on earlier drafts of this post, Stefan Hofmann, Gerhard Andersson, and Matthias Berking, but especially Erick Turner. However, the views expressed here do not necessarily express theirs.
My interest in pursuing a comparison between psychotherapy and pill placebos developed has I worked on earlier blog posts on the efficacy of antidepressants. I drew liberally from these earlier blog posts in writing this one, among them 1, 2, 3