“As I was saying before I was so rudely interrupted…”— William Connor
This is the second post in a two-part series about claims made in meta-analyses in JAMA and more recently elsewhere that long-term psychodynamic therapy is superior to shorter psychotherapies. This post was intended to be uploaded a while ago. But it got shelved until now because I felt the need to respond to hyped and distorted media coverage of a study in Lancet of CBT for persons with schizophrenia with two posts examining what turned out to be that significant, but not overwhelming study.
That digression probably helped to change the tide of opinion about that Lancet study. When I first posted about it, media coverage was dominated by wild claims about CBT having been shown to be as effective antipsychotic medication. Coverage in Science was headlined “Schizophrenia: Time to Flush the Meds?“ Alternative perspectives were largely limited to a restrained, soft-toned note of skepticism at Mental Elf and louder complaints at Keith Law’s Dystopia.
Here and elsewhere, I challenged the media coverage. I showed that this study actually had null findings of no difference between CBT and treatment as usual at the end of the intervention. Despite all the previous enthusiasm shown for the study back then, no one is now responding to my request on Twitter to come forward if they still believes the results that were claimed for the study A single holdout has been persisting with comments at my other blog site, but he increasingly seems like Hiroo Onoda still wandering in the jungle after the battle is lost.
Stay tuned for for what could be Keith Laws and my article at the new Lancet Psychiatry and likely a slew of letters at Lancet.
There is certainly more about the JAMA meta-analysis worth writing about. It was accompanied by a gushy editorial and praised by people like Peter Kramer who argued we can’t argue with the authority of JAMA. Yet when I took a look at the JAMA paper, it proved to be
a bizarre meta-analysis compar[ing] 1053 patients assigned to LLTP to 257 patients assigned to a control condition, only 36 of whom were receiving an evidence based therapy for their condition.
In this post I will continue describing what happened next:
- Leichsenring and Rabung responded to critics, dodging basic criticisms and condemning that those who reject their claims are bringing in biases of their own.
Yup, I was accused of being a part of a plot of advocates of cognitive therapy trying to beat down legitimate claims of long-term psychoanalysis and psychodynamic therapy being superior. Those who thought I was part of a plot against cognitive therapy during my analyses of the Lancet study, please note.
- Leichsenring and Rabung renewed their claims in another meta-analysis in British Journal of Psychiatry for which 10 of the 11 studies were already included in the JAMA meta-analysis.
When you read my account of this recycling, you will probably wonder why they were allowed to do this. I will give some reasons that do not reflect well on that journal. But then Harvard Review of Psychiatry offered continuing education credits from those who wanted to learn how to interpret a meta-analysis from another recycling.
- The long term psychodynamic/psychoanalytic community responded approvingly, echoing Leichsenring and Rabung’s assessment of skeptics.
- The important question of whether long-term psychoanalytic psychotherapy is better than shorter term therapies got an independent evaluation by another group, which included the world-class meta-analyst and systematic reviewer, John Ioannidis.
When Ioannidis offers a critical analysis of conventional wisdom, it is generally worth paying attention.
Responses from Leichsenring and Rabung and the LTPP community to criticism
I’ve been a skeptical critic long enough not to expect that authors will agree with criticism or that they will substantially modify their conclusions, no matter how incisive criticisms are. But when there’s been such obvious miscalculations as Leichsenring and Rabung made that inflated results in their favor, I would expect at least some admission of error and adjustment of conclusions, even if not a retraction.
They admitted no computational error, not even for a claim that in one analysis, the effect size for LTPP was 6.9. They did concede that their effect size estimates were different and even larger than “commonly assessed.” But there was apparently nothing wrong with that, even if readers might be expecting something comparable to what is typically done in meta analyses.
Overall, one would never get a sense from Leichsenring and Rabung response that they had published one of the most wildest examples of meta-analysis malpractice that can be found in any recent high impact journal. Results were not in any way comparable to what a conventional, well done meta-analysis would produce. [Click here if you want a more detailed analysis.].
Leichsenring and Rabung’s response to our extended critique was also strange. For instance, we had offered the criticism that the small group of studies entered into the meta-analysis were so heterogeneous that any effect size could not be generalized back to any one of the studies what treatments going into the meta-analysis. To this they responded
Heterogeneity of control conditions and diagnoses (p. 210) are part of the discussion about effectiveness vs. efficacy, which is nearly as old as psychotherapy research itself. Although theoretically important for internal validity, the attempt to create truly homogeneous yet clinically relevant study populations leads psychotherapy ad absurdum.
Really? I guess the Cochrane Collaboration has been getting it all wrong in being so concerned about heterogeneity.
But basically Leichsenring and Rabung’s message was “why were we picking on them and their meta-analysis and not someone else?” They implied we had undisclosed conflict of interests in criticizing them:
It is quite ironic that the paper of Bhar et al. is published in close proximity to an editorial dealing with the unmasking of special interest groups , which are obviously not limited to somatic medicine and the pharmaceutical industry.
Really, Falk and Sven? Is it a conflict of interest for anyone but a psychoanalyst to have an opinion about a slick sell job of a meta analysis?
I joined with other colleagues because I was deeply offended by your flaunting of standards, your abuse of statistics, your stubborn refusal to acknowledge that you had made mistakes. All in the prestigious JAMA and with a fawning editorial that becomes a further source of offense. I wrote and continue to write about your work because I want to cultivate a gag response in others.
The authors went on to accuse us of being in some sort of a plot, of our joining
ranks with an interesting and surprising movement of others , who publish comments with relatively low empirical novelty but quite harsh language towards the Leichsenring and Rabung article in other journals, let alone internet blogs and pamphlets.
As of March, 2014 JAMA article has racked up over 180 citations according to ISI Web of Science, 454 according to Google Scholar. This unusually large discrepancy reflects in part proponents of psychoanalytic therapy making greater use of chapters in books, rather than peer reviewed articles. Across the LTPP literature, the Leichsenring and Rabung’s blatant miscalculations in calculating effect sizes are being uncritically accepted and praised. Recurrent themes get amplified in repetition. Although skepticism is expressed about LTPP being evaluated in RCTs and meta-analyses, the contradictory argument is made, usually in the same article, that the science is solid, effects are equal or larger than for evidence-based therapies, and critics and doubters gets slammed as having ulterior motives and undisclosed conflicts of interest.
A Psychology Today counterpoint from a University of Colorado Medical School psychologist to my complaint about the miscalculated effect sizes took the hype to a whole new level:
Indeed, the within-group effect sizes for long-term psychodynamic therapy were quite large (as a rough example: if psychiatric symptoms were SAT scores and long-term psychodynamic therapy were an SAT training program, the average student would expect to increase their score by somewhere around 90-180 points on each section).
You would think that this analogy was a cause for skepticism, but no.
Eighteen months after publication of the JAMA article, Leichsenring and Rabung published another meta-analysis in British Journal of Psychiatry. Ten of the 11 studies entered into it were already in the JAMA article. The article’s title identified it as an “update.”
The sole study “updating” the JAMA meta-analysis was a decade old and had been excluded from the JAMA analyses. Bateman and Fonagy comparing an 18 month “mentalization-based” therapy to structured clinical management, “a counseling model closest to a supportive approach with case management, advocacy support, and problem-oriented psychotherapeutic interventions (p.357).” This treatment was not manualized or evidence-based. The study did not add much except for further statistical and clinical heterogeneity and confusion.
Leichsenring and Rabung concluded from the redone meta-analysis
Results suggest that LTPP is superior to less intensive forms of psychotherapy in complex mental disorders.
Leichsenring and Rabung have continued to turn out redundant meta-analyses. Leichsenring’s article in Harvard Review of Psychiatry offers continuing education credit.
Learning Objectives: After participating in this educational activity, the reader should be better able to evaluate the empirical evidence for pre/post changes in psychoanalysis patients with complex mental disorders, and assess the limitations of the meta-analysis.
Updates and meta-analyses are justified by the passage of time and accumulation of new relevant studies. There was neither in the case of BJP or these other meta-analyses. The BJP editor should have recognized the manuscript as an attempt at duplicate publication, aimed at extending a publicity effort into new venues, not a publication justified by new science.
Critics predictably responded to the re-analysis. Only one of the Rapid Responses left on the Journal website made into the print edition and it was met a month later by an invited editorial in BJP from Jeremy Holmes, author of Introduction to Psychoanalysis— without the same strict word limits:
Leichsenring &Rabung7 found that long-term psychodynamic psychotherapies (LTPPs) produced large within-group effect sizes (average 0.8–1.2) comparable with those achieved by other psychotherapy modalities; that gains tended to accumulate even after therapy has finished, in contrast to non-psychotherapeutic treatments; and that a dose–effect pattern was present, with longer therapies producing greater and more sustained improvement.
The issue of insurance reimbursement was pushed with the reassurance:
Although expensive, psychodynamic psychiatry is able in some circumstances to ‘pay for itself’,9 thanks to offset costs of other expenses (medication, hospital stays, welfare payments, etc).
This extravagant claim was based on the single study added by Leichsenring and Rabungin their BJP reanalysis, in which LTPP was compared to 18 months of “mentalization-based therapy” with case management, advocacy support, and problem-oriented psychotherapeutic interventions.
Elsewhere, Thombs and colleagues had noted a trial excluded by Leichsenring and Rabung because it was too short, found comparable outcomes for a mean of 232 LTPP sessions at an estimated cost of $29,000 to $40,600, according to the authors, to 9.8 sessions of a nurse-delivered solution-focused therapy at a cost of $735 to $980.
How did this recycling get into British Journal of Psychiatry? It is a matter of conjecture, but the editor at the time was Peter Tyrer, a practicing psychoanalyst. He is also a devout Catholic who gave space to American antiabortionists claiming that a significant portion of the mental health problems of women of childbearing age was due to abortion. He resisted a storm of criticism of the antiabortionists meta-analysis, which disproportionately featured their own flawed studies. He even recruited Ben Goldacre to manage the resulting crisis. Goldacre, however, joined with critics in denouncing the meta-analysis.
These are only two examples, but they are extraordinary. Maybe Tyrer had some sort of imperial sense of editorial discretion. But there are rules…
Too many flawed meta analysis by the same authors
A recent BMJ article has noted the prevalence of multiple meta-analyses of the same literature, and expressed caution about such clusters of meta-analyses often come from the same group.
Re-analyses from the same authors need special justification and risk perpetuating the same problems from one meta-analysis to another. If inadequacies, including miscalculation and biased, unconventional calculation of effect sizes require a reanalysis, it probably should be done by another group.
Moreover, if the inadequacies of the earlier analyses by particular authors are the rationale for the conducting another meta-analysis, the problems that led to the decision to do so should be made explicit. Arguably, if the problems are sufficient to require a reanalysis, the earlier analyses should either be retracted or no longer uncritically cited. Instead, we have a pattern of laudatory self citation, minimization of any difficulties, and repeated meta-analyses lending false authority.
An independent reanalysis
The Dutch Health Care Insurance Board (CVZ) provided partial funding for an independent reanalysis of the evidence concerning the efficacy of LTPP. All the authors were Dutch, except for John Ioannidis, the Greek-American who is the author of numerous well-executed meta-analyses and systematic reviews. Some have proved game changing, like “Most Positive Findings Are False.” Richard Smith, former Editor of BMJ endorses Ioannidis as
a brilliant researcher who has done more than anybody to identify serious problems with the publishing of science.
The authors defined LTPP as having at least 40 sessions and continuing for at least one year. This different from Leichsenring and Rabung’s requirement of at least 50 sessions, but the authors noted that weekly sessions may result in a total of less than 50 sessions in a year, allowing for patient and therapist vacations and missed sessions.
The authors struggled with the poor quality of the studies they were able to identify, but came up with an excellent solution, a sensitivity analysis. The meta-analysis was conducted with each of the poor quality studies included and then again with each of these studies excluded. As it turned out, whether the studies were included did not influence the results.
They explicitly rejected the validity of effect sizes calculated on the basis of within-group differences:
To reliably assess the effectiveness of any treatment, it is necessary to evaluate its outcomes compared to a control group. The change in severity or intensity of a mental disorder over time cannot be attributed solely to the treatment that took place during that time, unless the treatment is controlled for. This is especially so with long-term treatments where the course of symptoms may change (more or less) spontaneously over time, even in personality disorders that were previously thought to be stable and incurable, such as borderline personality disorder.
Results were unambiguous and negative, in terms of the efficacy of LTPP:
The recovery rate of various mental disorders was equal after LTPP or various control treatments, including treatments without a specialized psychotherapy component. Similarly, no statistically significant differences were found for the domains target problems, general psychiatric problems, personality pathology, social functioning, overall effectiveness or quality of life.
Control conditions were heterogeneous and frequently of low quality, e.g. without a specialized psychotherapy component. If anything, this suggests that LTPP is often compared against relatively ineffective “straw man” comparator… LTPP comparisons to specialized non-psychodynamic treatments, like dialectical behavior therapy and schema-focused therapy, suggest that LTPP might not be particularly effective.
The bottom line is that available evidence suggests that LTPP is not worthwhile, at least in terms of the conventional ways of evaluating therapies. The authors noted that many of the studies made comparisons between LTPP and a control condition, which is inappropriate if the critical question is whether LTPP is superior to other psychotherapies.
The authors included a provocative quote from Freud expressing doubt whether LTPP really produces much change:
One has the impression that one ought not to be surprised if it should turn out in the end that the difference between a person who has not been analyzed and the behavior of a person after he has been analyzed is not so thorough-going as we aim at making it and as we expect and maintain it to be´ (Freud, 1937/1961).
Bewildered yet? Tips for evaluating other meta-analyses
I have been encouraging a healthy skepticism about the quality and credibility of articles published in even the most high-impact prestigious journals. The push to secure insurance coverage for LTPP has produced a lot of bad science, both at the level of poorly done clinical trials intended to prove rather than test the efficacy of LTPP, as well as the horrific meta-analyses that take bizarre steps to ensure LTPP uber alles.
It is troubling to see that bad science repeatedly gets into prestigious, high impact journals with its flaws brazenly displayed. Not only that, is accompanied by laudatory editorials in various efforts to block and neutralize criticism. Anyone who has participated in the debate described in this blog has to be aware of the presence of an old boy network of aging psychoanalysts and their patients that seeks to control the evidence that is available concerning LTPP. Few will articulate readers’ dilemma as clearly has Peter Kramer, but many readers suffer a discomfort knowing something is wrong with these clinical trials and meta-analyses, but they are intimidated by the sheer authority of JAMA, British Journal of Psychiatry, and Harvard Review of Psychiatry. Why, if there is such consensus about the efficacy of LTPP, who can argue?
Reviewing meta analyses of long-term psychoanalytic and psychodynamic psychotherapies reinforces some points that I have made and will be making in future blog posts.
- We can’t necessarily decide on the authority and credibility of articles solely on the basis of the journals in which they appear, and an accompanying editorial does not necessarily give added reassurance.
- We have standards such as CONSORT for guiding the reporting of clinical trials and the Cochrane collaboration risk of bias criteria for judging whether the reporting has a risk of bias.
- We similarly have standards such as PRISM for guiding organization, conduct, and reporting of meta-analysis and systematic reviews and AMSTAR for their evaluation.
It’s a good idea to familiarize yourself with these readily available standards.
- We should be wary of the use of meta-analysis for propaganda and marketing of particular interventions and services. Conflicts of interests unconfined to the usual consideration of whether there’s industry support for a clinical trial or meta-analysis.
- We should be concerned about multiple meta-analysis coming from the same group of authors. We need to ask what justification there is for multiple publications journals and the be alert to uncritical self citation.
But what if one does not have time or inclination to scrutinize bad meta-analyses with formal rating scales? That is certainly true of a lot of consumers, whether they be clinicians, policymakers, or patients trying to make intelligent decisions about whether they really need long-term treatment. I think that they are some basic things to look for that can serve as a first screen of meta-analyses so the decision could be made whether to accept them, dismiss them, or subject them to further evaluation. The screen can be seen as a good rule-out tool and maybe the first stage of a two-stage process involving a further look, including going back to the original studies and other relevant meta-analyses.
When I pick up a meta-analysis, some first things that I look for that can be immediately seen as missing in Leichsenring and Rabung:
- Is there a reasonable number of reasonably sized trials making head-to-head comparisons between the two types of treatments that are being pitted against each other?
- Did the authors rely on conventional between-group calculation of effect sizes, rather than jumping to biased and easily misinterpreted within-group effect sizes?
- Aside from technical questions of statistical heterogeneity, does the lumping and splitting of both intervention and comparison groups make sense in terms of similarities and differences in interventions, patient characteristics, and clinical context?
It has taken years of discussion for all of the problems of Leichsenring and Rabung’s meta-analysis to become apparent, but I think their failure to meet basic criteria should have been apparent in a less than 30 minute close read of their JAMA paper. An effect size of 6.9? Come on!