I was asked by a Danish journalist to examine a randomized controlled trial (RCT) of cognitive behavior therapy (CBT) for functional somatic symptoms. I had not previously given the study a close look.
I was dismayed by how highly problematic the study was in so many ways.
I doubted that the results of the study showed any benefits to the patients or have any relevance to healthcare.
I then searched and found the website for the senior author’s clinical offerings. I suspected that the study was a mere experimercial or marketing effort of the services he offered.
Overall, I think what I found hiding in plain sight has broader relevance to scrutinizing other studies claiming to evaluate the efficacy of CBT for what are primarily physical illnesses, not psychiatric disorders. Look at the other RCTs. I am confident you will find similar problems. But then there is the bigger picture…
Schröder A, Rehfeld E, Ørnbøl E, Sharpe M, Licht RW, Fink P. Cognitive–behavioural group treatment for a range of functional somatic syndromes: randomised trial. The British Journal of Psychiatry. 2012 Apr 13:bjp-p.
A summary overview of what I found:
- Was unblinded to patients, interventionists, and to the physicians continuing to provide routine care.
- Had a grossly unmatched, inadequate control/comparison group that leads to any benefit from nonspecific (placebo) factors in the trial counting toward the estimated efficacy of the intervention.
- Relied on subjective self-report measures for primary outcomes.
- With such a familiar trio of design flaws, even an inert homeopathic treatment would be found effective, if it were provided with the same positive expectations and support as the CBT in this RCT. [This may seem a flippant comment that reflects on my credibility, not the study. But please keep reading to my detailed analysis where I back it up.]
- The study showed an inexplicably high rate of deterioration in both treatment and control group. Apparent improvement in the treatment group might only reflect less deterioration than in the control group.
- The study is focused on unvalidated psychiatric diagnoses being applied to patients with multiple somatic complaints, some of whom may not yet have a medical diagnosis, but most clearly had confirmed physical illnesses.
But wait, there is more!
- It’s not CBT that was evaluated, but a complex multicomponent intervention in which what was called CBT is embedded in a way that its contribution cannot be evaluated.
The “CBT” did not map well on international understandings of the assumptions and delivery of CBT. The complex intervention included weeks of indoctrination of the patient with an understanding of their physical problems that incorporated simplistic pseudoscience before any CBT was delivered. We focused on goals imposed by a psychiatrist that didn’t necessarily fit with patients’ sense of their most pressing problems and the solutions.
- The authors switched primary outcomes – reconfiguring the scoring of their subjective self-report measures years into the trial, based on a peeking at the results with the original scoring.
Investigators have a website which is marketing services. Rather than a quality contribution to the literature, this study can be seen as an experimercial doomed to bad science and questionable results from before the first patient was enrolled. An undeclared conflict of interest in play? There is another serious undeclared conflict of interest for one of the authors.
For the uninformed and gullible, the study handsomely succeeds as an advertisement for the investigators’ services to professionals and patients.
Personally, I would be indignant if a primary care physician tried to refer me or friend or family member to this trial. In the absence of overwhelming evidence to the contrary, I assume that people around me who complain of physical symptoms have legitimate physical concerns. If they do not yet have a confirmed diagnosis, it serves little purpose to stop the probing and refer them to psychiatrists. This trial operates with an anachronistic Victorian definition of psychosomatic condition.
But why should we care about a patently badly conducted trial with switched outcomes? Is it only a matter of something being rotten in the state of Denmark? Aside from the general impact on the existing literature concerning CBT for somatic conditions, results of this trial were entered into a Cochrane review of nonpharmacological interventions for medically unexplained symptoms. I previously complained about one of the authors of this RCT also being listed as an author on another Cochrane review protocol. Prior to that, I complained to Cochrane about this author’s larger research group influencing a decision to include switched outcomes in another Cochrane review. A lot of us rightfully depend heavily on the verdict of Cochrane reviews for deciding best evidence. That trust is being put into jeopardy.
1.This is an unblinded trial, a particularly weak methodology for examining whether a treatment works.
The letter that alerted physicians to the trial had essentially encouraged them to refer patients they were having difficulty managing.
‘Patients with a long-term illness course due to medically unexplained or functional somatic symptoms who may have received diagnoses like fibromyalgia, chronic fatigue syndrome, whiplash associated disorder, or somatoform disorder.
Patients and the physicians who referred them subsequently got feedback about to which group patients were assigned, either routine care or what was labeled as CBT. This information could have had a strong influence on the outcomes that were reported, particularly for the patients left in routine care.
Patients’ learning that they did not get assigned to the intervention group was undoubtedly disappointing and demoralizing. The information probably did nothing to improve the positive expectations and support available to patients in routine. This could have had a nocebo effect. The feedback may have contributed to the otherwise inexplicably high rates of subjective deterioration [to be noted below] reported by patients left in the routine care condition. In contrast, the authors’ disclosure that patients had been assigned to the intervention group undoubtedly boosted the morale of both patients and physicians and also increased the gratitude of the patients. This would be reflected in the responses to the subjective outcome measures.
The gold standard alternative to an unblinded trial is a double-blind, placebo-controlled trial in which neither providers, nor patients, nor even the assessors rating outcomes know to which group particular patients were assigned. Of course, this is difficult to achieve in a psychotherapy trial. Yet a fair alternative is a psychotherapy trial in which patients and those who refer them are blind to the nature of the different treatments, and in which an effort is made to communicate credible positive expectations about the comparison control group.
Conclusion: A lack of blinding seriously biases this study toward finding a positive effect for the intervention, regardless of whether the intervention has any active, effective component.
2. A claim that this is a randomized controlled trial depends on the adequacy of the control offered by the comparison group, enhanced routine care. Just what is being controlled by the comparison? In evaluating a psychological treatment, it’s important that the comparison/control group offers the same frequency and intensity of contact, positive expectations, attention and support. This trial decidedly did not.
There were large differences between the intervention and control conditions in the amount of contact time. Patients assigned to the cognitive therapy condition received an additional 9 group sessions with a psychiatrist of 3.5 hour duration, plus the option of even more consultations. The over 30 hours of contact time with a psychiatrist should be very attractive to patients who wanted it and could not otherwise obtain it. For some, it undoubtedly represented an opportunity to have someone to listen to their complaints of pain and suffering in a way that had not previously happened. This is also more than the intensity of psychotherapy typically offered in clinical trials, which is closer to 10 to 15, 50-minute sessions.
The intervention group thus received substantially more support and contact time, which was delivered with more positive expectations. This wealth of nonspecific factors favoring the intervention group compromises an effort to disentangle the specific effects of any active ingredient in the CBT intervention package. From what has been said so far, the trials’ providing a fair and generalizable evaluation of the CBT intervention is nigh impossible.
Conclusion: This is a methodologically poor choice of control groups with the dice loaded to obtain a positive effect for CBT.
3.The primary outcomes, both as originally scored and after switching, are subjective self-report measures that are highly responsive to nonspecific treatments, alleviation of mild depressive symptoms and demoralization. They are not consistently related to objective changes in functioning. They are particularly problematic when used as outcome measures in the context of an unblinded clinical trial within an inadequate control group.
There have been consistent demonstrations that assigning patients to inert treatments and measuring the outcomes with subjective measures may register improvements that will not correspond to what would be found with objective measures.
For instance, a provocative New England Journal of Medicine study showed that sham acupuncture as effective as an established medical treatment – an albuterol inhaler – for asthma when judged with subjective measures, but there was a large superiority for the established medical treatment obtained with objective measures.
There have been a number of demonstrations that treatments such as the one offered in the present study to patient populations similar to those in the study produce changes in subjective self-report that are not reflected in objective measures.
Much of the improvement in primary outcomes occurred before the first assessment after baseline and not very much afterwards. The early response is consistent with a placebo response.
The study actually included one largely unnoticed objective measure, utilization of routine care. Presumably if the CBT was effective as claimed, it would have produced a significant reduction in healthcare utilization. After all, isn’t the point of this trial to demonstrate that CBT can reduce health-care utilization associated with (as yet) medically unexplained symptoms? Curiously, utilization of routine care did not differ between groups.
The combination of the choice of subjective outcomes, unblinded nature of the trial, and poorly chosen control group bring together features that are highly likely to produce the appearance of positive effects, without any substantial benefit to the functioning and well-being of the patients.
Conclusion: Evidence for the efficacy of a CBT package for somatic complaints that depends solely on subjective self-report measures is unreliable, and unlikely to generalize to more objective measures of meaningful impact on patients’ lives.
4. We need to take into account the inexplicably high rates of deterioration in both groups, but particularly in the control group receiving enhanced care.
There was an unexplained deterioration of 50% deterioration in the control group and 25% in the intervention group. Rates of deterioration are only given a one-sentence mention in the article, but deserve much more attention. These rates of deterioration need to qualify and dampen any generalizable clinical interpretation of other claims about outcomes attributed to the CBT. We need to keep in mind that the clinical trials cannot determine how effective treatments are, but only how different a treatment is from a control group. So, an effect claimed for a treatment and control can largely or entirely come from deterioration in the control group, not what the treatment offers. The claim of success for CBT probably largely depends on the deterioration in the control group.
One interpretation of this trial is that spending an extraordinary 30 hours with a psychiatrist leads to only half the deterioration experienceddoing nothing more than routine care. But this begs the question of why are half the patients left in routine care deteriorating in such a large proportion. What possibly could be going on?
Conclusion: Unexplained deterioration in the control group may explain apparent effects of the treatment, but both groups are doing badly.
5. The diagnosis of “functional somatic symptoms” or, as the authors prefer – Severe Bodily Distress Syndromes – is considered by the authors to be a psychiatric diagnosis. It is not accepted as a valid diagnosis internationally. Its validation is limited to the work done almost entirely within the author group, which is explicitly labeled as “preliminary.” This biased sample of patients is quite heterogeneous, beyond their physicians having difficulty managing them. They have a full range of subjective complaints and documented physical conditions. Many of these patients would not be considered primarily having a psychiatric disorder internationally and certainly within the US, except where they had major depression or an anxiety disorder. Such psychiatric disorders were not an exclusion criteria.
Once sent on the pathway to a psychiatric diagnosis by their physicians’ making a referral to the study, patients had to meet additional criteria:
To be eligible for participation individuals had to have a chronic (i.e. of at least 2 years duration) bodily distress syndrome of the severe multi-organ type, which requires functional somatic symptoms from at least three of four bodily systems, and moderate to severe impairment.in daily living.
The condition identified in the title of the article is not validated as a psychiatric diagnosis. Two papers to which the authors refer to their own studies ( 1 , 2 ) from a single sample. The title of one of these papers makes a rather immodest claim:
Fink P, Schröder A. One single diagnosis, bodily distress syndrome, succeeded to capture 10 diagnostic categories of functional somatic syndromes and somatoform disorders. Journal of Psychosomatic Research. 2010 May 31;68(5):415-26.
In neither the two papers nor the present RCT is there sufficient effort to rule out a physical basis for the complaints qualifying these patients for a psychiatric diagnosis. There is also a lack of follow-up to see if physical diagnoses were later applied.
Citation patterns of these papers strongly suggest the authors are not having got much traction internationally. The criteria of symptoms from three out of four bodily systems is arbitrary and unvalidated. Many patients with known physical conditions would meet these criteria without any psychiatric diagnosis being warranted.
The authors relate what is their essentially homegrown diagnosis to functional somatic syndromes, diagnoses which are themselves subject to serious criticism. See for instance the work of Allen Frances M.D., who had been the chair of the American Psychiatric Association ‘s Diagnostic and Statistical Manual (DSM-IV) Task Force. He became a harsh critic of its shortcomings and the failures of APA to correct coverage of functional somatic syndromes in the next DSM.
Unless DSM-5 changes these incredibly over inclusive criteria, it will greatly increase the rates of diagnosis of mental disorders in the medically ill – whether they have established diseases (like diabetes, coronary disease or cancer) or have unexplained medical conditions that so far have presented with somatic symptoms of unclear etiology.
The diagnosis of mental disorder will be based solely on the clinician’s subjective and fallible judgment that the patient’s life has become ‘subsumed’ with health concerns and preoccupations, or that the response to distressing somatic symptoms is ‘excessive’ or ‘disproportionate,’ or that the coping strategies to deal with the symptom are ‘maladaptive’.
“These are inherently unreliable and untrustworthy judgments that will open the floodgates to the overdiagnosis of mental disorder and promote the missed diagnosis of medical disorder.
The DSM 5 Task force refused to adopt changes proposed by Dr. Frances.
Leading Frances to apologize to patients:
My heart goes out to all those who will be mislabeled with this misbegotten diagnosis. And I regret and apologize for my failure to be more effective.
The chair of The DSM Somatic Symptom Disorder work group has delivered a scathing critique of the very concept of medically unexplained symptoms.
Dimsdale JE. Medically unexplained symptoms: a treacherous foundation for somatoform disorders?. Psychiatric Clinics of North America. 2011 Sep 30;34(3):511-3.
Dimsdale noted that applying this psychiatric diagnosis sidesteps the quality of medical examination that led up to it. Furthermore:
Many illnesses present initially with nonspecific signs such as fatigue, long before the disease progresses to the point where laboratory and physical findings can establish a diagnosis.
And such diagnoses may encompass far too varied a group of patients for any intervention to make sense:
One needs to acknowledge that diseases are very heterogeneous. That heterogeneity may account for the variance in response to intervention. Histologically, similar tumors have different surface receptors, which affect response to chemotherapy. Particularly in chronic disease presentations such as irritable bowel syndrome or chronic fatigue syndrome, the heterogeneity of the illness makes it perilous to diagnose all such patients as having MUS and an underlying somatoform disorder.
I tried making sense of a table of the additional diagnoses that the patients in this study had been given. A considerable proportion of patients had physical conditions that would not be considered psychiatric problems in the United States.. Many patients could be suffering from multiple symptoms not only from the conditions, but side effects of the medications being offered. It is very difficult to manage multiple medications required by multiple comorbidities. Physicians from the community found their competence and ability to spend time with these patients taxing.
Most patients had a diagnosis of “functional headaches.” It’s not clear what this designation means, but conceivably it could include migraine headaches, which are accompanied by multiple physical complaints. CBT is not an evidence-based treatment of choice for functional headaches, much less migraines.
Over a third of the patients had irritable bowel syndrome (IBS). A systematic review of the comorbidity of irritable bowel syndrome concluded physical comorbidity is the norm in IBS:
The nongastrointestinal nonpsychiatric disorders with the best-documented association are fibromyalgia (median of 49% have IBS), chronic fatigue syndrome (51%), temporomandibular joint disorder (64%), and chronic pelvic pain (50%).
In the United States, many patients and specialists would consider considering irritable bowel syndrome as a psychiatric condition offensive and counterproductive. There is growing evidence that irritable bowel syndrome is a disturbance in the gut microbiota. It involves a gut-brain interaction, but the primary direction of influence is of the disturbance in the gut on the brain. Anxiety and depression symptoms are secondary manifestations, a product of activity in the gut influencing the nervous system.
Most of the patients in the sample had a diagnosis of fibromyalgia and over half of all patients in this study had a diagnosis of chronic fatigue syndrome.
Other patients had diagnosable anxiety and depressive disorders, which, particularly at the lower end of severity, are responsive to nonspecific treatments.
Undoubtedly many of these patients, perhaps most of them, are demoralized by not been able to get a diagnosis for what they have good basis to believe is a medical condition, aside from the discomfort, pain, and interference with their life that they are experiencing. They could be experiencing a demoralization secondary to physical illness.
These patients presented with pain, fatigue, general malaise, and demoralization. I have trouble imagining how their specific most pressing concerns could be addressed in group settings. These patients pose particular problems for making substantive clinical interpretation of outcomes that are highly general and subjective.
Conclusion: Diagnosing patients with multiple physical symptoms as having a psychiatric condition is highly controversial. Results will not generalize to countries and settings where the practice is not accepted. Many of the patients involved in the study had recognizable physical conditions, and yet they are being shunted to psychiatrists who focused only on their attitude towards the symptoms. They are being denied the specialist care and treatments that might conceivably reduce the impact of their conditions on their lives
6. The “CBT” offered in this study is as part of a complex, multicomponent treatment that does not resemble cognitive behavior therapy as it is practiced in the United States.
As seen in figure 1 in the article, The multicomponent intervention is quite complex and consists of more than cognitive behavior therapy. Moreover, at least in the United States, CBT has distinctive elements of collaborative empiricism. Patients and therapist work together selecting issues on which to focus, developing strategies, with the patients reporting back on efforts to implement them. From the details available in the article, the treatment sounded much more like an exhortation or indoctrination, even arguing with the patients, if necessary. An English version available on the web of the educational material used in initial sessions confirmed a lot of condescending pseudoscience was presented to convince the patients that their problems were largely in their heads.
Without a clear application of learning theory, behavioral analysis, or cognitive science, the “CBT” treatment offered in this RCT has much more in common with the creative novation therapy offered by Hans Eysenck, which is now known to have been justified with fraudulent data. Indeed, the educational materials for this study to what is offered in Eysenck’s study reveal striking similarities. Eysenck was advancing the claim that his intervention could prevent cardiovascular disease and cancer and overcome the iatrogenic effects. I know, this sounds really crazy, but see my careful documentation elsewhere.
Conclusion: The embedding of an unorthodox “CBT” in a multicomponent intervention in this study does not allow isolating any specific, active component ofCBT that might be at work.
7. The investigators disclose having altered their scoring of their primary outcome years after the trial began, and probably after a lot of outcome data had been collected.
I found a casual disclosure in the method section of this article unsettling, particularly noting that the original trial registration was:
We found an unexpected moderate negative correlation of the physical and mental component summary measures, which are constructed as independent measures. According to the SF-36 manual, a low or zero correlation of the physical and mental components is a prerequisite of their use.23 Moreover, three SF-36 scales that contribute considerably to the PCS did not fulfil basic scaling assumptions.31 These findings, together with a recent report of problems with the PCS in patients with physical and mental comorbidity,32 made us concerned that the PCS would not reliably measure patients’ physical health in the study sample. We therefore decided before conducting the analysis not to use the PCS, but to use instead the aggregate score as outlined above as our primary outcome measure. This decision was made on 26 February 2009 and registered as a protocol change at clinical trials. gov on 11 March 2009. Only baseline data had been analysed when we made our decision and the follow-up data were still concealed.
Switching outcomes, particularly after some results are known, constitutes a serious violation of best research practices and leads to suspicion of the investigators refining their hypotheses after they had peeked at the data. See How researchers dupe the public with a sneaky practice called “outcome switching”
The authors had originally proposed a scoring consistent with a very large body of literature. Dropping the original scoring precludes any direct comparison with this body of research, including basic norms. They claim that they switched scoring because two key subscales were correlated in the opposite direction of what is reported in the larger literature. This is troubling indication that something has gone terribly wrong in authors’ recruitment of a sample. It should not be pushed under the rug.
The authors claim that they switched outcomes based only on examining of baseline data from their study. However, one of the authors, Michael Sharpe is also an author on the controversial PACE trial A parallel switch was made to the scoring of the subjective self-reports in that trial. When the data were eventually re-analyzed using the original scoring, any positive findings for the trial were substantially reduced and arguably disappeared.
Even if the authors of the present RCT did not peekat their outcome data before deciding to switch scoring of the primary outcome, they certainly had strong indications from other sources that the original scoring would produce weak or null findings. In 2009, one of the authors, Michael Sharpe had access to results of a relevant trial. What is called the FINE trial had null findings, which affected decisions to switch outcomes in the PACE trial. Is it just a coincidence that the scoring of the outcomes was then switched for the present RCT?
Conclusion: The outcome switching for the present trial represents bad research practices. For the trial to have any credibility, the investigators should make their data publicly available so these data could be independently re-analyzed with the original scoring of primary outcomes.
The senior author’s clinic
I invite readers to take a virtual tour of the website for the senior author’s clinical services ]. Much of it is available in English. Recently, I blogged about dubious claims of a health care system in Detroit achieving a goal of “zero suicide.” . I suggested that the evidence for this claim was quite dubious, but was a powerful advertisement for the health care system. I think the present report of an RCT can similarly be seen as an infomercial for training and clinical services available in Denmark.
Conflict of interest
No conflict of interest is declared for this RCT. Under somewhat similar circumstances, I formally complained about undeclared conflicts of interest in a series of papers published in PLOS One. A correction has been announced, but not yet posted.
Aside from the senior author’s need to declare a conflict of interest, the same can be said for one of the authors, Michael Sharpe.
Apart from the professional and reputational interest, (his whole career has been built making strong claims about such interventions) Sharpe works for insurance companies, and publishes on the subject. He declared a conflict of interest for the for PACE trial.
MS has done voluntary and paid consultancy work for government and for legal and insurance companies, and has received royalties from Oxford University Press.
Here’s Sharpe’s report written for the social benefits reinsurance company UnumProvident.
If results of this are accepted at face, they will lend credibility to the claims that effective interventions are available to reduce social disability. It doesn’t matter that the intervention is not effective. Rather persons receiving social disability payments can be disqualified because they are not enrolled in such treatment.
Effects on the credibility of Cochrane collaboration report
The switched outcomes of the trial were entered into a Cochrane systematic review, to which primary care health professionals look for guidance in dealing with a complex clinical situation. The review gives no indication of the host of problems that I exposed here. Furthermore, I have glanced at some of the other trials included and I see similar difficulties.
I been unable to convince the Cochrane to clean up conflicts of interest that are attached to switched outcomes being entered in reviews. Perhaps some of my readers will want to approach Cochrane to revisit this issue.
I think this post raises larger issues about whether Cochrane has any business conducting and disseminating reviews of such a bogus psychiatric diagnosis, medically unexplained symptoms. These reviews do patients no good, and may sidetrack them from getting the medical care they deserve. The reviews do serve the interest of special interests, including disability insurance companies.
Special thanks to John Peters and to Skeptical Cat for their assistance with my writing this blog. However, I have sole responsibility for any excesses or distortions.