Neurobalm: the pseudo-neuroscience of couples therapy

soothingsyrup1Special thanks to Professor Keith Laws, blogger at LawsDystopiaBlog and especially the pseudonymous Neurocritic for their helpful comments. But any excesses or inaccuracies are entirely my own responsibility.


You may be more able to debunk bad neuroscience than you think.

In my last blog post, I began critically examining whether emotionally focused couples therapy (EFT) could be said to sooth the brains of wives who had received it.

Claims were made in a peer-reviewed article available here and amplified in a University of Ottawa press release that EFT was a particularly potent form of couples therapy. An fMRI study supposedly demonstrated how EFT changed the way the brain encoded threatening situations.

True love creates resilience, turning off fear and pain in the brain

OTTAWA, May 1, 2014— New research led by Dr. Sue Johnson of the University of Ottawa’s School of Psychology confirms that those with a truly felt loving connection to their partner seem to be calmer, stronger and more resilient to stress and threat.

In the first part of the study, which was recently published in PLOS ONE, couples learned how to reach for their lover and ask for what they need in a “Hold Me Tight” conversation. They learned the secrets of emotional responsiveness and connection.

The second part of the study, summarized here, focused on how this also changed their brain. It compared the activation of the female partner’s brain when a signal was given that an electric shock was pending before and after the “Hold Me Tight” conversation.

The experiment explored three different conditions. In the first, the subject lay alone in a scanner knowing that when she saw a red X on a screen in front of her face there was a 20% chance she would receive a shock to her ankles. In the second, a male stranger held her hand throughout the same procedure. In the third, her partner held her hand. Subjects also pressed a screen after each shock to rate how painful they perceived it to be.

Before the “Hold Me Tight” conversation, even when the female partner was holding her mate’s hand, her brain became very activated by the threat of the shock — especially in areas such as the inferior frontal gyrus, anterior insula, frontal operculum and orbitofrontal cortex, where fear is controlled. These are all areas that process alarm responses. Subjects also rated the shock as painful under all conditions.

However, after the partners were guided through intense bonding conversations (a structured therapy titled Emotionally Focused Couple Therapy or EFT), the brain activation and reported level of pain changed —under one condition. While the shock was again described as painful in the alone and in the stranger hand holding conditions (albeit with some small change compared to before), the shock was described as merely uncomfortable when the husband offered his hand. Even more interesting, in the husband hand-holding condition, the subject’s brain remained calm with minimal activation in the face of threat.

These results support the effectiveness of EFT and its ability to shape secure bonding. The physiological effects are exactly what one would expect from more secure bonding. This study also adds to the evidence that attachment bonds and their soothing impact are a key part of adult romantic love. Results shed new light on other positive findings on secure attachment in adults, suggesting the mechanisms by which safe haven contact fosters more stability and less reactivity to threat.

You can find my succinct deconstruction of the press release here.

I invite you to carefully read the article or my last blog post and this one. This shouldhold me tight prepare you to detect some important signs this press release is utter nonsense, designed to mislead and falsely impress clinicians to whom EFT workshops and trainings are marketed. For instance, where in the procedures described in the PLOS One article is there any indication of the “Hold Me Tight” conversation? But that is just the start of the nonsense.

The PLOS One article ends with the claim that this “experiment” was conducted with a rigor comparable to a randomized clinical trial. Reading the article or these blog posts, you should also be able to see that this claim too is utter nonsense.

In my last blog post, I showed a lack of compelling evidence that EFT was better than any other couples treatment. To the extent to which EFT has been evaluated at all, the studies are quite small and all supervised by promoters of EFT. Couples in the EFT studies are recruited to be less martially dissatisfied than in other couples therapy research, and there is some evidence that improvement in marital functioning does not persist after therapy ends.

I called attention to the neuroscientist Neurocritic’s caution against expecting fMRI studies to reveal much about the process or effectiveness of psychotherapy that we do not know already.

Of course, we should expect some effects of psychotherapy to be apparent in pre-post therapy fMRI studies. But we should also expect the same of bowling or watching a TV series for equivalent amount of time. Are we really getting much more than what we what we can observe in couples’ behavior or what they report after therapy to what we can find with an fMRI? And without a comparison group, studies are not particularly revealing.

The larger problem looming in the background is authors intentionally or unintentionally intimidating readers with glib interpretations of neuroscience. Few readers feel confident in their ability to interpret such claims, especially the therapists to whom author Susan Johnson’s workshops are promoted.

This blog post could surprise you.

Maybe it will reassure you that you possess basic critical faculties with which you can debunk the journal article –if you are willing to commit the time and energy to reading and rereading it with skepticism.

I would settle, however, for leaving you thoroughly confused and skeptical about the claims in the PLOS One article. There are lots of things that do not make sense and that should be confusing if you think about them.

Confusion is a healthy reaction, particularly if the alternative is gullibility and being persuaded by pseudoscience.

I begin by ignoring that this was specifically an fMRI study.  Instead, I will look at some numbers and details of the study that you can readily discover. Maybe you would have had to look some things up on the Internet, but many of you could replicate my efforts.

In the text below, I have inserted some numbers in brackets. If you click on them, you will be taken to a secondary blog site where there are some further explanations.

The 23 wives for whom data were reported are in unrepresentative and highly select subsample of the 666 wives in couples expressing an interest in response to advertisements for the study.

With such a small number of participants–

  •  Including or excluding one or two participants can change results [1]. There is some evidence this could have occurred after initial results were known [2].
  • Any positive significant findings are likely to be false, and of necessity, significant findings will be large in magnitude, even when false positives [3].

The sample was restricted to couples experiencing only mild to moderate marital dissatisfaction. So, the study sample was less dissatisfied with their marriages, i.e.,  not comparable to those recruited by other research groups for couples intervention studies.

Given the selection procedure, it was impossible for the authors to obtain a sample of couples with the mean levels of marital dissatisfaction that they reported for baseline assessments.

They stated that they recruited couples with the criteria that their marital dissatisfactionyour sample sizes are small initially be between 80-96 on the DAS. They then report that initial mean DAS score was 81.2 (SD=14.0). Impossible. [4]

Yup, and this throws into doubt all the other results that are reported, especially when they find they need to explain results that did not occur as expected in differences between pre and post EFT fMRI, but only in a complex interaction between pre/post fMRI and initial DAS scores.

Couples therapy was continued until some vaguely defined clinical goal had been achieved.  None of the details were presented that one would expect a scientific paper for how it was decided that this was enough therapy.

We were not told who decided, by what criteria, or with what interrater reliability the judgments were made. We do know Susan Johnson, CEO of the nonprofit and profit-making companies promoting EFT supervised all therapy and the study.

Basically, Dr. Johnson was probably able to prolong the therapy and the follow-up fMRI assessment until she believed that the wives responses would make the therapy look good. And with no further follow-up, she implies that “how the brain processes threat” had been changed without any evidence that whether changes in fMRI persisted or were transient.

This might be fine for the pseudo-magic of a workshop presentation, but is unacceptable for a peer-reviewed article for which readers are supposed to be able to arrive at an independent judgment. And far removed from the experimental control of a clinical trial in which timing of follow up assessments are fixed.

Randomized clinical trials take this kind of control away from investigators and put it into the design and the phenomenon being studied so that maybe investigators can be proved incorrect.

The amount of therapy that these wives received (M= 22-9, range =13-35) was substantially more what was provided in past EFT outcome studies. Whatever therapeutic gains were observed in the sample could not be expected to generalize to past studies. [5]

Despite the therapy that they had received and despite the low levels of marital dissatisfaction with which they had begun, the average couple finishing the study still qualified for entering it. [6]

There is no explanation given why only wives data are presented. No theoretical or clinical rationale is given for not studying husbands or presenting their data as well [7]

A great deal is made of whether particular results are statistically significant or not. However, keep in mind that there was a very small sample size and the seemingly sharp distinction between significant and nonsignificant is arbitrary. Certainly, the size of most differences between results characterized as significant versus nonsignificant is not itself statistically significant. [8]

And, we will see, much is being made of small differences that did not occur for all wives, only those initially with the lowest marital satisfaction.

The number of statistical tests the conducted was many times number of women in the study. The authors do not indicate all the analyses they conducted and selectively reported a subset of the analyses conducted, but there was considerable room for capitalizing on chance.

cherrypickingMultiple statistical tests in  a small sample without adjustment for there being so many tests is a common complaint about small fMRI studies, but this study is a particularly bad example. Happy cherrypicking!

The article and Johnson’s promotional materials make much of differences that were observed from fMRI data collected before and after therapy. But the article never reports results for actually testing these differences.This is an important discovery. Let’s stop and explore it.

The article leads off its presentation of the fMRI results with

The omnibus test of EFT and handholding on all voxels activated in the original Coan et al. handholding study indicated a significant interaction between EFT, handholding and DAS, F (2, 72.6) = 3.6, p= .03 (Alone x EFT x DAS b= 10.3, SE =3.7; Stranger x EFT x DAS b = 2.5, SE =3.3).

What is oddly missing here is any test of the simple interaction between EFT (before versus after therapy) and handholding, i.e., EFT x handholding. The authors do not tell us whether the overall effects on hand holding (partner versus alone versus stranger) were different from before versus after completion of EFT (partner versus alone versus stranger), but that is the difference they want to discuss.

Basically, the authors only report interactions between EFT and handholding as qualified by level of initial marital satisfaction.

So? The authors proposed the simple hypothesis that receiving EFT will affect fMRI results in a situation involving threat of pain. They are about to do a very large number of multiple statistical tests and they want to reassure the reader that they are not capitalizing on chance.

For reassurance, they need an interaction between EFT and handholding in the omnibus test. Apparently they did not get it. What they end up doing is going back and forth between whatever few statistical tests are significant from the well over 100 tests that they conducted for pre-/post-fMRI findings. When most of those tests proved nonsignificant they went to a more complex interaction between fMRI results qualified by wives’ level of marital satisfaction.

NThis  is a classic fishing expedition with a high probability that many of the fish should be thrown back as false positives. And the authors do not even have the fishing license that they hoped  significant omnibus results would have provided.

The article makes repeated references to following up and replicating an earlier study by one of the authors, Jim Coan. That study involved only 16 women selected for higher marital satisfaction, so much so, they were called “supercouples” in press coverage of the study. You can find Neurocritic’s critique of that study here.

The levels of marital satisfaction for the two small samples were discontinuous with each other—any couples eligible for one would be disqualified from the other by a wide margin. Most of the general population of married people would fall in between these two studies in  level of marital satisfaction. And any reference, as these authors make, to findings for women with low marital satisfaction in the Coan study are bunk. The highly select sample in the Coan study did not have any women with low marital satisfaction.

The two  samples are very different, but neither study presented data in a way that allowed direct comparison with the other. Both studies departed from transparent, conventional presentation of data. Maybe the results for the original Coan study were weak as well and were simply covered up. That is suggested in the Neurocritic blog post.

But the problem is worse than that. The authors claim that they are preselected the regions of interest (ROIs) based on the results that Coan obtained with his sample of 16 women. If you take the trouble to examine Table 1 from this article and compare it to Coan’s results, you will see that some of the areas of the brain they are examining did not produce significant results in Coan’s study. More evidence of a fishing expedition.

It is apparent that the authors changed their hypotheses after seeing the data. They did not expect changes in the stranger condition and scrambled to explain these results. If you jump to the Discussion section concerning fMRI results for the stranger condition, you get a lot of amazing post-hoc gobbledygook as the authors try to justify the results they obtained. They should simply have admitted that their hypothesis was not confirmed.

j figure 2.pone.0079314.g002
Figure 2. Point estimates of percent signal change graphed as a function of EFT (pre vs. post) by handholding (alone, stranger, partner) and DAS score.

The graphic representations in Figures 2 and 4 were produced by throwing away two thirds of the available data [9].  Yup. Each line represents results for two wives. It is unclear what interpretation is possible, except that it appears that after throwing away all this data, differences between pre- and post-therapy were not apparent for the group that started with higher marital satisfaction. It is nearly flat in the partner condition, which the authors consider so important.

We do not want to make too much of these graphs because they are based on so few wives. But they do seem to suggest that not much was happening for women with higher marital satisfaction to begin with. And this may be particularly true for the responses when they were holding the hand of their partner. Yikes!

aPLOS Johnson EFT-1
Click to enlarge

In looking at the graphical representations of self-report data in figure 1 and the fMRI data in figures 3 and 5, pay particular attention to the bracketing +/- zones, not just the heights of the bar graphs. Some of the brackets overlap or nearly so and you can see that small differences are being discussed.

And, oh, the neuroscience….

It is helpful to know something about fMRI studies to go much further in evaluating this one. But I can provide you with some light weaponry for dispensing with common nonsense.

First, beware of multiple statistical tests from small samples. The authors reassure us that their omnibus test reduced that threat, but they did not report relevant results and they probably did not obtain the results they needed for reassurance. And the results they expected for the omnibus test would not have been much reassurance anyway, they would still be largely capitalizing on chance. The authors also claim that they were testing regions of interest (ROIs), but if you take a careful look, they were testing other regions of the brain and they generally did not replicate much of Coan’s findings from his small study.

new phrenologySecond, beware of suggestions that particular complex mental functions are localized in single regions of the brain so that a difference for that mental function can be inferred from a specific finding for that region. The tendency of investigators to lapse into such claims has been labeled the new phrenology, phrenology being the 19th century pseudoscience of bumps. The authors of this study lead us into this trap when they attempt to explain in the discussion section findings they did not expect.

Third, beware of glib interpretations that a particular region of the brain is activated in terms of meaning that certain mental processes are occurring. It is often hard to tell what activation means. More activity can mean that more mental activity is occurring or it can mean the same mental activity requires more effort.

Fourth, beware of investigators claiming that changes in activation observed in fMRI data represent changes in the structure of the brain or mental processes (in this case, the authors’ claim that processing of threat had been changed). They are simply changes in activity and they may or may not persist and they may or may not be compensated by other changes. Keep in mind the brain is complex and function is interconnected.

Overall, the MRI results were weak, inconsistent, and obscured by the authors’ failure to report simple pre-post differences in any straightforward fashion. And what is presented really does not allow direct comparison between the earlier Coan study and the present one.

The authors started with the simple hypothesis that fMRI assessments conducted before and after EFT would show changes in wives’ response to threat of pain relative to whether there hand was being held by their partner, a stranger, or no one. Results were inconsistent and the authors were left struggling with findings that after a course of EFT, among other things, the wives were more comfortable with their hands been held by a stranger and less comfortable being alone. And that overall, results that they expected to be simply a result of the wives getting EFT actually were limited to wives who got EFT, but who had the lowest marital satisfaction to begin with.

We could continue our analysis by getting into the specific areas of brain functioning for which significant results were or were not obtained. That is dubious business because so many of the results are likely to be due to chance. If we nonetheless continue, we have to confront post-hoc gobbledygook efforts to explain results like

In the substantia nigra/red nucleus, threat-related activity was generally greater during stranger than partner handholding, F (1, 47.4) = 6.5, p = .01. In the vmPFC, left NAcc, left pallidum, right insula, right pallidum, and right planum polare, main effects of EFT revealed general decreases from pre- to post- therapy in threat activation, regardless of whose hand was held, all Fs (1, 41.1 to 58.6) > 3.9, all ps < .05.

Okay, now we started talking about seemingly serious neuroscience and fMRIs and you are confused. But you ought to be confused. Even a neuroscientist would be confused, because the authors are not providing a transparent presentation of their findings, only a lot of razzle dazzle designed to shock and awe, not really inform.

Magneto, the BS-fighting superhero summoned by Neurocritic
Magneto, the BS-fighting superhero summoned by Neurocritic

In an earlier blog post concerning the PLOS One study, Neurocritic detected nonsense and announced that Magneto, a BS-fighting superhero was being summoned. But even mighty Magneto was thwarted by the confused presentation of ambiguous results and the absence of knowledge of what other results had been examined but were suppressed because they did not support the story the authors wanted to tell.

I’m not sure that I understand this formulation, or that a dissociation between behavioral self-report and dACC activity warrants a reinterpretation of EFT’s therapeutic effects. Ultimately, I don’t feel like a BS-fighting superhero either, because it’s not clear whether Magneto has effectively corrected the misperceptions and overinterpretations that have arisen from this fMRI research.

Some of you may be old enough to recall Ronald Reagan doing advertisements for Generalconfused-man Electric on television. He would always end with “Progress is our most important product.” We have been trying to make sense of neuroscience data being inappropriately used to promote psychotherapy,and have had to  deal with all the confusion, contradictory results, and outright cover-up in an article in PLOS One. To paraphrase Reagan, “Confusion is our most important product.” If you are not confused, you don’t sufficiently grasp what is being done in the PLOS One article and the press coverage and promotional video.

