A study of adolescent depression in Lancet Psychiatry that simply doesn’t make sense

A study of adolescent depression in Lancet Psychiatry that simply doesn’t make sense.

  • Serious discrepancy between sample size and numbers appearing in tables.
  • Claim that teens who did not access services were 7x more likely to be depressed three years later is incorrect and apparently depends on misinterpreting odds ratio as relative risks.
  • Complex statistical analyses were inappropriate to modest sample size.
  • Where was the study statistician when this paper was submitted?
  • How were these problems missed by reviewers?

I’m uploading this post on PubPeer and I expect I will have some comments to add later.

Update: This post is getting some attention in the social media. Someone observed on Facebook that editors and reviewers mostly don’t bother to read tables. It was certainly true in this instance, but I suspect more generally true.

But on PubPeer, someone made a similar observation and noticed more irregularities in the tables and have appended their comment at the end. These irregularities further underscore the need for the journal to make some corrections.

Reduction in adolescent depression after contact with mental health services: a longitudinal cohort study in the UK Lancet Psychiatry

I saw problems right away when I examined the abstract and these were confirmed  when I looked at the tables.

What I found was a cohort of 1238 14-year-old adolescent in which it was intended to track reduction in depression following contact with mental health services.

The findings were improbable and even astonishing.


14-year-old adolescents who had contact with mental health services in the past year had a greater decrease in depressive symptoms than those without contact (adjusted coefficient −1·68, 95% CI −3·22 to −0·14; p=0·033). By age 17 years, the odds of reporting clinical depression were more than seven times higher in individuals without contact than in service users who had been similarly depressed at baseline (adjusted odds ratio 7·38, 1·73–31·50; p=0·0069).

“Seven times higher”?!!

I can certainly understand the authors’ advocacy of “improvement of access to adolescent mental health services” but it does not follow from appropriate analyses of data from such a small cohort.

The first and obvious concern is that there is unlikely a large enough number of 14-year-old adolescents depressed and contacting or not contacting mental-health services to generate reliable multivariate statistics.

The abstract claims that the effects is that that those who are depressed and did not had contact services were seven times more likely to be depressed at 17 than those who were depressed and had contact. This is both improbable and a misinterpretation of an odds ratio (not a  more easily interpretable relative risk, RR).

I start with the assumptions that

(1) Few depressed adolescents in the community have contact with formal mental health services;

(2) Much of the contact with mental health services is inadequate quality and intensity; and)

(3) To the extent to which their “depression” represents a recurrent episodic disorder, as major depression is, an episode of treatment is follow by relapse and recurrence of one or more episodes, for which services may not be obtained.

There was a gross discrepancy between number of adolescents (1238) in the cohort and the numbers reported in the tables. For instance, in table 1, the number of adolescents is variable, ranging up to 3302.

A sample size of 1238 may seem like a lot of teens.  It’s not large enough to generate and test explanations for effects of contact with mental health services, particularly when the general sample is drawn from the community these contacts are likely to be low enough not to be suitable for multivariate analyses using them as predictors with control of confounds

We need to recognize that in terms of statistical power, the number of events to explain, not the overall size of the sample is most important.

In the results section we we can readily see the expected reduction in numbers from sample size to number of adolescents:


Of the 1238 participants recruited, 1190 adolescents had data for T1 current mental disorder and past-year mental health service contact (appendix p 6). The number of respondents with complete data for all outcomes and covariates at all timepoints was 995 (84%) for T1, 778 (65%) for T2, and 806 (68%) for T3. 64 (5%) adolescents made past-year contact with mental health services; 126 (11%) had a current mental disorder. Among individuals with a disorder, 48 (38%) reported past-year service contact and 46 (96%) of these contacts were based on T1 past-year recall; 36 (84%) of 43 of these adolescents attended five or more sessions (n=5 had missing data for treatment length). In the disorder-and-services group (n=48), disorders were affective (n=16 [33%]), anxiety (n=10 [21%]), behavioural (n=25 [52%]), and other (n=5 [10%]); 14 (29%) of these participants had a comorbid K-SADS diagnosis (appendix p 9).

Statistical controls. The article reports a full range of statistical controls that might be appropriate for a larger sample, but not for predicting such a small number of events. Overfitting of multivariate equations is likely to produce spurious and inflated results.

Data were adjusted as follows: gender, sociodemographics (ethnic origin, Index of Multiple Deprivation, adolescent living with biological parents), environmental factors (number of stressful life events in the past year, current family dysfunction and friendships, any family-focused adversities by T1), and mental health factors (any past Schedule for Affective Disorders and Schizophrenia for School-Age Children diagnosis, any mental health services after T1, any emotional problems in a family member [past 3 years or present], current antisocial traits). Variables not included were any mental health service referral age 0–13 years (p=0·19 in base model) and pubertal status (not a true confounder as p>0·10 and ρ<0·10 with predictor). MFQ=Mood and Feelings Questionnaire. T1=timepoint 1 (age 14·5 years). T2=timepoint 2 (age 16 years). T3=timepoint 3 (age 17·5 years).

The article later states:

We included nine baseline covariates in the propensity score weighting (table 2). Propensity score weighted GLM revealed that among adolescents with a mental disorder, those without contact with mental health services at T1 had nearly four times the odds of being depressed by T3 compared with those in the disorder-and-services group (table 2). Inclusion of post-baseline confounding variables increased odds by more than five times, and in the common support sample, to more than seven times (table 2).

Again, there is a gross mismatch between heavy-duty statistical analyses and the modest sample to which they are applied. I don’t understand how a statistician would agree to provide these analyses. Perhaps that explains why odds ratios were misinterpreted as relative risks, a common problem when statisticians are not involved in interpreting data.

Overall, this study involved a wild mismatch between statistical analyses and sample size. Such a sample size does not even allow testing of the appropriateness of the assumptions that are being made.

In the present study, we considered mental health services from all sectors irrespective of treatment length, we multiply imputed missing data, used propensity score weighting to adjust for participants’ initial likelihood to access services, and data yielded clinically relevant results robust to a wide range of confounds. Contact with mental health services appeared to be of such value that after 3 years the levels of depressive symptoms of service users with a mental disorder were similar to those of unaffected individuals.

I conclude this article is quite misleading and should not remain uncorrected.

From PubPeer:

>>How were these problems missed by reviewers?
Because reviewers don’t read tables or statistics.

At first I thought that the sample sizes like the “imputed” 3302 might be due to counting each of three time points as a case. But then 3302 is too small (1238*3=3704) and 2469 is too big (778*3=2334).

Also, the breakdown of the numbers in the middle of Table 1 (“Categorical analysis of age”) doesn’t work. It’s OK for the right-hand side (2257+140+52=2469) but not the left (2965+202+126=3293, not 3302).

I don’t think the numbers in this study have any credibility at all until this table is sorted out.

ebook_mindfulness_345x550I will soon be offering e-books providing skeptical looks at mindfulness and positive psychology, as well as scientific writing courses on the web as I have been doing face-to-face for almost a decade.

Sign up at my new website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites.  Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.