4.7 Article

Consistency checks to improve measurement with the Hamilton Rating Scale for Depression (HAM-D)

期刊

JOURNAL OF AFFECTIVE DISORDERS
卷 302, 期 -, 页码 273-279

出版社

ELSEVIER
DOI: 10.1016/j.jad.2022.01.105

关键词

HAM-D17; Hamilton Rating Scale for Depression; Consistency of measurement; NEWMEDS; Careless ratings; Inconsistent ratings

资金

  1. Innovative Medicine Initiative Joint Undertaking [115008]
  2. European Union
  3. Elie Wiesel Chair at Bar Ilan University

向作者/读者索取更多资源

This study explores the impact of measurement imprecisions on outcome assessment in the treatment of depression and proposes flags for logical and statistical consistency checks. The results show that nearly 30% of the depression treatments have inconsistent scoring and statistical outliers, which should be reviewed and addressed to improve the reliability and validity of clinical trial data.
Background: Symptom manifestations in mood disorders can be subtle. Cumulatively, small imprecisions in measurement can limit our ability to measure treatment response accurately. Logical and statistical consistency checks between item responses (i.e., cross-sectionally) and across administrations (i.e., longitudinally) can contribute to improving measurement fidelity. Methods: The International Society for CNS Clinical Trials and Methodology convened an expert Working Group that assembled flags indicating consistency/inconsistency ratings for the Hamilton Rating Scale for Depression (HAM-D17), a widely-used rating scale in studies of depression. Proposed flags were applied to assessments derived from the NEWMEDS data repository of 95,468 HAM-D administrations from 32 registration trials of antidepressant medications and to Monte Carlo-simulated data as a proxy for applying flags under conditions of known inconsistency. Results: Two types of flags were derived: logical consistency checks and statistical outlier-response pattern checks. Almost thirty percent of the HAMD administrations had at least one logical scoring inconsistency flag. Seven percent had flags judged to suggest that a thorough review of rating is warranted. Almost 22% of the administrations had at least one statistical outlier flag and 7.9% had more than one. Most of the administrations in the Monte Carlo-simulated data raised multiple flags. Limitations: Flagged ratings may represent less-common presentations of administrations done correctly. Conclusions: Application of flags to clinical ratings may aid in detecting imprecise measurement. Reviewing and addressing these flags may improve reliability and validity of clinical trial data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据