☆ 4.5 Article

Optimization of multi-stage dynamic treatment regimes utilizing accumulated data

STATISTICS IN MEDICINE (2015)

Journal

STATISTICS IN MEDICINE

Volume 34, Issue 26, Pages 3424-3443

Publisher

WILEY-BLACKWELL

DOI: 10.1002/sim.6558

Keywords

backward induction; multi-stage treatment; optimal treatment sequence; Q-learning; treatment decision-making

Funding

USA National Institutes of Health [U54 CA096300, U01 CA152958, 5P50 CA100632, R01 CA 83932, 5P01 CA055164]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In medical therapies involving multiple stages, a physician's choice of a subject's treatment at each stage depends on the subject's history of previous treatments and outcomes. The sequence of decisions is known as a dynamic treatment regime or treatment policy. We consider dynamic treatment regimes in settings where each subject's final outcome can be defined as the sum of longitudinally observed values, each corresponding to a stage of the regime. Q-learning, which is a backward induction method, is used to first optimize the last stage treatment then sequentially optimize each previous stage treatment until the first stage treatment is optimized. During this process, model-based expectations of outcomes of late stages are used in the optimization of earlier stages. When the outcome models are misspecified, bias can accumulate from stage to stage and become severe, especially when the number of treatment stages is large. We demonstrate that a modification of standard Q-learning can help reduce the accumulated bias. We provide a computational algorithm, estimators, and closed-form variance formulas. Simulation studies show that the modified Q-learning method has a higher probability of identifying the optimal treatment regime even in settings with misspecified models for outcomes. It is applied to identify optimal treatment regimes in a study for advanced prostate cancer and to estimate and compare the final mean rewards of all the possible discrete two-stage treatment sequences. Copyright (c) 2015 John Wiley & Sons, Ltd.

Optimization of multi-stage dynamic treatment regimes utilizing accumulated data

Journal

STATISTICS IN MEDICINE

Publisher

WILEY-BLACKWELL

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Optimization of multi-stage dynamic treatment regimes utilizing accumulated data

Journal

STATISTICS IN MEDICINE

Publisher

WILEY-BLACKWELL

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper