Journal
BIOMETRICS
Volume 74, Issue 3, Pages 891-899Publisher
WILEY
DOI: 10.1111/biom.12836
Keywords
A-learning; Augmented inverse probability weighted estimator; CART; Dynamic treatment regime; Precision medicine; Q-learning
Funding
- National Natural Science Foundation of China [71701120, 61472205]
- Program for Innovative Research Team of Shanghai University of Finance and Economics
Ask authors/readers for more resources
A dynamic treatment regime is a sequence of decision rules, each corresponding to a decision point, that determine that next treatment based on each individual's own available characteristics and treatment history up to that point. We show that identifying the optimal dynamic treatment regime can be recast as a sequential optimization problem and propose a direct sequential optimization method to estimate the optimal treatment regimes. In particular, at each decision point, the optimization is equivalent to sequentially minimizing a weighted expected misclassification error. Based on this classification perspective, we propose a powerful and flexible C-learning algorithm to learn the optimal dynamic treatment regimes backward sequentially from the last stage until the first stage. C-learning is a direct optimization method that directly targets optimizing decision rules by exploiting powerful optimization/classification techniques and it allows incorporation of patient's characteristics and treatment history to improve performance, hence enjoying advantages of both the traditional outcome regression-based methods (Q- and A-learning) and the more recent direct optimization methods. The superior performance and flexibility of the proposed methods are illustrated through extensive simulation studies.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available