☆ 4.6 Article

Online Learning and Optimization of (Some) Cyclic Pricing Policies in the Presence of Patient Customers

M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT (2022)

期刊

M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT

卷 24, 期 2, 页码 1165-1182

出版社

INFORMS

DOI: 10.1287/msom.2021.0979

关键词

pricing; customer behavior; dynamic programming; nonparametric algorithms; asymptotic analysis

类别

Management Operations Research & Management Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The paper discusses revenue management with patient customers and their valuation characteristics from a joint learning and optimization perspective. New dynamic programming formulations and nontrivial learning algorithms are introduced, analyzing decreasing cyclic policies and threshold-regulated policies, with learning algorithms converging to optimal strategies at a near-optimal rate. The proposed algorithms perform significantly better than benchmark algorithms, highlighting the importance of smart learning in data-driven decision-making. Combining algorithms with smart estimation methods can significantly improve their empirical performance, emphasizing the practical viability of the algorithms.

Problem definition: We consider the problem of joint learning and optimization of cyclic pricing policies in the presence of patient customers. In our problem, some customers are patient, and they are willing to wait in the system for several periods to make a purchase until the price is lower than their valuation. The seller does not know the joint distribution of customers' valuation and patience level a priori and can only learn this from the realized total sales in every period. Academic/practical relevance: The revenue management problem with patient customers has been studied in the literature as an optimization problem, and cyclic policy has been shown to be optimal in some cases. We contribute to the literature by studying this problem from the joint learning and optimization perspective. Indeed, to the best of our knowledge, our paper is the first work that studies online learning and optimization for multiperiod pricing with patient customers. Methodology: We introduce new dynamic programming formulations for this problem, and we develop two nontrivial upper confidence bound-based learning algorithms. Results: We analyze both decreasing cyclic policies and so-called threshold-regulated policies, which contain both the decreasing cyclic policies and the nested decreasing cyclic policies. We show that our learning algorithms for these policies converge to the optimal clairvoyant decreasing cyclic policy and threshold-regulated policy at a near-optimal rate. Managerial implications: Our proposed algorithms perform significantly better than benchmark algorithms that either ignore the patient customer characteristic or simply use the standard estimate-then-optimize framework, which does not encourage enough exploration; this highlights the importance of smart learning in the context of data-driven decision making. In addition, our numerical results also show that combining our algorithms with smart estimation methods, such as linear interpolation or least square estimation, can significantly improve their empirical performance; this highlights the benefit of combining smart learning with smart estimation, which further increases the practical viability of the algorithms.

Online Learning and Optimization of (Some) Cyclic Pricing Policies in the Presence of Patient Customers

期刊

M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT

出版社

INFORMS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Online Learning and Optimization of (Some) Cyclic Pricing Policies in the Presence of Patient Customers

期刊

M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT

出版社

INFORMS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文