☆ 4.6 Article

Bayesian dynamic learning and pricing with strategic customers

PRODUCTION AND OPERATIONS MANAGEMENT (2022)

期刊

PRODUCTION AND OPERATIONS MANAGEMENT

卷 31, 期 8, 页码 3125-3142

出版社

WILEY

DOI: 10.1111/poms.13741

关键词

Bayesian learning; dynamic game; pricing; strategic customers

类别

Engineering, Manufacturing Operations Research & Management Science

资金

National Natural Science Foundation of China (NSFC) [NSFC-71971132, 72192832]
NSFC [NSFC-72150002]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study investigates a learning problem between a seller and strategically behaving customers in personalized revenue management. It finds that a commonly used policy (MBP) may result in incomplete learning of the customer's true type in certain cases. By analyzing a Stackelberg game in a two-period model, it demonstrates that the optimal policy can significantly reduce regret. Building on the two-period model, a randomized Bayesian policy and a deterministic Bayesian policy are proposed to enable the seller to learn customer types rapidly and maintain a bounded regret even in the presence of strategic customers.

We consider a seller who repeatedly sells a nondurable product to a single customer whose valuations of the product are drawn from a certain distribution. The seller, who initially does not know the valuation distribution, may use the customer's purchase history to learn and wishes to choose a pricing policy that maximizes her long-run revenue. Such a problem is at the core of personalized revenue management where the seller can access each customer's individual purchase history and offer personalized prices. In this paper, we study such a learning problem when the customer is aware of the seller's policy and thus may behave strategically when making a purchase decision. By using a Bayesian setting with a binary prior, we first show that a popular policy in this setting-the myopic Bayesian policy (MBP)-may lead to incomplete learning of the seller, namely, the seller may never be able to ascertain the true type of the customer and the regret may grow linearly over time. The failure of the MBP is due to the strategic action taken by the customer. To address the strategic behavior of the customers, we first analyze a Stackelberg game under a two-period model. We derive the optimal policy of the seller in the two-period model and show that the regret can be significantly reduced by using the optimal policy rather than the myopic policy. However, such a game is hard to analyze in general. Nevertheless, based on the idea used in the two-period model, we propose a randomized Bayesian policy (RBP), which updates the posterior belief of the customer in each period with a certain probability, as well as a deterministic Bayesian policy (DBP), in which the seller updates the posterior belief periodically and always defers her update to the next cycle. For both the RBP and DBP, we show that the seller can learn the customer type exponentially fast even if the customer is strategic, and the regret is bounded by a constant. We also propose policies that achieve asymptotically optimal regrets when only a finite number of price changes are allowed.

Bayesian dynamic learning and pricing with strategic customers

期刊

PRODUCTION AND OPERATIONS MANAGEMENT

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Bayesian dynamic learning and pricing with strategic customers

期刊

PRODUCTION AND OPERATIONS MANAGEMENT

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文