☆ 4.6 Article

Default prediction in P2P lending from high-dimensional data based on machine learning

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS (2019)

期刊

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS

卷 534, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.physa.2019.122370

关键词

Default prediction; High-dimensional data; Imbalanced data; Machine learning; P2P lending

类别

Physics, Multidisciplinary

资金

National Natural Science Foundation of China [91846107, 71571058, 61773286, 71690235]
Fundamental Research Funds for the Central Universities [PA2019GDQT0021]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In recent years, a new Internet-based unsecured credit model, peer-to-peer (P2P) lending, is flourishing and has become a successful complement to the traditional credit business. However, credit risk remains inevitable. A key challenge is creating a default prediction model that can effectively and accurately predict the default probability of each loan for a P2P lending platform. Due to the characteristics of P2P lending credit data, such as high dimension and class imbalance, conventional statistical models and machine learning algorithms cannot effectively and accurately predict default probability. To address this issue, a decision tree model-based heterogeneous ensemble default prediction model is proposed in this paper for accurate prediction of customer default in P2P lending. Gradient boosting decision trees (GBDT), extreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM) are employed as individual classifiers to create a heterogeneous ensemble learning-based default prediction model. Learning model-based feature ranking is applied to P2P lending credit data, and individual classifiers undergo hyperparameter optimization. Finally, comparison with benchmark models shows that the prediction model can achieve desirable prediction results and thus effectively solve the challenge of predictions based on high-dimensional and imbalanced data. (C) 2019 Elsevier B.V. All rights reserved.

Default prediction in P2P lending from high-dimensional data based on machine learning

期刊

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Default prediction in P2P lending from high-dimensional data based on machine learning

期刊

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文