4.7 Article

The value of text for small business default prediction: A Deep Learning approach

Journal

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
Volume 295, Issue 2, Pages 758-771

Publisher

ELSEVIER
DOI: 10.1016/j.ejor.2021.03.008

Keywords

OR in banking; Risk analysis; Deep Learning; Text mining; Small business lending

Funding

  1. Economic and Social Research Council [ES/P000673/1]
  2. Natural Sciences and Engineering Research Council of Canada (NSERC) [RGPIN202007114]
  3. Canada Research Chairs program

Ask authors/readers for more resources

Compared to consumer lending, mSME credit risk modeling is more challenging due to limited data availability, with textual loan assessment being a standard practice. Deep Learning and NLP techniques, including the BERT model, are used to extract information from textual assessments, showing surprisingly effective prediction of default. However, combining text with traditional data does not enhance predictive capability, with performance varying based on text length. Our proposed Deep Learning model is robust to text quality and can partly automate the mSME lending process.
Compared to consumer lending, Micro, Small and Medium Enterprise (mSME) credit risk modelling is particularly challenging, as, often, the same sources of information are not available. Therefore, it is standard policy for a loan officer to provide a textual loan assessment to mitigate limited data availability. In turn, this statement is analysed by a credit expert alongside any available standard credit data. In our paper, we exploit recent advances from the field of Deep Learning and Natural Language Processing (NLP), including the BERT (Bidirectional Encoder Representations from Transformers) model, to extract information from 60,000 textual assessments provided by a lender. We consider the performance in terms of the AUC (Area Under the receiver operating characteristic Curve) and Brier Score metrics and find that the text alone is surprisingly effective for predicting default. However, when combined with traditional data, it yields no additional predictive capability, with performance dependent on the text's length. Our proposed Deep Learning model does, however, appear to be robust to the quality of the text and therefore suitable for partly automating the mSME lending process. We also demonstrate how the content of loan assessments influences performance, leading us to a series of recommendations on a new strategy for collecting future mSME loan assessments. (c) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available