4.7 Article

On the combination of graph data for assessing thin-file borrowers' creditworthiness

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 213, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2022.118809

Keywords

Credit scoring; Machine learning; Social network analysis; Network data; Graph neural networks

Ask authors/readers for more resources

This paper introduces an information-processing framework that combines feature engineering, graph embeddings, and graph neural networks to improve credit scoring models. The results show that this approach outperforms traditional methods in assessing creditworthiness using social interaction data. Additionally, in the field of corporate lending, considering the relationships between companies and other entities is crucial for evaluating thin-file companies. The study also highlights the significant value of graph data in helping companies with little or no credit history enter the financial system.
Thin-file borrowers are customers for whom a creditworthiness assessment is uncertain due to their lack of credit history. To address missing credit information, many researchers have used borrowers' social interactions as an alternative data source. Exploiting social networking data has traditionally been achieved by hand-crafted feature engineering, but lately, graph neural networks have emerged as a promising alternative. Here we introduce an information-processing framework to improve credit scoring models by blending several methods of graph representation learning: feature engineering, graph embeddings, and graph neural networks. In this approach, we aggregate the methods' outputs to be fed to a gradient boosting classifier to produce a final creditworthiness score. We have validated this framework over a unique multi-source dataset that characterizes the relationships, interactions, and credit history for the entire population of a Latin American country, applying it to credit risk models, application, and behavior. It also allows us to study both individuals and companies. Our results show that the methods of graph representation learning should be used as complements; they should not be seen as self-sufficient methods, as it is currently done. We improve the creditworthiness assessment performance in terms of the measures of Area Under the ROC Curve (AUC) and Kolmogorov- Smirnov (KS), outperforming traditional methods of exploiting social interaction data. In the area of corporate lending, where the potential gain is much higher, our results confirm that the evaluation of a thin-file company cannot solely consider the company's own characteristics. The business ecosystem in which these companies interact with their owners, suppliers, customers, and other companies provides novel knowledge that enables financial institutions to enhance their creditworthiness assessment. Our results let us know when and on which population to use graph data and the expected effects on performance. They also show the enormous value of graph data on the credit scoring problem for thin-file borrowers, mainly to help companies with thin or no credit history to enter the financial system.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available