3.8 Proceedings Paper

Assessment of Creditworthiness Models Privacy-Preserving Training with Synthetic Data

Journal

HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2022
Volume 13469, Issue -, Pages 375-384

Publisher

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-031-15471-3_32

Keywords

Credit scoring; Synthetic data; Generative adversarial networks; Variational autoencoders

Funding

  1. CONICYT-PFCHA/DOCTORADO BECAS CHILE [201921190345]
  2. Natural Sciences and Engineering Research Council of Canada (NSERC) [RGPIN-202007114]
  3. Canada Research Chairs program
  4. MICIN [PID2020-116346GB-I00]

Ask authors/readers for more resources

Credit scoring models are the primary instrument used by financial institutions to manage credit risk. However, research on behavioral scoring is scarce due to difficulties in data access. This study presents a methodology for evaluating model performance when trained with synthetic data and applied to real-world data. Results show that the quality of synthetic data decreases as the number of attributes increases, and models trained with synthetic data show a slight reduction in performance compared to those trained with real data.
Credit scoring models are the primary instrument used by financial institutions to manage credit risk. The scarcity of research on behavioral scoring is due to the difficult data access. Financial institutions have to maintain the privacy and security of borrowers' information refrain them from collaborating in research initiatives. In this work, we present a methodology that allows us to evaluate the performance of models trained with synthetic data when they are applied to real-world data. Our results show that synthetic data quality is increasingly poor when the number of attributes increases. However, creditworthiness assessment models trained with synthetic data show a reduction of 3% of AUC and 6% of KS when compared with models trained with real data. These results have a significant impact since they encourage credit risk investigation from synthetic data, making it possible to maintain borrowers' privacy and to address problems that until now have been hampered by the availability of information.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available