4.7 Article

Instance-dependent cost-sensitive learning for detecting transfer fraud

Journal

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
Volume 297, Issue 1, Pages 291-300

Publisher

ELSEVIER
DOI: 10.1016/j.ejor.2021.05.028

Keywords

Decision analysis; Fraud detection; Cost-based model evaluation; Cost-sensitive classification

Funding

  1. BNP Paribas Fortis Chair in Fraud Analytics and Internal Funds KU Leuven [C16/15/068]

Ask authors/readers for more resources

Credit card transaction fraud is a global issue, and financial institutions are increasingly relying on data-driven methods to develop fraud detection systems for detecting and preventing fraudulent transactions. The article introduces two novel classifiers that minimize the instance-dependent cost measure when learning a classification model, highlighting the potential to reduce fraud losses.
Card transaction fraud is a growing problem affecting card holders worldwide. Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting fraudulent transactions is a binary classification problem. Classification models are commonly trained and evaluated in terms of statistical performance measures, such as likelihood and AUC, respectively. These measures, however, do not take into account the actual business objective, which is to minimize the financial losses due to fraud. Fraud detection is to be acknowledged as an instance-dependent cost-sensitive classification problem, where the costs due to misclassification vary between instances, and requiring adapted approaches for learning a classification model. In this article, an instance-dependent threshold is derived, based on the instance-dependent cost matrix for transfer fraud detection, that allows for making the optimal cost-based decision for each transaction. Two novel classifiers are presented, based on lasso-regularized logistic regression and gradient tree boosting, which directly minimize the proposed instance-dependent cost measure when learning a classification model. The proposed methods are implemented in the R packages cslogit and csboost , and compared against state-of-the-art methods on a publicly available data set from the machine learning competition website Kaggle and a proprietary card transaction data set. The results of the experiments highlight the potential of reducing fraud losses by adopting the proposed methods. (C) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available