☆ 4.7 Article

A principle component analysis-based random forest with the potential nearest neighbor method for automobile insurance fraud identification

APPLIED SOFT COMPUTING (2018)

Journal

APPLIED SOFT COMPUTING

Volume 70, Issue -, Pages 1000-1009

Publisher

ELSEVIER

DOI: 10.1016/j.asoc.2017.07.027

Keywords

Ensemble; Random forest; Principle component analysis; Potential nearest neighbors; Voting mechanism; Automobile insurance fraud

Funding

Project of the National Natural Science Foundation of China [61502280, 61472228]
Project of Qingdao Applied Basic Research of Qingdao [14-2-4-55-jch]
Natural Science Foundation of Shandong province [ZR2014FM009]
Graduate Education Innovation Program Project of Shandong University of Science and Technology [KDYC14016]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

As a successful ensemble method, Random Forest has attracted much attention. In this paper, individual classifiers are appropriately combined and a multiple classifier system with an increase in classification accuracy is presented. According to Breiman's methodology, we propose a multiple classifier system based on the Random Forest, Principle Component Analysis and Potential Nearest Neighbor methods As Breiman suggested, the performance of the Random Forest depends on the strength of the weak learners in the forests and diversity among them. The Principle Component Analysis method is applied to transform data at each node to another space when computing the best split at this node. This process increases the diversity of each tree in the forest and thereby improves the overall accuracy. The Random Forest is studied through the perspective of the Adaptive Nearest Neighbor. We introduce the concept of monotone distance measures and potential nearest neighbors and show that the Random Forest can be viewed as an adaptive learning mechanism of k Potential Nearest Neighbors. Considering the information loss caused by out-of-bag samples, a new voting mechanism based on Potential Nearest Neighbor is also presented to replace the traditional majority vote. The proposed algorithm improves the classification accuracy of the ensemble classifier by improving the difference of the base classifiers. The performance of the proposed method is compared with those of the Oblique Decision Tree Ensemble, Rotation Forest and basic Random Forest on the data sets. The experimental results show that the proposed method produces a better classification accuracy and lower variance. The proposed method is also applied to detect automobile insurance fraud, and the fraud rules are obtained. (C) 2017 Elsevier B.V. All rights reserved.

A principle component analysis-based random forest with the potential nearest neighbor method for automobile insurance fraud identification

Journal

APPLIED SOFT COMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A principle component analysis-based random forest with the potential nearest neighbor method for automobile insurance fraud identification

Journal

APPLIED SOFT COMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper