3.8 Proceedings Paper

Detecting Malicious Accounts in Online Developer Communities Using Deep Learning

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3357384.3357971

Keywords

Online Developer Community; Malicious Account Detection; Deep Learning; Social Networks

Funding

  1. National Natural Science Foundation of China [61602122, 71731004, 61572278, U1736209]
  2. Research Grants Council of Hong Kong [16214817]
  3. Academy of Finland

Ask authors/readers for more resources

Online developer communities like GitHub provide services such as distributed version control and task management, which allow a massive number of developers to collaborate online. However, the openness of the communities makes themselves vulnerable to different types of malicious attacks, since the attackers can easily join and interact with legitimate users. In this work, we formulate the malicious account detection problem in online developer communities, and propose GitSec, a deep learning-based solution to detect malicious accounts. GitSec distinguishes malicious accounts from legitimate ones based on the account profiles as well as dynamic activity characteristics. On one hand, GitSec makes use of users' descriptive features from the profiles. On the other hand, GitSec processes users' dynamic behavioral data by constructing two user activity sequences and applying a parallel neural network design to deal with each of them, respectively. An attention mechanism is used to integrate the information generated by the parallel neural networks. The final judgement is made by a decision maker implemented by a supervised machine learning-based classifier. Based on the real-world data of GitHub users, our extensive evaluations show that GitSec is an accurate detection system, with an F1-score of 0.922 and an AUC value of 0.940.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available