4.5 Article

Protocol clustering of unknown traffic based on embedding of protocol specification

Journal

COMPUTERS & SECURITY
Volume 136, Issue -, Pages -

Publisher

ELSEVIER ADVANCED TECHNOLOGY
DOI: 10.1016/j.cose.2023.103575

Keywords

Private protocols; Unknown traffic; Protocol reverse engineering; Embedding; Unsupervised clustering

Ask authors/readers for more resources

Protocol Reverse Engineering (PRE) is a direct approach for analyzing unknown traffic. This paper proposes a method for clustering unknown traffic based on private protocol labels, and the experimental results demonstrate its advantages on real-world network traffic.
Protocol Reverse Engineering (PRE) has been widely studied in recent years as the most direct approach for analyzing unknown traffic, which is predominantly generated by private protocols. With the increase in private protocols, network traffic keeps deepening the unknown, leading to supervised learning methods struggling to obtain effective models when prior knowledge is absent. Furthermore, the unknown traffic captured in the real-world environment is actually mixed, and it cannot be directly provided to PRE for further analysis due to the lack of labels associated with private protocols. To address this issue in PRE, we propose an approach for dividing the unknown traffic into clusters with the labels of different private protocols in this paper, named FEAC. Firstly, we propose the general structure of protocol specification through an extensive investigation of protocols. Then, the unknown traffic is characterized as the Protocol Specification Fusion Vector (PSFV) based on word embedding, fusing the multidimensional information of protocol specification introduced before. After that, representation learning is employed in refining the information of the PSFVs to compress the dimension, reducing the complexity of computation. Finally, we combine the refined PSFVs and DBSCAN algorithm to implement the protocol clustering of unknown traffic, improving the analysis ability of PRE on unknown traffic. We carry out comprehensive experiments for comparison on real-world network traffic, and the experimental results demonstrate that FEAC achieves the ideal clustering performance and has advantages over previous work.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available