4.6 Article

A Fake Online Repository Generation Engine for Cyber Deception

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TDSC.2019.2898661

Keywords

Fake documents; online repository; cyber deception

Funding

  1. ONR [N00014-161-2896, N00014-15-1-2007, N00014-18-1-2670]
  2. ARO [W911NF-13-1-0421, W911NF-15-1-0576, W911NF-14-1-0358]
  3. Maryland Procurement Office [H9823014C0137]
  4. SERB, DST [ECR/2017/001691]
  5. Ramanujan Faculty Award
  6. Infosys center of AI at IIIT Delhi

Ask authors/readers for more resources

This paper discusses the defense of enterprises successfully hacked by increasing a posteriori costs for attackers. The FORGE system generates highly believable fake documents using innovations involving multi-layer graphs and meta-centrality.
Today, major corporations and government organizations must face the reality that they will be hacked by malicious actors. In this paper, we consider the case of defending enterprises that have been successfully hacked by imposing additional a posteriori costs on the attacker. Our idea is simple: for every real document d, we develop methods to automatically generate a set Fake(d) of fake documents that are very similar to d. The attacker who steals documents must wade through a large number of documents in detail in order to separate the real one from the fakes. Our FORGE system focuses on technical documents (e.g., engineering/design documents) and involves three major innovations. First, we represent the semantic content of documents via multi-layer graphs (MLGs). Second, we propose a novel concept of meta-centrality for multi-layer graphs. A meta-centrality (MC) measure takes a classical centrality measure (for ordinary graphs, not MLGs) as input, and generalizes it to MLGs. The idea is to generate fake documents by replacing concepts on the basis of meta-centrality with related concepts according to an ontology. Our third innovation is to show that the problem of generating the set Fake(d) of fakes can be viewed as an optimization problem. We prove that this problem is NP-complete and then develop efficient heuristics to solve it in practice. We ran detailed experiments on two datasets: one a panel of 20 human subjects, another with a panel of 10. Our results show that FORGE generates highly believable fakes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available