4.7 Article Proceedings Paper

Robust fingerprinting of genomic databases

期刊

BIOINFORMATICS
卷 38, 期 SUPPL 1, 页码 143-152

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btac243

关键词

-

资金

  1. National Library of Medicine of the National Institutes of Health [R01LM013429]
  2. National Science Foundation (NSF) [2050410]
  3. Division Of Computer and Network Systems
  4. Direct For Computer & Info Scie & Enginr [2050410] Funding Source: National Science Foundation

向作者/读者索取更多资源

This study aims to fill the gap in fingerprinting schemes for genomic databases and develop mitigation techniques against correlation attacks. Experimental results demonstrate that correlation attacks have a significant impact on fingerprinting schemes, but the proposed mitigation techniques effectively alleviate the attacks while preserving database utility.
Motivation: Database fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are motivated to fill in this gap by devising a vanilla fingerprinting scheme specifically for genomic databases. Moreover, since malicious genomic database recipients may compromise the embedded fingerprint (distort the steganographic marks, i.e. the embedded fingerprint bit-string) by launching effective correlation attacks, which leverage the intrinsic correlations among genomic data (e.g. Mendel's law and linkage disequilibrium), we also augment the vanilla scheme by developing mitigation techniques to achieve robust fingerprinting of genomic databases against correlation attacks. Results: Via experiments using a real-world genomic database, we first show that correlation attacks against fingerprinting schemes for genomic databases are very powerful. In particular, the correlation attacks can distort more than half of the fingerprint bits by causing a small utility loss (e.g. database accuracy and consistency of SNP-phenotype associations measured via P-values). Next, we experimentally show that the correlation attacks can be effectively mitigated by our proposed mitigation techniques. We validate that the attacker can hardly compromise a large portion of the fingerprint bits even if it pays a higher cost in terms of degradation of the database utility. For example, with around 24% loss in accuracy and 20% loss in the consistency of SNP-phenotype associations, the attacker can only distort about 30% fingerprint bits, which is insufficient for it to avoid being accused. We also show that the proposed mitigation techniques also preserve the utility of the shared genomic databases, e.g. the mitigation techniques only lead to around 3% loss in accuracy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据