4.6 Article

Anti-forensics of fake stereo audio using generative adversarial network

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 81, Issue 12, Pages 17155-17167

Publisher

SPRINGER
DOI: 10.1007/s11042-022-12448-4

Keywords

Generative adversarial network; Anti-forensics; Stereo faking

Funding

  1. National Natural Science Foundation of China [61300055]
  2. Zhejiang Natural Science Foundation [LY20F020010, LY17F020010]
  3. Ningbo Natural Science Foundation [202003N4089]
  4. Ningbo Science and Technology Innovation 2025 Major Project [2018B10010, 2019B10075]
  5. K.C. Wong Magna Fund in Ningbo University

Ask authors/readers for more resources

Fake-quality audio detection, specifically in the context of stereo-faked audio, is an important field in digital audio forensics. This study proposes an anti-forensic framework based on generative adversarial network to expose the weaknesses of stereo-faking detectors. By generating fake stereo audio using a mono audio, the researchers demonstrate that detection accuracy can significantly decrease while the false acceptance rate increases.
Fake-quality audio detection is an important branch in the field of digital audio forensics. Resampling and recompression are the two typical operations to achieve fake audio quality, in which an audio with low sampling/bit rate can be converted to one with higher sampling/bit rate pretending to be in high quality. Stereo-faking is another fake-quality operation, with which a mono audio can be converted into a stereo one. To detect the stereo-faking, a few forensic methods have been proposed. Little consideration, however, has been given to the security of these methods themselves. To expose the weakness of these stereo-faking detectors, an anti-forensic framework based on generative adversarial network is proposed. The fake stereo audio is created by generating a new channel audio based on a mono audio. Skip connection is adopted to ensure the quality of the generated audio. Considering that stereo application scenarios are mostly music and film recording, a large number of music and film recordings are downloaded from the Internet as our datasets. Use these datasets to train our model. The anti-forensic samples generated by the model are used to attack the most effective fake stereo audio detectors. Experimental results show that the generated fake stereo audio of music can significantly reduce its detection accuracy from about 99-30%, and the false acceptance rate can increase from 0.08% to about 69%. The fake stereo audio generated from the film recording can significantly reduce its detection accuracy from about 99-1.7%, and the false acceptance rate can increase from 0.02% to about 98%.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available