Recent advances in generative artificial intelligence have spurred the development of realistic speech synthesis. While this technology has the potential to improve lives through personalized voice assistants and accessibility-enhancing communication tools, it has also led to the emergence of deepfakes, in which synthesized speech can be misused to deceive humans and machines for nefarious purposes.
In response to this evolving threat, Ning Zhang, assistant professor of computer science and engineering at the McKelvey School of Engineering at Washington University in St. Louis, developed a tool called AntiFake, a new defense mechanism designed to thwart unauthorized text-to-speech before it happens. Zhang presented AntiFake on November 27 at the Association for Computing Machinery Computer and Communications Security Conference in Copenhagen, Denmark.
Unlike traditional deepfake detection methods, which are used to evaluate and discover synthetic audio as a post-attack mitigation tool, AntiFake takes a proactive stance. It uses adversarial techniques to prevent deceptive speech synthesis by making it more difficult for AI tools to read necessary features from voice recordings. The code is freely accessible to users.
“AntiFake ensures that when we broadcast voice data, it is difficult for criminals to use this information to synthesize our voices and impersonate us,” Zhang said. “The tool uses an adversarial AI technique that was originally part of the cybercriminals’ toolkit, but now we use it to defend against them. We mess up the recorded audio signal a bit, distort it or Let’s disrupt it just enough so that it still sounds correct to human listeners, but it’s completely different from AI.”
To ensure AntiFake could withstand an ever-changing landscape of potential attackers and unknown synthetic models, Zhang and first author Zhiyuan Yu, a graduate student in Zhang’s lab, built the tool to be generalizable and tested it against five states of the art. -artistic vocal synthesizers. AntiFake achieved a protection rate of over 95%, even against unreleased commercial synthesizers. They also tested AntiFake’s usability with 24 human participants to confirm that the tool is accessible to diverse populations.
Currently, AntiFake can protect short snippets of speech, targeting the most common type of voice impersonation. But, Zhang said, there’s nothing stopping this tool from being extended to protect longer recordings, or even music, in the ongoing fight against misinformation.
“Eventually, we want to be able to fully protect voice recordings,” Zhang said. “While I don’t know what’s next for AI voice technology (new tools and features are constantly being developed), I think our strategy of turning our adversaries’ techniques against them will continue to be effective. The AI remains vulnerable to enemy attacks. disruptions, even if the technical specificities must be modified to maintain this winning strategy. »
More information:
Zhiyuan Yu et al, AntiFake: Using adversarial audio to prevent unauthorized speech synthesis, Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (2023). DOI: 10.1145/3576915.3623209
Provided by Washington University in St. Louis
Quote: Defend your voice against deepfakes (November 27, 2023) retrieved on November 28, 2023 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.