Audio Data Defenses: Protecting Music and Speech Data from Targeted Attacks

Julia Barnett (Northwestern University)*, William Agnew (Carnegie Mellon University), Robin Netzorg (UC Berkeley), Patrick O'Reilly (Northwestern University), Ezra Awumey (Carnegie Mellon University), Chris Donahue (Carnegie Mellon University), Sauvik Das (Carnegie Mellon University)

This paper will be presented in person

Abstract:

Use of data in AI systems without consent or in violation of copyright agreements has become a pressing concern as AI systems have become increasingly capable and commercialized. In the audio domain, these concerns extend to music styles, voice prints, and lyrics being replicated and extended, causing economic, security, and representation harms. In this paper we present initial work on protecting audio from unauthorized AI inference, including voice cloning and music extension. We utilize encoder-based attacks to add noise to audio to distort the encoded latent while minimally changing the original audio. We conduct small-scale experiments showing the effectiveness of our protection, and discuss next steps needed to develop a defense that is both effective and acceptable to audio workers. We present our results at tinyurl.com/audio-protection.