Posts by Collection

portfolio

publications

Towards Reactive Acoustic Jamming for Personal Voice Assistants

Published in Proceedings of the 2nd International Workshop on Multimedia Privacy and Security, 2018

This paper develops Reactive Acoustic Jamming, a method that emits ultrasonic signals upon wake-word detection to block unauthorized voice recordings, providing proactive privacy protection for voice assistant users.

Recommended citation: Peng Cheng, Ibrahim Ethem Bagci, Jie Yan, Utz Roedig. (2018). “Towards Reactive Acoustic Jamming for Personal Voice Assistants.” Proceedings of the 2nd International Workshop on Multimedia Privacy and Security, Toronto, Canada, 1–13.

Smart Speaker Privacy Control—Acoustic Tagging for Personal Voice Assistants

Published in IEEE Security and Privacy Workshops (SPW 2019), 2019

This paper introduces acoustic tagging for privacy control, embedding imperceptible tags into voice streams to enable privacy preference signaling and unauthorized recording traceability in voice assistant systems.

Recommended citation: Peng Cheng, Ibrahim Ethem Bagci, Jie Yan, Utz Roedig. (2019). “Smart Speaker Privacy Control—Acoustic Tagging for Personal Voice Assistants.” IEEE Security and Privacy Workshops (SPW 2019), San Francisco, CA, USA, 144–149.

SonarSnoop: Active Acoustic Side-Channel Attacks

Published in International Journal of Information Security (IJIS), 2020

This paper demonstrates novel sonar-like attacks using smartphone acoustic hardware to infer user interactions like unlocking patterns. The work was a finalist for the Pwnie Award 2019 for most innovative research and received recognition from security experts.

Recommended citation: Peng Cheng, Ibrahim Ethem Bagci, Utz Roedig, Jie Yan. “SonarSnoop: Active Acoustic Side-Channel Attacks.” International Journal of Information Security. 19(2), pp. 213-228. 2020. doi: 10.1007/s10207-019-00452-6.

Adversarial Command Detection Using Parallel Speech Recognition Systems

Published in Computer Security - ESORICS 2021 International Workshops, 2021

This paper proposes a defense mechanism leveraging parallel speech recognition systems to detect inaudible malicious commands targeting voice assistants, countering adversarial exploitation of voice-controlled systems.

Recommended citation: Peng Cheng, MS Arun Sankar, Ibrahim Ethem Bagci, Utz Roedig. (2021). “Adversarial Command Detection Using Parallel Speech Recognition Systems.” Computer Security - ESORICS 2021 International Workshops, Darmstadt, Germany (Virtual), 238–255.

Personal Voice Assistant Security and Privacy—A Survey

Published in Proceedings of the IEEE, 2022

This comprehensive survey examines acoustic-channel-driven security and privacy threats in voice assistants, providing a systematic analysis of vulnerabilities and defense mechanisms in personal voice assistant systems.

Recommended citation: Peng Cheng, Utz Roedig. “Personal Voice Assistant Security and Privacy—A Survey.” Proceedings of the IEEE. 110(4), pp. 476-507. 2022. doi: 10.1109/JPROC.2022.3154330.

UniAP: Protecting Speech Privacy With Non-Targeted Universal Adversarial Perturbations

Published in IEEE Transactions on Dependable and Secure Computing (TDSC), 2023

This paper proposes UniAP, a non-targeted adversarial attack framework to obfuscate speech signals, achieving >87% success in real-world scenarios for speech privacy protection.

Recommended citation: Peng Cheng, Yuexin Wu, Yuan Hong, Zhongjie Ba, Feng Lin, Li Lu, Kui Ren. “UniAP: Protecting Speech Privacy With Non-Targeted Universal Adversarial Perturbations.” IEEE Transactions on Dependable and Secure Computing. 21(1), pp. 31-46. 2024. doi: 10.1109/TDSC.2023.3235266.

InfoMasker: Preventing Eavesdropping Using Phoneme-Based Noise

Published in Network and Distributed System Security Symposium (NDSS 2023), 2023

This paper designs ultrasonic noise injection systems to disrupt unauthorized recordings while preserving authorized access, reducing speech recognition accuracy to <50% even at low energy levels.

Recommended citation: Peng Huang, Yao Wei, Peng Cheng*, Zhongjie Ba, Li Lu, Feng Lin, Fengwei Zhang, Kui Ren. “InfoMasker: Preventing Eavesdropping Using Phoneme-Based Noise.” Proceedings of the Network and Distributed System Security Symposium. San Diego, CA, USA. 2023. doi: to appear.

Transferring Audio Deepfake Detection Capability Across Languages

Published in Proceedings of the ACM Web Conference (WWW 2023), 2023

This paper introduces domain adaptation techniques to transfer deepfake detection capabilities across languages, validated on 137-hour multilingual datasets to address the challenge of detecting deepfakes in low-resource languages.

Recommended citation: Zhongjie Ba, Qing Wen, Peng Cheng*, Yuwei Wang, Feng Lin, Li Lu, Zhiyi Liu. “Transferring Audio Deepfake Detection Capability Across Languages.” Proceedings of the ACM Web Conference. Austin, TX, USA. 2023. doi: 10.1145/3543507.3583392.

Masked Diffusion Models Are Fast and Privacy-Aware Learners

Published in arXiv preprint, 2023

This paper explores privacy-preserving capabilities of masked diffusion models, demonstrating their potential for fast learning while maintaining privacy guarantees in generative AI applications.

Recommended citation: Jiachen Lei, Qingni Wang, Peng Cheng*, Zhongjie Ba, Zhan Qin, Zhibo Wang, Zhiyi Liu, Kui Ren. “Masked Diffusion Models Are Fast and Privacy-Aware Learners.” arXiv preprint arXiv:2306.11363. 2023. doi: to appear.

Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

Published in IEEE Transactions on Dependable and Secure Computing (TDSC), 2024

This paper extends the InfoMasker framework with controlled recording privilege mechanisms, enabling selective privacy protection while maintaining authorized access to voice communications.

Recommended citation: Peng Huang, Yao Wei, Peng Cheng*, Zhongjie Ba, Li Lu, Feng Lin, Yuwei Wang, Kui Ren. “Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege.” IEEE Transactions on Dependable and Secure Computing. 22(2), pp. 1074-1090. 2025. doi: 10.1109/TDSC.2024.3408163.

ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms Using Linguistic Features

Published in IEEE Symposium on Security and Privacy (SP 2024), 2024

This paper proposes linguistic feature-based attacks using TTS/ASR reciprocity, enabling single-query adversarial samples with 97.7% query cost reduction. Validated on four commercial systems and adopted by NVIDIA for their AI security toolkit.

Recommended citation: Peng Cheng, Yuwei Wang, Peng Huang, Zhongjie Ba, Xiaohong Lin, Feng Lin, Li Lu, Kui Ren. “ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms Using Linguistic Features.” Proceedings of IEEE Symposium on Security and Privacy. San Francisco, CA, USA. 2024. doi: 10.1109/SP54263.2024.00047.

Indelible “Footprints” of Inaudible Command Injection

Published in IEEE Transactions on Information Forensics and Security (TIFS), 2024

This paper discovers hardware-specific artifacts in ultrasound injections and designs DolphinTag to detect attacks via abnormal demodulation with 100% accuracy, achieving 99.8% accuracy on interference signatures through software methods.

Recommended citation: Zhongjie Ba, Bin Gong, Yuwei Wang, Yiwen Liu, Peng Cheng*, Feng Lin, Li Lu, Kui Ren. “Indelible “Footprints” of Inaudible Command Injection.” IEEE Transactions on Information Forensics and Security. 19, pp. 6589-6604. 2024. doi: 10.1109/TIFS.2024.3421486.

SurrogatePrompt: Bypassing the Safety Filter of Text-to-Image Models via Substitution

Published in Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS 2024), 2024

This paper exposes critical vulnerabilities in commercial text-to-image models through SurrogatePrompt, achieving 88% success rate in bypassing safety filters to generate unsafe content. The findings were acknowledged by Midjourney and Stability.ai.

Recommended citation: Zhongjie Ba, Jieming Zhong, Jiachen Lei, Peng Cheng*, Qingni Wang, Zhan Qin, Zhibo Wang, Kui Ren. “SurrogatePrompt: Bypassing the Safety Filter of Text-to-Image Models via Substitution.” Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. Salt Lake City, UT, USA. 2024. doi: 10.1145/3658644.3670317.

SecHeadset: A Practical Privacy Protection System for Real-time Voice Communication

Published in Proceedings of the ACM MobiSys 2025, 2025

This paper presents SecHeadset, a practical privacy protection system that prevents third parties from eavesdropping on speech content in VoIP and voice message applications by adding vowel-based noise to speech audio signals, validated through 204-user studies.

Recommended citation: Peng Huang, Kun Pan, Qingni Wang, Peng Cheng*, Li Lu, Zhongjie Ba, Kui Ren. “SecHeadset: A Practical Privacy Protection System for Real-time Voice Communication.” Proceedings of the ACM MobiSys. Anaheim, California, US. 2025. doi: to appear.

Robust Watermarks Leak: Channel-Aware Feature Extraction Enables Adversarial Watermark Manipulation

Published in arXiv preprint, 2025

This paper reveals inherent tradeoffs in watermark robustness, enabling single-image attacks to extract and forge watermarks while maintaining visual fidelity, exposing fundamental vulnerabilities in current watermarking approaches.

Recommended citation: Zhongjie Ba, Yaoxin Zhang, Peng Cheng (corresponding author), Bin Gong, Xiaoyuan Zhang, Qinglong Wang, Kui Ren. (2025). “Robust Watermarks Leak: Channel-Aware Feature Extraction Enables Adversarial Watermark Manipulation.” arXiv preprint, arXiv:2502.06418, 2025, https://doi.org/10.48550/arXiv.2502.06418.

Deepfake Detection: Key Challenges and Technical Approaches

Published in Computing Magazine of the CCF, 1(2): 8–15, 2025

This article surveys key challenges and technical approaches in deepfake detection, covering audio, visual, and multimodal detection methods, published in the CCF Computing Magazine.

Recommended citation: Kui Ren, Fangjun Lin, Zhongjie Ba, Zhuotao Liu, Peng Cheng. (2025). “Deepfake Detection: Key Challenges and Technical Approaches.” Computing Magazine of the CCF. 1(2): 8–15.

Beyond Content: A Comprehensive Speech Toxicity Dataset and Detection Framework Incorporating Paralinguistic Cues

Published in The 40th AAAI Conference on Artificial Intelligence (AAAI 2026), 2026

This paper presents a comprehensive speech toxicity dataset and detection framework that goes beyond semantic content by incorporating paralinguistic cues such as tone, emotion, and prosody for more accurate toxicity detection.

Recommended citation: Zhongjie Ba, Lixiang Yi, Peng Cheng* (corresponding author), Qiwei Li, Qinglong Wang, Li Lu. “Beyond Content: A Comprehensive Speech Toxicity Dataset and Detection Framework Incorporating Paralinguistic Cues.” The 40th AAAI Conference on Artificial Intelligence (AAAI 2026).

Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection

Published in The 14th International Conference on Learning Representations (ICLR 2026), 2026

This paper proposes an attack-resistant watermarking scheme for AI-generated content (AIGC) image forensics using diffusion-based semantic deflection, achieving robustness against adversarial watermark removal and forgery attacks.

Recommended citation: Qian Liu, Yaoxin Zhang, Zhongjie Ba, Chao Shuai, Peng Cheng, Tianwei Zheng, Zhibo Wang. “Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection.” The 14th International Conference on Learning Representations (ICLR 2026).

HyperPotter: Spell the Charm of High-Order Interactions in Audio Deepfake Detection

Published in International Conference on Machine Learning (ICML 2026), 2026

This paper proposes HyperPotter, which leverages high-order feature interactions to significantly improve audio deepfake detection performance across diverse synthesis methods.

Recommended citation: Qing Wen, Hao Li, Zhongjie Ba, Peng Cheng* (corresponding author), Mingyi He, Li Lu, Kui Ren. “HyperPotter: Spell the Charm of High-Order Interactions in Audio Deepfake Detection.” International Conference on Machine Learning (ICML 2026).

Published in , 1900

MixFake: Benchmarking and Enhancing Audio Deepfake Detection in Diverse Real-world Mixed Audio

Published in IEEE International Conference on Multimedia and Expo (ICME 2026) [Spotlight], 2026

This paper introduces MixFake, a benchmark and enhanced detection framework for audio deepfake detection in real-world mixed audio scenarios, addressing the challenge of diverse synthesis methods co-existing in a single audio sample. Accepted as Spotlight paper.

Recommended citation: Qingcao Li, Yipeng Lin, Weichen Lian, Zhongjie Ba, Peng Cheng† (corresponding author), Zhichao Lian. “MixFake: Benchmarking and Enhancing Audio Deepfake Detection in Diverse Real-world Mixed Audio.” IEEE International Conference on Multimedia and Expo (ICME 2026). [Spotlight]

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.