Security for Artificial Intelligence

[Research Statement] [Publications] [Members]

Research Statement

As intelligent systems become pervasive, safeguarding their security and privacy is critical. However, recent research have demonstrated that machine learning systems, including state-of-the-art deep neural networks, can be easily fooled by an adversary. For example, it is easy to generate adversarial examples, which are close to the benign inputs but are misidentified by the machine learning models. Moreover, in our recent work, we have shown that such attacks can be successful even without access to the model internals, i.e., in a black-box setting. These attacks may cause severe outcomes: for example, the adversary can mislead the perceptual systems of autonomous vehicles to wrongly identify road signs, which can result in catastrophic traffic accidents. Therefore, such security issues hinder the application of machine learning to security-critical systems.

In AI security research, we aim at investigating into the vulnerability of automatic learning systems, and ultimately, developing robust defense strategies against such sophisticated adversarial manipulations in real-world applications.

Recent Publications

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Dan Hendrycks, Steven Basart*, Norman Mu*, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, Justin Gilmer.

International Conference on Computer Vision (ICCV). October, 2021.

 

Extracting Training Data from Large Language Models

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel.

USENIX Security Symposium. August, 2021.

 

Towards Robustness of Text-to-SQL Models against Synonym Substitution

Yujian Gan, Xinyun Chen, Qiuping Huang, Matthew Purver, John R. Woodward, Jinxia Xie, Pengsheng Huang.

Annual Meeting of the Association for Computational Linguistics (ACL). August, 2021.

 

BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning

Lun Wang, Zaynah Javed, Xian Wu, Wenbo Guo, Xinyu Xing, Dawn Song.

International Joint Conference on Artificial Intelligence (IJCAI). August, 2021.

 

Natural Adversarial Examples

Dan Hendrycks, Kevin Zhao*, Steven Basart*, Jacob Steinhardt, Dawn Song.

The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2021.

 

REFIT: a Unified Watermark Removal Framework for Deep Learning Systems with Limited Data

Xinyun Chen*, Wenxiao Wang*, Chris Bender, Yiming Ding, Ruoxi Jia, Bo Li, Dawn Song.

ACM Asia Conference on Computer and Communications Security (AsiaCCS). June, 2021.

 

Understanding Robustness in Teacher-Student Setting: A New Perspective

Zhuolin Yang*, Zhaoxi Chen, Tiffany (Tianhui) Cai, Xinyun Chen, Bo Li, Yuandong Tian*.

International Conference on Artificial Intelligence and Statistics (AISTATS). April, 2021.

 

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein.

December, 2020.

 

Imitation Attacks and Defenses for Black-box Machine Translation Systems

Eric Wallace, Mitchell Stern, Dawn Song.

Conference on Empirical Methods in Natural Language Processing (EMNLP), November, 2020.

 

Blog

Towards Inspecting and Eliminating Trojan Backdoors in Deep Neural Networks

Wenbo Guo*, Lun Wang*, Yan Xu, Xinyu Xing, Min Du, Dawn Song.

IEEE International Conference on Data Mining (ICDM), November, 2020.

 

Pretrained Transformers Improve Out-of-Distribution Robustness

Dan Hendrycks*, Xiaoyuan Liu*, Eric Wallace, Adam Dziedzic, Rishabh Krishnan, Dawn Song.

Annual Meeting of the Association for Computational Linguistics (ACL). July, 2020.

 

The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks

Yuheng Zhang*, Ruoxi Jia*, Hengzhi Pei, Wenxiao Wang, Bo Li, Dawn Song.

The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2020.

 

Robust Anomaly Detection and Backdoor Attack Detection Via Differential Privacy

Min Du, Ruoxi Jia, Dawn Song.

International Conference on Learning Representations (ICLR). May, 2020.

 

Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty

Dan Hendrycks, Mantas Mazeika*, Saurav Kadavath*, Dawn Song.

Advances in Neural Information Processing Systems (NeurIPS). December, 2019.

 

AdvIT: Adversarial Frames Identifier Based on Temporal Consistency In Videos

Chaowei Xiao, Ruizhi Deng, Bo Li, Taesung Lee, Benjamin Edwards, Jinfeng Yi, Dawn Song, Mingyan Liu, Ian Molloy.

International Conference on Computer Vision (ICCV). October, 2019.

 

The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets

Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, Dawn Song.

USENIX Security. August, 2019.

 

Press: The Register | Schneier on Security

How You Act Tells a Lot: Privacy-Leakage Attack on Deep Reinforcement Learning

​Xinlei Pan, Weiyao Wang, Xiaoshuai Zhang, Bo Li, Jinfeng Yi, Dawn Song.

International Conference on Autonomous Agents and Multiagent Systems (AAMAS). May, 2019

 

Characterizing Audio Adversarial Examples Using Temporal Dependency

Zhuolin Yang, Bo Li, Pin-Yu Chen, Dawn Song.

International Conference on Learning Representations (ICLR). May, 2019.

 

Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation

Chaowei Xiao, Ruizhi Deng, Bo Li, Fisher Yu, Mingyan Liu, Dawn Song.

European Conference on Computer Vision (ECCV). September, 2018.

 

Exploring the Space of Black-box Attacks on Deep Neural Networks

Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song.

The European Conference on Computer Vision (ECCV). September, 2018.

 

Generating Adversarial Examples with Adversarial Networks

Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu, Dawn Song.

The International Joint Conference on Artificial Intelligence (IJCAI). July, 2018.

 

Curriculum Adversarial Training

Qizhi Cai, (Min Du), Chang Liu, Dawn Song.

The International Joint Conference on Artificial Intelligence (IJCAI). July, 2018.

 

Fooling Vision and Language Models Despite Localization and Attention Mechanism

Xiaojun Xu, Xinyun Chen, Chang Liu, Anna Rohrbach, Trevor Darell, Dawn Song.

The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2018.

 

Robust Physical-World Attacks on Deep Learning Visual Classification

Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, Chaowei Xiao, Dawn Song.

The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2018.

 

Press: IEEE Spectrum | Yahoo News | Wired | Engagdet | Telegraph | Car and Driver | CNET | Digital Trends | SCMagazine | Schneier on Security | Ars Technica | Fortune | Science Magazine

Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality

Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi Wijewickrema, Michael E. Houle, Grant Schoenebeck, Dawn Song, James Bailey.

International Conference on Learning Representations (ICLR). May, 2018.

 

Spatially Transformed Adversarial Examples

Chaowei Xiao*, Jun-Yan Zhu*, Bo Li, Mingyan Liu, Dawn Song.

International Conference on Learning Representations (ICLR). May, 2018.

 

Decision Boundary Analysis of Adversarial Examples

Warren He, Bo Li, Dawn Song.

International Conference on Learning Representations (ICLR). May, 2018.

 

Adversarial examples for generative models

Jernej Kos, Ian Fischer, Dawn Song.

IEEE S&P Workshop on Deep Learning and Security. May, 2018.

 

Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, Dawn Song.

December, 2017.

 

Press: Motherboard | The Register

Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong

Warren He, James Wei, Xinyun Chen, Nicholas Carlini, Dawn Song.

USENIX Workshop on Offensive Technologies (WOOT). August, 2017.

 

Delving into Transferable Adversarial Examples and Black-box Attacks

Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song.

International Conference on Learning Representations (ICLR). April, 2017.

 

Delving into adversarial attacks on deep policies

Jernej Kos and Dawn Song.

ICLR Workshop. April, 2017.

 


Members