LLM-PBE: Assessing Data Privacy in Large Language Models Qinbin Li, Junyuan Hong, Chulin Xie, Jeffrey Tan, Rachel Xin, Junyi Hou, Xavier Yin, Zhun Wang, Dan Hendrycks, Zhangyang Wang, Bo Li, Bingsheng He, Dawn Song International Conference on Very Large Data Bases (VLDB) Best Paper Award Finalist. August, 2024.
|
SHINE: Shielding Backdoors in Deep Reinforcement Learning Zhuowen Yuan, Wenbo Guo, Jinyuan Jia, Bo Li, Dawn Song The International Conference on Machine Learning (ICML). July, 2024.
|
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li The International Conference on Machine Learning (ICML). July, 2024.
|
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression Junyuan Hong, Jinhao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang “Atlas” Wang, Bo Li The International Conference on Machine Learning (ICML). July, 2024.
|
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li The International Conference on Machine Learning (ICML). July, 2024.
|
The False Promise of Imitating Proprietary Language Models Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao Liu, Pieter Abbeel, Sergey Levine, Dawn Song International Conference on Learning Representations (ICLR). May, 2024.
|
TextGuard: Provable Defense against Backdoor Attacks on Text Classification Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song The Network and Distributed System Security Symposium (NDSS). February, 2024.
|
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li Advances in Neural Information Processing Systems (NeurIPS) Outstanding Paper Award. December, 2023.
|
DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification Mintong Kang, Dawn Song, Bo Li Advances in Neural Information Processing Systems (NeurIPS). December, 2023.
|
BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning Xuan Chen, Wenbo Guo, Guanhong Tao, Xiangyu Zhang, Dawn Song Advances in Neural Information Processing Systems (NeurIPS). December, 2023.
|
PATROL: Provable Defense against Adversarial Policy in Two-player Games Wenbo Guo, Xian Wu, Lun Wang, Xinyu Xing, Dawn Song USENIX Security Symposium. August, 2023.
|
Extracting Training Data from Diffusion Models Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace USENIX Security Symposium. August, 2023.
|
Poisoning Instruction-Tuned Language Models Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein The International Conference on Machine Learning (ICML). July, 2023.
|
Trojdiff: Trojan attacks on diffusion models with diverse targets Weixin Chen, Dawn Song, Bo Li The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2023.
|
Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). February, 2023.
|
Scaling Out-of-Distribution Detection for Real-World Settings Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joe Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, Dawn Song The International Conference on Machine Learning (ICML). July, 2022.
|
Deduplicating Training Data Mitigates Privacy Risks in Language Models Nikhil Kandpal, Eric Wallace, Colin Raffel The International Conference on Machine Learning (ICML). July, 2022.
|
PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Bo Li, Dawn Song, and Jacob Steinhardt The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2022.
|
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization Dan Hendrycks, Steven Basart*, Norman Mu*, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, Justin Gilmer. International Conference on Computer Vision (ICCV). October, 2021.
|
Extracting Training Data from Large Language Models Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel. USENIX Security Symposium. August, 2021.
|
Towards Robustness of Text-to-SQL Models against Synonym Substitution Yujian Gan, Xinyun Chen, Qiuping Huang, Matthew Purver, John R. Woodward, Jinxia Xie, Pengsheng Huang. Annual Meeting of the Association for Computational Linguistics (ACL). August, 2021.
|
BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning Lun Wang, Zaynah Javed, Xian Wu, Wenbo Guo, Xinyu Xing, Dawn Song. International Joint Conference on Artificial Intelligence (IJCAI). August, 2021.
|
Dan Hendrycks, Kevin Zhao*, Steven Basart*, Jacob Steinhardt, Dawn Song. The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2021.
|
REFIT: a Unified Watermark Removal Framework for Deep Learning Systems with Limited Data Xinyun Chen*, Wenxiao Wang*, Chris Bender, Yiming Ding, Ruoxi Jia, Bo Li, Dawn Song. ACM Asia Conference on Computer and Communications Security (AsiaCCS). June, 2021.
|
Understanding Robustness in Teacher-Student Setting: A New Perspective Zhuolin Yang*, Zhaoxi Chen, Tiffany (Tianhui) Cai, Xinyun Chen, Bo Li, Yuandong Tian*. International Conference on Artificial Intelligence and Statistics (AISTATS). April, 2021.
|
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein. December, 2020.
|
Imitation Attacks and Defenses for Black-box Machine Translation Systems Eric Wallace, Mitchell Stern, Dawn Song. Conference on Empirical Methods in Natural Language Processing (EMNLP), November, 2020.
|
Towards Inspecting and Eliminating Trojan Backdoors in Deep Neural Networks Wenbo Guo*, Lun Wang*, Yan Xu, Xinyu Xing, Min Du, Dawn Song. IEEE International Conference on Data Mining (ICDM), November, 2020.
|
Pretrained Transformers Improve Out-of-Distribution Robustness Dan Hendrycks*, Xiaoyuan Liu*, Eric Wallace, Adam Dziedzic, Rishabh Krishnan, Dawn Song. Annual Meeting of the Association for Computational Linguistics (ACL). July, 2020.
|
The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks Yuheng Zhang*, Ruoxi Jia*, Hengzhi Pei, Wenxiao Wang, Bo Li, Dawn Song. The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2020.
|
Robust Anomaly Detection and Backdoor Attack Detection Via Differential Privacy Min Du, Ruoxi Jia, Dawn Song. International Conference on Learning Representations (ICLR). May, 2020.
|
Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty Dan Hendrycks, Mantas Mazeika*, Saurav Kadavath*, Dawn Song. Advances in Neural Information Processing Systems (NeurIPS). December, 2019.
|
AdvIT: Adversarial Frames Identifier Based on Temporal Consistency In Videos Chaowei Xiao, Ruizhi Deng, Bo Li, Taesung Lee, Benjamin Edwards, Jinfeng Yi, Dawn Song, Mingyan Liu, Ian Molloy. International Conference on Computer Vision (ICCV). October, 2019.
|
The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, Dawn Song. USENIX Security. August, 2019.
Press: The Register | Schneier on Security |
How You Act Tells a Lot: Privacy-Leakage Attack on Deep Reinforcement Learning Xinlei Pan, Weiyao Wang, Xiaoshuai Zhang, Bo Li, Jinfeng Yi, Dawn Song. International Conference on Autonomous Agents and Multiagent Systems (AAMAS). May, 2019
|
Characterizing Audio Adversarial Examples Using Temporal Dependency Zhuolin Yang, Bo Li, Pin-Yu Chen, Dawn Song. International Conference on Learning Representations (ICLR). May, 2019.
|
Chaowei Xiao, Ruizhi Deng, Bo Li, Fisher Yu, Mingyan Liu, Dawn Song. European Conference on Computer Vision (ECCV). September, 2018.
|
Exploring the Space of Black-box Attacks on Deep Neural Networks Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song. The European Conference on Computer Vision (ECCV). September, 2018.
|
Generating Adversarial Examples with Adversarial Networks Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu, Dawn Song. The International Joint Conference on Artificial Intelligence (IJCAI). July, 2018.
|
Curriculum Adversarial Training Qizhi Cai, (Min Du), Chang Liu, Dawn Song. The International Joint Conference on Artificial Intelligence (IJCAI). July, 2018.
|
Fooling Vision and Language Models Despite Localization and Attention Mechanism Xiaojun Xu, Xinyun Chen, Chang Liu, Anna Rohrbach, Trevor Darell, Dawn Song. The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2018.
|
Robust Physical-World Attacks on Deep Learning Visual Classification Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, Chaowei Xiao, Dawn Song. The Conference on Computer Vision and Pattern Recognition (CVPR). June, 2018.
Press: IEEE Spectrum | Yahoo News | Wired | Engagdet | Telegraph | Car and Driver | CNET | Digital Trends | SCMagazine | Schneier on Security | Ars Technica | Fortune | Science Magazine |
Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi Wijewickrema, Michael E. Houle, Grant Schoenebeck, Dawn Song, James Bailey. International Conference on Learning Representations (ICLR). May, 2018.
|
Spatially Transformed Adversarial Examples Chaowei Xiao*, Jun-Yan Zhu*, Bo Li, Mingyan Liu, Dawn Song. International Conference on Learning Representations (ICLR). May, 2018.
|
Decision Boundary Analysis of Adversarial Examples Warren He, Bo Li, Dawn Song. International Conference on Learning Representations (ICLR). May, 2018.
|
Adversarial examples for generative models Jernej Kos, Ian Fischer, Dawn Song. IEEE S&P Workshop on Deep Learning and Security. May, 2018.
|
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, Dawn Song. December, 2017.
Press: Motherboard | The Register |
Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong Warren He, James Wei, Xinyun Chen, Nicholas Carlini, Dawn Song. USENIX Workshop on Offensive Technologies (WOOT). August, 2017.
|
Delving into Transferable Adversarial Examples and Black-box Attacks Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. International Conference on Learning Representations (ICLR). April, 2017.
|
Delving into adversarial attacks on deep policies Jernej Kos and Dawn Song. ICLR Workshop. April, 2017.
|
Faculty
Postdocs:
Ph.D. Students:
Alumni:
Min Du
Warren He
Jernej Kos (NUS)