Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models

11 June 2024

Papers citing "Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models"

3 / 3 papers shown

Title
The Role of Predictive Uncertainty and Diversity in Embodied AI and Robot Learning Ransalu Senanayake 32 8 0 06 May 2024
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned Deep Ganguli Liane Lovitt John Kernion Amanda Askell Yuntao Bai ... Nicholas Joseph Sam McCandlish C. Olah Jared Kaplan Jack Clark 218 441 0 23 Aug 2022
Safety Verification of Deep Neural Networks Xiaowei Huang M. Kwiatkowska Sen Wang Min Wu AAML 178 929 0 21 Oct 2016