Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.02632
Cited By
SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models
5 August 2024
Muxi Diao
Rumei Li
Shiyang Liu
Guogang Liao
Jingang Wang
Xunliang Cai
Weiran Xu
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models"
5 / 5 papers shown
Title
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
Runqi Qiao
Qiuna Tan
Guanting Dong
Minhui Wu
Chong Sun
...
Yida Xu
Muxi Diao
Zhimin Bao
Chen Li
Honggang Zhang
VLM
LRM
31
3
0
01 Jul 2024
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Xiaoshuai Song
Muxi Diao
Guanting Dong
Zhengyang Wang
Yujia Fu
...
Yejie Wang
Zhuoma Gongque
Jianing Yu
Qiuna Tan
Weiran Xu
ELM
25
10
0
12 Jun 2024
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Mikayel Samvelyan
Sharath Chandra Raparthy
Andrei Lupu
Eric Hambro
Aram H. Markosyan
...
Minqi Jiang
Jack Parker-Holder
Jakob Foerster
Tim Rocktaschel
Roberta Raileanu
SyDa
57
61
0
26 Feb 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
110
292
0
19 Sep 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1