ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.02632
  4. Cited By
SEAS: Self-Evolving Adversarial Safety Optimization for Large Language
  Models

SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models

5 August 2024
Muxi Diao
Rumei Li
Shiyang Liu
Guogang Liao
Jingang Wang
Xunliang Cai
Weiran Xu
    AAML
ArXivPDFHTML

Papers citing "SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models"

5 / 5 papers shown
Title
We-Math: Does Your Large Multimodal Model Achieve Human-like
  Mathematical Reasoning?
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
Runqi Qiao
Qiuna Tan
Guanting Dong
Minhui Wu
Chong Sun
...
Yida Xu
Muxi Diao
Zhimin Bao
Chen Li
Honggang Zhang
VLM
LRM
31
30
0
01 Jul 2024
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Xiaoshuai Song
Muxi Diao
Guanting Dong
Zhengyang Wang
Yujia Fu
...
Yejie Wang
Zhuoma Gongque
Jianing Yu
Qiuna Tan
Weiran Xu
ELM
27
10
0
12 Jun 2024
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Mikayel Samvelyan
Sharath Chandra Raparthy
Andrei Lupu
Eric Hambro
Aram H. Markosyan
...
Minqi Jiang
Jack Parker-Holder
Jakob Foerster
Tim Rocktaschel
Roberta Raileanu
SyDa
59
61
0
26 Feb 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated
  Jailbreak Prompts
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
110
292
0
19 Sep 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1