Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.01586
Cited By
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
2 February 2024
Wenyue Hua
Xianjun Yang
Zelong Li
Cheng Wei
Yongfeng Zhang
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution"
11 / 11 papers shown
Title
Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering
Joshua Owotogbe
LLMAG
52
0
0
06 May 2025
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Bang Zhang
Ruotian Ma
Qingxuan Jiang
Peisong Wang
Jiaqi Chen
...
Fanghua Ye
Jian Li
Yifan Yang
Zhaopeng Tu
Xiaolong Li
LLMAG
ELM
ALM
97
25
1
01 May 2025
RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models
Bang An
Shiyue Zhang
Mark Dredze
54
0
0
25 Apr 2025
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAML
LLMAG
ELM
48
18
0
03 Oct 2024
War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars
Wenyue Hua
Lizhou Fan
Lingyao Li
Kai Mei
Jianchao Ji
Yingqiang Ge
Libby Hemphill
Yongfeng Zhang
LM&Ro
LLMAG
125
87
0
28 Nov 2023
Controlled Text Generation with Natural Language Instructions
Wangchunshu Zhou
Yuchen Eleanor Jiang
Ethan Gotlieb Wilcox
Ryan Cotterell
Mrinmaya Sachan
152
84
0
27 Apr 2023
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
215
1,701
0
07 Apr 2023
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
225
495
0
28 Sep 2022
Housekeep: Tidying Virtual Households using Commonsense Reasoning
Yash Kant
Arun Ramachandran
Sriram Yenamandra
Igor Gilitschenski
Dhruv Batra
Andrew Szot
Harsh Agrawal
LM&Ro
LRM
152
70
0
22 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Torsten Scholak
Nathan Schucher
Dzmitry Bahdanau
146
373
0
10 Sep 2021
1