Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2506.15253
Cited By
RAS-Eval: A Comprehensive Benchmark for Security Evaluation of LLM Agents in Real-World Environments
18 June 2025
Yuchuan Fu
Xiaohan Yuan
Dongxia Wang
LLMAG
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"RAS-Eval: A Comprehensive Benchmark for Security Evaluation of LLM Agents in Real-World Environments"
4 / 4 papers shown
EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law
Ilija Lichkovski
Alexander Müller
Mariam Ibrahim
Tiwai Mhundwa
LLMAG
AILaw
ELM
273
1
0
24 Oct 2025
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents
Yihong Tang
Kehai Chen
Liang Yue
Jinxin Fan
Caishen Zhou
...
Kaiyang Guo
Xingshan Zeng
Wenjing Cun
L. Shang
Min Zhang
LLMAG
158
0
0
20 Oct 2025
MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents
Dongsen Zhang
Zekun Li
Xu Luo
Xuannan Liu
Peipei Li
Wenjun Xu
ELM
180
1
0
14 Oct 2025
Evaluating LLM Generated Detection Rules in Cybersecurity
Anna Bertiger
Bobby Filar
Aryan Luthra
Stefano Meschiari
Aiden Mitchell
Sam Scholten
Vivek Sharath
117
0
0
20 Sep 2025
1