Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2506.15253
Cited By

RAS-Eval: A Comprehensive Benchmark for Security Evaluation of LLM Agents in Real-World Environments

RAS-Eval: A Comprehensive Benchmark for Security Evaluation of LLM Agents in Real-World Environments

18 June 2025

ArXiv (abs)PDF HTML

Papers citing "RAS-Eval: A Comprehensive Benchmark for Security Evaluation of LLM Agents in Real-World Environments"

4 / 4 papers shown

EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law

EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law

Ilija Lichkovski

Alexander Müller

LLMAG AILaw ELM

273

1

0

24 Oct 2025

Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents

Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents

...

158

0

0

20 Oct 2025

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

180

1

0

14 Oct 2025

Evaluating LLM Generated Detection Rules in Cybersecurity

Evaluating LLM Generated Detection Rules in Cybersecurity

Stefano Meschiari

117

0

0

20 Sep 2025