ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.14866
  4. Cited By
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
v1v2 (latest)

OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents

17 June 2025
Thomas Kuntz
Agatha Duzan
Hao Zhao
Francesco Croce
Zico Kolter
Nicolas Flammarion
Maksym Andriushchenko
    LLMAGELM
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)Github (38★)

Papers citing "OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents"

10 / 10 papers shown
Are Your Agents Upward Deceivers?
Are Your Agents Upward Deceivers?
Dadi Guo
Qingyu Liu
Dongrui Liu
Qihan Ren
Shuai Shao
...
Z. Chen
Jialing Tao
Yaodong Yang
Jing Shao
Xia Hu
LLMAG
146
0
0
04 Dec 2025
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
Qiushi Sun
Mukai Li
Zhoumianze Liu
Zhihui Xie
F. Xu
...
Qi Liu
Z. Wu
Zhuosheng Zhang
B. Kao
Lingpeng Kong
181
2
0
28 Oct 2025
OSWorld-MCP: Benchmarking MCP Tool Invocation In Computer-Use Agents
OSWorld-MCP: Benchmarking MCP Tool Invocation In Computer-Use Agents
Hongrui Jia
Jitong Liao
X. Zhang
Haiyang Xu
Tianbao Xie
Chaoya Jiang
Ming Yan
Si Liu
Wei Ye
Fei Huang
173
4
0
28 Oct 2025
Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Shrestha Datta
Shahriar Kabir Nahin
Anshuman Chhabra
P. Mohapatra
LLMAGLM&Ro
299
4
0
27 Oct 2025
GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments?
GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments?
Chiyu Chen
Xinhao Song
Yunkai Chai
Yang Yao
Haodong Zhao
Lijun Li
Jie Li
Yan Teng
Gongshen Liu
Y. Wang
AAMLLLMAG
196
0
0
23 Oct 2025
Just Do It!? Computer-Use Agents Exhibit Blind Goal-Directedness
Just Do It!? Computer-Use Agents Exhibit Blind Goal-Directedness
Erfan Shayegani
Keegan Hines
Yue Dong
Nael B. Abu-Ghazaleh
Roman Lutz
Spencer Whitehead
Vidhisha Balachandran
Besmira Nushi
Vibhav Vineet
148
0
0
02 Oct 2025
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
Quanfeng Lu
Zhantao Ma
Shuai Zhong
Jin Wang
Dahai Yu
Michael K. Ng
Ping Luo
209
0
0
27 Aug 2025
Reliable Weak-to-Strong Monitoring of LLM Agents
Reliable Weak-to-Strong Monitoring of LLM Agents
Neil Kale
Chen Bo Calvin Zhang
Kevin Zhu
Ankit Aich
Paula Rodriguez
Scale Red Team
Christina Q. Knight
Zifan Wang
184
2
0
26 Aug 2025
RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
Zeyi Liao
Jaylen Jones
Linxi Jiang
Eric Fosler-Lussier
Eric Fosler-Lussier
Yu-Chuan Su
Zhiqiang Lin
Huan Sun
ELM
438
10
0
28 May 2025
A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
Ada Chen
Yongjiang Wu
Jing Zhang
Shu Yang
Shu Yang
Jen-tse Huang
Wenxuan Wang
Wenxuan Wang
S. Wang
ELM
441
11
0
16 May 2025
1