ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.04247
  4. Cited By
Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy
v1v2v3v4v5 (latest)

Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy

Nature Communications (Nat. Commun.), 2024
6 February 2024
Xiangru Tang
Qiao Jin
Kunlun Zhu
Tongxin Yuan
Yichi Zhang
Wangchunshu Zhou
Meng Qu
Yilun Zhao
Jian Tang
Zhuosheng Zhang
Arman Cohan
Zhiyong Lu
Mark B. Gerstein
    LLMAGELM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy"

28 / 28 papers shown
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents
Kunlun Zhu
Jiaxun Zhang
Ziheng Qi
Nuoxing Shang
Zijia Liu
Peixuan Han
Yue Su
Haofei Yu
Jiaxuan You
201
5
0
29 May 2025
ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools
ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools
Zhucong Li
Bowei Zhang
Jin Xiao
Zhijian Zhou
Fenglei Cao
Jiaqing Liang
Yuan Qi
204
0
0
27 May 2025
Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks
Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks
Sirui Chen
Shuqin Ma
Shu Yu
Hanwang Zhang
Shengjie Zhao
Chaochao Lu
742
3
0
26 May 2025
ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense
ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense
Shiyu Xiang
Tong Zhang
Ronghao Chen
AAML
242
1
0
25 May 2025
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator
Xueyang Zhou
Weidong Wang
Lin Lu
Jiawen Shi
Guiyao Tie
Yongtian Xu
Lixing Chen
Pan Zhou
Neil Zhenqiang Gong
Lichao Sun
LLMAG
469
0
0
23 May 2025
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
Haoyu Wang
Christopher M. Poskitt
Jun Sun
446
20
0
24 Mar 2025
Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System
Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning SystemAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ziyou Jiang
Mingyang Li
Guowei Yang
Peng Li
Yuekai Huang
Zhiyuan Chang
Qing Wang
AAML
274
4
0
17 Feb 2025
ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain
ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain
Haochen Zhao
Xiangru Tang
Ziran Yang
Xiao Han
Xuanzhi Feng
...
Senhao Cheng
Di Jin
Yilun Zhao
Arman Cohan
Mark B. Gerstein
ELM
256
6
0
23 Nov 2024
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
Yujun Zhou
Jingdong Yang
Yue Huang
Kehan Guo
Zoe Emory
...
Tian Gao
Werner Geyer
Nuno Moniz
Nitesh Chawla
Xiangliang Zhang
459
12
0
18 Oct 2024
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference OptimizationInternational Conference on Learning Representations (ICLR), 2024
Yougang Lyu
Lingyong Yan
Zihan Wang
D. Yin
Sudipta Singha Roy
Maarten de Rijke
Zhaochun Ren
582
15
0
10 Oct 2024
FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational
  Turns Undermining Refusal Efficacy and Defenses over SORRY-Bench
FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses over SORRY-Bench
Aman Priyanshu
Supriti Vijay
AAML
206
1
0
28 Aug 2024
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
Xingyao Wang
Boxuan Li
Yufan Song
Frank F. Xu
Xiangru Tang
...
Junyang Lin
Robert Brennan
Yuan Yao
Heng Ji
Graham Neubig
VLM
574
13
0
23 Jul 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
432
41
0
20 Jul 2024
Speech-Copilot: Leveraging Large Language Models for Speech Processing
  via Task Decomposition, Modularization, and Program Generation
Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Chun-Yi Kuan
Chih-Kai Yang
Wei-Ping Huang
Ke-Han Lu
Hung-yi Lee
279
17
0
13 Jul 2024
Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents
Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents
Yulong Yang
Xinshan Yang
Shuaidong Li
Chenhao Lin
Subrat Kishore Dutta
Chao Shen
Tianwei Zhang
276
3
0
12 Jul 2024
AI Agents Under Threat: A Survey of Key Security Challenges and Future
  Pathways
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways
Zehang Deng
Yongjian Guo
Changzhou Han
Wanlun Ma
Junwu Xiong
Sheng Wen
Yang Xiang
393
125
0
04 Jun 2024
Safeguarding Large Language Models: A Survey
Safeguarding Large Language Models: A Survey
Yi Dong
Ronghui Mu
Yanghao Zhang
Siqi Sun
Tianle Zhang
...
Yi Qi
Jinwei Hu
Jie Meng
Saddek Bensalem
Xiaowei Huang
OffRLKELMAILaw
254
68
0
03 Jun 2024
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
Junhao Cheng
Xi Lu
Hanhui Li
Khun Loun Zai
Baiqiao Yin
Yuhao Cheng
Yiqiang Yan
Xiaodan Liang
DiffMVGen
398
16
0
03 Jun 2024
AmpleGCG: Learning a Universal and Transferable Generative Model of
  Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Zeyi Liao
Huan Sun
AAML
294
149
0
11 Apr 2024
Exploring Autonomous Agents through the Lens of Large Language Models: A
  Review
Exploring Autonomous Agents through the Lens of Large Language Models: A Review
Saikat Barua
LM&MALLMAG
251
34
0
05 Apr 2024
Empowering Biomedical Discovery with AI Agents
Empowering Biomedical Discovery with AI AgentsCell (Cell), 2024
Shanghua Gao
Ada Fang
Yepeng Huang
Valentina Giunchiglia
Ayush Noori
Jonathan Richard Schwarz
Yasha Ektefaie
Jovana Kondic
Marinka Zitnik
LLMAGAI4CE
263
203
0
03 Apr 2024
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Zhengqing Yuan
Ruoxi Chen
Zhaoxu Li
Haolong Jia
Lifang He
Chi Wang
Lichao Sun
VGen
302
41
0
20 Mar 2024
Data Interpreter: An LLM Agent For Data Science
Data Interpreter: An LLM Agent For Data Science
Sirui Hong
Yizhang Lin
Bang Liu
Bangbang Liu
Binhao Wu
...
Jinhao Tu
Yaying Fei
Yuheng Cheng
Zongze Xu
Chenglin Wu
LLMAGAI4CE
421
144
0
28 Feb 2024
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Preslav Nakov
Tairan Wang
Qingqing Zhu
Taicheng Guo
Shen Gao
Zhiyong Lu
Xin Gao
Xiangliang Zhang
439
3
0
22 Feb 2024
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent
  Constitution
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
Qingfeng Lan
Xianjun Yang
Zelong Li
Cheng Wei
Yongfeng Zhang
LLMAG
421
4
0
02 Feb 2024
Executable Code Actions Elicit Better LLM Agents
Executable Code Actions Elicit Better LLM Agents
Xingyao Wang
Yangyi Chen
Lifan Yuan
Yizhe Zhang
Yunzhu Li
Yuan Yao
Heng Ji
ELMLLMAGLM&Ro
842
313
0
01 Feb 2024
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
Tongxin Yuan
Zhiwei He
Lingzhong Dong
Yiming Wang
Ruijie Zhao
...
Binglin Zhou
Fangqi Li
Zhuosheng Zhang
Rui Wang
Gongshen Liu
ELM
391
136
0
18 Jan 2024
Structured Chemistry Reasoning with Large Language Models
Structured Chemistry Reasoning with Large Language Models
Siru Ouyang
Zhuosheng Zhang
Bing Yan
Xuan Liu
Yejin Choi
Jiawei Han
Lianhui Qin
LRM
145
26
0
16 Nov 2023
1