ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.04247
  4. Cited By
Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy
v1v2v3v4v5 (latest)

Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy

Nature Communications (Nat. Commun.), 2024
6 February 2024
Xiangru Tang
Qiao Jin
Kunlun Zhu
Tongxin Yuan
Yichi Zhang
Wangchunshu Zhou
Meng Qu
Yilun Zhao
Jian Tang
Zhuosheng Zhang
Arman Cohan
Zhiyong Lu
Mark B. Gerstein
    LLMAGELM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy"

28 / 28 papers shown
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents
Kunlun Zhu
Jiaxun Zhang
Ziheng Qi
Nuoxing Shang
Zijia Liu
Peixuan Han
Yue Su
Haofei Yu
Jiaxuan You
219
7
0
29 May 2025
ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools
ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools
Zhucong Li
Bowei Zhang
Jin Xiao
Zhijian Zhou
Fenglei Cao
Jiaqing Liang
Yuan Qi
216
0
0
27 May 2025
Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks
Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks
Sirui Chen
Shuqin Ma
Shu Yu
Hanwang Zhang
Shengjie Zhao
Chaochao Lu
774
4
0
26 May 2025
ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense
ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense
Shiyu Xiang
Tong Zhang
Ronghao Chen
AAML
261
1
0
25 May 2025
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator
Xueyang Zhou
Weidong Wang
Lin Lu
Jiawen Shi
Guiyao Tie
Yongtian Xu
Lixing Chen
Pan Zhou
Neil Zhenqiang Gong
Lichao Sun
LLMAG
481
0
0
23 May 2025
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
Haoyu Wang
Christopher M. Poskitt
Jun Sun
461
23
0
24 Mar 2025
Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System
Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning SystemAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ziyou Jiang
Mingyang Li
Guowei Yang
Peng Li
Yuekai Huang
Zhiyuan Chang
Qing Wang
AAML
288
5
0
17 Feb 2025
ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain
ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain
Haochen Zhao
Xiangru Tang
Ziran Yang
Xiao Han
Xuanzhi Feng
...
Senhao Cheng
Di Jin
Yilun Zhao
Arman Cohan
Mark B. Gerstein
ELM
259
6
0
23 Nov 2024
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
Yujun Zhou
Jingdong Yang
Yue Huang
Kehan Guo
Zoe Emory
...
Tian Gao
Werner Geyer
Nuno Moniz
Nitesh Chawla
Xiangliang Zhang
472
13
0
18 Oct 2024
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference OptimizationInternational Conference on Learning Representations (ICLR), 2024
Yougang Lyu
Lingyong Yan
Zihan Wang
D. Yin
Sudipta Singha Roy
Maarten de Rijke
Zhaochun Ren
590
15
0
10 Oct 2024
FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational
  Turns Undermining Refusal Efficacy and Defenses over SORRY-Bench
FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses over SORRY-Bench
Aman Priyanshu
Supriti Vijay
AAML
224
2
0
28 Aug 2024
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
Xingyao Wang
Boxuan Li
Yufan Song
Frank F. Xu
Xiangru Tang
...
Junyang Lin
Robert Brennan
Yuan Yao
Heng Ji
Graham Neubig
VLM
604
13
0
23 Jul 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
443
42
0
20 Jul 2024
Speech-Copilot: Leveraging Large Language Models for Speech Processing
  via Task Decomposition, Modularization, and Program Generation
Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Chun-Yi Kuan
Chih-Kai Yang
Wei-Ping Huang
Ke-Han Lu
Hung-yi Lee
292
19
0
13 Jul 2024
Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents
Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents
Yulong Yang
Xinshan Yang
Shuaidong Li
Chenhao Lin
Subrat Kishore Dutta
Chao Shen
Tianwei Zhang
295
1
0
12 Jul 2024
AI Agents Under Threat: A Survey of Key Security Challenges and Future
  Pathways
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways
Zehang Deng
Yongjian Guo
Changzhou Han
Wanlun Ma
Junwu Xiong
Sheng Wen
Yang Xiang
409
138
0
04 Jun 2024
Safeguarding Large Language Models: A Survey
Safeguarding Large Language Models: A Survey
Yi Dong
Ronghui Mu
Yanghao Zhang
Siqi Sun
Tianle Zhang
...
Yi Qi
Jinwei Hu
Jie Meng
Saddek Bensalem
Xiaowei Huang
OffRLKELMAILaw
261
73
0
03 Jun 2024
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
Junhao Cheng
Xi Lu
Hanhui Li
Khun Loun Zai
Baiqiao Yin
Yuhao Cheng
Yiqiang Yan
Xiaodan Liang
DiffMVGen
428
16
0
03 Jun 2024
AmpleGCG: Learning a Universal and Transferable Generative Model of
  Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Zeyi Liao
Huan Sun
AAML
321
151
0
11 Apr 2024
Exploring Autonomous Agents through the Lens of Large Language Models: A
  Review
Exploring Autonomous Agents through the Lens of Large Language Models: A Review
Saikat Barua
LM&MALLMAG
266
38
0
05 Apr 2024
Empowering Biomedical Discovery with AI Agents
Empowering Biomedical Discovery with AI AgentsCell (Cell), 2024
Shanghua Gao
Ada Fang
Yepeng Huang
Valentina Giunchiglia
Ayush Noori
Jonathan Richard Schwarz
Yasha Ektefaie
Jovana Kondic
Marinka Zitnik
LLMAGAI4CE
270
216
0
03 Apr 2024
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Zhengqing Yuan
Ruoxi Chen
Zhaoxu Li
Haolong Jia
Lifang He
Chi Wang
Lichao Sun
VGen
333
41
0
20 Mar 2024
Data Interpreter: An LLM Agent For Data Science
Data Interpreter: An LLM Agent For Data Science
Sirui Hong
Yizhang Lin
Bang Liu
Bangbang Liu
Binhao Wu
...
Jinhao Tu
Yaying Fei
Yuheng Cheng
Zongze Xu
Chenglin Wu
LLMAGAI4CE
440
149
0
28 Feb 2024
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Preslav Nakov
Tairan Wang
Qingqing Zhu
Taicheng Guo
Shen Gao
Zhiyong Lu
Xin Gao
Xiangliang Zhang
452
3
0
22 Feb 2024
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent
  Constitution
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
Qingfeng Lan
Xianjun Yang
Zelong Li
Cheng Wei
Yongfeng Zhang
LLMAG
448
4
0
02 Feb 2024
Executable Code Actions Elicit Better LLM Agents
Executable Code Actions Elicit Better LLM Agents
Xingyao Wang
Yangyi Chen
Lifan Yuan
Yizhe Zhang
Yunzhu Li
Yuan Yao
Heng Ji
ELMLLMAGLM&Ro
867
334
0
01 Feb 2024
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
Tongxin Yuan
Zhiwei He
Lingzhong Dong
Yiming Wang
Ruijie Zhao
...
Binglin Zhou
Fangqi Li
Zhuosheng Zhang
Rui Wang
Gongshen Liu
ELM
413
142
0
18 Jan 2024
Structured Chemistry Reasoning with Large Language Models
Structured Chemistry Reasoning with Large Language Models
Siru Ouyang
Zhuosheng Zhang
Bing Yan
Xuan Liu
Yejin Choi
Jiawei Han
Lianhui Qin
LRM
170
27
0
16 Nov 2023
1
Page 1 of 1