v1v2v3v4v5 (latest)

Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy

Nature Communications (Nat. Commun.), 2024

6 February 2024

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy"

28 / 28 papers shown

SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents

201

29 May 2025

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

204

27 May 2025

Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks

742

26 May 2025

$ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense$

ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense

242

25 May 2025

SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator

469

23 May 2025

AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents

Haoyu Wang

Christopher M. Poskitt

Jun Sun

446

24 Mar 2025

Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning SystemAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

274

17 Feb 2025

ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain

...

256

23 Nov 2024

LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs

...

459

18 Oct 2024

MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference OptimizationInternational Conference on Learning Representations (ICLR), 2024

582

10 Oct 2024

FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses over SORRY-Bench

Aman Priyanshu

Supriti Vijay

AAML

206

28 Aug 2024

OpenHands: An Open Platform for AI Software Developers as Generalist Agents

...

Heng Ji

Graham Neubig

VLM

574

23 Jul 2024

Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)

432

20 Jul 2024

Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

Hung-yi Lee

279

13 Jul 2024

Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents

276

12 Jul 2024

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

Sheng Wen

393

125

04 Jun 2024

Safeguarding Large Language Models: A Survey

...

254

03 Jun 2024

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

398

03 Jun 2024

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

Zeyi Liao

Huan Sun

AAML

294

149

11 Apr 2024

Exploring Autonomous Agents through the Lens of Large Language Models: A Review

Saikat Barua

LM&MA LLMAG

251

05 Apr 2024

Empowering Biomedical Discovery with AI AgentsCell (Cell), 2024

Shanghua Gao

Ada Fang

Yepeng Huang

Valentina Giunchiglia

Ayush Noori

Jonathan Richard Schwarz

263

203

03 Apr 2024

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

Lichao Sun

302

20 Mar 2024

Data Interpreter: An LLM Agent For Data Science

...

421

144

28 Feb 2024

Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark

439

22 Feb 2024

TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution

421

02 Feb 2024

Executable Code Actions Elicit Better LLM Agents

Heng Ji

842

313

01 Feb 2024

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents

...

Rui Wang

391

136

18 Jan 2024

Structured Chemistry Reasoning with Large Language Models

Yejin Choi

145

16 Nov 2023