v1v2v3v4v5 (latest)

Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy

Nature Communications (Nat. Commun.), 2024

6 February 2024

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy"

28 / 28 papers shown

SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents

219

29 May 2025

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

216

27 May 2025

Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks

774

26 May 2025

$ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense$

ALRPHFS: Adversarially Learned Risk Patterns with Hierarchical Fast \& Slow Reasoning for Robust Agent Defense

261

25 May 2025

SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator

481

23 May 2025

AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents

Haoyu Wang

Christopher M. Poskitt

Jun Sun

461

24 Mar 2025

Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning SystemAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

288

17 Feb 2025

ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain

...

259

23 Nov 2024

LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs

...

472

18 Oct 2024

MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference OptimizationInternational Conference on Learning Representations (ICLR), 2024

590

10 Oct 2024

FRACTURED-SORRY-Bench: Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses over SORRY-Bench

Aman Priyanshu

Supriti Vijay

AAML

224

28 Aug 2024

OpenHands: An Open Platform for AI Software Developers as Generalist Agents

...

Heng Ji

Graham Neubig

VLM

604

23 Jul 2024

Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)

443

20 Jul 2024

Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

Hung-yi Lee

292

13 Jul 2024

Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents

295

12 Jul 2024

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

Sheng Wen

409

138

04 Jun 2024

Safeguarding Large Language Models: A Survey

...

261

03 Jun 2024

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

428

03 Jun 2024

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

Zeyi Liao

Huan Sun

AAML

321

151

11 Apr 2024

Exploring Autonomous Agents through the Lens of Large Language Models: A Review

Saikat Barua

LM&MA LLMAG

266

05 Apr 2024

Empowering Biomedical Discovery with AI AgentsCell (Cell), 2024

Shanghua Gao

Ada Fang

Yepeng Huang

Valentina Giunchiglia

Ayush Noori

Jonathan Richard Schwarz

270

216

03 Apr 2024

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

Lichao Sun

333

20 Mar 2024

Data Interpreter: An LLM Agent For Data Science

...

440

149

28 Feb 2024

Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark

452

22 Feb 2024

TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution

448

02 Feb 2024

Executable Code Actions Elicit Better LLM Agents

Heng Ji

867

334

01 Feb 2024

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents

...

Rui Wang

413

142

18 Jan 2024

Structured Chemistry Reasoning with Large Language Models

Yejin Choi

170

16 Nov 2023