ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.14516
  4. Cited By
Do LLMs "know" internally when they follow instructions?

Do LLMs "know" internally when they follow instructions?

18 October 2024
Juyeon Heo
Christina Heinze-Deml
Oussama Elachqar
Shirley Ren
Udhay Nallasamy
Andy Miller
Kwan Ho Ryan Chan
Jaya Narain
ArXivPDFHTML

Papers citing "Do LLMs "know" internally when they follow instructions?"

2 / 2 papers shown
Title
Prefill-Based Jailbreak: A Novel Approach of Bypassing LLM Safety Boundary
Prefill-Based Jailbreak: A Novel Approach of Bypassing LLM Safety Boundary
Yakai Li
Jiekang Hu
Weiduan Sang
Luping Ma
Jing Xie
Weijuan Zhang
Aimin Yu
Shijie Zhao
Qingjia Huang
Qihang Zhou
AAML
40
0
0
28 Apr 2025
I'm Sorry Dave: How the old world of personnel security can inform the new world of AI insider risk
I'm Sorry Dave: How the old world of personnel security can inform the new world of AI insider risk
Paul Martin
Sarah Mercer
49
0
0
26 Mar 2025
1