ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.00799
  4. Cited By
Are you still on track!? Catching LLM Task Drift with Activations

Are you still on track!? Catching LLM Task Drift with Activations

2 June 2024
Sahar Abdelnabi
Aideen Fay
Giovanni Cherubin
Ahmed Salem
Mario Fritz
Andrew J. Paverd
ArXivPDFHTML

Papers citing "Are you still on track!? Catching LLM Task Drift with Activations"

3 / 3 papers shown
Title
WildChat: 1M ChatGPT Interaction Logs in the Wild
WildChat: 1M ChatGPT Interaction Logs in the Wild
Wenting Zhao
Xiang Ren
Jack Hessel
Claire Cardie
Yejin Choi
Yuntian Deng
14
44
0
02 May 2024
The Instruction Hierarchy: Training LLMs to Prioritize Privileged
  Instructions
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Eric Wallace
Kai Y. Xiao
R. Leike
Lilian Weng
Johannes Heidecke
Alex Beutel
SILM
14
71
0
19 Apr 2024
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Julien Piet
Maha Alrashed
Chawin Sitawarin
Sizhe Chen
Zeming Wei
Elizabeth Sun
Basel Alomair
David A. Wagner
AAML
SyDa
50
22
0
29 Dec 2023
1