BadJudge: Backdoor Vulnerabilities of LLM-as-a-JudgeInternational Conference on Learning Representations (ICLR), 2025 |
Attention Tracker: Detecting Prompt Injection Attacks in LLMsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 |