ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.01552
  4. Cited By
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context
  Learning

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

4 December 2023
Bill Yuchen Lin
Abhilasha Ravichander
Ximing Lu
Nouha Dziri
Melanie Sclar
Khyathi Chandu
Chandra Bhagavatula
Yejin Choi
ArXiv (abs)PDFHTMLHuggingFace (33 upvotes)

Papers citing "The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning"

50 / 112 papers shown
Self-Guided Defense: Adaptive Safety Alignment for Reasoning Models via Synthesized Guidelines
Self-Guided Defense: Adaptive Safety Alignment for Reasoning Models via Synthesized Guidelines
Yuhang Wang
Yanxu Zhu
Dongyuan Lu
Jitao Sang
AAMLSILMELMLRM
476
0
0
26 Nov 2025
Factors That Support Grounded Responses in LLM Conversations: A Rapid Review
Factors That Support Grounded Responses in LLM Conversations: A Rapid Review
Gabriele Cesar Iwashima
Claudia Susie Rodrigues
Claudio Dipolitto
Geraldo Xexéo
80
0
0
24 Nov 2025
Rethinking Deep Alignment Through The Lens Of Incomplete Learning
Rethinking Deep Alignment Through The Lens Of Incomplete Learning
Thong Bach
D. Nguyen
T. Le
T. Tran
107
0
0
15 Nov 2025
Data Trajectory Alignment for LLM Domain Adaptation: A Two-Phase Synthesis Framework for Telecommunications Mathematics
Data Trajectory Alignment for LLM Domain Adaptation: A Two-Phase Synthesis Framework for Telecommunications Mathematics
Z. Zhou
Jing Li
Suming Qiu
J. Huang
Linyuan Qiu
Zhijie Sun
127
0
0
10 Nov 2025
Inference-Time Personalized Alignment with a Few User Preference Queries
Inference-Time Personalized Alignment with a Few User Preference Queries
Victor-Alexandru Pădurean
Parameswaran Kamalaruban
Nachiket Kotalwar
Alkis Gotovos
Adish Singla
171
0
0
04 Nov 2025
Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing
Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing
Rongzhi Zhang
Meghaj Tarte
Yuzhao Heng
Xiang Chen
Tong Yu
Lingkai Kong
Sudheer Chava
Chao Zhang
101
0
0
14 Oct 2025
ADEPT: Continual Pretraining via Adaptive Expansion and Dynamic Decoupled Tuning
ADEPT: Continual Pretraining via Adaptive Expansion and Dynamic Decoupled Tuning
Jinyang Zhang
Yue Fang
Hongxin Ding
Weibin Liao
Muyang Ye
Xu Chu
Junfeng Zhao
Yasha Wang
CLL
138
0
0
11 Oct 2025
IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular Data
IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular DataAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Tao Feng
Lizhen Qu
Niket Tandon
Gholamreza Haffari
102
0
0
10 Oct 2025
Understanding the Effects of Domain Finetuning on LLMs
Understanding the Effects of Domain Finetuning on LLMs
Eshaan Tanwar
Deepak Nathani
William Yang Wang
Tanmoy Chakraborty
135
0
0
10 Oct 2025
Reasoning for Hierarchical Text Classification: The Case of Patents
Reasoning for Hierarchical Text Classification: The Case of Patents
Lekang Jiang
Wenjun Sun
Stephan Goetz
BDL
152
7
0
08 Oct 2025
Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance
Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance
Ahmed Alajrami
Xingwei Tan
Nikolaos Aletras
178
1
0
03 Oct 2025
Shape Happens: Automatic Feature Manifold Discovery in LLMs via Supervised Multi-Dimensional Scaling
Shape Happens: Automatic Feature Manifold Discovery in LLMs via Supervised Multi-Dimensional Scaling
Federico Tiblias
Irina Bigoulaeva
Jingcheng Niu
Simone Balloccu
Iryna Gurevych
156
0
0
01 Oct 2025
On Theoretical Interpretations of Concept-Based In-Context Learning
On Theoretical Interpretations of Concept-Based In-Context Learning
Huaze Tang
Tianren Peng
Shao-Lun Huang
204
0
0
25 Sep 2025
Diagnosing the Performance Trade-off in Moral Alignment: A Case Study on Gender Stereotypes
Diagnosing the Performance Trade-off in Moral Alignment: A Case Study on Gender Stereotypes
Guangliang Liu
Bocheng Chen
Xitong Zhang
Xitong Zhang
K. Johnson
183
0
0
25 Sep 2025
MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models
MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models
Siyu Yan
Long Zeng
Xuecheng Wu
Chengcheng Han
Kongcheng Zhang
Chong Peng
Xuezhi Cao
Xunliang Cai
Chenjuan Guo
AAML
144
0
0
18 Sep 2025
RoboInspector: Unveiling the Unreliability of Policy Code for LLM-enabled Robotic Manipulation
RoboInspector: Unveiling the Unreliability of Policy Code for LLM-enabled Robotic Manipulation
Chenduo Ying
L. Du
Peng Cheng
Yuanchao Shu
153
0
0
29 Aug 2025
Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
Jun Wang
Ninglun Gu
Kailai Zhang
Zijiao Zhang
Yelun Bao
...
Liwei Liu
Yihuan Liu
Pengyong Li
Gary G. Yen
Junchi Yan
ALMELM
228
0
0
26 Aug 2025
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Khaoula Chehbouni
Mohammed Haddou
Jackie CK Cheung
G. Farnadi
LLMAG
338
8
0
25 Aug 2025
Speculative Safety-Aware Decoding
Speculative Safety-Aware Decoding
Xuekang Wang
Shengyu Zhu
Xueqi Cheng
191
0
0
25 Aug 2025
NeuronTune: Fine-Grained Neuron Modulation for Balanced Safety-Utility Alignment in LLMs
NeuronTune: Fine-Grained Neuron Modulation for Balanced Safety-Utility Alignment in LLMs
Birong Pan
Mayi Xu
Qiankun Pi
Jianhao Chen
Yuanyuan Zhu
Ming Zhong
T. Qian
145
0
0
13 Aug 2025
IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization
IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization
Yuzhuo Bai
Shitong Duan
Muhua Huang
Jing Yao
Zhenghao Liu
Peng Zhang
Tun Lu
Xiaoyuan Yi
Maosong Sun
Xing Xie
174
1
0
12 Aug 2025
A Survey on Training-free Alignment of Large Language Models
A Survey on Training-free Alignment of Large Language Models
Birong Pan
Yongqi Li
Jiasheng Si
Sibo Wei
Mayi Xu
Shen Zhou
Yuanyuan Zhu
Ming Zhong
T. Qian
3DVLM&MA
446
1
0
12 Aug 2025
P-Aligner: Enabling Pre-Alignment of Language Models via Principled Instruction Synthesis
P-Aligner: Enabling Pre-Alignment of Language Models via Principled Instruction Synthesis
Feifan Song
Bofei Gao
Yifan Song
Yi Liu
Weimin Xiong
Yuyang Song
Tianyu Liu
Guoyin Wang
Houfeng Wang
ALMLLMSV
173
1
0
06 Aug 2025
The Homogenizing Effect of Large Language Models on Human Expression and Thought
The Homogenizing Effect of Large Language Models on Human Expression and Thought
Zhivar Sourati
Alireza S. Ziabari
Morteza Dehghani
172
3
0
02 Aug 2025
Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations
Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations
Wenhao Wang
Yanyan Li
Long Jiao
Jiawei Yuan
279
2
0
02 Jul 2025
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Maggie Huan
Yuetai Li
Tuney Zheng
Xiaoyu Xu
Seungone Kim
Minxin Du
Radha Poovendran
Graham Neubig
Xiang Yue
LRMELM
203
50
0
01 Jul 2025
LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
Chenghao Yang
Ari Holtzman
221
3
0
22 Jun 2025
LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning
LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning
Gabrel J. Perin
Runjin Chen
Xuxi Chen
Nina S. T. Hirata
Zinan Lin
Junyuan Hong
AAML
317
3
0
18 Jun 2025
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
Zijie Wu
Chaohui Yu
Fan Wang
Xiang Bai
AI4CE
316
6
0
11 Jun 2025
SoK: Machine Unlearning for Large Language Models
Jie Ren
Yue Xing
Yingqian Cui
Charu C. Aggarwal
Hui Liu
MU
182
2
0
10 Jun 2025
Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding
Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong DecodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Feifan Song
Shaohang Wei
Wen Luo
Yuxuan Fan
Tianyu Liu
Guoyin Wang
Houfeng Wang
216
4
0
09 Jun 2025
United Minds or Isolated Agents? Exploring Coordination of LLMs under Cognitive Load Theory
United Minds or Isolated Agents? Exploring Coordination of LLMs under Cognitive Load Theory
HaoYang Shang
Xuan Liu
Zi Liang
J. Zhang
Haibo Hu
Song Guo
LLMAG
242
5
0
07 Jun 2025
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning
Tim Franzmeyer
Archie Sravankumar
Lijuan Liu
Yuning Mao
Rui Hou
Sinong Wang
Jakob Foerster
Luke Zettlemoyer
Madian Khabsa
KELMALM
246
0
0
04 Jun 2025
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
Yanjun Fu
Faisal Hamman
Sanghamitra Dutta
ALM
347
6
0
02 Jun 2025
RAST: Reasoning Activation in LLMs via Small-model Transfer
RAST: Reasoning Activation in LLMs via Small-model Transfer
Siru Ouyang
Xinyu Zhu
Zilin Xiao
Minhao Jiang
Yu Meng
Jiawei Han
OffRLReLMLRM
256
1
0
30 May 2025
TRIDENT: Enhancing Large Language Model Safety with Tri-Dimensional Diversified Red-Teaming Data Synthesis
TRIDENT: Enhancing Large Language Model Safety with Tri-Dimensional Diversified Red-Teaming Data SynthesisAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Xiaorui Wu
Xiaofeng Mao
Fei Li
Xin Zhang
Xuanhong Li
Chong Teng
Donghong Ji
Zhuang Li
144
3
0
30 May 2025
Adaptive Detoxification: Safeguarding General Capabilities of LLMs through Toxicity-Aware Knowledge Editing
Adaptive Detoxification: Safeguarding General Capabilities of LLMs through Toxicity-Aware Knowledge EditingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yifan Lu
Jing Li
Yigeng Zhou
Yihui Zhang
Wenya Wang
Xiucheng Li
Meishan Zhang
Fangming Liu
Jun-chen Yu
Min Zhang
KELMCLL
272
5
0
28 May 2025
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Yi Liu
Dianqing Liu
Mingye Zhu
Junbo Guo
Yongdong Zhang
Zhendong Mao
373
0
0
26 May 2025
The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
Kefan Yu
Qingcheng Zeng
Weihao Xuan
Wanxin Li
Jingyi Wu
Rob Voigt
ReLMLRM
297
0
0
24 May 2025
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
Xiaohao Liu
Xiaobo Xia
Weixiang Zhao
Manyi Zhang
Xianzhi Yu
Xiu Su
Shuo Yang
See-Kiong Ng
Tat-Seng Chua
KELMLRM
411
3
0
23 May 2025
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Fanqi Wan
Weizhou Shen
Shengyi Liao
Yingcheng Shi
Chenliang Li
Ziyi Yang
Ji Zhang
Fei Huang
Jingren Zhou
Ming Yan
OffRLLLMAGReLMLRM
422
13
0
23 May 2025
One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
Haoran Gu
Handing Wang
Yi Mei
Mengjie Zhang
Yaochu Jin
314
0
0
12 May 2025
LLAMAPIE: Proactive In-Ear Conversation Assistants
LLAMAPIE: Proactive In-Ear Conversation AssistantsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Tuochao Chen
Nicholas Batchelder
Alisa Liu
Noah A. Smith
Shyamnath Gollakota
927
1
0
07 May 2025
Base Models Beat Aligned Models at Randomness and Creativity
Base Models Beat Aligned Models at Randomness and Creativity
Peter West
Christopher Potts
1.1K
14
0
30 Apr 2025
Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
Takuma Udagawa
Yang Zhao
H. Kanayama
Bishwaranjan Bhattacharjee
414
2
0
19 Apr 2025
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
Ayoung Lee
Ryan Sungmo Kwon
Peter Railton
Lu Wang
ELM
511
3
0
15 Apr 2025
A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models
A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models
Zhouhang Xie
Junda Wu
Yiran Shen
Yu Xia
Xintong Li
...
Sachin Kumar
Bodhisattwa Prasad Majumder
Jingbo Shang
Prithviraj Ammanabrolu
Julian McAuley
406
8
0
09 Apr 2025
Representation Bending for Large Language Model Safety
Representation Bending for Large Language Model SafetyAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ashkan Yousefpour
Taeheon Kim
Ryan S. Kwon
Seungbeen Lee
Wonje Jeung
Seungju Han
Alvin Wan
Harrison Ngan
Youngjae Yu
Jonghyun Choi
AAMLALMKELM
442
13
0
02 Apr 2025
LightDefense: A Lightweight Uncertainty-Driven Defense against Jailbreaks via Shifted Token Distribution
LightDefense: A Lightweight Uncertainty-Driven Defense against Jailbreaks via Shifted Token Distribution
Zhuoran Yang
Jie Peng
AAML
316
2
0
02 Apr 2025
Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility
Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive PlausibilityAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
S. Lam
Qingcheng Zeng
Jingyi Wu
Rob Voigt
357
0
0
21 Mar 2025
123
Next
Page 1 of 3