Word Embeddings Are Steers for Language Models

Word Embeddings Are Steers for Language Models

22 May 2023

Tarek F. Abdelzaher

Heng Ji

Papers citing "Word Embeddings Are Steers for Language Models"

10 / 10 papers shown

Title
Personalize Your LLM: Fake it then Align it Yijing Zhang Dyah Adila Changho Shin Frederic Sala 86 0 0 02 Mar 2025
PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent Jiateng Liu Lin Ai Zizhou Liu Payam Karisani Zheng Hui May Fung Preslav Nakov Julia Hirschberg Heng Ji DiffM 83 3 0 17 Feb 2025
Evaluating the Prompt Steerability of Large Language Models Erik Miehling Michael Desmond K. Ramamurthy Elizabeth M. Daly Pierre L. Dognin Jesus Rios Djallel Bouneffouf Miao Liu LLMSV 87 3 0 19 Nov 2024
Programming Refusal with Conditional Activation Steering Bruce W. Lee Inkit Padhi K. Ramamurthy Erik Miehling Pierre L. Dognin Manish Nagireddy Amit Dhurandhar LLMSV 89 13 0 06 Sep 2024
Controlled Text Generation with Natural Language Instructions Wangchunshu Zhou Yuchen Eleanor Jiang Ethan Gotlieb Wilcox Ryan Cotterell Mrinmaya Sachan 143 84 0 27 Apr 2023
Large Language Models are Zero-Shot Reasoners Takeshi Kojima S. Gu Machel Reid Yutaka Matsuo Yusuke Iwasawa ReLM LRM 291 2,712 0 24 May 2022
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP Timo Schick Sahana Udupa Hinrich Schütze 254 374 0 28 Feb 2021
Debiasing Pre-trained Contextualised Embeddings Masahiro Kaneko Danushka Bollegala 210 138 0 23 Jan 2021
The Woman Worked as a Babysitter: On Biases in Language Generation Emily Sheng Kai-Wei Chang Premkumar Natarajan Nanyun Peng 204 607 0 03 Sep 2019
Efficient Estimation of Word Representations in Vector Space Tomáš Mikolov Kai Chen G. Corrado J. Dean 3DV 228 29,632 0 16 Jan 2013