Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.10332
Cited By
Revision Transformers: Instructing Language Models to Change their Values
19 October 2022
Felix Friedrich
Wolfgang Stammer
P. Schramowski
Kristian Kersting
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Revision Transformers: Instructing Language Models to Change their Values"
14 / 14 papers shown
Title
SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs
Ruben Härle
Felix Friedrich
Manuel Brack
Bjorn Deiseroth
P. Schramowski
Kristian Kersting
23
0
0
11 Nov 2024
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Chen Cecilia Liu
Iryna Gurevych
Anna Korhonen
33
5
0
06 Jun 2024
Towards Measuring and Modeling "Culture" in LLMs: A Survey
Muhammad Farid Adilazuarda
Sagnik Mukherjee
Pradhyumna Lavania
Siddhant Singh
Alham Fikri Aji
Jacki OÑeill
Ashutosh Modi
Monojit Choudhury
50
53
0
05 Mar 2024
Learning by Self-Explaining
Wolfgang Stammer
Felix Friedrich
David Steinmann
Manuel Brack
Hikaru Shindo
Kristian Kersting
20
7
0
15 Sep 2023
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
Felix Friedrich
Manuel Brack
Lukas Struppek
Dominik Hintersdorf
P. Schramowski
Sasha Luccioni
Kristian Kersting
27
119
0
07 Feb 2023
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
225
500
0
28 Sep 2022
Does CLIP Know My Face?
Dominik Hintersdorf
Lukas Struppek
Manuel Brack
Felix Friedrich
P. Schramowski
Kristian Kersting
VLM
13
9
0
15 Sep 2022
Does Moral Code Have a Moral Code? Probing Delphi's Moral Philosophy
Kathleen C. Fraser
S. Kiritchenko
Esma Balkir
107
37
0
25 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
211
1,654
0
15 Oct 2021
Can Machines Learn Morality? The Delphi Experiment
Liwei Jiang
Jena D. Hwang
Chandra Bhagavatula
Ronan Le Bras
Jenny T Liang
...
Yulia Tsvetkov
Oren Etzioni
Maarten Sap
Regina A. Rini
Yejin Choi
FaML
117
110
0
14 Oct 2021
Interactively Providing Explanations for Transformer Language Models
Felix Friedrich
P. Schramowski
Christopher Tauchmann
Kristian Kersting
LRM
33
6
0
02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,835
0
18 Apr 2021
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
406
2,576
0
03 Sep 2019
1