Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.17345
Cited By
Exploring and steering the moral compass of Large Language Models
27 May 2024
Alejandro Tlaie
LLMSV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring and steering the moral compass of Large Language Models"
4 / 4 papers shown
Title
Representation Engineering for Large-Language Models: Survey and Research Challenges
Lukasz Bartoszcze
Sarthak Munshi
Bryan Sukidi
Jennifer Yen
Zejia Yang
David Williams-King
Linh Le
Kosi Asuzu
Carsten Maple
100
0
0
24 Feb 2025
Programming Refusal with Conditional Activation Steering
Bruce W. Lee
Inkit Padhi
K. Ramamurthy
Erik Miehling
Pierre L. Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
91
13
0
06 Sep 2024
How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions
Julia Kharchenko
Tanya Roosta
Aman Chadha
Chirag Shah
27
16
0
21 Jun 2024
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
240
456
0
24 Sep 2022
1