Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.11107
Cited By
Exploring Safety-Utility Trade-Offs in Personalized Language Models
17 June 2024
Anvesh Rao Vijjini
Somnath Basu Roy Chowdhury
Snigdha Chaturvedi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring Safety-Utility Trade-Offs in Personalized Language Models"
11 / 11 papers shown
Title
The Power of Personality: A Human Simulation Perspective to Investigate Large Language Model Agents
Yifan Duan
Yihong Tang
Xuefeng Bai
Kehai Chen
J. Li
Min Zhang
LLMAG
69
0
0
28 Feb 2025
The Rise of Darkness: Safety-Utility Trade-Offs in Role-Playing Dialogue Agents
Yihong Tang
Kehai Chen
X. Bai
Zhengyu Niu
B. Wang
Jie Liu
Min Zhang
LLMAG
39
0
0
28 Feb 2025
A Survey of Personalized Large Language Models: Progress and Future Directions
Jiahong Liu
Zexuan Qiu
Zhongyang Li
Quanyu Dai
Jieming Zhu
Minda Hu
Menglin Yang
Irwin King
LM&MA
38
2
0
17 Feb 2025
Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization
Yu-Min Tseng
Yu-Chao Huang
Teng-Yun Hsiao
Yu-Ching Hsu
Chao-Wei Huang
Jia-Yin Foo
Yun-Nung Chen
LLMAG
235
63
0
03 Jun 2024
Context Steering: Controllable Personalization at Inference Time
Jerry Zhi-Yang He
Sashrika Pandey
Mariah L. Schrum
Anca Dragan
26
3
0
02 May 2024
Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Shashank Gupta
Vaishnavi Shrivastava
A. Deshpande
A. Kalyan
Peter Clark
Ashish Sabharwal
Tushar Khot
99
49
0
08 Nov 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
197
2,232
0
22 Mar 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Adversarial Scrubbing of Demographic Information for Text Classification
Somnath Basu Roy Chowdhury
Sayan Ghosh
Yiyuan Li
Junier B. Oliva
Shashank Srivastava
Snigdha Chaturvedi
34
14
0
17 Sep 2021
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
202
121
0
23 Jan 2021
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
195
607
0
03 Sep 2019
1