Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2008.02275
Cited By
Aligning AI With Shared Human Values
5 August 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
J. Li
D. Song
Jacob Steinhardt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Aligning AI With Shared Human Values"
50 / 347 papers shown
Title
NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context
Ben Yao
Qiuchi Li
Yazhou Zhang
Siyu Yang
Bohan Zhang
Prayag Tiwari
Jing Qin
26
0
0
13 May 2025
Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2
Vytenis Šliogeris
Povilas Daniušis
Arturas Nakvosas
CLL
32
0
0
09 May 2025
Advancing and Benchmarking Personalized Tool Invocation for LLMs
X. Huang
Yuefeng Huang
W. Liu
Xingshan Zeng
Y. Wang
Ruiming Tang
Hong Xie
Defu Lian
50
0
0
07 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Yeseong Kim
45
0
0
01 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
1
0
26 Apr 2025
Auditing the Ethical Logic of Generative AI Models
W. Russell Neuman
Chad Coleman
Ali Dasdan
Safinah Ali
Manan Shah
ELM
LRM
68
1
0
24 Apr 2025
The Digital Cybersecurity Expert: How Far Have We Come?
Dawei Wang
Geng Zhou
Xianglong Li
Yu Bai
Li Chen
Ting Qin
Jian Sun
D. Li
ELM
57
0
0
16 Apr 2025
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
Ayoung Lee
Ryan Sungmo Kwon
Peter Railton
Lu Wang
ELM
49
0
0
15 Apr 2025
HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving
Avinash Kumar
Shashank Nag
Jason Clemons
L. John
Poulami Das
26
0
0
14 Apr 2025
Visual moral inference and communication
Warren Zhu
Aida Ramezani
Yang Xu
31
0
0
12 Apr 2025
RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models
Lv Qingsong
Yangning Li
Zihua Lan
Zishan Xu
Jiwei Tang
Yinghui Li
Wenhao Jiang
Hai-tao Zheng
Philip S. Yu
32
0
0
09 Apr 2025
Separator Injection Attack: Uncovering Dialogue Biases in Large Language Models Caused by Role Separators
Xitao Li
H. Wang
Jiang Wu
Ting Liu
AAML
26
0
0
08 Apr 2025
PipeDec: Low-Latency Pipeline-based Inference with Dynamic Speculative Decoding towards Large-scale Models
Haofei Yin
Mengbai Xiao
Rouzhou Lu
Xiao Zhang
Dongxiao Yu
Guanghui Zhang
AI4CE
24
0
0
05 Apr 2025
Entropy-Based Block Pruning for Efficient Large Language Models
Liangwei Yang
Yuhui Xu
Juntao Tan
Doyen Sahoo
S.
Caiming Xiong
H. Wang
Shelby Heinecke
AAML
23
0
0
04 Apr 2025
Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation
A. Myntti
Erik Henriksson
Veronika Laippala
S. Pyysalo
31
0
0
02 Apr 2025
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM
Kshitij Ambilduke
Ben Peters
Sonal Sannigrahi
Anil Keshwani
Tsz Kin Lam
Bruno Martins
Marcely Zanon Boito
André F. T. Martins
47
0
0
13 Mar 2025
Backtracking for Safety
Bilgehan Sel
Dingcheng Li
Phillip Wallis
Vaishakh Keshava
Ming Jin
Siddhartha Reddy Jonnalagadda
KELM
55
0
0
11 Mar 2025
Stay Focused: Problem Drift in Multi-Agent Debate
Jonas Becker
Lars Benedikt Kaesberg
Andreas Stephan
Jan Philip Wahle
Terry Ruas
Bela Gipp
57
1
0
26 Feb 2025
Speaking the Right Language: The Impact of Expertise Alignment in User-AI Interactions
Shramay Palta
Nirupama Chandrasekaran
Rachel Rudinger
Scott Counts
57
0
0
25 Feb 2025
Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
Subhash Kantamneni
Joshua Engels
Senthooran Rajamanoharan
Max Tegmark
Neel Nanda
59
5
0
23 Feb 2025
Self-Taught Agentic Long Context Understanding
Yufan Zhuang
Xiaodong Yu
Jialian Wu
X. Sun
Z. Wang
Jiang Liu
Yusheng Su
Jingbo Shang
Zicheng Liu
Emad Barsoum
LRM
36
0
0
21 Feb 2025
Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral
Shivani Kumar
David Jurgens
LRM
41
0
0
21 Feb 2025
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation
Abdelrahman Abdallah
Bhawna Piryani
Jamshid Mozafari
Mohammed Ali
Adam Jatowt
81
1
0
21 Feb 2025
Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models
M. Russinovich
Ahmed Salem
MU
CLL
57
0
0
20 Feb 2025
Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages
Shreyan Biswas
Alexander Erlei
U. Gadiraju
103
4
0
13 Feb 2025
The Odyssey of the Fittest: Can Agents Survive and Still Be Good?
Dylan Waldner
Risto Miikkulainen
51
0
0
08 Feb 2025
Breaking Focus: Contextual Distraction Curse in Large Language Models
Yue Huang
Yanbo Wang
Zixiang Xu
Chujie Gao
Siyuan Wu
Jiayi Ye
Xiuying Chen
Pin-Yu Chen
X. Zhang
AAML
43
1
0
03 Feb 2025
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
Yibin Wang
H. Shi
Ligong Han
Dimitris N. Metaxas
Hao Wang
BDL
UQLM
104
6
0
28 Jan 2025
The Goofus & Gallant Story Corpus for Practical Value Alignment
Md Sultan al Nahian
Tasmia Tasrin
Spencer Frazier
Mark O. Riedl
Brent Harrison
45
0
0
17 Jan 2025
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun
Penghan Wang
Fan Lai
113
1
0
15 Jan 2025
AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds
Yinfang Chen
Manish Shetty
Gagan Somashekar
Minghua Ma
Yogesh L. Simmhan
Jonathan Mace
Chetan Bansal
Rujia Wang
Saravan Rajmohan
46
1
0
12 Jan 2025
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
LLM-jp
Akiko Aizawa
Eiji Aramaki
Bowen Chen
Fei Cheng
...
Yuya Yamamoto
Yusuke Yamauchi
Hitomi Yanaka
Rio Yokota
Koichiro Yoshino
45
14
0
31 Dec 2024
M
3
^3
3
oralBench: A MultiModal Moral Benchmark for LVLMs
Bei Yan
Jie M. Zhang
Zhiyuan Chen
Shiguang Shan
Xilin Chen
ELM
41
1
0
31 Dec 2024
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization
Tan-Hanh Pham
Hoang-Nam Le
Phu-Vinh Nguyen
Chris Ngo
Truong Son-Hy
AuLLM
LRM
81
1
0
21 Dec 2024
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
Yuxi Sun
Wei Gao
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
ELM
74
0
0
17 Dec 2024
Text Is Not All You Need: Multimodal Prompting Helps LLMs Understand Humor
Ashwin Baluja
65
3
0
01 Dec 2024
TAROT: Targeted Data Selection via Optimal Transport
Lan Feng
Fan Nie
Yuejiang Liu
Alexandre Alahi
OT
125
1
0
30 Nov 2024
Towards Robust Evaluation of Unlearning in LLMs via Data Transformations
Abhinav Joshi
Shaswati Saha
Divyaksh Shukla
Sriram Vema
Harsh Jhamtani
Manas Gaur
Ashutosh Modi
MU
83
3
0
23 Nov 2024
Inducing Human-like Biases in Moral Reasoning Language Models
Artem Karpov
Seong Hah Cho
Austin Meek
Raymond Koopmanschap
Lucy Farnik
Bogdan-Ionut Cirstea
LRM
66
0
0
23 Nov 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Haonan Wang
Qian Liu
Chao Du
Tongyao Zhu
Cunxiao Du
Kenji Kawaguchi
Tianyu Pang
109
6
0
20 Nov 2024
Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment
Allison Huang
Yulu Niki Pi
Carlos Mougan
71
0
0
18 Nov 2024
Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics Statements
Antonia Karamolegkou
Sandrine Schiller Hansen
Ariadni Christopoulou
Filippos Stamatiou
Anne Lauscher
Anders Søgaard
31
0
0
12 Nov 2024
Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications
Xianzhe Fan
Qing Xiao
Xuhui Zhou
Yuran Su
Zhicong Lu
Maarten Sap
Hong Shen
38
0
0
11 Nov 2024
Evaluating Moral Beliefs across LLMs through a Pluralistic Framework
Xuelin Liu
Yanfei Zhu
Shucheng Zhu
Pengyuan Liu
Ying Liu
Dong Yu
31
1
0
06 Nov 2024
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMe
VLM
21
0
0
01 Nov 2024
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Gabrielle Kaili-May Liu
Bowen Shi
Avi Caciularu
Idan Szpektor
Arman Cohan
58
3
0
30 Oct 2024
Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Yifang Chen
David Zhu
SyDa
33
0
0
27 Oct 2024
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
70
1
0
26 Oct 2024
From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages
Artur Kiulian
Anton Polishko
M. Khandoga
Yevhen Kostiuk
Guillermo Gabrielli
...
Hrishikesh Garud
Wendy Wing Yee Mak
Dmytro Chaplynskyi
Selma Belhadj Amor
Grigol Peradze
32
0
0
24 Oct 2024
PLDR-LLM: Large Language Model from Power Law Decoder Representations
Burc Gokden
16
1
0
22 Oct 2024
1
2
3
4
5
6
7
Next