v1v2v3v4v5v6 (latest)

Aligning AI With Shared Human Values

5 August 2020

Papers citing "Aligning AI With Shared Human Values"

50 / 463 papers shown

Inducing Human-like Biases in Moral Reasoning Language Models

235

23 Nov 2024

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

469

20 Nov 2024

Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment

Allison Huang

Yulu Niki Pi

Carlos Mougan

234

18 Nov 2024

Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF DatasetsNeural Information Processing Systems (NeurIPS), 2024

Ike Obi

Rohan Pant

Srishti Shekhar Agrawal

Maham Ghazanfar

Aaron Basiletti

233

18 Nov 2024

Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics StatementsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Antonia Karamolegkou

Sandrine Schiller Hansen

Ariadni Christopoulou

Filippos Stamatiou

Anne Lauscher

Anders Søgaard

170

12 Nov 2024

Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications

188

11 Nov 2024

Evaluating Moral Beliefs across LLMs through a Pluralistic FrameworkConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

266

06 Nov 2024

MoD: A Distribution-Based Approach for Merging Large Language Models

Quy-Anh Dang

Chris Ngo

MoMe VLM

252

01 Nov 2024

MDCure: A Scalable Pipeline for Multi-Document Instruction-FollowingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Gabrielle Kaili-May Liu

619

30 Oct 2024

Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation

Yifang Chen

David Zhu

SyDa

136

27 Oct 2024

Improving Model Evaluation using SMART Filtering of Benchmark DatasetsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

704

26 Oct 2024

From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages

...

212

24 Oct 2024

PLDR-LLM: Large Language Model from Power Law Decoder Representations

Burc Gokden

145

22 Oct 2024

Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model ConfidenceBigData Congress [Services Society] (BSS), 2024

...

Vasileios Mavroeidis

Audun Josang

263

20 Oct 2024

Speciesism in Natural Language Processing ResearchAI and Ethics (AI & Ethics), 2024

Masashi Takeshita

Rafal Rzepka

222

18 Oct 2024

Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models

262

17 Oct 2024

BenTo: Benchmark Task Reduction with In-Context Transferability

Hongyu Zhao

Ming Li

Lichao Sun

Tianyi Zhou

298

17 Oct 2024

Learning to Route LLMs with Confidence Tokens

Yu-Neng Chuang

Helen Zhou

Prathusha Kameswara Sarma

285

17 Oct 2024

LLM-Human Pipeline for Cultural Context Grounding of Conversations

Rajkumar Pujari

Dan Goldwasser

266

17 Oct 2024

$Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection$

Adapt-

\infty

: Scalable Continual Multimodal Instruction Tuning via Dynamic Data SelectionInternational Conference on Learning Representations (ICLR), 2024

312

14 Oct 2024

Evaluating Gender Bias of LLMs in Making Morality JudgementsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

160

13 Oct 2024

SocialGaze: Improving the Integration of Human Social Norms in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

219

11 Oct 2024

Do Unlearning Methods Remove Information from Language Model Weights?

Aghyad Deeb

Fabien Roger

AAML MU

429

11 Oct 2024

TRIAGE: Ethical Benchmarking of AI Models Through Mass Casualty Simulations

Nathalie Maria Kirch

Konstantin Hebenstreit

Matthias Samwald

195

10 Oct 2024

Fine-Tuning Language Models for Ethical Ambiguity: A Comparative Study of Alignment with Human Responses

Pranav Senthilkumar

Visshwa Balasubramanian

146

10 Oct 2024

The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making

221

09 Oct 2024

Scaling Laws For Mixed Quantization

347

09 Oct 2024

Intuitions of Compromise: Utilitarianism vs. Contractualism

Jared Moore

Yejin Choi

Sydney Levine

230

07 Oct 2024

Unlocking Structured Thinking in Language Models with Cognitive Prompting

Oliver Kramer

Jill Baumann

ReLM LRM

276

03 Oct 2024

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily LifeInternational Conference on Learning Representations (ICLR), 2024

Yu Ying Chiu

Liwei Jiang

Yejin Choi

334

03 Oct 2024

Examining the Role of Relationship Alignment in Large Language Models

Kristen M. Altenburger

179

02 Oct 2024

LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits

450

02 Oct 2024

Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability

Weitong Zhang

Chengqi Zang

Bernhard Kainz

221

01 Oct 2024

Predicting memorization within Large Language Models fine-tuned for classification

346

27 Sep 2024

Post-hoc Reward Calibration: A Case Study on Length BiasInternational Conference on Learning Representations (ICLR), 2024

300

25 Sep 2024

JMedBench: A Benchmark for Evaluating Japanese Biomedical Large Language ModelsInternational Conference on Computational Linguistics (COLING), 2024

192

20 Sep 2024

Edu-Values: Towards Evaluating the Chinese Education Values of Large Language ModelsThe Web Conference (WWW), 2024

Yazhou Zhang

375

19 Sep 2024

ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs

330

15 Sep 2024

DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective Partitioning

...

Weipeng Chen

Guosheng Dong

Bin Cui

Wentao Zhang

217

02 Sep 2024

ToolACE: Winning the Points of LLM Function CallingInternational Conference on Learning Representations (ICLR), 2024

...

303

112

02 Sep 2024

Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning

Maxime Méloux

Christophe Cerisara

KELM CLL

261

30 Aug 2024

Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language ModelsInternational Conference on Learning Representations (ICLR), 2024

497

27 Aug 2024

Investigating LLM Applications in E-Commerce

163

23 Aug 2024

Beyond Labels: Aligning Large Language Models with Human-like ReasoningInternational Conference on Pattern Recognition (ICPR), 2024

Muhammad Rafsan Kabir

Rafeed Mohammad Sultan

Mohammad Ruhul Amin

190

20 Aug 2024

Promoting Equality in Large Language Models: Identifying and Mitigating the Implicit Bias based on Bayesian Theory

Xihe Qiu

Yinghui Xu

Yuan Qi

213

20 Aug 2024

Value Alignment from Unstructured TextConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Inkit Padhi

Karthikeyan N. Ramamurthy

227

19 Aug 2024

CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Yufei Huang

...

Tao Liu

Deyi Xiong

ELM

136

19 Aug 2024

How Well Do LLMs Identify Cultural Unity in Diversity?

Jialin Li

Junli Wang

Junjie Hu

Ming Jiang

229

09 Aug 2024

226

07 Aug 2024

Pula: Training Large Language Models for SetswanaNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Nathan Brown

Vukosi Marivate

OSLM

317

05 Aug 2024