ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16944
  4. Cited By
Zephyr: Direct Distillation of LM Alignment

Zephyr: Direct Distillation of LM Alignment

25 October 2023
Lewis Tunstall
E. Beeching
Nathan Lambert
Nazneen Rajani
Kashif Rasul
Younes Belkada
Shengyi Huang
Leandro von Werra
Clémentine Fourrier
Nathan Habib
Nathan Sarrazin
Omar Sanseviero
Alexander M. Rush
Thomas Wolf
    ALM
ArXivPDFHTML

Papers citing "Zephyr: Direct Distillation of LM Alignment"

50 / 257 papers shown
Title
Large Language Models for Computer-Aided Design: A Survey
Large Language Models for Computer-Aided Design: A Survey
Licheng Zhang
Bach Le
Naveed Akhtar
Siew-Kei Lam
Tuan Ngo
3DV
AI4CE
32
0
0
13 May 2025
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions
Lata Pangtey
Anukriti Bhatnagar
Shubhi Bansal
Shahid Shafi Dar
Nagendra Kumar
18
0
0
13 May 2025
Camera Control at the Edge with Language Models for Scene Understanding
Camera Control at the Edge with Language Models for Scene Understanding
Alexiy Buynitsky
Sina Ehsani
Bhanu Pallakonda
Pragyana Mishra
VLM
30
0
0
09 May 2025
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
Xiaoyu Xu
Minxin Du
Qingqing Ye
Haibo Hu
MU
52
0
0
07 May 2025
Aligning Language Models for Icelandic Legal Text Summarization
Aligning Language Models for Icelandic Legal Text Summarization
Þórir Hrafn Harðarson
Hrafn Loftsson
Stefán Ólafsson
AILaw
AI4TS
ELM
75
0
0
25 Apr 2025
LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media
LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media
H. Zhang
Zhengyuan Zhu
Zeyu Zhang
Chengkai Li
22
0
0
11 Apr 2025
2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization
2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization
Mengyang Li
Zhong Zhang
27
0
0
10 Apr 2025
Bridging the Gap Between Preference Alignment and Machine Unlearning
Bridging the Gap Between Preference Alignment and Machine Unlearning
Xiaohua Feng
Yuyuan Li
Huwei Ji
Jiaming Zhang
L. Zhang
Tianyu Du
Chaochao Chen
MU
38
0
0
09 Apr 2025
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
Xiaohua Feng
Yuyuan Li
C. Wang
Junlin Liu
L. Zhang
Chaochao Chen
MU
29
0
0
09 Apr 2025
Revealing the Intrinsic Ethical Vulnerability of Aligned Large Language Models
Revealing the Intrinsic Ethical Vulnerability of Aligned Large Language Models
Jiawei Lian
Jianhong Pan
L. Wang
Yi Wang
Shaohui Mei
Lap-Pui Chau
AAML
24
0
0
07 Apr 2025
R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning Distillation
R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning Distillation
M. Weyssow
Chengran Yang
Junkai Chen
Yikun Li
Huihui Huang
...
Han Wei Ang
Frank Liauw
Eng Lieh Ouh
Lwin Khin Shar
David Lo
LRM
33
0
0
07 Apr 2025
Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Yinan Sun
Xiongkuo Min
Zicheng Zhang
Yixuan Gao
Y. Cao
Guangtao Zhai
VLM
59
0
0
26 Mar 2025
InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization
InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization
Yunan Wang
Jijie Li
Bo Zhang
Liangdong Wang
Guang Liu
58
0
0
20 Mar 2025
R.U.Psycho? Robust Unified Psychometric Testing of Language Models
Julian Schelb
Orr Borin
David Garcia
Andreas Spitz
37
0
0
13 Mar 2025
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees
Zhiyuan Zeng
Yizhong Wang
Hannaneh Hajishirzi
Pang Wei Koh
ELM
53
3
0
11 Mar 2025
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs
Jongwoo Ko
Tianyi Chen
Sungnyun Kim
Tianyu Ding
Luming Liang
Ilya Zharkov
Se-Young Yun
VLM
83
0
0
10 Mar 2025
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning
Shengyao Zhuang
Xueguang Ma
Bevan Koopman
Jimmy Lin
Guido Zuccon
LRM
53
1
0
08 Mar 2025
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Xu Chu
Zhixin Zhang
Tianyu Jia
Yujie Jin
72
0
0
25 Feb 2025
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao
Fanqi Wan
Jiajian Guo
Xiaojun Quan
Qifan Wang
ALM
58
0
0
25 Feb 2025
Simplify RLHF as Reward-Weighted SFT: A Variational Method
Simplify RLHF as Reward-Weighted SFT: A Variational Method
Yuhao Du
Z. Li
Pengyu Cheng
Zhihong Chen
Yuejiao Xie
Xiang Wan
Anningzhe Gao
35
1
0
20 Feb 2025
Small Models Struggle to Learn from Strong Reasoners
Small Models Struggle to Learn from Strong Reasoners
Yuetai Li
Xiang Yue
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Bill Yuchen Lin
Bhaskar Ramasubramanian
Radha Poovendran
LRM
44
12
0
17 Feb 2025
Enhancing Knowledge Graph Construction: Evaluating with Emphasis on Hallucination, Omission, and Graph Similarity Metrics
Enhancing Knowledge Graph Construction: Evaluating with Emphasis on Hallucination, Omission, and Graph Similarity Metrics
Hussam Ghanem
C. Cruz
56
0
0
07 Feb 2025
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Rui Min
Tianyu Pang
Chao Du
Qian Liu
Minhao Cheng
Min-Bin Lin
AAML
57
2
0
29 Jan 2025
Learning to Explore and Select for Coverage-Conditioned Retrieval-Augmented Generation
Learning to Explore and Select for Coverage-Conditioned Retrieval-Augmented Generation
Takyoung Kim
Kyungjae Lee
Y. Jang
Ji Yong Cho
Gangwoo Kim
Minseok Cho
Moontae Lee
92
0
0
28 Jan 2025
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings
Tong Liu
Xiao Yu
Wenxuan Zhou
Jindong Gu
Volker Tresp
37
0
0
11 Jan 2025
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
48
0
0
07 Jan 2025
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
Eunsu Kim
Juyoung Suk
Seungone Kim
Niklas Muennighoff
Dongkwan Kim
Alice H. Oh
ELM
78
1
0
31 Dec 2024
Energy-Based Preference Model Offers Better Offline Alignment than the
  Bradley-Terry Preference Model
Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model
Yuzhong Hong
Hanshan Zhang
Junwei Bao
Hongfei Jiang
Yang Song
OffRL
74
1
0
18 Dec 2024
Deploying Foundation Model Powered Agent Services: A Survey
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Haozhao Wang
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
112
1
0
18 Dec 2024
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over
  Aligned Large Language Models
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models
Yuchen Fan
Yuzhong Hong
Qiushi Wang
Junwei Bao
Hongfei Jiang
Yang Song
65
1
0
17 Dec 2024
Evaluating Zero-Shot Multilingual Aspect-Based Sentiment Analysis with
  Large Language Models
Evaluating Zero-Shot Multilingual Aspect-Based Sentiment Analysis with Large Language Models
Chengyan Wu
Bolei Ma
Zheyu Zhang
Ningyuan Deng
Yanqing He
Yun Xue
LRM
78
0
0
17 Dec 2024
MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary
  Negative Samples
MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary Negative Samples
Shuo Xie
Fangzhi Zhu
Jiahui Wang
Lulu Wen
Wei Dai
Xiaowei Chen
Junxiong Zhu
Kai Zhou
Bo Zheng
66
0
0
13 Dec 2024
Hymba: A Hybrid-head Architecture for Small Language Models
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong
Y. Fu
Shizhe Diao
Wonmin Byeon
Zijia Chen
...
Min-Hung Chen
Yoshi Suhara
Y. Lin
Jan Kautz
Pavlo Molchanov
Mamba
100
21
0
20 Nov 2024
The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection
Tomas Horych
Christoph Mandl
Terry Ruas
André Greiner-Petter
Bela Gipp
Akiko Aizawa
Timo Spinde
96
4
0
17 Nov 2024
Self-Calibrated Listwise Reranking with Large Language Models
Self-Calibrated Listwise Reranking with Large Language Models
Ruiyang Ren
Yuhao Wang
K. Zhou
Wayne Xin Zhao
W. Wang
Jing Liu
Ji-Rong Wen
Tat-Seng Chua
LRM
KELM
36
0
0
07 Nov 2024
One fish, two fish, but not the whole sea: Alignment reduces language
  models' conceptual diversity
One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity
Sonia K. Murthy
Tomer Ullman
Jennifer Hu
ALM
41
10
0
07 Nov 2024
TODO: Enhancing LLM Alignment with Ternary Preferences
TODO: Enhancing LLM Alignment with Ternary Preferences
Yuxiang Guo
Lu Yin
Bo Jiang
Jiaqi Zhang
33
1
0
02 Nov 2024
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Y. Qi
Hao Peng
X. Wang
Bin Xu
Lei Hou
Juanzi Li
56
0
0
31 Oct 2024
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General
  Preferences
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Y. Liu
Argyris Oikonomou
Weiqiang Zheng
Yang Cai
Arman Cohan
29
1
0
30 Oct 2024
Fine-tuning Large Language Models for DGA and DNS Exfiltration Detection
Fine-tuning Large Language Models for DGA and DNS Exfiltration Detection
Md Abu Sayed
Asif Rahman
Christopher Kiekintveld
Sebastian Garcia
25
0
0
29 Oct 2024
Fine-Tuning and Evaluating Open-Source Large Language Models for the
  Army Domain
Fine-Tuning and Evaluating Open-Source Large Language Models for the Army Domain
Daniel C. Ruiz
John Sell
11
1
0
27 Oct 2024
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Yujuan Fu
Özlem Uzuner
Meliha Yetisgen
Fei Xia
55
3
0
24 Oct 2024
Augmenting Legal Decision Support Systems with LLM-based NLI for
  Analyzing Social Media Evidence
Augmenting Legal Decision Support Systems with LLM-based NLI for Analyzing Social Media Evidence
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Kanwal Mehreen
Subhasya Tippareddy
Ashay Srivastava
AILaw
25
0
0
21 Oct 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja
Lester James Validad Miranda
Shayekh Bin Islam
Rishabh Maheshwary
Drishti Sharma
Gusti Winata
Nathan Lambert
Sebastian Ruder
Sara Hooker
Marzieh Fadaee
LRM
35
15
0
20 Oct 2024
Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Zhepeng Cen
Yao Liu
Siliang Zeng
Pratik Chaudhar
Huzefa Rangwala
George Karypis
Rasool Fakoor
SyDa
AIFin
18
3
0
18 Oct 2024
Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer
  Quality in Large Language Models
Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer Quality in Large Language Models
Iaroslav Chelombitko
Egor Safronov
Aleksey Komissarov
15
1
0
16 Oct 2024
Understanding Likelihood Over-optimisation in Direct Alignment
  Algorithms
Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
Zhengyan Shi
Sander Land
Acyr F. Locatelli
Matthieu Geist
Max Bartolo
46
4
0
15 Oct 2024
SkillAggregation: Reference-free LLM-Dependent Aggregation
SkillAggregation: Reference-free LLM-Dependent Aggregation
Guangzhi Sun
Anmol Kagrecha
Potsawee Manakul
Phil Woodland
Mark J. F. Gales
22
0
0
14 Oct 2024
Modeling User Preferences with Automatic Metrics: Creating a
  High-Quality Preference Dataset for Machine Translation
Modeling User Preferences with Automatic Metrics: Creating a High-Quality Preference Dataset for Machine Translation
Sweta Agrawal
José G. C. de Souza
Ricardo Rei
António Farinhas
Gonçalo Faria
Patrick Fernandes
Nuno M. Guerreiro
Andre Martins
24
5
0
10 Oct 2024
OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via
  Large Language Model Prompting
OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting
Xukai Liu
Ye Liu
Kai Zhang
Kehang Wang
Qi Liu
Enhong Chen
29
1
0
10 Oct 2024
123456
Next