ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.01171
  4. Cited By
Why Exposure Bias Matters: An Imitation Learning Perspective of Error
  Accumulation in Language Generation

Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation

3 April 2022
Kushal Arora
Layla El Asri
Hareesh Bahuleyan
Jackie C.K. Cheung
ArXivPDFHTML

Papers citing "Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation"

50 / 55 papers shown
Title
Looking beyond the next token
Looking beyond the next token
Abitha Thankaraj
Yiding Jiang
J. Zico Kolter
Yonatan Bisk
LRM
57
1
0
15 Apr 2025
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
Lixing Xiao
Shunlin Lu
Huaijin Pi
Ke Fan
Liang Pan
Yueer Zhou
Ziyong Feng
Xiaowei Zhou
Sida Peng
Jingbo Wang
DiffM
VGen
43
3
0
19 Mar 2025
DatawiseAgent: A Notebook-Centric LLM Agent Framework for Automated Data Science
Ziming You
Yumiao Zhang
Dexuan Xu
Yiwei Lou
Yandong Yan
Wei Wang
H. Zhang
Yu Huang
LLMAG
57
0
0
10 Mar 2025
Frequency Autoregressive Image Generation with Continuous Tokens
Hu Yu
Hao Luo
Hangjie Yuan
Yu Rong
Feng Zhao
VGen
37
2
0
07 Mar 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
76
6
0
27 Feb 2025
Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents
Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents
A. Lewis
Michael White
Jing Liu
T. Koike-Akino
K. Parsons
Y. Wang
HILM
58
0
0
26 Feb 2025
Patient Trajectory Prediction: Integrating Clinical Notes with Transformers
Patient Trajectory Prediction: Integrating Clinical Notes with Transformers
Sifal Klioui
Sana Sellami
Youssef Trardi
71
0
0
25 Feb 2025
Sequence-level Large Language Model Training with Contrastive Preference Optimization
Sequence-level Large Language Model Training with Contrastive Preference Optimization
Zhili Feng
Dhananjay Ram
Cole Hawkins
Aditya Rawal
Jinman Zhao
Sheng Zha
57
0
0
23 Feb 2025
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models
Gyeongman Kim
Gyouk Chu
Eunho Yang
MoE
54
0
0
18 Feb 2025
Few-shot LLM Synthetic Data with Distribution Matching
Few-shot LLM Synthetic Data with Distribution Matching
Jiyuan Ren
Zhaocheng Du
Zhihao Wen
Qinglin Jia
Sunhao Dai
Chuhan Wu
Zhenhua Dong
SyDa
75
0
0
09 Feb 2025
Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning
Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning
Manh Luong
Khai Nguyen
Dinh Q. Phung
Gholamreza Haffari
Lizhen Qu
47
0
0
08 Feb 2025
Decoupled Sequence and Structure Generation for Realistic Antibody Design
Decoupled Sequence and Structure Generation for Realistic Antibody Design
Nayoung Kim
Minsu Kim
Sungsoo Ahn
Jinkyoo Park
47
0
0
20 Jan 2025
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
Junho Myung
Nayeon Lee
Yi Zhou
Jiho Jin
Rifki Afina Putri
...
Seid Muhie Yimam
Mohammad Taher Pilehvar
N. Ousidhoum
Jose Camacho-Collados
Alice H. Oh
87
32
0
17 Jan 2025
MeasureNet: Measurement Based Celiac Disease Identification
MeasureNet: Measurement Based Celiac Disease Identification
Aayush Kumar Tyagi
Vaibhav Mishra
Ashok Tiwari
Lalita Mehra
Prasenjit Das
G. Makharia
Prathosh AP
Mausam
75
0
0
02 Dec 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
Jahyun Koo
Yerin Hwang
Yongil Kim
Taegwan Kang
Hyunkyung Bae
Kyomin Jung
40
0
0
25 Oct 2024
The Mystery of the Pathological Path-star Task for Language Models
The Mystery of the Pathological Path-star Task for Language Models
Arvid Frydenlund
LRM
27
3
0
17 Oct 2024
Learning from Imperfect Data: Towards Efficient Knowledge Distillation
  of Autoregressive Language Models for Text-to-SQL
Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL
Qihuang Zhong
Kunfeng Chen
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
34
0
0
15 Oct 2024
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring
  the (Lack of) Cultural Knowledge of LLMs
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs
Yu Ying Chiu
Liwei Jiang
Bill Yuchen Lin
Chan Young Park
Shuyue Stella Li
...
Mehar Bhatia
Maria Antoniak
Yulia Tsvetkov
Vered Shwartz
Yejin Choi
ELM
ALM
45
18
0
03 Oct 2024
Exploring and Enhancing the Transfer of Distribution in Knowledge
  Distillation for Autoregressive Language Models
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models
Jun Rao
Xuebo Liu
Zepeng Lin
Liang Ding
Jing Li
Dacheng Tao
Min Zhang
30
2
0
19 Sep 2024
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
Dongheng Li
Yongchang Hao
Lili Mou
34
1
0
19 Sep 2024
Spatially-Aware Speaker for Vision-and-Language Navigation Instruction
  Generation
Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation
Muraleekrishna Gopinathan
Martin Masek
Jumana Abu-Khalaf
David Suter
LM&Ro
26
1
0
09 Sep 2024
Imitating Language via Scalable Inverse Reinforcement Learning
Imitating Language via Scalable Inverse Reinforcement Learning
Markus Wulfmeier
Michael Bloesch
Nino Vieillard
Arun Ahuja
Jorg Bornschein
...
Jost Tobias Springenberg
Nikola Momchev
Olivier Bachem
Matthieu Geist
Martin Riedmiller
34
9
0
02 Sep 2024
Self-Improving Robust Preference Optimization
Self-Improving Robust Preference Optimization
Eugene Choi
Arash Ahmadian
Matthieu Geist
Oilvier Pietquin
M. G. Azar
23
8
0
03 Jun 2024
Dissociation of Faithful and Unfaithful Reasoning in LLMs
Dissociation of Faithful and Unfaithful Reasoning in LLMs
Evelyn Yee
Alice Li
Chenyu Tang
Yeon Ho Jung
R. Paturi
Leon Bergen
LRM
32
4
0
23 May 2024
Stream of Search (SoS): Learning to Search in Language
Stream of Search (SoS): Learning to Search in Language
Kanishk Gandhi
Denise Lee
Gabriel Grand
Muxin Liu
Winson Cheng
Archit Sharma
Noah D. Goodman
RALM
AIFin
LRM
44
44
0
01 Apr 2024
PromptKD: Distilling Student-Friendly Knowledge for Generative Language
  Models via Prompt Tuning
PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning
Gyeongman Kim
Doohyuk Jang
Eunho Yang
VLM
35
13
0
20 Feb 2024
PRISE: LLM-Style Sequence Compression for Learning Temporal Action
  Abstractions in Control
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
Ruijie Zheng
Ching-An Cheng
Hal Daumé
Furong Huang
Andrey Kolobov
27
9
0
16 Feb 2024
Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking
Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking
Yi Ren Fung
Ruining Zhao
Jae Doo
Chenkai Sun
Heng Ji
26
27
0
14 Feb 2024
DistiLLM: Towards Streamlined Distillation for Large Language Models
DistiLLM: Towards Streamlined Distillation for Large Language Models
Jongwoo Ko
Sungnyun Kim
Tianyi Chen
SeYoung Yun
61
25
0
06 Feb 2024
A Survey on Hallucination in Large Language Models: Principles,
  Taxonomy, Challenges, and Open Questions
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRM
HILM
31
714
0
09 Nov 2023
Do Stochastic Parrots have Feelings Too? Improving Neural Detection of
  Synthetic Text via Emotion Recognition
Do Stochastic Parrots have Feelings Too? Improving Neural Detection of Synthetic Text via Emotion Recognition
Alan Cowap
Yvette Graham
Jennifer Foster
DeLMO
25
0
0
24 Oct 2023
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning
  and Autoregression
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression
Adam Block
Dylan J. Foster
Akshay Krishnamurthy
Max Simchowitz
Cyril Zhang
23
4
0
17 Oct 2023
Multi-level Adaptive Contrastive Learning for Knowledge Internalization
  in Dialogue Generation
Multi-level Adaptive Contrastive Learning for Knowledge Internalization in Dialogue Generation
Chenxu Yang
Zheng Lin
Lanrui Wang
Chong Tian
Liang Pang
JiangNan Li
Qirong Ho
Yanan Cao
Weiping Wang
16
1
0
13 Oct 2023
Language Model Decoding as Direct Metrics Optimization
Language Model Decoding as Direct Metrics Optimization
Haozhe Ji
Pei Ke
Hongning Wang
Minlie Huang
11
7
0
02 Oct 2023
Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot
  Translation
Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation
Junjie Yang
Liang Ding
Li Shen
Matthieu Labeau
Yibing Zhan
Weifeng Liu
Dacheng Tao
VLM
21
4
0
28 Sep 2023
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye
Zaixiang Zheng
Yu Bao
Lihua Qian
Quanquan Gu
DiffM
52
14
0
23 Aug 2023
Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P)
  Transduction
Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction
Eunseop Yoon
Hee Suk Yoon
Dhananjaya N. Gowda
Soohwan Eom
Daehyeok Kim
John Harvill
Heting Gao
M. Hasegawa-Johnson
Chanwoo Kim
Chang-Dong Yoo
18
1
0
16 Aug 2023
Mitigating the Learning Bias towards Repetition by Self-Contrastive
  Training for Open-Ended Generation
Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation
Jian-Yu Guan
Minlie Huang
24
0
0
04 Jul 2023
On-Policy Distillation of Language Models: Learning from Self-Generated
  Mistakes
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
Rishabh Agarwal
Nino Vieillard
Yongchao Zhou
Piotr Stańczyk
Sabela Ramos
Matthieu Geist
Olivier Bachem
35
84
0
23 Jun 2023
MiniLLM: Knowledge Distillation of Large Language Models
MiniLLM: Knowledge Distillation of Large Language Models
Yuxian Gu
Li Dong
Furu Wei
Minlie Huang
ALM
31
76
0
14 Jun 2023
SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling
  with Backtracking
SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking
Chris Cundy
Stefano Ermon
16
10
0
08 Jun 2023
Deductive Verification of Chain-of-Thought Reasoning
Deductive Verification of Chain-of-Thought Reasoning
Z. Ling
Yunhao Fang
Xuanlin Li
Zhiao Huang
Mingu Lee
Roland Memisevic
Hao Su
ReLM
LRM
24
123
0
06 Jun 2023
How Language Model Hallucinations Can Snowball
How Language Model Hallucinations Can Snowball
Muru Zhang
Ofir Press
William Merrill
Alisa Liu
Noah A. Smith
HILM
LRM
78
252
0
22 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang
Yang Feng
21
25
0
15 May 2023
KEPR: Knowledge Enhancement and Plausibility Ranking for Generative
  Commonsense Question Answering
KEPR: Knowledge Enhancement and Plausibility Ranking for Generative Commonsense Question Answering
Zhifeng Li
Bowei Zou
Yifan Fan
Yu Hong
14
3
0
15 May 2023
Towards Understanding and Improving Knowledge Distillation for Neural
  Machine Translation
Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation
Songming Zhang
Yunlong Liang
Shuaibo Wang
Wenjuan Han
Jian Liu
Jinan Xu
Yufeng Chen
16
7
0
14 May 2023
A Systematic Study of Knowledge Distillation for Natural Language
  Generation with Pseudo-Target Training
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
24
17
0
03 May 2023
Directed Acyclic Transformer Pre-training for High-quality
  Non-autoregressive Text Generation
Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation
Fei Huang
Pei Ke
Minlie Huang
AI4CE
33
7
0
24 Apr 2023
Tailoring Language Generation Models under Total Variation Distance
Tailoring Language Generation Models under Total Variation Distance
Haozhe Ji
Pei Ke
Zhipeng Hu
Rongsheng Zhang
Minlie Huang
28
18
0
26 Feb 2023
Reward Gaming in Conditional Text Generation
Reward Gaming in Conditional Text Generation
Richard Yuanzhe Pang
Vishakh Padmakumar
Thibault Sellam
Ankur P. Parikh
He He
21
24
0
16 Nov 2022
12
Next