ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.12009
  4. Cited By
Representation Degeneration Problem in Training Natural Language
  Generation Models

Representation Degeneration Problem in Training Natural Language Generation Models

International Conference on Learning Representations (ICLR), 2019
28 July 2019
Jun Gao
Di He
Xu Tan
Tao Qin
Liwei Wang
Tie-Yan Liu
ArXiv (abs)PDFHTML

Papers citing "Representation Degeneration Problem in Training Natural Language Generation Models"

50 / 161 papers shown
SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment
SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment
Yixuan Tang
Yi Yang
ALM
159
0
0
02 Dec 2025
Lifting Manifolds to Mitigate Pseudo-Alignment in LLM4TS
Lifting Manifolds to Mitigate Pseudo-Alignment in LLM4TS
Liangwei Nathan Zheng
Wenhao Liang
Wei Emma Zhang
Miao Xu
Olaf Maennel
Weitong Chen
AI4TS
112
0
0
14 Oct 2025
Scaling Language-Centric Omnimodal Representation Learning
Scaling Language-Centric Omnimodal Representation Learning
Chenghao Xiao
Hou Pong Chan
Hao Zhang
Weiwen Xu
Mahani Aljunied
Yu Rong
143
0
0
13 Oct 2025
Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional Attention
Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional AttentionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zhaoxin Feng
Jianfei Ma
Emmanuele Chersoni
Xiaojing Zhao
Xiaoyi Bao
151
3
0
02 Oct 2025
Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval
Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval
Nima Sheikholeslami
Erfan Hosseini
Patrice Bechard
Srivatsava Daruru
Sai Rajeswar
119
0
0
30 Sep 2025
Demystifying Network Foundation Models
Demystifying Network Foundation Models
Sylee
Beltiukov
Satyandra Guthula
Wenbo Guo
W. Willinger
194
1
0
27 Sep 2025
Binary Autoencoder for Mechanistic Interpretability of Large Language Models
Binary Autoencoder for Mechanistic Interpretability of Large Language Models
Hakaze Cho
Haolin Yang
Brian M. Kurkoski
Naoya Inoue
MQ
200
0
0
25 Sep 2025
Probability Signature: Bridging Data Semantics and Embedding Structure in Language Models
Probability Signature: Bridging Data Semantics and Embedding Structure in Language Models
Junjie Yao
Zhi-hai Xu
139
0
0
24 Sep 2025
Angular Dispersion Accelerates $k$-Nearest Neighbors Machine Translation
Angular Dispersion Accelerates kkk-Nearest Neighbors Machine Translation
Evgeniia Tokarchuk
S. Troshin
Vlad Niculae
105
0
0
20 Sep 2025
Modality Alignment with Multi-scale Bilateral Attention for Multimodal Recommendation
Modality Alignment with Multi-scale Bilateral Attention for Multimodal Recommendation
Kelin Ren
Chan-Yang Ju
Dong-Ho Lee
112
0
0
11 Sep 2025
ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models
ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models
P. Nguyen
Huy P Phan
Hieu Pham
Christos Chatzichristos
Bert Vandenberk
M. D. Vos
MedIm
264
1
0
27 Aug 2025
Vec2Summ: Text Summarization via Probabilistic Sentence Embeddings
Vec2Summ: Text Summarization via Probabilistic Sentence Embeddings
Mao Li
Fred Conrad
Johann Gagnon-Bartsch
85
0
0
09 Aug 2025
From Neurons to Semantics: Evaluating Cross-Linguistic Alignment Capabilities of Large Language Models via Neurons Alignment
From Neurons to Semantics: Evaluating Cross-Linguistic Alignment Capabilities of Large Language Models via Neurons AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Chongxuan Huang
Yongshi Ye
Biao Fu
Qifeng Su
Xiaodong Shi
343
1
0
20 Jul 2025
Large Language Models Encode Semantics and Alignment in Linearly Separable Representations
Large Language Models Encode Semantics and Alignment in Linearly Separable Representations
Baturay Saglam
Paul Kassianik
Blaine Nelson
Sajana Weerawardhena
Yaron Singer
Amin Karbasi
161
3
0
13 Jul 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Accurate and Efficient Multivariate Time Series Forecasting via Offline ClusteringIEEE International Conference on Data Engineering (ICDE), 2025
Yiming Niu
Jinliang Deng
Lulu Zhang
Zimu Zhou
Yongxin Tong
AI4TS
381
0
0
09 May 2025
llm-jp-modernbert: A ModernBERT Model Trained on a Large-Scale Japanese Corpus with Long Context Length
llm-jp-modernbert: A ModernBERT Model Trained on a Large-Scale Japanese Corpus with Long Context Length
Issa Sugiura
Kouta Nakayama
Yusuke Oda
182
7
0
22 Apr 2025
Measuring Intrinsic Dimension of Token Embeddings
Takuya Kataiwa
Cho Hakaze
Tetsushi Ohki
231
4
0
04 Mar 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
427
17
0
20 Feb 2025
DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active Learning
DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active LearningTransactions of the Association for Computational Linguistics (TACL), 2024
Jiaxin Guo
Cheng Chen
Shuzhen Li
Tianze Zhang
414
1
0
01 Feb 2025
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
  Fast, Memory Efficient, and Long Context Finetuning and Inference
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Benjamin Warner
Antoine Chaffin
Benjamin Clavié
Orion Weller
Oskar Hallström
...
Tom Aarsen
Nathan Cooper
Griffin Adams
Jeremy Howard
Iacopo Poli
457
389
0
18 Dec 2024
USTCCTSU at SemEval-2024 Task 1: Reducing Anisotropy for Cross-lingual
  Semantic Textual Relatedness Task
USTCCTSU at SemEval-2024 Task 1: Reducing Anisotropy for Cross-lingual Semantic Textual Relatedness TaskInternational Workshop on Semantic Evaluation (SemEval), 2024
Jianjian Li
Shengwei Liang
Yong Liao
Hongping Deng
Haiyang Yu
346
2
0
28 Nov 2024
Long-context Protein Language Modeling Using Bidirectional Mamba with Shared Projection Layers
Long-context Protein Language Modeling Using Bidirectional Mamba with Shared Projection LayersbioRxiv (bioRxiv), 2024
Yingheng Wang
Zichen Wang
Gil Sadeh
Luca Zancato
Alessandro Achille
George Karypis
Huzefa Rangwala
422
1
0
29 Oct 2024
Building A Coding Assistant via the Retrieval-Augmented Language Model
Building A Coding Assistant via the Retrieval-Augmented Language Model
Xinze Li
Hanbin Wang
Zhenghao Liu
S. Yu
Kaiyan Zhang
Shi Yu
Yukai Fu
Yu Gu
Ge Yu
3DVRALM
186
10
0
21 Oct 2024
Self-Supervised Learning of Disentangled Representations for Multivariate Time-Series
Ching Chang
Chiao-Tung Chan
Wei-Yao Wang
Chao-Han Huck Yang
Tien-Fu Chen
AI4TS
274
1
0
16 Oct 2024
How much do contextualized representations encode long-range context?
How much do contextualized representations encode long-range context?North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Simeng Sun
Cheng-Ping Hsieh
336
0
0
16 Oct 2024
Tackling Dimensional Collapse toward Comprehensive Universal Domain Adaptation
Tackling Dimensional Collapse toward Comprehensive Universal Domain Adaptation
Hung-Chieh Fang
Po-Yi Lu
Hsuan-Tien Lin
260
0
0
15 Oct 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2024
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
311
12
0
15 Oct 2024
Contrastive Learning for Implicit Social Factors in Social Media
  Popularity Prediction
Contrastive Learning for Implicit Social Factors in Social Media Popularity Prediction
Zhizhen Zhang
Ruihong Qiu
Xiaohui Xie
186
2
0
12 Oct 2024
CrossQuant: A Post-Training Quantization Method with Smaller
  Quantization Kernel for Precise Large Language Model Compression
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression
Wenyuan Liu
Xindian Ma
Peng Zhang
Yan Wang
MQ
167
2
0
10 Oct 2024
Unveiling Transformer Perception by Exploring Input Manifolds
Unveiling Transformer Perception by Exploring Input Manifolds
A. Benfenati
Alfio Ferrara
A. Marta
Davide Riva
Elisabetta Rocchetti
348
0
0
08 Oct 2024
NoTeeline: Supporting Real-Time, Personalized Notetaking with
  LLM-Enhanced Micronotes
NoTeeline: Supporting Real-Time, Personalized Notetaking with LLM-Enhanced MicronotesInternational Conference on Intelligent User Interfaces (IUI), 2024
Faria Huq
Abdus Samee
David Chuan-en Lin
Xiaodi Alice Tang
Jeffrey P. Bigham
224
0
0
24 Sep 2024
Diversity-grounded Channel Prototypical Learning for Out-of-Distribution
  Intent Detection
Diversity-grounded Channel Prototypical Learning for Out-of-Distribution Intent Detection
Bo Liu
Liming Zhan
Yujie Feng
Zexin Lu
Chengqiang Xie
Lei Xue
Albert Y. S. Lam
Xiao-Ming Wu
OODD
263
1
0
17 Sep 2024
Towards High-resolution 3D Anomaly Detection via Group-Level Feature
  Contrastive Learning
Towards High-resolution 3D Anomaly Detection via Group-Level Feature Contrastive LearningACM Multimedia (MM), 2024
Hongze Zhu
Guoyang Xie
Chengbin Hou
Tao Dai
Can Gao
Jinbao Wang
Linlin Shen
3DPC
212
24
0
08 Aug 2024
Reconsidering Token Embeddings with the Definitions for Pre-trained
  Language Models
Reconsidering Token Embeddings with the Definitions for Pre-trained Language Models
Ying Zhang
Zhuoran Liu
Manabu Okumura
180
0
0
02 Aug 2024
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Chaofan Tao
Qian Liu
Longxu Dou
Niklas Muennighoff
Zhongwei Wan
Ping Luo
Min Lin
Ngai Wong
PILM
319
94
0
18 Jul 2024
One Stone, Four Birds: A Comprehensive Solution for QA System Using
  Supervised Contrastive Learning
One Stone, Four Birds: A Comprehensive Solution for QA System Using Supervised Contrastive Learning
Bo Wang
Tsunenori Mine
AAML
268
0
0
12 Jul 2024
Exploring the Impact of a Transformer's Latent Space Geometry on
  Downstream Task Performance
Exploring the Impact of a Transformer's Latent Space Geometry on Downstream Task Performance
Anna C. Marbut
John W. Chandler
Travis J. Wheeler
284
1
0
18 Jun 2024
Understanding Token Probability Encoding in Output Embeddings
Understanding Token Probability Encoding in Output Embeddings
Hakaze Cho
Yoshihiro Sakai
Kenshiro Tanaka
Mariko Kato
Naoya Inoue
296
3
0
03 Jun 2024
Understanding and Minimising Outlier Features in Neural Network Training
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He
Lorenzo Noci
Daniele Paliotta
Imanol Schlag
Thomas Hofmann
291
9
0
29 May 2024
On the Role of Attention Masks and LayerNorm in Transformers
On the Role of Attention Masks and LayerNorm in Transformers
Xinyi Wu
A. Ajorlou
Yifei Wang
Stefanie Jegelka
Ali Jadbabaie
258
29
0
29 May 2024
Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to
  Multimodal Inputs
Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
Mustafa Shukor
Matthieu Cord
300
13
0
26 May 2024
DefSent+: Improving sentence embeddings of language models by projecting
  definition sentences into a quasi-isotropic or isotropic vector space of
  unlimited dictionary entries
DefSent+: Improving sentence embeddings of language models by projecting definition sentences into a quasi-isotropic or isotropic vector space of unlimited dictionary entries
Xiaodong Liu
405
0
0
25 May 2024
Why do small language models underperform? Studying Language Model
  Saturation via the Softmax Bottleneck
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
198
19
0
11 Apr 2024
Understanding Cross-Lingual Alignment -- A Survey
Understanding Cross-Lingual Alignment -- A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Katharina Hämmerl
Jindvrich Libovický
Kangyang Luo
270
31
0
09 Apr 2024
Event-enhanced Retrieval in Real-time Search
Event-enhanced Retrieval in Real-time Search
Yanan Zhang
Xiaoling Bai
Tianhua Zhou
231
2
0
09 Apr 2024
LAN: Learning Adaptive Neighbors for Real-Time Insider Threat Detection
LAN: Learning Adaptive Neighbors for Real-Time Insider Threat DetectionIEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024
Xiangrui Cai
Yang Wang
Sihan Xu
Hao Li
Ying Zhang
Zheli Liu
Xiaojie Yuan
225
12
0
14 Mar 2024
Pixel Sentence Representation Learning
Pixel Sentence Representation Learning
Chenghao Xiao
Zhuoxu Huang
Danlu Chen
G. Hudson
Yi Zhou
Haoran Duan
Chenghua Lin
Jie Fu
Jungong Han
Noura Al Moubayed
SSL
207
5
0
13 Feb 2024
NNOSE: Nearest Neighbor Occupational Skill Extraction
NNOSE: Nearest Neighbor Occupational Skill Extraction
Mike Zhang
Rob van der Goot
Min-Yen Kan
Barbara Plank
247
10
0
30 Jan 2024
Anisotropy Is Inherent to Self-Attention in Transformers
Anisotropy Is Inherent to Self-Attention in TransformersConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
247
29
0
22 Jan 2024
Why "classic" Transformers are shallow and how to make them go deep
Why "classic" Transformers are shallow and how to make them go deep
Yueyao Yu
Yin Zhang
ViT
271
0
0
11 Dec 2023
1234
Next
Page 1 of 4