ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Journal of machine learning research (JMLR), 2019
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 11,958 papers shown
Title
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Liu
Yuzhuo Fu
96
0
0
22 Oct 2025
Restoring Pruned Large Language Models via Lost Component Compensation
Restoring Pruned Large Language Models via Lost Component Compensation
Zijian Feng
Hanzhang Zhou
Zixiao Zhu
Tianjiao Li
Jia Jim Deryl Chua
Lee Onn Mak
Gee Wah Ng
Kezhi Mao
125
0
0
22 Oct 2025
Data-Centric Lessons To Improve Speech-Language Pretraining
Data-Centric Lessons To Improve Speech-Language Pretraining
Vishaal Udandarao
Zhiyun Lu
Xuankai Chang
Yongqiang Wang
Violet Z. Yao
Albin Madapally Jose
Fartash Faghri
Josh Gardner
Chung-Cheng Chiu
124
0
0
22 Oct 2025
A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring
A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring
Julian Schulz
LRM
104
0
0
22 Oct 2025
CPSVD: Enhancing Large Language Model Compression via Column-Preserving Singular Value Decomposition
CPSVD: Enhancing Large Language Model Compression via Column-Preserving Singular Value Decomposition
Lin Xv
Jingsheng Gao
Xian Gao
Ting Li
Yuzhuo Fu
52
0
0
22 Oct 2025
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Chenyu Wang
Zhanglu Yan
Zhi Zhou
Xu Chen
Weng-Fai Wong
MQ
148
0
0
22 Oct 2025
ELUTQ: Efficient LUT-Aware Quantization for Deploying Large Language Models on Edge Devices
ELUTQ: Efficient LUT-Aware Quantization for Deploying Large Language Models on Edge Devices
Xin Nie
Liang Dong
H. Zhang
JiaWang Xiao
G. Sun
MQ
380
0
0
22 Oct 2025
Difficulty-Controllable Multiple-Choice Question Generation Using Large Language Models and Direct Preference Optimization
Difficulty-Controllable Multiple-Choice Question Generation Using Large Language Models and Direct Preference Optimization
Yuto Tomikawa
Masaki Uto
124
0
0
22 Oct 2025
From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering
From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering
Lei Li
Xiao Zhou
Y. Zhang
X. Wu
RALMMedIm
147
0
0
21 Oct 2025
ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning
ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning
Xiaohan Qin
Xiaoxing Wang
Ning Liao
Cancheng Zhang
Xiangdong Zhang
Mingquan Feng
Jingzhi Wang
Junchi Yan
126
0
0
21 Oct 2025
Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning
Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning
Chenghao Zhu
Meiling Tao
Tiannan Wang
Dongyi Ding
Yuchen Eleanor Jiang
Wangchunshu Zhou
132
2
0
21 Oct 2025
CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs
CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs
Shaobo Wang
Yongliang Miao
Yuancheng Liu
Qianli Ma
Ning Liao
Linfeng Zhang
LRM
145
1
0
21 Oct 2025
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
Qi Li
Junpan Wu
Xiang Liu
Yuxin Wang
Z. Li
Zhenheng Tang
Yuhan Chen
Shaohuai Shi
Xiaowen Chu
ReLMLRM
232
1
0
21 Oct 2025
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
Soroush Tabesh
M. Safaryan
Dan Alistarh
Alexandra Volkova
Dan Alistarh
MQ
183
0
0
21 Oct 2025
Large language models for folktale type automation based on motifs: Cinderella case study
Large language models for folktale type automation based on motifs: Cinderella case study
Tjaša Arčon
Marko Robnik-Šikonja
Polona Tratnik
40
0
0
21 Oct 2025
Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection
Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection
Hongyi He
Xiao Liu
Zhenghao Lin
Mingni Tang
Y. Cheng
Jintao Wang
W. Li
Peng Cheng
Yeyun Gong
OODD
161
0
0
21 Oct 2025
Identity-Aware Large Language Models require Cultural Reasoning
Identity-Aware Large Language Models require Cultural Reasoning
Alistair Plum
Anne-Marie Lutgen
Christoph Purschke
Achim Rettinger
LRM
88
0
0
21 Oct 2025
Unbiased Gradient Low-Rank Projection
Unbiased Gradient Low-Rank Projection
Rui Pan
Yang Luo
Yuxing Liu
Yang You
Tong Zhang
136
0
0
20 Oct 2025
AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages
AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages
Mardiyyah Oduwole
Prince Mireku
Fatimo Adebanjo
Oluwatosin Olajide
Mahi Aminu Aliyu
Jekaterina Novikova
89
0
0
20 Oct 2025
Rethinking On-policy Optimization for Query Augmentation
Rethinking On-policy Optimization for Query Augmentation
Zhichao Xu
Shengyao Zhuang
Xueguang Ma
Bingsen Chen
Yijun Tian
Fengran Mo
Jie Cao
Vivek Srikumar
RALMLRM
135
0
0
20 Oct 2025
Annotation-Efficient Universal Honesty Alignment
Annotation-Efficient Universal Honesty Alignment
Shiyu Ni
Keping Bi
Jiafeng Guo
Minghao Tang
Jingtong Wu
Zengxin Han
Xueqi Cheng
HILM
132
0
0
20 Oct 2025
Benchmarking Probabilistic Time Series Forecasting Models on Neural Activity
Benchmarking Probabilistic Time Series Forecasting Models on Neural Activity
Ziyu Lu
Anna J. Li
Alexander E. Ladd
Pascha Matveev
Aditya Deole
Eric Shea-Brown
J. Nathan Kutz
Nicholas A. Steinmetz
BDLAI4TS
162
0
0
20 Oct 2025
Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains
Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains
Austin Xu
Xuan-Phi Nguyen
Yilun Zhou
Chien-Sheng Wu
Caiming Xiong
Shafiq Joty
OffRLALMLRMELM
213
0
0
20 Oct 2025
Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models
Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models
Dayan Pan
Zhaoyang Fu
Jingyuan Wang
Xiao Han
Yue Zhu
Xiangyu Zhao
KELMCLL
112
0
0
20 Oct 2025
Generation then Reconstruction: Accelerating Masked Autoregressive Models via Two-Stage Sampling
Generation then Reconstruction: Accelerating Masked Autoregressive Models via Two-Stage Sampling
Feihong Yan
P. Wang
Yao Zhu
Kaiyu Pang
Qingyan Wei
Huiqi Li
Linfeng Zhang
DiffM
110
0
0
20 Oct 2025
DSEBench: A Test Collection for Explainable Dataset Search with Examples
DSEBench: A Test Collection for Explainable Dataset Search with Examples
Qing Shi
Jing He
Qiaosheng Chen
Gong Cheng
125
0
0
20 Oct 2025
DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
Yongxin He
Shan Zhang
Yixuan Cao
Lei Ma
Ping Luo
DeLMO
204
0
0
20 Oct 2025
Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization
Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization
Masahiro Kaneko
Zeerak Talat
Timothy Baldwin
AAML
133
1
0
19 Oct 2025
All You Need is One: Capsule Prompt Tuning with a Single Vector
All You Need is One: Capsule Prompt Tuning with a Single Vector
Yiyang Liu
James Chenhao Liang
Heng Fan
Wenhao Yang
Yiming Cui
Xiaotian Han
Lifu Huang
Dongfang Liu
Qifan Wang
Cheng Han
VLM
113
1
0
19 Oct 2025
Uncovering Brain-Like Hierarchical Patterns in Vision-Language Models through fMRI-Based Neural Encoding
Uncovering Brain-Like Hierarchical Patterns in Vision-Language Models through fMRI-Based Neural Encoding
Yudan Ren
Xinlong Wang
Kexin Wang
Tian Xia
Zihan Ma
Zhaowei Li
Xiangrong Bi
Xiao Li
Xiaowei He
68
0
0
19 Oct 2025
Watermark Robustness and Radioactivity May Be at Odds in Federated Learning
Watermark Robustness and Radioactivity May Be at Odds in Federated Learning
Leixu Huang
Zedian Shao
Teodora Baluta
WaLM
207
0
0
19 Oct 2025
Connecting Domains and Contrasting Samples: A Ladder for Domain Generalization
Connecting Domains and Contrasting Samples: A Ladder for Domain GeneralizationKnowledge Discovery and Data Mining (KDD), 2025
Tianxin Wei
Yifan Chen
Xinrui He
Wenxuan Bao
Jingrui He
143
4
0
19 Oct 2025
Back to Bytes: Revisiting Tokenization Through UTF-8
Back to Bytes: Revisiting Tokenization Through UTF-8
Amit Moryossef
Clara Meister
Pavel Stepachev
Desmond Elliott
87
0
0
19 Oct 2025
Neuronal Group Communication for Efficient Neural representation
Neuronal Group Communication for Efficient Neural representation
Zhengqi Pei
Qingming Huang
Shuhui Wang
103
0
0
19 Oct 2025
Mixed-Precision Quantization for Language Models: Techniques and Prospects
Mixed-Precision Quantization for Language Models: Techniques and Prospects
M. Rakka
Marios Fournarakis
Olga Krestinskaya
Jinane Bazzi
K. Salama
Fadi J. Kurdahi
A. Eltawil
M. Fouda
MQ
187
0
0
19 Oct 2025
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Zhoutong Wu
Y. Zhang
Yiming Dong
Chenheng Zhang
Cong Fang
Kun Yuan
Zhouchen Lin
131
0
0
19 Oct 2025
SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning
SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning
Xiaojun Guo
Runyu Zhou
Yifei Wang
Qi Zhang
Chenheng Zhang
...
Xiaohan Wang
Jiajun Chai
Guojun Yin
Wei Lin
Y. Wang
LRMVLM
132
2
0
18 Oct 2025
SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense
SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense
Y. Huang
Liang Shi
Yitian Zhang
Yi Tian Xu
Yun Fu
AAML
88
0
0
18 Oct 2025
Probing the Hidden Talent of ASR Foundation Models for L2 English Oral Assessment
Probing the Hidden Talent of ASR Foundation Models for L2 English Oral Assessment
Fu-An Chao
Bi-Cheng Yan
Berlin Chen
73
0
0
18 Oct 2025
TokenAR: Multiple Subject Generation via Autoregressive Token-level enhancement
TokenAR: Multiple Subject Generation via Autoregressive Token-level enhancement
Haiyue Sun
Qingdong He
Jinlong Peng
Peng Tang
Jiangning Zhang
Junwei Zhu
Xiaobin Hu
Shuicheng Yan
DiffMVGen
95
0
0
18 Oct 2025
Utilising Large Language Models for Generating Effective Counter Arguments to Anti-Vaccine Tweets
Utilising Large Language Models for Generating Effective Counter Arguments to Anti-Vaccine Tweets
Utsav Dhanuka
Soham Poddar
Saptarshi Ghosh
72
0
0
18 Oct 2025
Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
Minh Khoi Nguyen Nhat
R. Teo
Laziz U. Abdullaev
Maurice Mok
Viet-Hoang Tran
T. Nguyen
MoE
158
0
0
18 Oct 2025
Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
Jihoon Kwon
Kyle Min
Jy-yong Sohn
CoGe
156
0
0
18 Oct 2025
REALM: An MLLM-Agent Framework for Open World 3D Reasoning Segmentation and Editing on Gaussian Splatting
REALM: An MLLM-Agent Framework for Open World 3D Reasoning Segmentation and Editing on Gaussian Splatting
Changyue Shi
Minghao Chen
Yiping Mao
Chuxiao Yang
Xinyuan Hu
Jiajun Ding
Zhou Yu
LRM
72
2
0
18 Oct 2025
RL makes MLLMs see better than SFT
RL makes MLLMs see better than SFT
Junha Song
Sangdoo Yun
Dongyoon Han
Jaegul Choo
Byeongho Heo
OffRL
179
0
0
18 Oct 2025
MoS-VLA: A Vision-Language-Action Model with One-Shot Skill Adaptation
MoS-VLA: A Vision-Language-Action Model with One-Shot Skill Adaptation
Ruihan Zhao
Tyler Ingebrand
Sandeep Chinchali
Ufuk Topcu
VLM
78
0
0
18 Oct 2025
Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI
Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI
Zheng Huang
Enpei Zhang
Yinghao Cai
Weikang Qiu
Carl Yang
Elynn Chen
Xiang Zhang
Rex Ying
Dawei Zhou
Yujun Yan
DiffM
92
0
0
17 Oct 2025
Chronos-2: From Univariate to Universal Forecasting
Chronos-2: From Univariate to Universal Forecasting
Abdul Fatir Ansari
Oleksandr Shchur
Jaris Küken
Andreas Auer
Boran Han
...
Hao Wang
Huzefa Rangwala
George Karypis
Yuyang Wang
Michael Bohlke-Schneider
AI4TSBDL
193
6
0
17 Oct 2025
Expert Merging in Sparse Mixture of Experts with Nash Bargaining
Expert Merging in Sparse Mixture of Experts with Nash Bargaining
Dung V. Nguyen
Anh T. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
Shiqi Jiang
Ethan Fetaya
Linh Duy Tran
Gal Chechik
T. Nguyen
MoMe
164
0
0
17 Oct 2025
Large-scale User Game Lifecycle Representation Learning
Large-scale User Game Lifecycle Representation Learning
Yanjie Gou
Jiangming Liu
Kouying Xue
Yi Hu
OffRL
93
0
0
17 Oct 2025
Previous
123...567...238239240
Next