ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Journal of machine learning research (JMLR), 2019
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 12,026 papers shown
Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation
Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation
Thaweerath Phisannupawong
J. J. Damanik
Han-Lim Choi
159
0
0
24 Oct 2025
Bi-Level Optimization for Generative Recommendation: Bridging Tokenization and Generation
Bi-Level Optimization for Generative Recommendation: Bridging Tokenization and Generation
Yimeng Bai
Chang Liu
Y. Zhang
D. Wang
Frank Yang
Andrew Rabinovich
Wenge Rong
Fuli Feng
122
0
0
24 Oct 2025
PARL: Prompt-based Agents for Reinforcement Learning
PARL: Prompt-based Agents for Reinforcement Learning
Yarik Menchaca Resendiz
Roman Klinger
LLMAGLRM
164
0
0
24 Oct 2025
Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models
Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models
Matteo Silvestri
Flavio Giorgi
Fabrizio Silvestri
Gabriele Tolomei
LMTD
139
0
0
23 Oct 2025
Relative-Based Scaling Law for Neural Language Models
Relative-Based Scaling Law for Neural Language Models
Baoqing Yue
Jinyuan Zhou
Zixi Wei
Jingtao Zhan
Qingyao Ai
Yiqun Liu
144
0
0
23 Oct 2025
Better Tokens for Better 3D: Advancing Vision-Language Modeling in 3D Medical Imaging
Better Tokens for Better 3D: Advancing Vision-Language Modeling in 3D Medical Imaging
Ibrahim Ethem Hamamci
Sezgin Er
Antonio Terpin
Hadrien Reynaud
Dong Yang
Pengfei Guo
Marc Edgar
Daguang Xu
Bernhard Kainz
Bjoern Menze
MedIm
145
0
0
23 Oct 2025
A Reinforcement Learning Framework for Robust and Secure LLM Watermarking
A Reinforcement Learning Framework for Robust and Secure LLM Watermarking
Li An
Yujian Liu
Yepeng Liu
Yuheng Bu
Yang Zhang
Shiyu Chang
153
1
0
23 Oct 2025
Plan Then Retrieve: Reinforcement Learning-Guided Complex Reasoning over Knowledge Graphs
Plan Then Retrieve: Reinforcement Learning-Guided Complex Reasoning over Knowledge Graphs
Yanlin Song
Ben Liu
Víctor Gutiérrez-Basulto
Zhiwei Hu
Qianqian Xie
Min Peng
Sophia Ananiadou
Jeff Z. Pan
RALMReLMLRM
279
0
0
23 Oct 2025
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging
Bowen Wang
Haiyuan Wan
Liwen Shi
Chen Yang
Peng He
...
Tiao Tan
Yongjian Li
Fangming Liu
Yifan Gong
Sheng Zhang
CLLMoMeKELM
220
2
0
23 Oct 2025
From Masks to Worlds: A Hitchhiker's Guide to World Models
From Masks to Worlds: A Hitchhiker's Guide to World Models
Jinbin Bai
Yu Lei
H. Wu
Yuchen Zhu
Shufan Li
Yi Xin
Xiangtai Li
Molei Tao
Aditya Grover
Ming-Hsuan Yang
VGenSyDa
182
2
0
23 Oct 2025
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Xi Zhang
Xiaolin Wu
Jiamang Wang
W. Lin
MQ
200
0
0
23 Oct 2025
Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal Transformers
Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal TransformersIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2025
Dean L. Slack
G. Hudson
T. Winterbottom
Noura Al Moubayed
138
0
0
23 Oct 2025
xMem: A CPU-Based Approach for Accurate Estimation of GPU Memory in Deep Learning Training Workloads
xMem: A CPU-Based Approach for Accurate Estimation of GPU Memory in Deep Learning Training Workloads
Jiabo Shi
Dimitrios Pezaros
Yehia Elkhatib
102
0
0
23 Oct 2025
Data Efficient Any Transformer-to-Mamba Distillation via Attention Bridge
Data Efficient Any Transformer-to-Mamba Distillation via Attention Bridge
Penghao Wang
Yuhao Zhou
Mengxuan Wu
Panpan Zhang
Zhangyang Wang
Kai Wang
Mamba
294
0
0
22 Oct 2025
Machine Text Detectors are Membership Inference Attacks
Machine Text Detectors are Membership Inference Attacks
Ryuto Koike
Liam Dugan
Masahiro Kaneko
Chris Callison-Burch
Naoaki Okazaki
163
1
0
22 Oct 2025
Data-Centric Lessons To Improve Speech-Language Pretraining
Data-Centric Lessons To Improve Speech-Language Pretraining
Vishaal Udandarao
Zhiyun Lu
Xuankai Chang
Yongqiang Wang
Violet Z. Yao
Albin Madapally Jose
Fartash Faghri
Josh Gardner
Chung-Cheng Chiu
136
0
0
22 Oct 2025
Restoring Pruned Large Language Models via Lost Component Compensation
Restoring Pruned Large Language Models via Lost Component Compensation
Zijian Feng
Hanzhang Zhou
Zixiao Zhu
Tianjiao Li
Jia Jim Deryl Chua
Lee Onn Mak
Gee Wah Ng
Kezhi Mao
141
0
0
22 Oct 2025
A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring
A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring
Julian Schulz
LRM
122
0
0
22 Oct 2025
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Liu
Yuzhuo Fu
123
0
0
22 Oct 2025
Difficulty-Controllable Multiple-Choice Question Generation Using Large Language Models and Direct Preference Optimization
Difficulty-Controllable Multiple-Choice Question Generation Using Large Language Models and Direct Preference Optimization
Yuto Tomikawa
Masaki Uto
140
0
0
22 Oct 2025
Beyond Uniform SVD:Dual-Level Optimization across Columns and Modules for LLM Compression
Beyond Uniform SVD:Dual-Level Optimization across Columns and Modules for LLM Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Li
72
0
0
22 Oct 2025
ELUTQ: Optimizing Quantization Accuracy under LUT-Based Computation for Edge LLMs
ELUTQ: Optimizing Quantization Accuracy under LUT-Based Computation for Edge LLMs
Xin Nie
Liang Dong
H. Zhang
JiaWang Xiao
G. Sun
MQ
456
0
0
22 Oct 2025
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Chenyu Wang
Zhanglu Yan
Zhi Zhou
Xu Chen
Weng-Fai Wong
MQ
153
0
0
22 Oct 2025
Identity-Aware Large Language Models require Cultural Reasoning
Identity-Aware Large Language Models require Cultural Reasoning
Alistair Plum
Anne-Marie Lutgen
Christoph Purschke
Achim Rettinger
LRM
96
2
0
21 Oct 2025
Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning
Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning
Chenghao Zhu
Meiling Tao
Tiannan Wang
Dongyi Ding
Yuchen Eleanor Jiang
Wangchunshu Zhou
152
2
0
21 Oct 2025
Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection
Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection
Hongyi He
Xiao Liu
Zhenghao Lin
Mingni Tang
Y. Cheng
Jintao Wang
W. Li
Peng Cheng
Yeyun Gong
OODD
185
0
0
21 Oct 2025
ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning
ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning
Xiaohan Qin
Xiaoxing Wang
Ning Liao
Cancheng Zhang
Xiangdong Zhang
Mingquan Feng
Jingzhi Wang
Junchi Yan
138
1
0
21 Oct 2025
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
Qi Li
Junpan Wu
Xiang Liu
Yuxin Wang
Z. Li
Zhenheng Tang
Yuhan Chen
Shaohuai Shi
Xiaowen Chu
ReLMLRM
256
1
0
21 Oct 2025
Large language models for folktale type automation based on motifs: Cinderella case study
Large language models for folktale type automation based on motifs: Cinderella case study
Tjaša Arčon
Marko Robnik-Šikonja
Polona Tratnik
48
0
0
21 Oct 2025
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
Soroush Tabesh
M. Safaryan
Dan Alistarh
Alexandra Volkova
Dan Alistarh
MQ
211
0
0
21 Oct 2025
From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering
From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering
Lei Li
Xiao Zhou
Y. Zhang
X. Wu
RALMMedIm
156
0
0
21 Oct 2025
CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs
CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs
Shaobo Wang
Yongliang Miao
Yuancheng Liu
Qianli Ma
Ning Liao
Linfeng Zhang
LRM
165
1
0
21 Oct 2025
Rethinking On-policy Optimization for Query Augmentation
Rethinking On-policy Optimization for Query Augmentation
Zhichao Xu
Shengyao Zhuang
Xueguang Ma
Bingsen Chen
Yijun Tian
Fengran Mo
Jie Cao
Vivek Srikumar
RALMLRM
182
0
0
20 Oct 2025
Benchmarking Probabilistic Time Series Forecasting Models on Neural Activity
Benchmarking Probabilistic Time Series Forecasting Models on Neural Activity
Ziyu Lu
Anna J. Li
Alexander E. Ladd
Pascha Matveev
Aditya Deole
Eric Shea-Brown
J. Nathan Kutz
Nicholas A. Steinmetz
BDLAI4TS
170
0
0
20 Oct 2025
Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains
Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains
Austin Xu
Xuan-Phi Nguyen
Yilun Zhou
Chien-Sheng Wu
Caiming Xiong
Shafiq Joty
OffRLALMLRMELM
221
0
0
20 Oct 2025
DSEBench: A Test Collection for Explainable Dataset Search with Examples
DSEBench: A Test Collection for Explainable Dataset Search with Examples
Qing Shi
Jing He
Qiaosheng Chen
Gong Cheng
142
0
0
20 Oct 2025
Annotation-Efficient Universal Honesty Alignment
Annotation-Efficient Universal Honesty Alignment
Shiyu Ni
Keping Bi
Jiafeng Guo
Minghao Tang
Jingtong Wu
Zengxin Han
Xueqi Cheng
HILM
156
0
0
20 Oct 2025
Generation then Reconstruction: Accelerating Masked Autoregressive Models via Two-Stage Sampling
Generation then Reconstruction: Accelerating Masked Autoregressive Models via Two-Stage Sampling
Feihong Yan
P. Wang
Yao Zhu
Kaiyu Pang
Qingyan Wei
Huiqi Li
Linfeng Zhang
DiffM
133
0
0
20 Oct 2025
Unbiased Gradient Low-Rank Projection
Unbiased Gradient Low-Rank Projection
Rui Pan
Yang Luo
Yuxing Liu
Yang You
Tong Zhang
148
0
0
20 Oct 2025
AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages
AFRICAPTION: Establishing a New Paradigm for Image Captioning in African Languages
Mardiyyah Oduwole
Prince Mireku
Fatimo Adebanjo
Oluwatosin Olajide
Mahi Aminu Aliyu
Jekaterina Novikova
108
0
0
20 Oct 2025
Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models
Contextual Attention Modulation: Towards Efficient Multi-Task Adaptation in Large Language Models
Dayan Pan
Zhaoyang Fu
Jingyuan Wang
Xiao Han
Yue Zhu
Xiangyu Zhao
KELMCLL
124
0
0
20 Oct 2025
DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
Yongxin He
Shan Zhang
Yixuan Cao
Lei Ma
Ping Luo
DeLMO
243
1
0
20 Oct 2025
Watermark Robustness and Radioactivity May Be at Odds in Federated Learning
Watermark Robustness and Radioactivity May Be at Odds in Federated Learning
Leixu Huang
Zedian Shao
Teodora Baluta
WaLM
219
0
0
19 Oct 2025
Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization
Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization
Masahiro Kaneko
Zeerak Talat
Timothy Baldwin
AAML
141
2
0
19 Oct 2025
Neuronal Group Communication for Efficient Neural representation
Neuronal Group Communication for Efficient Neural representation
Zhengqi Pei
Qingming Huang
Shuhui Wang
111
0
0
19 Oct 2025
All You Need is One: Capsule Prompt Tuning with a Single Vector
All You Need is One: Capsule Prompt Tuning with a Single Vector
Yiyang Liu
James Chenhao Liang
Heng Fan
Wenhao Yang
Yiming Cui
Xiaotian Han
Lifu Huang
Dongfang Liu
Qifan Wang
Cheng Han
VLM
142
1
0
19 Oct 2025
Mixed-Precision Quantization for Language Models: Techniques and Prospects
Mixed-Precision Quantization for Language Models: Techniques and Prospects
M. Rakka
Marios Fournarakis
Olga Krestinskaya
Jinane Bazzi
K. Salama
Fadi J. Kurdahi
A. Eltawil
M. Fouda
MQ
233
0
0
19 Oct 2025
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Zhoutong Wu
Y. Zhang
Yiming Dong
Chenheng Zhang
Cong Fang
Kun Yuan
Zhouchen Lin
153
0
0
19 Oct 2025
Uncovering Brain-Like Hierarchical Patterns in Vision-Language Models through fMRI-Based Neural Encoding
Uncovering Brain-Like Hierarchical Patterns in Vision-Language Models through fMRI-Based Neural Encoding
Yudan Ren
Xinlong Wang
Kexin Wang
Tian Xia
Zihan Ma
Zhaowei Li
Xiangrong Bi
Xiao Li
Xiaowei He
93
0
0
19 Oct 2025
Back to Bytes: Revisiting Tokenization Through UTF-8
Back to Bytes: Revisiting Tokenization Through UTF-8
Amit Moryossef
Clara Meister
Pavel Stepachev
Desmond Elliott
127
0
0
19 Oct 2025
Previous
123...567...239240241
Next