ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Journal of machine learning research (JMLR), 2019
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 11,877 papers shown
Title
Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications
Vision-Language-Action Models for Robotics: A Review Towards Real-World ApplicationsIEEE Access (IEEE Access), 2025
Kento Kawaharazuka
Jihoon Oh
Jun Yamada
Ingmar Posner
Yuke Zhu
LM&Ro
189
17
0
08 Oct 2025
Textual interpretation of transient image classifications from large language models
Textual interpretation of transient image classifications from large language modelsNature Astronomy (Nat. Astron.), 2025
F. Stoppa
Turan Bulmus
S. Bloemen
Stephen J. Smartt
P. Groot
P. Vreeswijk
Ken W. Smith
72
0
0
08 Oct 2025
Quick-CapsNet (QCN): A fast alternative to Capsule Networks
Quick-CapsNet (QCN): A fast alternative to Capsule NetworksACS/IEEE International Conference on Computer Systems and Applications (AICCSA), 2020
Pouya Shiri
Ramin Sharifi
A. Baniasadi
3DPC
121
0
0
08 Oct 2025
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
Peter Ochieng
60
1
0
07 Oct 2025
Training Dynamics Impact Post-Training Quantization Robustness
Training Dynamics Impact Post-Training Quantization Robustness
Albert Catalan-Tatjer
Niccolò Ajroldi
Jonas Geiping
MQ
125
0
0
07 Oct 2025
Towards Label-Free Biological Reasoning Synthetic Dataset Creation via Uncertainty Filtering
Towards Label-Free Biological Reasoning Synthetic Dataset Creation via Uncertainty Filtering
Josefa Lia Stoisser
Lawrence Phillips
Aditya Misra
Tom A. Lamb
Philip Torr
Marc Boubnovski Martell
Julien Fauqueur
Kaspar Martens
LRM
60
0
0
07 Oct 2025
BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining
BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining
Jie Hao
Rui Yu
W. Zhang
Huixia Wang
Jie Xu
Mingrui Liu
144
0
0
07 Oct 2025
Test-Time Efficient Pretrained Model Portfolios for Time Series Forecasting
Test-Time Efficient Pretrained Model Portfolios for Time Series Forecasting
Mert Kayaalp
Caner Turkmen
Oleksandr Shchur
Pedro Mercado
Abdul Fatir Ansari
Michael Bohlke-Schneider
Bernie Wang
AI4TS
68
0
0
07 Oct 2025
AgentDR Dynamic Recommendation with Implicit Item-Item Relations via LLM-based Agents
AgentDR Dynamic Recommendation with Implicit Item-Item Relations via LLM-based Agents
Mingdai Yang
Nurendra Choudhary
Jiangshu Du
Edward W.Huang
Philip S.Yu
Karthik Subbian
Danai Kourta
100
0
0
07 Oct 2025
Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models
Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models
Filippo Rinaldi
Aniello Panariello
Giacomo Salici
Fengyuan Liu
Marco Ciccone
Angelo Porrello
Simone Calderara
131
0
0
07 Oct 2025
GUIDE: Guided Initialization and Distillation of Embeddings
GUIDE: Guided Initialization and Distillation of Embeddings
Khoa Trinh
Gaurav Menghani
Erik Vee
76
0
0
07 Oct 2025
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels
Zhepeng Cen
H. Chen
Shiyu Wang
Zuxin Liu
Zhiwei Liu
Ding Zhao
Silvio Savarese
Caiming Xiong
Huan Wang
Weiran Yao
OffRL
81
1
0
07 Oct 2025
Membership Inference Attacks on Tokenizers of Large Language Models
Membership Inference Attacks on Tokenizers of Large Language Models
Meng Tong
Yuntao Du
Kejiang Chen
Weiming Zhang
Ninghui Li
MIALM
227
0
0
07 Oct 2025
UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
Wenhao Guan
Zhikang Niu
Ziyue Jiang
Kaidi Wang
Peijie Chen
Q. Hong
Lin Li
Xie Chen
AuLLM
197
0
0
06 Oct 2025
Adaptive Memory Momentum via a Model-Based Framework for Deep Learning Optimization
Adaptive Memory Momentum via a Model-Based Framework for Deep Learning Optimization
Kristi Topollai
A. Choromańska
ODL
251
0
0
06 Oct 2025
HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks
HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks
Zheng Xiong
Kang Li
Zilin Wang
Matthew Jackson
Jakob Foerster
Shimon Whiteson
VLM
156
1
0
06 Oct 2025
Factuality Matters: When Image Generation and Editing Meet Structured Visuals
Factuality Matters: When Image Generation and Editing Meet Structured Visuals
Le Zhuo
Songhao Han
Yuandong Pu
Boxiang Qiu
Sayak Paul
...
Yihao Liu
Jie Shao
Xi Chen
Si Liu
Hongsheng Li
EGVM
198
1
0
06 Oct 2025
SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
Ronen Kamenetsky
Sara Dorfman
Daniel Garibi
Roni Paiss
Or Patashnik
Daniel Cohen-Or
DiffM
279
0
0
06 Oct 2025
SpikingMamba: Towards Energy-Efficient Large Language Models via Knowledge Distillation from Mamba
SpikingMamba: Towards Energy-Efficient Large Language Models via Knowledge Distillation from Mamba
Y. Huang
Jianxiong Tang
Chao Wang
Ziyi Wang
Jianguo Zhang
Zhichao Lu
Bojun Cheng
Luziwei Leng
Mamba
132
0
0
06 Oct 2025
FoilDiff: A Hybrid Transformer Backbone for Diffusion-based Modelling of 2D Airfoil Flow Fields
FoilDiff: A Hybrid Transformer Backbone for Diffusion-based Modelling of 2D Airfoil Flow Fields
Kenechukwu Ogbuagu
S. Maleki
G. Bruni
S. Krishnababu
DiffMAI4CE
328
0
0
05 Oct 2025
Thai Semantic End-of-Turn Detection for Real-Time Voice Agents
Thai Semantic End-of-Turn Detection for Real-Time Voice Agents
Thanapol Popit
Natthapath Rungseesiripak
Monthol Charattrakool
Saksorn Ruangtanusak
68
0
0
05 Oct 2025
Large Language Models Hallucination: A Comprehensive Survey
Large Language Models Hallucination: A Comprehensive Survey
Aisha Alansari
Hamzah Luqman
HILMLRM
373
1
0
05 Oct 2025
Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time
Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time
Daniel Tan
Anders Woodruff
Niels Warncke
Arun Jose
Maxime Riché
David Demitri Africa
Mia Taylor
271
0
0
05 Oct 2025
Activation Steering with a Feedback Controller
Activation Steering with a Feedback Controller
Dung V. Nguyen
Hieu M. Vu
Nhi Y. Pham
Lei Zhang
T. Nguyen
LLMSV
155
0
0
05 Oct 2025
Spectral Alignment as Predictor of Loss Explosion in Neural Network Training
Spectral Alignment as Predictor of Loss Explosion in Neural Network Training
Haiquan Qiu
You Wu
Yingjie Tan
Yaqing Wang
Quanming Yao
83
0
0
05 Oct 2025
Smart Paste: Automatically Fixing Copy/Paste for Google Developers
Smart Paste: Automatically Fixing Copy/Paste for Google Developers
Vincent Nguyen
Guilherme Herzog
José Cambronero
Marcus Revaj
Aditya Kini
Alexander Frömmgen
Maxim Tabachnyk
KELMVLM
60
0
0
04 Oct 2025
On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
Weiqing He
Xiang Li
Tianqi Shang
Li Shen
Weijie J. Su
Q. Long
WaLM
173
0
0
04 Oct 2025
Allocation of Parameters in Transformers
Allocation of Parameters in Transformers
Ruoxi Yu
Haotian Jiang
Jingpu Cheng
Penghao Yu
Qianxiao Li
Zhong Li
MoE
118
0
0
04 Oct 2025
Brain-Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage
Brain-Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage
Angela Lopez-Cardona
Sebastian Idesis
Mireia Masias Bruns
Sergi Abadal
Ioannis Arapakis
68
0
0
03 Oct 2025
Distributed Low-Communication Training with Decoupled Momentum Optimization
Distributed Low-Communication Training with Decoupled Momentum Optimization
S. Nedelkoski
Alexander Acker
O. Kao
Soeren Becker
Dominik Scheinert
75
0
0
03 Oct 2025
DMark: Order-Agnostic Watermarking for Diffusion Large Language Models
DMark: Order-Agnostic Watermarking for Diffusion Large Language Models
Linyu Wu
Linhao Zhong
Wenjie Qu
Y. Li
Yue Liu
Shengfang Zhai
Chunhua Shen
Jiaheng Zhang
172
0
0
03 Oct 2025
MoGIC: Boosting Motion Generation via Intention Understanding and Visual Context
MoGIC: Boosting Motion Generation via Intention Understanding and Visual Context
Junyu Shi
Yong Sun
Zhiyuan Zhang
Lijiang Liu
Zhengjie Zhang
Yuxin He
Qiang Nie
VGen
57
0
0
03 Oct 2025
Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction
Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction
Kaisi Guan
Xihua Wang
Zhengfeng Lai
Xin Cheng
Peng Zhang
Xiaojiang Liu
Ruihua Song
Meng Cao
DiffM
216
3
0
03 Oct 2025
AgenticRAG: Tool-Augmented Foundation Models for Zero-Shot Explainable Recommender Systems
AgenticRAG: Tool-Augmented Foundation Models for Zero-Shot Explainable Recommender Systems
Bo Ma
Hang Li
ZeHua Hu
XiaoFan Gui
LuYao Liu
Simon Liu
LRM
77
0
0
03 Oct 2025
When and Where do Events Switch in Multi-Event Video Generation?
When and Where do Events Switch in Multi-Event Video Generation?
Ruotong Liao
Guowen Huang
Qing Cheng
Thomas Seidl
Daniel Cremers
Volker Tresp
DiffMVGen
172
0
0
03 Oct 2025
Truth-Aware Decoding: A Program-Logic Approach to Factual Language Generation
Truth-Aware Decoding: A Program-Logic Approach to Factual Language Generation
Faruk Alpay
Hamdi Alakkad
20
0
0
03 Oct 2025
Small is Sufficient: Reducing the World AI Energy Consumption Through Model Selection
Small is Sufficient: Reducing the World AI Energy Consumption Through Model Selection
Tiago da Silva Barros
Frédéric Giroire
Ramon Aparicio-Pardo
Joanna Moulierac
88
0
0
02 Oct 2025
PyramidStyler: Transformer-Based Neural Style Transfer with Pyramidal Positional Encoding and Reinforcement Learning
PyramidStyler: Transformer-Based Neural Style Transfer with Pyramidal Positional Encoding and Reinforcement Learning
Raahul Krishna Durairaju
K. Saruladha
99
0
0
02 Oct 2025
Hierarchical Semantic Retrieval with Cobweb
Anant Gupta
Karthik Singaravadivelan
Zekun Wang
60
0
0
02 Oct 2025
The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM
The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM
Kwanhee Lee
Hyeondo Jang
Dongyeop Lee
Dan Alistarh
Namhoon Lee
68
1
0
02 Oct 2025
SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning
SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning
Shicheng Liu
Kai Sun
Lisheng Fu
Xilun Chen
Xinyuan Zhang
...
Rulin Shao
Yue Liu
Anuj Kumar
Wen-tau Yih
Xin Luna Dong
RALM
116
0
0
02 Oct 2025
Sparse Query Attention (SQA): A Computationally Efficient Attention Mechanism with Query Heads Reduction
Sparse Query Attention (SQA): A Computationally Efficient Attention Mechanism with Query Heads Reduction
Adam Filipek
76
1
0
02 Oct 2025
Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
Feiyang Kang
Newsha Ardalani
Michael Kuchnik
Youssef Emad
Mostafa Elhoushi
Shubhabrata Sengupta
Shang-Wen Li
Ramya Raghavendra
R. Jia
Carole-Jean Wu
SyDa
104
0
0
02 Oct 2025
Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
Shunfeng Zheng
Yudi Zhang
Meng Fang
Zihan Zhang
Zhitan Wu
Mykola Pechenizkiy
Ling-Hao Chen
ReLMRALMLRM
176
0
0
01 Oct 2025
Learn to Guide Your Diffusion Model
Learn to Guide Your Diffusion Model
Alexandre Galashov
Ashwini Pokle
Arnaud Doucet
Arthur Gretton
Mauricio Delbracio
Valentin De Bortoli
DiffM
304
0
0
01 Oct 2025
Microsaccade-Inspired Probing: Positional Encoding Perturbations Reveal LLM Misbehaviours
Microsaccade-Inspired Probing: Positional Encoding Perturbations Reveal LLM Misbehaviours
Rui Melo
Rui Abreu
C. Păsăreanu
118
0
0
01 Oct 2025
HalluGuard: Evidence-Grounded Small Reasoning Models to Mitigate Hallucinations in Retrieval-Augmented Generation
HalluGuard: Evidence-Grounded Small Reasoning Models to Mitigate Hallucinations in Retrieval-Augmented Generation
Loris Bergeron
Ioana Buhnila
Jérôme François
Radu State
HILMLRM
92
0
0
01 Oct 2025
Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack
Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack
Nanxiang Jiang
Zhaoxin Fan
Enhan Kang
Daiheng Gao
Yun Zhou
Yanxia Chang
Zheng Zhu
Yeying Jin
Wenjun Wu
AAML
112
0
0
01 Oct 2025
CarbonX: An Open-Source Tool for Computational Decarbonization Using Time Series Foundation Models
CarbonX: An Open-Source Tool for Computational Decarbonization Using Time Series Foundation Models
Diptyaroop Maji
Kang Yang
Prashant J. Shenoy
R. Sitaraman
Mani B. Srivastava
AI4TS
74
0
0
01 Oct 2025
Black-Box Time-Series Domain Adaptation via Cross-Prompt Foundation Models
Black-Box Time-Series Domain Adaptation via Cross-Prompt Foundation Models
Muhammad Furqon
Mahardhika Pratama
Igor Skrjanc
Lin Liu
Habibullah Habibullah
K. Doğançay
AI4TS
84
0
0
01 Oct 2025
Previous
123...789...236237238
Next