ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Journal of machine learning research (JMLR), 2019
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 11,873 papers shown
Title
Cautious Weight Decay
Cautious Weight Decay
Lizhang Chen
Jonathan Li
Kaizhao Liang
Baiyu Su
Cong Xie
Nuo Wang Pierse
Chen Liang
Ni Lao
Qiang Liu
68
0
0
14 Oct 2025
Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification
Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification
Mahamodul Hasan Mahadi
Md. Nasif Safwan
Souhardo Rahman
Shahnaj Parvin
Aminun Nahar
Kamruddin Nur
VLM
54
0
0
14 Oct 2025
PAGE: Prompt Augmentation for text Generation Enhancement
PAGE: Prompt Augmentation for text Generation Enhancement
Mauro Jose Pacchiotti
Luciana Ballejos
Mariel Ale
20
0
0
13 Oct 2025
Learning to Watermark: A Selective Watermarking Framework for Large Language Models via Multi-Objective Optimization
Learning to Watermark: A Selective Watermarking Framework for Large Language Models via Multi-Objective Optimization
C. Wang
Junyi Shu
Billy Chiu
Yu Li
Saleh Alharbi
Min Zhang
Jing Li
92
0
0
13 Oct 2025
Saudi Sign Language Translation Using T5
Saudi Sign Language Translation Using T5
Ali Alhejab
Tomas Zelezny
Lamya Alkanhal
Ivan Gruber
Yazeed Alharbi
Jakub Straka
Vaclav Javorek
Marek Hruz
Badriah Alkalifah
Ahmed M. Ali
SLR
201
0
0
13 Oct 2025
Are Large Language Models Effective Knowledge Graph Constructors?
Are Large Language Models Effective Knowledge Graph Constructors?
R. Chen
Weifeng Jiang
Chengwei Qin
Bo Xiong
Fiona Liausvia
Dongkyu Choi
Boon Kiat Quek
AI4MH
112
0
0
13 Oct 2025
MC#: Mixture Compressor for Mixture-of-Experts Large Models
MC#: Mixture Compressor for Mixture-of-Experts Large Models
Wei Huang
Yue Liao
Yukang Chen
Jianhui Liu
Haoru Tan
Si Liu
Shiming Zhang
Shuicheng Yan
Xiaojuan Qi
MoEMQ
168
0
0
13 Oct 2025
GenCNER: A Generative Framework for Continual Named Entity Recognition
GenCNER: A Generative Framework for Continual Named Entity Recognition
Yawen Yang
Fukun Ma
Shiao Meng
Aiwei Liu
Lijie Wen
64
0
0
13 Oct 2025
Evaluating Line-level Localization Ability of Learning-based Code Vulnerability Detection Models
Marco Pintore
Giorgio Piras
Angelo Sotgiu
Maura Pintor
Battista Biggio
AAML
51
0
0
13 Oct 2025
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
CoSPED: Consistent Soft Prompt Targeted Data Extraction and Defense
Yang Zhuochen
Fok Kar Wai
Thing Vrizlynn
AAMLSILM
198
0
0
13 Oct 2025
Neural Weight Compression for Language Models
Neural Weight Compression for Language Models
Jegwang Ryu
Minkyu Kim
Seungjun Shin
Hee Min Choi
Dokwan Oh
Jaeho Lee
80
0
0
13 Oct 2025
Early Detection and Reduction of Memorisation for Domain Adaptation and Instruction Tuning
Early Detection and Reduction of Memorisation for Domain Adaptation and Instruction Tuning
Dean L. Slack
Noura Al Moubayed
92
0
0
13 Oct 2025
Into the Unknown: Towards using Generative Models for Sampling Priors of Environment Uncertainty for Planning in Configuration Spaces
Into the Unknown: Towards using Generative Models for Sampling Priors of Environment Uncertainty for Planning in Configuration Spaces
Subhransu S. Bhattacharjee
Hao Lu
Dylan Campbell
Rahul Shome
3DPC
76
0
0
13 Oct 2025
Point Prompting: Counterfactual Tracking with Video Diffusion Models
Point Prompting: Counterfactual Tracking with Video Diffusion Models
Ayush Shrivastava
Sanyam Mehta
Daniel Geng
Andrew Owens
DiffMVGen
88
1
0
13 Oct 2025
Domain-Specific Data Generation Framework for RAG Adaptation
Domain-Specific Data Generation Framework for RAG Adaptation
Chris Xing Tian
Weihao Xie
Zhen Chen
Zhengyuan Yi
Hui Liu
Haoliang Li
Shiqi Wang
Siwei Ma
48
0
0
13 Oct 2025
A Theorem-Proving-Based Evaluation of Neural Semantic Parsing
A Theorem-Proving-Based Evaluation of Neural Semantic Parsing
Hayate Funakura
Hyunsoo Kim
Koji Mineshima
48
0
0
13 Oct 2025
Direct Multi-Token Decoding
Direct Multi-Token Decoding
Xuan Luo
Weizhi Wang
Xifeng Yan
OffRL
64
0
0
13 Oct 2025
Catch Your Breath: Adaptive Computation for Self-Paced Sequence Production
Catch Your Breath: Adaptive Computation for Self-Paced Sequence Production
Alexandre Galashov
Matt Jones
Rosemary Ke
Yuan Cao
Vaishnavh Nagarajan
Michael C. Mozer
92
0
0
13 Oct 2025
Vision-LLMs for Spatiotemporal Traffic Forecasting
Vision-LLMs for Spatiotemporal Traffic Forecasting
Ning Yang
Hengyu Zhong
Haijun Zhang
Randall Berry
AI4TS
52
0
0
13 Oct 2025
Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos
Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational VideosComputer Vision and Pattern Recognition (CVPR), 2023
Rohit Gupta
Anirban Roy
Claire Christensen
Sujeong Kim
Sarah Gerard
Madeline Cincebeaux
Ajay Divakaran
Todd Grindal
M. Shah
88
19
0
13 Oct 2025
Head-wise Adaptive Rotary Positional Encoding for Fine-Grained Image Generation
Head-wise Adaptive Rotary Positional Encoding for Fine-Grained Image Generation
Jiaye Li
Baoyou Chen
Hui Li
Zilong Dong
Jingdong Wang
Siyu Zhu
52
0
0
12 Oct 2025
Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data?
Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data?
Shaobo Wang
C. Wang
Wenjie Fu
Yue Min
Mingquan Feng
...
Kexin Yang
Xingzhang Ren
Fei Huang
Dayiheng Liu
Linfeng Zhang
100
0
0
12 Oct 2025
R2T: Rule-Encoded Loss Functions for Low-Resource Sequence Tagging
R2T: Rule-Encoded Loss Functions for Low-Resource Sequence Tagging
Mamadou K. Keita
Christopher Homan
Sébastien Diarra
3DV
100
0
0
12 Oct 2025
AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs
AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs
Gunho Park
Jeongin Bae
Beomseok Kwon
Byeongwook Kim
S. Kwon
Dongsoo Lee
MQ
120
1
0
12 Oct 2025
UltraLLaDA: Scaling the Context Length to 128K for Diffusion Large Language Models
UltraLLaDA: Scaling the Context Length to 128K for Diffusion Large Language Models
Guangxin He
Shen Nie
Fengqi Zhu
Yuankang Zhao
Tianyi Bai
Ran Yan
Jie Fu
Chongxuan Li
Binhang Yuan
60
4
0
12 Oct 2025
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
Bowei He
Lihao Yin
Huiling Zhen
Shuqi Liu
Han Wu
Xiaokun Zhang
Mingxuan Yuan
Chen Ma
72
0
0
12 Oct 2025
Translution: Unifying Self-attention and Convolution for Adaptive and Relative Modeling
Translution: Unifying Self-attention and Convolution for Adaptive and Relative Modeling
Hehe Fan
Yi Yang
Mohan S. Kankanhalli
Fei Wu
ViT
48
0
0
11 Oct 2025
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
Lancheng Zou
Shuo Yin
Zehua Pei
Tsung-Yi Ho
Farzan Farnia
Bei Yu
56
0
0
11 Oct 2025
ReMix: Towards a Unified View of Consistent Character Generation and Editing
ReMix: Towards a Unified View of Consistent Character Generation and Editing
Benjia Zhou
Bin-Bin Fu
Pei Cheng
Y. Wang
Jiayuan Fan
Tao Chen
DiffM
76
0
0
11 Oct 2025
Lost in the Middle: An Emergent Property from Information Retrieval Demands in LLMs
Lost in the Middle: An Emergent Property from Information Retrieval Demands in LLMs
Nikolaus Salvatore
Hao Wang
Qiong Zhang
RALM
53
0
0
11 Oct 2025
The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton
The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton
Natalie Abreu
Nikhil Vyas
Sham Kakade
Depen Morwani
87
2
0
10 Oct 2025
Provable Watermarking for Data Poisoning Attacks
Provable Watermarking for Data Poisoning Attacks
Yifan Zhu
Lijia Yu
Xiao-Shan Gao
AAML
92
0
0
10 Oct 2025
Closing the Data-Efficiency Gap Between Autoregressive and Masked Diffusion LLMs
Closing the Data-Efficiency Gap Between Autoregressive and Masked Diffusion LLMs
Xu Pan
Ely Hahami
Jingxuan Fan
Ziqian Xie
H. Sompolinsky
96
0
0
10 Oct 2025
The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach
The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach
Nizar El Ghazal
Antoine Caubrière
Valentin Vielzeuf
AuLLM
150
0
0
10 Oct 2025
AdaPM: a Partial Momentum Algorithm for LLM Training
AdaPM: a Partial Momentum Algorithm for LLM Training
Yimu Zhang
Yuanshi Liu
Cong Fang
76
0
0
10 Oct 2025
Hybrid Models for Natural Language Reasoning: The Case of Syllogistic Logic
Hybrid Models for Natural Language Reasoning: The Case of Syllogistic Logic
Manuel Vargas Guzmán
Jakub Szymanik
Maciej Malicki
NAILRMELM
44
0
0
10 Oct 2025
PatentVision: A multimodal method for drafting patent applications
PatentVision: A multimodal method for drafting patent applications
Ruo Yang
Sai Krishna Reddy Mudhiganti
Manali Sharma
32
0
0
10 Oct 2025
Patentformer: A demonstration of AI-assisted automated patent drafting
Patentformer: A demonstration of AI-assisted automated patent drafting
Sai Krishna Reddy Mudhiganti
Juanyan Wang
Ruo Yang
Manali Sharma
53
0
0
10 Oct 2025
Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization
Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization
Shuo Xing
Soumik Dey
Mingyang Wu
Ashirbad Mishra
Naveen Ravipati
Binbin Li
Hansi Wu
Zhengzhong Tu
107
1
0
09 Oct 2025
LOTION: Smoothing the Optimization Landscape for Quantized Training
LOTION: Smoothing the Optimization Landscape for Quantized Training
Mujin Kwun
Depen Morwani
Chloe Huangyuan Su
Stephanie Gil
Nikhil Anand
Sham Kakade
MQ
161
1
0
09 Oct 2025
USIM and U0: A Vision-Language-Action Dataset and Model for General Underwater Robots
USIM and U0: A Vision-Language-Action Dataset and Model for General Underwater Robots
Junwen Gu
Zhiheng wu
Pengxuan Si
Shuang Qiu
Yukai Feng
Luoyang Sun
Laien Luo
Lianyi Yu
Jian Wang
Zhengxing Wu
68
1
0
09 Oct 2025
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
Kazuki Egashira
Robin Staab
Thibaud Gloaguen
Mark Vero
Martin Vechev
AAML
147
0
0
09 Oct 2025
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution
Shian Du
Menghan Xia
Chang-rui Liu
Quande Liu
Xintao Wang
Pengfei Wan
Xiangyang Ji
VGenSupR
200
0
0
09 Oct 2025
DACIP-RC: Domain Adaptive Continual Instruction Pre-Training via Reading Comprehension on Business Conversations
DACIP-RC: Domain Adaptive Continual Instruction Pre-Training via Reading Comprehension on Business Conversations
Elena Khasanova
Harsh Saini
Md Tahmid Rahman Laskar
Xue-Yong Fu
Cheng Chen
Shashi Bhushan TN
CLL
64
0
0
09 Oct 2025
Textual interpretation of transient image classifications from large language models
Textual interpretation of transient image classifications from large language modelsNature Astronomy (Nat. Astron.), 2025
F. Stoppa
Turan Bulmus
S. Bloemen
Stephen J. Smartt
P. Groot
P. Vreeswijk
Ken W. Smith
52
0
0
08 Oct 2025
Sunflower: A New Approach To Expanding Coverage of African Languages in Large Language Models
Sunflower: A New Approach To Expanding Coverage of African Languages in Large Language Models
Benjamin Akera
Evelyn Nafula Ouma
Gilbert Yiga
Patrick Walukagga
Phionah Natukunda
...
Imran Sekalala
Nimpamya Janat Namara
Engineer Bainomugisha
Ernest Mwebaze
John Quinn
92
0
0
08 Oct 2025
Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Jiuan Zhou
Yu Cheng
Yuan Xie
Z. Yin
86
2
0
08 Oct 2025
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models
Cheng-Han Chiang
Xiaofei Wang
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
Shujie Liu
Zhendong Wang
Zhengyuan Yang
Hung-yi Lee
Lijuan Wang
LLMAGReLMRALMLRM
144
1
0
08 Oct 2025
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Yeskendir Koishekenov
Aldo Lipani
Nicola Cancedda
LRM
74
2
0
08 Oct 2025
Quick-CapsNet (QCN): A fast alternative to Capsule Networks
Quick-CapsNet (QCN): A fast alternative to Capsule NetworksACS/IEEE International Conference on Computer Systems and Applications (AICCSA), 2020
Pouya Shiri
Ramin Sharifi
A. Baniasadi
3DPC
121
0
0
08 Oct 2025
Previous
123...678...236237238
Next