Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 8,020 papers shown
Title
Activation Patching for Interpretable Steering in Music Generation
Simone Facchiano
Giorgio Strano
Donato Crisostomi
Irene Tallini
Tommaso Mencattini
Fabio Galasso
Emanuele Rodolà
LLMSV
19
0
0
06 Apr 2025
Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression
Ivan Ilin
Peter Richtárik
24
0
0
06 Apr 2025
On the Spatial Structure of Mixture-of-Experts in Transformers
Daniel Bershatsky
Ivan V. Oseledets
MoE
35
0
0
06 Apr 2025
Universal Item Tokenization for Transferable Generative Recommendation
Bowen Zheng
Hongyu Lu
Yu Chen
Wayne Xin Zhao
Ji-Rong Wen
29
0
0
06 Apr 2025
Pre-training Generative Recommender with Multi-Identifier Item Tokenization
Bowen Zheng
Enze Liu
Z. Chen
Zhongrui Ma
Yue Wang
Wayne Xin Zhao
Ji-Rong Wen
33
0
0
06 Apr 2025
A Comprehensive Survey of Challenges and Opportunities of Few-Shot Learning Across Multiple Domains
Andrea Gajic
Sudip Vhaduri
OOD
VLM
51
0
0
05 Apr 2025
Investigating and Mitigating Stereotype-aware Unfairness in LLM-based Recommendations
Zihuai Zhao
Wenqi Fan
Yao Wu
Qing Li
75
1
0
05 Apr 2025
Foundation Models for Time Series: A Survey
Siva Rama Krishna Kottapalli
Karthik Hubli
Sandeep Chandrashekhara
Garima Jain
Sunayana Hubli
Gayathri Botla
Ramesh Doddaiah
AI4TS
AI4CE
23
0
0
05 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffM
VGen
23
0
0
05 Apr 2025
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Yuheng Wu
Wentao Guo
Zirui Liu
Heng Ji
Zhaozhuo Xu
Denghui Zhang
33
0
0
05 Apr 2025
Transformer representation learning is necessary for dynamic multi-modal physiological data on small-cohort patients
Bingxu Wang
Kunzhi Cai
Yuqi Zhang
Yachong Guo
Zeyi Zhou
Wenjiao Li
Yachong Guo
Wei Wang
Qing Zhou
MedIm
29
0
0
05 Apr 2025
Sigma: A dataset for text-to-code semantic parsing with statistical analysis
Saleh Almohaimeed
Shenyang Liu
May Alsofyani
Saad Almohaimeed
Liqiang Wang
37
0
0
05 Apr 2025
UniRVQA: A Unified Framework for Retrieval-Augmented Vision Question Answering via Self-Reflective Joint Training
Jiaqi Deng
Kaize Shi
Zonghan Wu
Huan Huo
Dingxian Wang
Guandong Xu
21
0
0
05 Apr 2025
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
Kazuki Yano
Takumi Ito
Jun Suzuki
LRM
47
1
0
05 Apr 2025
Entropy-Based Block Pruning for Efficient Large Language Models
Liangwei Yang
Yuhui Xu
Juntao Tan
Doyen Sahoo
S.
Caiming Xiong
H. Wang
Shelby Heinecke
AAML
23
0
0
04 Apr 2025
Safe Screening Rules for Group OWL Models
Runxue Bao
Quanchao Lu
Yanfu Zhang
36
0
0
04 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
63
2
0
03 Apr 2025
MegaMath: Pushing the Limits of Open Math Corpora
Fan Zhou
Zengzhi Wang
Nikhil Ranjan
Zhoujun Cheng
Liping Tang
Guowei He
Zhengzhong Liu
Eric P. Xing
LRM
46
1
0
03 Apr 2025
Leveraging LLM For Synchronizing Information Across Multilingual Tables
Siddharth Khincha
Tushar Kataria
Ankita Anand
Dan Roth
Vivek Gupta
49
0
0
03 Apr 2025
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments
Chenyu Zhang
Daniil Cherniavskii
Andrii Zadaianchuk
Antonios Tragoudaras
Antonios Vozikis
Thijmen Nijdam
Derck W. E. Prinzhorn
Mark Bodracska
N. Sebe
E. Gavves
EGVM
VGen
46
0
0
03 Apr 2025
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
Hao Li
Hao Fei
Zechao Hu
Zhengwei Yang
Zheng Wang
45
0
0
03 Apr 2025
CoLa -- Learning to Interactively Collaborate with Large LMs
Abhishek Sharma
Dan Goldwasser
LLMAG
SyDa
58
0
0
03 Apr 2025
GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration
Yuhang Li
Ruokai Yin
Donghyun Lee
Shiting Xiao
Priyadarshini Panda
MQ
45
0
0
03 Apr 2025
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Leonardo Iurada
Marco Ciccone
Tatiana Tommasi
KELM
MoMe
40
0
0
03 Apr 2025
Large (Vision) Language Models are Unsupervised In-Context Learners
Artyom Gadetsky
Andrei Atanov
Yulun Jiang
Zhitong Gao
Ghazal Hosseini Mighan
Amir Zamir
Maria Brbić
VLM
MLLM
LRM
67
0
0
03 Apr 2025
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
36
0
0
03 Apr 2025
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
Chunhui Zhang
Li Liu
Jialin Gao
Xin Sun
Hao Wen
Xi Zhou
Shiming Ge
Y. Wang
33
0
0
02 Apr 2025
Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation
A. Myntti
Erik Henriksson
Veronika Laippala
S. Pyysalo
31
0
0
02 Apr 2025
Real-time Ad retrieval via LLM-generative Commercial Intention for Sponsored Search Advertising
Tongtong Liu
Zhaohui Wang
Meiyue Qin
Zenghui Lu
Xudong Chen
Yuekui Yang
Peng Shu
36
0
0
02 Apr 2025
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Zixuan Wang
Duo Peng
Feng Chen
Y. Yang
Yinjie Lei
DiffM
74
0
0
02 Apr 2025
Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations
Chongjie Si
Zhiyi Shi
Xuehui Wang
Yichen Xiao
Xiaokang Yang
Wei-Ming Shen
AI4CE
60
0
0
01 Apr 2025
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng
Yuying Ge
Yixiao Ge
Jing Liao
Ying Shan
VGen
AI4CE
51
0
0
01 Apr 2025
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models
Guy Kaplan
Michael Toker
Yuval Reif
Yonatan Belinkov
Roy Schwartz
DiffM
48
0
0
01 Apr 2025
Detecting Financial Fraud with Hybrid Deep Learning: A Mix-of-Experts Approach to Sequential and Anomalous Patterns
Diego Vallarino
20
1
0
01 Apr 2025
On Benchmarking Code LLMs for Android Malware Analysis
Yiling He
Hongyu She
Xingzhi Qian
Xinran Zheng
Zhuo Chen
Z. Qin
Lorenzo Cavallaro
ELM
43
1
0
01 Apr 2025
Catch Me if You Search: When Contextual Web Search Results Affect the Detection of Hallucinations
Mahjabin Nahar
Eun-Ju Lee
Jin Won Park
Dongwon Lee
HILM
71
0
0
01 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao W. Wang
Songruoyao Wu
Jiaxing Yu
K. Zhang
MGen
VGen
65
1
0
01 Apr 2025
SeizureTransformer: Scaling U-Net with Transformer for Simultaneous Time-Step Level Seizure Detection from Long EEG Recordings
Kerui Wu
Ziyue Zhao
Bülent Yener
ViT
46
0
0
01 Apr 2025
QG-VTC: Question-Guided Visual Token Compression in MLLMs for Efficient VQA
Shuai Li
Jian Xu
Xiao-Hui Li
Chao Deng
Lin-Lin Huang
MQ
41
0
0
01 Apr 2025
VNJPTranslate: A comprehensive pipeline for Vietnamese-Japanese translation
Hoang Hai Phan
Nguyen Duc Minh Vu
Nam Dang Phuong
46
0
0
01 Apr 2025
Accelerating Causal Network Discovery of Alzheimer Disease Biomarkers via Scientific Literature-based Retrieval Augmented Generation
Xiaofan Zhou
Liangjie Huang
Pinyang Cheng
Wenpen Yin
Rui Zhang
Wenrui Hao
Lu Cheng
21
0
0
01 Apr 2025
Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation
Y. Li
Bo Liu
Sheng Huang
Z. Zhang
Xiaotong Yuan
Richang Hong
41
0
0
31 Mar 2025
DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning Guidance
Junjie Zheng
Zihao Chen
Chaofan Ding
Xinhan Di
VGen
67
1
0
31 Mar 2025
UniSep: Universal Target Audio Separation with Language Models at Scale
Y. Wang
Hangting Chen
Dongchao Yang
Weiqin Li
Dan Luo
Guangzhi Li
Shan Yang
Zhiyong Wu
H. Meng
Xixin Wu
VLM
44
1
0
31 Mar 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
37
0
0
31 Mar 2025
Adaptive Layer-skipping in Pre-trained LLMs
Xuan Luo
Weizhi Wang
Xifeng Yan
107
0
0
31 Mar 2025
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
Rana Muhammad Shahroz Khan
Dongwen Tang
Pingzhi Li
Kai Wang
Tianlong Chen
AI4CE
101
0
0
31 Mar 2025
Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval
Enrico Palumbo
Gustavo Penha
Andreas Damianou
José Luis Redondo García
Timothy Christopher Heath
Alice Wang
Hugues Bouchard
M. Lalmas
49
0
0
31 Mar 2025
Is analogy enough to draw novel adjective-noun inferences?
Hayley Ross
Kathryn Davidson
Najoung Kim
NAI
39
0
0
31 Mar 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Z. Li
L. Zhang
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
54
0
0
31 Mar 2025
Previous
1
2
3
...
5
6
7
...
159
160
161
Next