Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.02860
Cited By
v1
v2
v3 (latest)
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"
50 / 2,017 papers shown
Title
Exploration of Masked and Causal Language Modelling for Text Generation
Nicolo Micheletti
Samuel Belkadi
Lifeng Han
Goran Nenadic
197
11
0
21 May 2024
Mamba in Speech: Towards an Alternative to Self-Attention
Xiangyu Zhang
Qiquan Zhang
Hexin Liu
Tianyi Xiao
Xinyuan Qian
Beena Ahmed
E. Ambikairajah
Haizhou Li
Julien Epps
Mamba
307
86
0
21 May 2024
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
International Conference on Machine Learning (ICML), 2024
Victor Agostinelli
Sanghyun Hong
Lizhong Chen
KELM
175
3
0
18 May 2024
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving
Pai Zeng
Zhenyu Ning
Jieru Zhao
Weihao Cui
Mengwei Xu
Liwei Guo
Xusheng Chen
Yizhou Shan
LLMAG
224
5
0
18 May 2024
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Haoyi Wu
Kewei Tu
MQ
278
35
0
17 May 2024
A Hybrid Deep Learning Framework for Stock Price Prediction Considering the Investor Sentiment of Online Forum Enhanced by Popularity
Huiyu Li
Junhua Hu
87
0
0
17 May 2024
Positional encoding is not the same as context: A study on positional encoding for sequential recommendation
Alejo López-Ávila
Jinhua Du
Abbas Shimary
Ze Li
219
5
0
16 May 2024
Robust Singing Voice Transcription Serves Synthesis
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruiqi Li
Yu Zhang
Yongqi Wang
Zhiqing Hong
Rongjie Huang
Zhou Zhao
208
16
0
16 May 2024
Enhancing Maritime Trajectory Forecasting via H3 Index and Causal Language Modelling (CLM)
Nicolas Drapier
Aladine Chetouani
A. Chateigner
110
5
0
15 May 2024
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning
International Conference on Machine Learning (ICML), 2024
Junfeng Chen
Kailiang Wu
383
10
0
15 May 2024
A Survey on Transformers in NLP with Focus on Efficiency
Wazib Ansar
Saptarsi Goswami
Amlan Chakrabarti
MedIm
269
11
0
15 May 2024
Improving Transformers with Dynamically Composable Multi-Head Attention
International Conference on Machine Learning (ICML), 2024
Da Xiao
Qingye Meng
Shengping Li
Xingyuan Yuan
179
5
0
14 May 2024
Automated Deep Learning for Load Forecasting
Julie Keisler
Sandra Claudel
Gilles Cabriel
Margaux Brégère
AI4TS
165
3
0
14 May 2024
MambaOut: Do We Really Need Mamba for Vision?
Computer Vision and Pattern Recognition (CVPR), 2024
Weihao Yu
Xinchao Wang
Mamba
245
159
0
13 May 2024
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Jianyi Chen
Wei Xue
Xu Tan
Zhen Ye
Qi-fei Liu
Yi-Ting Guo
123
4
0
13 May 2024
Towards Subgraph Isomorphism Counting with Graph Kernels
Xin Liu
Weiqi Wang
Jiaxin Bai
Yangqiu Song
146
1
0
13 May 2024
Transforming the Bootstrap: Using Transformers to Compute Scattering Amplitudes in Planar N = 4 Super Yang-Mills Theory
Tianji Cai
G. W. Merz
Franccois Charton
Niklas Nolte
Matthias Wilhelm
K. Cranmer
Lance J. Dixon
293
22
0
09 May 2024
Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation
Mo Guan
Yan Wang
Guangkun Ma
Jiarui Liu
Mingzu Sun
SLR
178
12
0
09 May 2024
Smurfs: Multi-Agent System using Context-Efficient DFSDT for Tool Planning
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Junzhi Chen
Juhao Liang
Benyou Wang
LLMAG
169
4
0
09 May 2024
Whole Genome Transformer for Gene Interaction Effects in Microbiome Habitat Specificity
AAAI Conference on Artificial Intelligence (AAAI), 2024
Zhufeng Li
S. S. Cranganore
Nicholas D. Youngblut
Niki Kilbertus
286
4
0
09 May 2024
Lightweight Spatial Modeling for Combinatorial Information Extraction From Documents
Yanfei Dong
Lambert Deng
Jiazheng Zhang
Xiaodong Yu
Ting Lin
Francesco Gelli
Soujanya Poria
W. Lee
163
0
0
08 May 2024
SUTRA: Scalable Multilingual Language Model Architecture
Abhijit Bendale
Michael Sapienza
Steven Ripplinger
Simon Gibbs
Jaewon Lee
Pranav Mistry
LRM
ELM
185
8
0
07 May 2024
A Transformer with Stack Attention
Jiaoda Li
Jennifer C. White
Mrinmaya Sachan
Robert Bamler
191
4
0
07 May 2024
Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer
Zhuoyi Yang
Heyang Jiang
Wenyi Hong
Jiayan Teng
Wendi Zheng
Yuxiao Dong
Ming Ding
Jie Tang
SupR
104
10
0
07 May 2024
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
ACM Multimedia (MM), 2024
Tao Liu
Feilong Chen
Shuai Fan
Chenpeng Du
Qi Chen
Xie Chen
Kai Yu
DiffM
PINN
177
54
0
06 May 2024
Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation
Kaize Shi
Xueyao Sun
Qing Li
Guandong Xu
212
20
0
06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
278
74
0
06 May 2024
Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making
Zhuang Lei
Jingdong Zhao
Yuntao Li
Zichun Xu
Liangliang Zhao
Hong Liu
163
3
0
30 Apr 2024
Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics
J. Michaelov
Catherine Arnett
Benjamin Bergen
172
5
0
30 Apr 2024
Decoding Radiologists' Intentions: A Novel System for Accurate Region Identification in Chest X-ray Image Analysis
Akash Awasthi
Safwan Ahmad
Bryant Le
Hien Nguyen
93
2
0
29 Apr 2024
Research and application of artificial intelligence based webshell detection model: A literature review
Mingrui Ma
Lansheng Han
Chunjie Zhou
268
5
0
28 Apr 2024
Setting up the Data Printer with Improved English to Ukrainian Machine Translation
Yurii Paniv
Dmytro Chaplynskyi
Nikita Trynus
Volodymyr Kyrylov
AI4CE
242
3
0
23 Apr 2024
Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory
Hung Le
D. Nguyen
Kien Do
Svetha Venkatesh
T. Tran
177
0
0
18 Apr 2024
Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation
Jingmin Sun
Yuxuan Liu
Zecheng Zhang
Hayden Schaeffer
AI4CE
319
33
0
18 Apr 2024
Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study
Zooey Nguyen
Anthony Annunziata
Vinh Luong
Sang Dinh
Quynh Le
Anh Hai Ha
Chanh Le
Hong An Phan
Shruti Raghavan
Christopher Nguyen
LRM
137
7
0
17 Apr 2024
AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts
Meng Jiang
Y. Yu
Qing Zhao
Jianqiang Li
Changwei Song
...
Wei-dong Zhai
Dan Luo
Xiaoqin Wang
Guanghui Fu
Bing Xiang Yang
142
3
0
17 Apr 2024
Position Engineering: Boosting Large Language Models through Positional Information Manipulation
Zhiyuan He
Huiqiang Jiang
Zilong Wang
Yuqing Yang
Luna Qiu
Lili Qiu
LLMAG
81
12
0
17 Apr 2024
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs
Woomin Song
Seunghyuk Oh
Sangwoo Mo
Jaehyung Kim
Sukmin Yun
Jung-Woo Ha
Jinwoo Shin
170
29
0
16 Apr 2024
TEL'M: Test and Evaluation of Language Models
G. Cybenko
Joshua Ackerman
Paul Lintilhac
ALM
ELM
305
1
0
16 Apr 2024
TransformerFAM: Feedback attention is working memory
Dongseong Hwang
Weiran Wang
Zhuoyuan Huo
K. Sim
P. M. Mengibar
332
17
0
14 Apr 2024
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
Benjue Weng
LM&MA
232
13
0
13 Apr 2024
NeurIT: Pushing the Limit of Neural Inertial Tracking for Indoor Robotic IoT
Xinzhe Zheng
Sijie Ji
Yipeng Pan
Kaiwen Zhang
Chenshu Wu
239
2
0
13 Apr 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Xuezhe Ma
Xiaomeng Yang
Wenhan Xiong
Beidi Chen
Lili Yu
Hao Zhang
Jonathan May
Luke Zettlemoyer
Omer Levy
Chunting Zhou
155
48
0
12 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRM
LLMAG
CLL
269
158
0
10 Apr 2024
Bidirectional Long-Range Parser for Sequential Data Understanding
George Leotescu
Daniel Voinea
A. Popa
185
1
0
08 Apr 2024
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
245
25
0
05 Apr 2024
Training LLMs over Neurally Compressed Text
Brian Lester
Jaehoon Lee
A. Alemi
Jeffrey Pennington
Adam Roberts
Jascha Narain Sohl-Dickstein
Noah Constant
175
9
0
04 Apr 2024
A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation
International Conference on Language Resources and Evaluation (LREC), 2024
Jifan Yu
Xiaohan Zhang
Yifan Xu
Xuanyu Lei
Zijun Yao
Jing Zhang
Lei Hou
Juanzi Li
HILM
245
4
0
04 Apr 2024
Streaming Dense Video Captioning
Xingyi Zhou
Anurag Arnab
Shyamal Buch
Shen Yan
Austin Myers
Xuehan Xiong
Arsha Nagrani
Cordelia Schmid
VLM
221
72
0
01 Apr 2024
Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training
Vivian Liu
Yiqiao Yin
270
38
0
01 Apr 2024
Previous
1
2
3
...
7
8
9
...
39
40
41
Next