Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.02860
Cited By
v1
v2
v3 (latest)
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"
50 / 2,017 papers shown
Title
AISPACE at SemEval-2024 task 8: A Class-balanced Soft-voting System for Detecting Multi-generator Machine-generated Text
Renhua Gu
Xiangfeng Meng
DeLMO
181
4
0
01 Apr 2024
LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction
Bo Zou
Chao Yang
Yu Qiao
Chengbin Quan
Youjian Zhao
162
8
0
01 Apr 2024
Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods
Yuji Cao
Huan Zhao
Yuheng Cheng
Ting Shu
Guolong Liu
Gaoqi Liang
Junhua Zhao
Yun Li
LLMAG
KELM
OffRL
LM&Ro
324
142
0
30 Mar 2024
RealKIE: Five Novel Datasets for Enterprise Key Information Extraction
Benjamin Townsend
Madison May
Katherine Mackowiak
Christopher Wells
SyDa
233
0
0
29 Mar 2024
Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality
Sishuo Chen
Lei Li
Shuhuai Ren
Rundong Gao
Yuanxin Liu
Xiaohan Bi
Xu Sun
Lu Hou
194
3
0
28 Mar 2024
A Survey on Large Language Models from Concept to Implementation
Chen Wang
Jin Zhao
Jiaqi Gong
LLMAG
LM&MA
315
7
0
27 Mar 2024
EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention
Zhen Tian
Wayne Xin Zhao
Changwang Zhang
Xin Zhao
Zhongrui Ma
Ji-Rong Wen
195
5
0
26 Mar 2024
Equipping Sketch Patches with Context-Aware Positional Encoding for Graphic Sketch Representation
Sicong Zang
Zhijun Fang
258
1
0
26 Mar 2024
Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention
USENIX Annual Technical Conference (USENIX ATC), 2024
Bin Gao
Zhuomin He
Puru Sharma
Qingxuan Kang
Djordje Jevdjic
Junbo Deng
Xingkun Yang
Zhou Yu
Pengfei Zuo
302
96
0
23 Mar 2024
EAGLE: A Domain Generalization Framework for AI-generated Text Detection
Amrita Bhattacharjee
Raha Moraffah
Joshua Garland
Huan Liu
DeLMO
171
15
0
23 Mar 2024
Differentially Private Next-Token Prediction of Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
James Flemings
Meisam Razaviyayn
Murali Annavaram
337
19
0
22 Mar 2024
KeyPoint Relative Position Encoding for Face Recognition
Minchul Kim
Yiyang Su
Feng Liu
Anil Jain
Xiaoming Liu
CVBM
167
20
0
21 Mar 2024
Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Training Structured Neural Network
Zih-Syuan Huang
Ching-pei Lee
110
0
0
21 Mar 2024
Large Language Models for Blockchain Security: A Systematic Literature Review
Zheyuan He
Zihao Li
Sen Yang
Ao Qiao
Xiaosong Zhang
Xiapu Luo
Ting Chen
Ting Chen
PILM
485
30
0
21 Mar 2024
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control
Enshen Zhou
Yiran Qin
Zhen-fei Yin
Yuzhou Huang
Ruimao Zhang
Lu Sheng
Yu Qiao
Jing Shao
LM&Ro
AI4CE
232
48
0
18 Mar 2024
Towards Understanding the Relationship between In-context Learning and Compositional Generalization
International Conference on Language Resources and Evaluation (LREC), 2024
Sungjun Han
Sebastian Padó
CoGe
185
4
0
18 Mar 2024
Scaling Instructable Agents Across Many Simulated Worlds
Sima Team
Maria Abi Raad
Arun Ahuja
Catarina Barros
F. Besse
...
Daan Wierstra
Duncan Williams
Nathaniel Wong
Sarah York
Nick Young
LM&Ro
325
64
0
13 Mar 2024
StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses
Neural Information Processing Systems (NeurIPS), 2024
Jia-Nan Li
Quan Tu
Cunli Mao
Zhengtao Yu
Ji-Rong Wen
Rui Yan
OffRL
174
6
0
13 Mar 2024
LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model
AAAI Conference on Artificial Intelligence (AAAI), 2024
Linmei Hu
Hongyu He
Duokang Wang
Ziwang Zhao
Yingxia Shao
Liqiang Nie
108
37
0
12 Mar 2024
Memory-based Adapters for Online 3D Scene Perception
Computer Vision and Pattern Recognition (CVPR), 2024
Xiuwei Xu
Chong Xia
Ziwei Wang
Linqing Zhao
Yueqi Duan
Jie Zhou
Jiwen Lu
3DPC
150
12
0
11 Mar 2024
LIEDER: Linguistically-Informed Evaluation for Discourse Entity Recognition
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Xiaomeng Zhu
Robert Frank
208
0
0
10 Mar 2024
Mastering Memory Tasks with World Models
Mohammad Reza Samsami
Artem Zholus
Janarthanan Rajendran
Sarath Chandar
CLL
OffRL
247
38
0
07 Mar 2024
TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax
Tobias Christian Nauen
Sebastián M. Palacio
Andreas Dengel
236
6
0
05 Mar 2024
NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function
Abdullah Nazhat Abdullah
Tarkan Aydin
305
0
0
04 Mar 2024
xT: Nested Tokenization for Larger Context in Large Images
Ritwik Gupta
Shufan Li
Tyler Lixuan Zhu
Jitendra Malik
Trevor Darrell
K. Mangalam
ViT
166
7
0
04 Mar 2024
Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
Amal Rannen-Triki
J. Bornschein
Razvan Pascanu
Marcus Hutter
Andras Gyorgy
Alexandre Galashov
Yee Whye Teh
Michalis K. Titsias
KELM
124
4
0
03 Mar 2024
Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey
Dinh-Viet-Toan Le
Louis Bigo
Mikaela Keller
Dorien Herremans
MedIm
199
26
0
27 Feb 2024
Video as the New Language for Real-World Decision Making
Sherry Yang
Jacob Walker
Jack Parker-Holder
Yilun Du
Jake Bruce
Andre Barreto
Pieter Abbeel
Dale Schuurmans
VGen
223
79
0
27 Feb 2024
Unifying Linear-Time Attention via Latent Probabilistic Modelling
Rares Dolga
Marius Cobzarenco
David Barber
David Barber
142
1
0
27 Feb 2024
Label Informed Contrastive Pretraining for Node Importance Estimation on Knowledge Graphs
Tianyu Zhang
Chengbin Hou
Rui Jiang
Xuegong Zhang
Chenghu Zhou
Ke Tang
Hairong Lv
160
8
0
26 Feb 2024
On Languaging a Simulation Engine
Han Liu
Liantang Li
108
0
0
26 Feb 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
503
141
0
26 Feb 2024
MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models
Nathanaël Carraz Rakotonirina
Marco Baroni
VLM
KELM
111
1
0
23 Feb 2024
Spatially-Aware Transformer for Embodied Agents
Junmo Cho
Jaesik Yoon
Sungjin Ahn
184
1
0
23 Feb 2024
Scaling Efficient LLMs
B. N. Kausik
302
3
0
22 Feb 2024
Do Efficient Transformers Really Save Computation?
Kai-Bo Yang
Jan Ackermann
Zhenyu He
Guhao Feng
Bohang Zhang
Yunzhen Feng
Qiwei Ye
Di He
Liwei Wang
202
27
0
21 Feb 2024
Fine-Grained Modeling of Narrative Context: A Coherence Perspective via Retrospective Questions
Liyan Xu
JiangNan Li
Mo Yu
Jie Zhou
184
10
0
21 Feb 2024
CAMELoT: Towards Large Language Models with Training-Free Consolidated Associative Memory
Zexue He
Leonid Karlinsky
Donghyun Kim
Julian McAuley
Dmitry Krotov
Rogerio Feris
KELM
RALM
174
18
0
21 Feb 2024
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
181
1
0
20 Feb 2024
Detecting misinformation through Framing Theory: the Frame Element-based Model
Guan-Hua Wang
Rebecca Frederick
Jinglong Duan
William Wong
V. Rupar
Weihua Li
Quan-wei Bai
152
6
0
19 Feb 2024
Sequential Recommendation on Temporal Proximities with Contrastive Learning and Self-Attention
Hansol Jung
Hyunwoo Seo
Chiehyeon Lim
AI4TS
146
1
0
15 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
240
85
0
15 Feb 2024
Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer
Md Kowsher
Abdul Rafae Khan
Jia Xu
253
0
0
14 Feb 2024
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
240
64
0
14 Feb 2024
Transformer Mechanisms Mimic Frontostriatal Gating Operations When Trained on Human Working Memory Tasks
Aaron Traylor
Jack Merullo
Michael J. Frank
Ellie Pavlick
150
6
0
13 Feb 2024
On the Resurgence of Recurrent Models for Long Sequences -- Survey and Research Opportunities in the Transformer Era
Matteo Tiezzi
Michele Casoni
Alessandro Betti
Tommaso Guidi
Marco Gori
S. Melacci
219
18
0
12 Feb 2024
Highly Accurate Disease Diagnosis and Highly Reproducible Biomarker Identification with PathFormer
Research Square (RS), 2023
Zehao Dong
Qihang Zhao
Philip R. O. Payne
Michael Province
C. Cruchaga
Muhan Zhang
Tianyu Zhao
Yixin Chen
Fuhai Li
AI4CE
127
8
0
11 Feb 2024
Large-Language-Model Empowered Dose Volume Histogram Prediction for Intensity Modulated Radiotherapy
Zehao Dong
Yixin Chen
Hiram Gay
Yao Hao
Geoff Hugo
Pamela Samson
Tianyu Zhao
121
1
0
11 Feb 2024
Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Jian Wang
Chak Tou Leong
Jiashuo Wang
Dongding Lin
Wenjie Li
Xiao-Yong Wei
212
14
0
10 Feb 2024
Memory Consolidation Enables Long-Context Video Understanding
Ivana Balavzević
Yuge Shi
Pinelopi Papalampidi
Rahma Chaabouni
Skanda Koppula
Olivier J. Hénaff
391
45
0
08 Feb 2024
Previous
1
2
3
...
8
9
10
...
39
40
41
Next