ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.02860
  4. Cited By
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
v1v2v3 (latest)

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
    VLM
ArXiv (abs)PDFHTML

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

50 / 2,022 papers shown
Signformer is all you need: Towards Edge AI for Sign Language
Signformer is all you need: Towards Edge AI for Sign Language
Eta Yang
SLR
305
0
0
19 Nov 2024
KLCBL: An Improved Police Incident Classification Model
KLCBL: An Improved Police Incident Classification Model
Liu Zhuoxian
Shi Tuo
Hu Xiaofeng
182
0
0
11 Nov 2024
EviRerank: Adaptive Evidence Construction for Long-Document LLM Reranking
EviRerank: Adaptive Evidence Construction for Long-Document LLM Reranking
Minghan Li
Eric Gaussier
Juntao Li
Guodong Zhou
ALM
211
0
0
09 Nov 2024
The Evolution of RWKV: Advancements in Efficient Language Modeling
The Evolution of RWKV: Advancements in Efficient Language Modeling
Akul Datta
VLM
188
1
0
05 Nov 2024
Provable Length Generalization in Sequence Prediction via Spectral
  Filtering
Provable Length Generalization in Sequence Prediction via Spectral Filtering
Annie Marsden
Evan Dogariu
Naman Agarwal
Xinyi Chen
Daniel Suo
Elad Hazan
346
1
0
01 Nov 2024
Video Token Merging for Long-form Video Understanding
Video Token Merging for Long-form Video Understanding
Seon-Ho Lee
Jue Wang
Zhikang Zhang
D. Fan
Xinyu Li
291
15
0
31 Oct 2024
Generating Realistic Tabular Data with Large Language Models
Generating Realistic Tabular Data with Large Language ModelsIndustrial Conference on Data Mining (IDM), 2024
Dang Nguyen
Sunil Gupta
Kien Do
Thin Nguyen
Svetha Venkatesh
LMTDSyDa
215
6
0
29 Oct 2024
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning
Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Aosong Feng
Rex Ying
Leandros Tassiulas
247
3
0
28 Oct 2024
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning TechniquesApplied Soft Computing (Appl. Soft Comput.), 2024
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
232
3
0
24 Oct 2024
Large Body Language Models
Large Body Language Models
Saif Punjwani
Larry Heck
172
0
0
21 Oct 2024
Generalized Probabilistic Attention Mechanism in Transformers
Generalized Probabilistic Attention Mechanism in Transformers
DongNyeong Heo
Heeyoul Choi
277
3
0
21 Oct 2024
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical
  and Landmark Loss Optimization
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization
Bin Lin
Yanzhen Yu
Jianhao Ye
Ruitao Lv
Yue Yang
Ruoye Xie
Pan Yu
Hongbin Zhou
VGen
261
3
0
18 Oct 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide
  Image Analysis
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image AnalysisNeural Information Processing Systems (NeurIPS), 2024
Honglin Li
Yunlong Zhang
Pingyi Chen
Honglin Li
Chenglu Zhu
Lin Yang
MedIm
289
12
0
18 Oct 2024
An Evolved Universal Transformer Memory
An Evolved Universal Transformer MemoryInternational Conference on Learning Representations (ICLR), 2024
Edoardo Cetin
Qi Sun
Tianyu Zhao
Yujin Tang
1.3K
4
0
17 Oct 2024
How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs
How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Guhao Feng
Kai-Bo Yang
Yuntian Gu
Xinyue Ai
Shengjie Luo
Jiacheng Sun
Di He
Hao Sun
Liwei Wang
LRM
310
13
0
17 Oct 2024
Super-resolving Real-world Image Illumination Enhancement: A New Dataset
  and A Conditional Diffusion Model
Super-resolving Real-world Image Illumination Enhancement: A New Dataset and A Conditional Diffusion Model
Yang Liu
Yaofang Liu
J. Pan
Yuxiang Hui
Fan Jia
Raymond H. Chan
T. Zeng
221
1
0
16 Oct 2024
Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP
  Technical Specifications
Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical SpecificationsIEEE Wireless Communications and Networking Conference (WCNC), 2024
Thaina Saraiva
Marco Sousa
Pedro Vieira
António Rodrigues
282
4
0
15 Oct 2024
Survey and Evaluation of Converging Architecture in LLMs based on
  Footsteps of Operations
Survey and Evaluation of Converging Architecture in LLMs based on Footsteps of OperationsIEEE Open Journal of the Computer Society (JOCS), 2024
Seongho Kim
Jihyun Moon
Juntaek Oh
Insu Choi
Joon-Sung Yang
162
0
0
15 Oct 2024
SLaNC: Static LayerNorm Calibration
SLaNC: Static LayerNorm Calibration
Mahsa Salmani
Nikita Trukhanov
I. Soloveychik
MQ
244
0
0
14 Oct 2024
BookWorm: A Dataset for Character Description and Analysis
BookWorm: A Dataset for Character Description and AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Argyrios Papoudakis
Mirella Lapata
Frank Keller
195
2
0
14 Oct 2024
ChuLo: Chunk-Level Key Information Representation for Long Document Understanding
ChuLo: Chunk-Level Key Information Representation for Long Document UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yan Li
Soyeon Caren Han
Yue Dai
Feiqi Cao
451
1
0
14 Oct 2024
GEM-VPC: A dual Graph-Enhanced Multimodal integration for Video
  Paragraph Captioning
GEM-VPC: A dual Graph-Enhanced Multimodal integration for Video Paragraph Captioning
Eileen Wang
Caren Han
Josiah Poon
212
1
0
12 Oct 2024
ACER: Automatic Language Model Context Extension via Retrieval
ACER: Automatic Language Model Context Extension via Retrieval
Luyu Gao
Yunyi Zhang
Jamie Callan
RALM
177
0
0
11 Oct 2024
On the token distance modeling ability of higher RoPE attention
  dimension
On the token distance modeling ability of higher RoPE attention dimensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xiangyu Hong
Che Jiang
Biqing Qi
Fandong Meng
Mo Yu
Bowen Zhou
Jie Zhou
245
10
0
11 Oct 2024
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing
  Attention
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing AttentionAsian Conference on Computer Vision (ACCV), 2024
Nguyen Huu Bao Long
Chenyu Zhang
Yuzhi Shi
Tsubasa Hirakawa
Takayoshi Yamashita
Tohgoroh Matsui
H. Fujiyoshi
221
10
0
11 Oct 2024
HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific
  Citation Prediction
HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation PredictionNeural Information Processing Systems (NeurIPS), 2024
Qianyue Hao
Jingyang Fan
Fengli Xu
Jian Yuan
Yong Li
199
16
0
10 Oct 2024
Chain-of-Sketch: Enabling Global Visual Reasoning
Chain-of-Sketch: Enabling Global Visual Reasoning
Aryo Lotfi
Enrico Fini
Samy Bengio
Moin Nabi
Emmanuel Abbe
LRM
296
0
0
10 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
869
5
0
10 Oct 2024
TouchInsight: Uncertainty-aware Rapid Touch and Text Input for Mixed
  Reality from Egocentric Vision
TouchInsight: Uncertainty-aware Rapid Touch and Text Input for Mixed Reality from Egocentric VisionACM Symposium on User Interface Software and Technology (UIST), 2024
Paul Streli
Mark Richardson
Fadi Botros
Shugao Ma
Robert Wang
Christian Holz
188
15
0
08 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length ExtrapolationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
362
11
0
07 Oct 2024
Forgetting Curve: A Reliable Method for Evaluating Memorization
  Capability for Long-context Models
Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xinyu Liu
Runsong Zhao
Pengcheng Huang
Chunyang Xiao
Bei Li
Jingang Wang
Tong Xiao
Jingbo Zhu
162
5
0
07 Oct 2024
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting
Timer-XL: Long-Context Transformers for Unified Time Series ForecastingInternational Conference on Learning Representations (ICLR), 2024
Yong Liu
Guo Qin
Xiangdong Huang
Jianmin Wang
Mingsheng Long
AI4TS
333
38
0
07 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
631
46
0
06 Oct 2024
Correlation-Aware Select and Merge Attention for Efficient Fine-Tuning
  and Context Length Extension
Correlation-Aware Select and Merge Attention for Efficient Fine-Tuning and Context Length Extension
Ning Wang
Zekun Li
Tongxin Bai
Guoqi Li
141
0
0
05 Oct 2024
S7: Selective and Simplified State Space Layers for Sequence Modeling
S7: Selective and Simplified State Space Layers for Sequence Modeling
Taylan Soydan
Nikola Zubić
Nico Messikommer
Siddhartha Mishra
Davide Scaramuzza
273
13
0
04 Oct 2024
ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question
  Answering
ALR2^22: A Retrieve-then-Reason Framework for Long-context Question Answering
Huayang Li
Pat Verga
Priyanka Sen
Bowen Yang
Vijay Viswanathan
Patrick Lewis
Taro Watanabe
Yixuan Su
RALMLRM
209
19
0
04 Oct 2024
MELODI: Exploring Memory Compression for Long Contexts
MELODI: Exploring Memory Compression for Long ContextsInternational Conference on Learning Representations (ICLR), 2024
Yinpeng Chen
DeLesley Hutchins
Aren Jansen
Andrey Zhmoginov
David Racz
Jesper Andersen
194
3
0
04 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Wei Wu
Chao Wang
L. Chen
Mingze Yin
Yiheng Zhu
Kun Fu
Jieping Ye
Hui Xiong
Zheng Wang
381
3
0
04 Oct 2024
Graph-tree Fusion Model with Bidirectional Information Propagation for
  Long Document Classification
Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document ClassificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Sudipta Singha Roy
Xindi Wang
Robert E. Mercer
Frank Rudzicz
173
0
0
03 Oct 2024
ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for
  Embodied AI
ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI
Ahmad Elawady
Gunjan Chhablani
Ram Ramrakhya
Karmesh Yadav
Dhruv Batra
Z. Kira
Andrew Szot
OffRL
344
2
0
03 Oct 2024
Efficient Streaming LLM for Speech Recognition
Efficient Streaming LLM for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Junteng Jia
Gil Keren
Wei Zhou
Egor Lakomkin
Xiaohui Zhang
Chunyang Wu
Frank Seide
Jay Mahadeokar
Ozlem Kalinli
AuLLM
214
6
0
02 Oct 2024
Prototype based Masked Audio Model for Self-Supervised Learning of Sound
  Event Detection
Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Pengfei Cai
Yan Song
Nan Jiang
Qing Gu
Ian Mcloughlin
218
5
0
26 Sep 2024
Decoding Large-Language Models: A Systematic Overview of Socio-Technical
  Impacts, Constraints, and Emerging Questions
Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions
Zeyneb N. Kaya
Souvick Ghosh
129
0
0
25 Sep 2024
Generative AI-driven forecasting of oil production
Generative AI-driven forecasting of oil production
Yash Gandhi
Kexin Zheng
Birendra Jha
K. Nomura
A. Nakano
P. Vashishta
R. Kalia
201
1
0
24 Sep 2024
Ads that Talk Back: Implications and Perceptions of Injecting Personalized Advertising into LLM Chatbots
Ads that Talk Back: Implications and Perceptions of Injecting Personalized Advertising into LLM Chatbots
Brian Tang
Kaiwen Sun
Noah T. Curran
F. Schaub
Kang G. Shin
SILM
282
3
0
23 Sep 2024
PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference
PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference
Zeyu Zhang
Haiying Shen
VLM
348
1
0
23 Sep 2024
FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAs
FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAsInternational Conference on Field-Programmable Technology (ICFPT), 2024
Ehsan Kabir
Md. Arafat Kabir
Austin R. J. Downey
Jason D. Bakos
David Andrews
Miaoqing Huang
GNN
271
2
0
21 Sep 2024
"I Never Said That": A dataset, taxonomy and baselines on response
  clarity classification
"I Never Said That": A dataset, taxonomy and baselines on response clarity classificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Konstantinos Thomas
Giorgos Filandrianos
Maria Lymperaiou
Chrysoula Zerva
Giorgos Stamou
177
0
0
20 Sep 2024
Contextual Compression in Retrieval-Augmented Generation for Large
  Language Models: A Survey
Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey
Sourav Verma
RALM3DV
259
7
0
20 Sep 2024
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELMCLL
995
8
0
20 Sep 2024
Previous
123456...394041
Next