ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.02860
  4. Cited By
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
    VLM
ArXivPDFHTML

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

50 / 597 papers shown
Title
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Feng Liu
Nicholas Chimitt
Lanqing guo
Jitesh Jain
Aditya Kane
...
Arun Ross
Humphrey Shi
Zhangyang Wang
A. Jain
Xiaoming Liu
CVBM
27
1
0
07 May 2025
Compact Recurrent Transformer with Persistent Memory
Compact Recurrent Transformer with Persistent Memory
Edison Mucllari
Z. Daniels
David C. Zhang
Qiang Ye
CLL
VLM
49
0
0
02 May 2025
A Character-based Diffusion Embedding Algorithm for Enhancing the Generation Quality of Generative Linguistic Steganographic Texts
A Character-based Diffusion Embedding Algorithm for Enhancing the Generation Quality of Generative Linguistic Steganographic Texts
Yingquan Chen
Qianmu Li
Xiaocong Wu
Huifeng Li
Qing Chang
DiffM
29
0
0
02 May 2025
Dual Filter: A Mathematical Framework for Inference using Transformer-like Architectures
Dual Filter: A Mathematical Framework for Inference using Transformer-like Architectures
Heng-Sheng Chang
P. Mehta
36
0
0
01 May 2025
From Attention to Atoms: Spectral Dictionary Learning for Fast, Interpretable Language Models
From Attention to Atoms: Spectral Dictionary Learning for Fast, Interpretable Language Models
Andrew Kiruluta
24
0
0
29 Apr 2025
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Yi Lu
Wanxu Zhao
Xin Zhou
Chenxin An
C. Wang
...
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
39
0
0
26 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao W. Wang
Songruoyao Wu
Jiaxing Yu
K. Zhang
MGen
VGen
70
1
0
01 Apr 2025
TRA: Better Length Generalisation with Threshold Relative Attention
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
46
0
0
29 Mar 2025
Semi-supervised Node Importance Estimation with Informative Distribution Modeling for Uncertainty Regularization
Semi-supervised Node Importance Estimation with Informative Distribution Modeling for Uncertainty Regularization
Yankai Chen
Taotao Wang
Yixiang Fang
Yunyu Xiao
BDL
100
1
0
26 Mar 2025
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
45
0
0
22 Mar 2025
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola
Aaron Gokaslan
Justin T Chiu
Zhihan Yang
Zhixuan Qi
Jiaqi Han
S. Sahoo
Volodymyr Kuleshov
DiffM
72
5
0
12 Mar 2025
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
155
0
0
10 Mar 2025
Transformer Meets Twicing: Harnessing Unattended Residual Information
Laziz U. Abdullaev
Tan M. Nguyen
41
2
0
02 Mar 2025
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
Bingzhe Zhao
Ke Cheng
Aomufei Yuan
Yuxuan Tian
Ruiguang Zhong
Chengchen Hu
Tong Yang
Lian Yu
51
0
0
19 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
77
0
0
18 Feb 2025
Associative Recurrent Memory Transformer
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Mikhail Burtsev
68
2
0
17 Feb 2025
ChordFormer: A Conformer-Based Architecture for Large-Vocabulary Audio Chord Recognition
ChordFormer: A Conformer-Based Architecture for Large-Vocabulary Audio Chord Recognition
Muhammad Waseem Akram
Stefano Dettori
V. Colla
Giorgio Buttazzo
52
0
0
17 Feb 2025
Theoretical Benefit and Limitation of Diffusion Language Model
Theoretical Benefit and Limitation of Diffusion Language Model
Guhao Feng
Yihan Geng
Jian-Yu Guan
Wei Yu Wu
Liwei Wang
Di He
DiffM
55
0
0
13 Feb 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
39
0
0
06 Feb 2025
Music Generation using Human-In-The-Loop Reinforcement Learning
Aju Ani Justus
34
0
0
28 Jan 2025
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
Kai-Tuo Xu
Feng-Long Xie
Xu Tang
Yao Hu
69
4
0
24 Jan 2025
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Thibaut Thonet
Jos Rozen
Laurent Besacier
RALM
137
2
0
20 Jan 2025
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition
Shucong Zhang
Titouan Parcollet
Rogier van Dalen
Sourav Bhattacharya
44
0
0
10 Jan 2025
Deep Neural Networks and Brain Alignment: Brain Encoding and Decoding (Survey)
Deep Neural Networks and Brain Alignment: Brain Encoding and Decoding (Survey)
S. Oota
Zijiao Chen
Manish Gupta
R. Bapi
G. Jobard
F. Alexandre
X. Hinaut
3DV
AI4CE
49
11
0
31 Dec 2024
Investigating Length Issues in Document-level Machine Translation
Investigating Length Issues in Document-level Machine Translation
Ziqian Peng
Rachel Bawden
François Yvon
69
1
0
23 Dec 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide
  Image Analysis
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
38
4
0
18 Oct 2024
An Evolved Universal Transformer Memory
An Evolved Universal Transformer Memory
Edoardo Cetin
Qi Sun
Tianyu Zhao
Yujin Tang
146
0
0
17 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
90
0
0
10 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
16
0
06 Oct 2024
S7: Selective and Simplified State Space Layers for Sequence Modeling
S7: Selective and Simplified State Space Layers for Sequence Modeling
Taylan Soydan
Nikola Zubić
Nico Messikommer
Siddhartha Mishra
Davide Scaramuzza
37
4
0
04 Oct 2024
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
141
1
0
20 Sep 2024
The Compressor-Retriever Architecture for Language Model OS
The Compressor-Retriever Architecture for Language Model OS
Yuan Yang
Siheng Xiong
Ehsan Shareghi
Faramarz Fekri
RALM
KELM
32
1
0
02 Sep 2024
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
Yonghui Wang
Shaokai Liu
Li Li
Wengang Zhou
Houqiang Li
ViT
49
1
0
07 Aug 2024
Jailbreaking Text-to-Image Models with LLM-Based Agents
Jailbreaking Text-to-Image Models with LLM-Based Agents
Yingkai Dong
Zheng Li
Xiangtao Meng
Ning Yu
Shanqing Guo
LLMAG
45
13
0
01 Aug 2024
Genomic Language Models: Opportunities and Challenges
Genomic Language Models: Opportunities and Challenges
Gonzalo Benegas
Chengzhong Ye
C. Albors
Jianan Canal Li
Yun S. Song
AI4CE
LM&MA
ELM
41
18
0
16 Jul 2024
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long
  Motion Generation
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation
Zeyu Zhang
Akide Liu
Qi Chen
Feng Chen
Ian Reid
Richard Hartley
Bohan Zhuang
Hao Tang
Mamba
31
9
0
14 Jul 2024
Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text
Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text
Lucio La Cava
Davide Costa
Andrea Tagarelli
DeLMO
40
2
0
12 Jul 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
88
2
0
09 Jul 2024
CLIPVQA:Video Quality Assessment via CLIP
CLIPVQA:Video Quality Assessment via CLIP
Fengchuang Xing
Mingjie Li
Yuan-Gen Wang
Guopu Zhu
Xiaochun Cao
CLIP
ViT
38
4
0
06 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Z. Zhang
Di He
KELM
33
0
0
03 Jul 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLM
VLM
35
22
0
18 Jun 2024
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
Yuan Pu
Yazhe Niu
Jiyuan Ren
Zhenjie Yang
Hongsheng Li
Yu Liu
OffRL
43
1
0
15 Jun 2024
Visual Representation Learning with Stochastic Frame Prediction
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang
Dongyoung Kim
Junsu Kim
Jinwoo Shin
Pieter Abbeel
Younggyo Seo
42
2
0
11 Jun 2024
CTC-based Non-autoregressive Textless Speech-to-Speech Translation
CTC-based Non-autoregressive Textless Speech-to-Speech Translation
Qingkai Fang
Zhengrui Ma
Yan Zhou
Min Zhang
Yang Feng
52
0
0
11 Jun 2024
X-VILA: Cross-Modality Alignment for Large Language Model
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye
De-An Huang
Yao Lu
Zhiding Yu
Wei Ping
...
Jan Kautz
Song Han
Dan Xu
Pavlo Molchanov
Hongxu Yin
MLLM
VLM
45
29
0
29 May 2024
Categorical Flow Matching on Statistical Manifolds
Categorical Flow Matching on Statistical Manifolds
Chaoran Cheng
Jiahan Li
Jian-wei Peng
Ge Liu
61
10
0
26 May 2024
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Nikola Zubić
Federico Soldá
Aurelio Sulser
Davide Scaramuzza
LRM
BDL
50
5
0
26 May 2024
Towards Better Understanding of In-Context Learning Ability from
  In-Context Uncertainty Quantification
Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification
Shang Liu
Zhongze Cai
Guanting Chen
Xiaocheng Li
UQCV
46
1
0
24 May 2024
A Structure-Aware Framework for Learning Device Placements on Computation Graphs
A Structure-Aware Framework for Learning Device Placements on Computation Graphs
Shukai Duan
Heng Ping
Nikos Kanakaris
Xiongye Xiao
Panagiotis Kyriakis
...
Guixiang Ma
Mihai Capota
Shahin Nazarian
Theodore L. Willke
Paul Bogdan
45
2
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
80
42
0
23 May 2024
1234...101112
Next