Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.02860
Cited By
v1
v2
v3 (latest)
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"
50 / 2,022 papers shown
CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Junlin Lv
Yuan Feng
Xike Xie
Xin Jia
Qirong Peng
Guiming Xie
287
5
0
19 Sep 2024
StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Suzhen Wang
Yifeng Ma
Yu Ding
Zhipeng Hu
Changjie Fan
Tangjie Lv
Zhidong Deng
Xin Yu
260
19
0
14 Sep 2024
Exploring SSL Discrete Tokens for Multilingual ASR
Mingyu Cui
Daxin Tan
Yifan Yang
Dingdong Wang
Huimeng Wang
Xiao Chen
Xie Chen
Xunying Liu
257
5
0
13 Sep 2024
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Mingyu Cui
Yifan Yang
Jiajun Deng
Jiawen Kang
Shujie Hu
Tianzi Wang
Zhaoqing Li
Shiliang Zhang
Xie Chen
Xunying Liu
246
2
0
13 Sep 2024
Inf-MLLM: Efficient Streaming Inference of Multimodal Large Language Models on a Single GPU
Zhenyu Ning
Jieru Zhao
Qihao Jin
Wenchao Ding
Minyi Guo
70
17
0
11 Sep 2024
Multimodal Emotion Recognition with Vision-language Prompting and Modality Dropout
Anbin QI
Zhongliang Liu
Xinyong Zhou
Jinba Xiao
Fengrun Zhang
Qi Gan
Ming Tao
Gaozheng Zhang
Lu Zhang
VLM
152
10
0
11 Sep 2024
DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models
Maryam Akhavan Aghdam
Hongpeng Jin
Yanzhao Wu
MoE
214
6
0
10 Sep 2024
DetoxBench: Benchmarking Large Language Models for Multitask Fraud & Abuse Detection
Joymallya Chakraborty
Wei Xia
Anirban Majumder
Dan Ma
Walid Chaabene
Naveed Janvekar
144
7
0
09 Sep 2024
An overview of domain-specific foundation model: key technologies, applications and challenges
Science China Information Sciences (Sci. China Inf. Sci.), 2024
Haolong Chen
Hanzhi Chen
Zijian Zhao
Kaifeng Han
Guangxu Zhu
Yichen Zhao
Ying Du
Wei Xu
Qingjiang Shi
ALM
VLM
482
19
0
06 Sep 2024
Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis
ACM Multimedia (MM), 2024
Xianbing Zhao
Zhuang Li
Tao Feng
Jianfei Cai
Buzhou Tang
211
2
0
05 Sep 2024
The Compressor-Retriever Architecture for Language Model OS
Yuan Yang
Siheng Xiong
Ehsan Shareghi
Faramarz Fekri
RALM
KELM
266
2
0
02 Sep 2024
DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective Partitioning
Keer Lu
Xiaonan Nie
Zhuoran Zhang
Zheng Liang
Da Pan
...
Weipeng Chen
Guosheng Dong
Bin Cui
Bin Cui
Wentao Zhang
217
2
0
02 Sep 2024
MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT
International Society for Music Information Retrieval Conference (ISMIR), 2024
Jinlong Zhu
Keigo Sakurai
Ren Togo
Takahiro Ogawa
Miki Haseyama
GAN
244
1
0
02 Sep 2024
MemLong: Memory-Augmented Retrieval for Long Text Modeling
Weijie Liu
Zecheng Tang
Juntao Li
Kehai Chen
Min Zhang
RALM
162
9
0
30 Aug 2024
Maelstrom Networks
Matthew Evanusa
Cornelia Fermuller
Yiannis Aloimonos
157
2
0
29 Aug 2024
HLogformer: A Hierarchical Transformer for Representing Log Data
Zhichao Hou
Mina Ghashami
Mikhail Kuznetsov
MohamadAli Torkamani
194
1
0
29 Aug 2024
Evaluating Credit VIX (CDS IV) Prediction Methods with Incremental Batch Learning
Robert Taylor
98
0
0
27 Aug 2024
A Survey of Large Language Models for European Languages
Wazir Ali
S. Pyysalo
379
6
0
27 Aug 2024
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
M. Russak
Umar Jamil
Christopher Bryant
Kiran Kamble
Axel Magnuson
Mateusz Russak
Waseem Alshikh
197
6
0
27 Aug 2024
Bengali Sign Language Recognition through Hand Pose Estimation using Multi-Branch Spatial-Temporal Attention Model
Abu Saleh Musa Miah
Md. Al-Hasan
Md Hadiuzzaman
Muhammad Nazrul Islam
Jungpil Shin
SLR
163
0
0
26 Aug 2024
Mixed Sparsity Training: Achieving 4
×
\times
×
FLOP Reduction for Transformer Pretraining
Pihe Hu
Shaolong Li
Longbo Huang
193
0
0
21 Aug 2024
Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Amey Hengle
Prasoon Bajpai
Soham Dan
Tanmoy Chakraborty
LRM
228
7
0
19 Aug 2024
Image-based Freeform Handwriting Authentication with Energy-oriented Self-Supervised Learning
IEEE transactions on multimedia (IEEE TMM), 2024
Wenwen Qiang
Luntian Mou
Changwen Zheng
Wen Gao
AAML
183
4
0
19 Aug 2024
MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection
Interspeech (Interspeech), 2024
Pengfei Cai
Yan Song
Kang Li
Haoyu Song
Ian Mcloughlin
225
9
0
16 Aug 2024
Survey: Transformer-based Models in Data Modality Conversion
Elyas Rashno
Amir Eskandari
Aman Anand
F. Zulkernine
MedIm
225
6
0
08 Aug 2024
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Yilong Chen
Guoxia Wang
Junyuan Shang
Shiyao Cui
Zhenyu Zhang
Tingwen Liu
Shuohuan Wang
Yu Sun
Dianhai Yu
Hua Wu
253
31
0
07 Aug 2024
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
Yonghui Wang
Shaokai Liu
Li Li
Wengang Zhou
Houqiang Li
ViT
215
5
0
07 Aug 2024
Clustering and Mining Accented Speech for Inclusive and Fair Speech Recognition
Jaeyoung Kim
Han Lu
S. Khorram
Anshuman Tripathi
Qian Zhang
Hasim Sak
118
0
0
05 Aug 2024
Long Input Benchmark for Russian Analysis
I. Churin
Murat Apishev
Maria Tikhonova
Denis Shevelev
Aydar Bulatov
Yuri Kuratov
Sergej Averkiev
Alena Fenogenova
161
2
0
05 Aug 2024
DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting
International Conference on Information and Knowledge Management (CIKM), 2024
Ruixin Ding
Yuqi Chen
Yu-Ting Lan
Wei Zhang
AI4TS
177
13
0
05 Aug 2024
Fuzz-Testing Meets LLM-Based Agents: An Automated and Efficient Framework for Jailbreaking Text-To-Image Generation Models
IEEE Symposium on Security and Privacy (S&P), 2024
Yingkai Dong
Xiangtao Meng
Ning Yu
Zheng Li
Shanqing Guo
LLMAG
433
17
0
01 Aug 2024
Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional Representation
Jingyue Huang
Ke Chen
Yi-Hsuan Yang
CoGe
230
13
0
30 Jul 2024
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Neural Information Processing Systems (NeurIPS), 2024
Gagan Jain
Nidhi Hegde
Aditya Kusupati
Arsha Nagrani
Shyamal Buch
Prateek Jain
Anurag Arnab
Sujoy Paul
MoE
269
17
0
29 Jul 2024
Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings
Seungyeon Rhyu
Kichang Yang
Sungjun Cho
Jaehyeon Kim
Kyogu Lee
Moontae Lee
253
0
0
29 Jul 2024
QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning
Mostafa Kotb
C. Weber
Muhammad Burhan Hafez
Stefan Wermter
234
6
0
26 Jul 2024
V
I
L
A
2
VILA^2
V
I
L
A
2
: VILA Augmented VILA
Yunhao Fang
Ligeng Zhu
Yao Lu
Yan Wang
Pavlo Molchanov
Jang Hyun Cho
Marco Pavone
Song Han
Hongxu Yin
VLM
250
4
0
24 Jul 2024
Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models
Yida Zhao
Chao Lou
Kewei Tu
212
2
0
24 Jul 2024
What Matters in Explanations: Towards Explainable Fake Review Detection Focusing on Transformers
Md. Shajalal
Md. Atabuzzaman
Alexander Boden
Gunnar Stevens
Delong Du
254
0
0
24 Jul 2024
MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Cheng Luo
Jiawei Zhao
Zhuoming Chen
Beidi Chen
A. Anandkumar
264
5
0
22 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
471
84
0
20 Jul 2024
On Pre-training of Multimodal Language Models Customized for Chart Understanding
Wan-Cyuan Fan
Yen-Chun Chen
Xiyang Dai
Lu Yuan
Leonid Sigal
360
10
0
19 Jul 2024
Transformer-based Single-Cell Language Model: A Survey
Wei Lan
Guohang He
Mingyang Liu
Qingfeng Chen
Junyue Cao
Wei Peng
MedIm
LRM
217
19
0
18 Jul 2024
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities
To Eun Kim
Alireza Salemi
Andrew Drozdov
Fernando Diaz
Hamed Zamani
367
10
0
17 Jul 2024
Genomic Language Models: Opportunities and Challenges
Gonzalo Benegas
Chengzhong Ye
C. Albors
Jianan Canal Li
Yun S. Song
AI4CE
LM&MA
ELM
339
54
0
16 Jul 2024
Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using Datagraphs
W. J. Meijer
A. C. Kemmeren
E.H.J. Riemens
J. E. Fransman
M. V. Bekkum
G. J. Burghouts
J. D. V. Mil
249
0
0
15 Jul 2024
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation
Zeyu Zhang
Akide Liu
Qi Chen
Feng Chen
Ian Reid
Richard Hartley
Bohan Zhuang
Hao Tang
Mamba
155
19
0
14 Jul 2024
Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text
Lucio La Cava
Davide Costa
Andrea Tagarelli
DeLMO
374
9
0
12 Jul 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
414
5
0
09 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
349
36
0
06 Jul 2024
CLIPVQA:Video Quality Assessment via CLIP
Fengchuang Xing
Mingjie Li
Yuan-Gen Wang
Guopu Zhu
Xiaochun Cao
CLIP
ViT
287
16
0
06 Jul 2024
Previous
1
2
3
...
5
6
7
...
39
40
41
Next