Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.02860
Cited By
v1
v2
v3 (latest)
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"
50 / 2,022 papers shown
Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey
Victoria Smith
Ali Shahin Shamsabadi
Carolyn Ashurst
Adrian Weller
PILM
484
41
0
27 Sep 2023
Segmentation-Free Streaming Machine Translation
Transactions of the Association for Computational Linguistics (TACL), 2023
Javier Iranzo-Sánchez
Jorge Iranzo-Sánchez
Adria Giménez
Jorge Civera Saiz
Alfons Juan
VOS
245
3
0
26 Sep 2023
Natural Language based Context Modeling and Reasoning for Ubiquitous Computing with Large Language Models: A Tutorial
Haoyi Xiong
Jiang Bian
Sijia Yang
Xiaofei Zhang
Linghe Kong
Daqing Zhang
LRM
LLMAG
276
8
0
24 Sep 2023
Lexical Squad@Multimodal Hate Speech Event Detection 2023: Multimodal Hate Speech Detection using Fused Ensemble Approach
CASE (CASE), 2023
Mohammad Kashif
Mohammad Zohair
Saquib Ali
88
4
0
23 Sep 2023
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models
International Conference on Language Resources and Evaluation (LREC), 2023
Zican Dong
Tianyi Tang
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
RALM
ALM
409
48
0
23 Sep 2023
Unlocking Model Insights: A Dataset for Automated Model Card Generation
Shruti Singh
Hitesh Lodwal
Husain Malwat
Rakesh Thakur
Mayank Singh
SyDa
162
4
0
22 Sep 2023
Classification of Alzheimers Disease with Deep Learning on Eye-tracking Data
International Conference on Multimodal Interaction (ICMI), 2023
Harshinee Sriram
Cristina Conati
Thalia S. Field
163
19
0
22 Sep 2023
What Learned Representations and Influence Functions Can Tell Us About Adversarial Examples
International Joint Conference on Natural Language Processing (IJCNLP), 2023
Shakila Mahjabin Tonni
Mark Dras
TDI
AAML
GAN
420
0
0
19 Sep 2023
Interactive Distillation of Large Single-Topic Corpora of Scientific Papers
International Conference on Machine Learning and Applications (ICMLA), 2023
N. Solovyev
Ryan Barron
Manish Bhattarai
M. Eren
Kim Ø. Rasmussen
Boian S. Alexandrov
152
3
0
19 Sep 2023
MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation
Xinda Wu
Zhijie Huang
Kejun Zhang
Jiaxing Yu
Xu Tan
Tieyao Zhang
Zihao Wang
Lingyun Sun
208
8
0
19 Sep 2023
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
International Conference on Learning Representations (ICLR), 2023
Dawei Zhu
Nan Yang
Liang Wang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
444
100
0
19 Sep 2023
Collaborative Three-Stream Transformers for Video Captioning
Computer Vision and Image Understanding (CVIU), 2023
Hao Wang
Libo Zhang
Hengrui Fan
Tiejian Luo
196
8
0
18 Sep 2023
Music Generation based on Generative Adversarial Networks with Transformer
Ziyi Jiang
Ruoxue Wu
Zhenghan Chen
Xiaoxuan Liang
GAN
MGen
219
1
0
16 Sep 2023
Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Hao Yen
Sabato Marco Siniscalchi
Chin-Hui Lee
177
4
0
16 Sep 2023
Augmenting conformers with structured state-space sequence models for online speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Haozhe Shan
Albert Gu
Zhong Meng
Weiran Wang
Krzysztof Choromanski
Tara N. Sainath
RALM
204
8
0
15 Sep 2023
Résumé Parsing as Hierarchical Sequence Labeling: An Empirical Study
Federico Retyk
H. Fabregat
Juan Aizpuru
Mariana Taglio
Rabih Zbib
82
4
0
13 Sep 2023
Native Language Identification with Big Bird Embeddings
International Conference on Language Resources and Evaluation (LREC), 2023
Sergey Kramp
Giovanni Cassani
Chris Emmery
63
0
0
13 Sep 2023
BodyFormer: Semantics-guided 3D Body Gesture Synthesis with Transformer
ACM Transactions on Graphics (TOG), 2023
Kunkun Pang
Dafei Qin
Yingruo Fan
Julian Habekost
Takaaki Shiratori
Junichi Yamagishi
Taku Komura
SLR
ViT
136
26
0
07 Sep 2023
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023
Zhihang Xu
Shaofei Zhang
Xi Wang
Jiajun Zhang
Wenning Wei
Lei He
Sheng Zhao
236
2
0
06 Sep 2023
Language Models for Novelty Detection in System Call Traces
Quentin Fournier
Daniel Aloise
Leandro R. Costa
AI4TS
207
5
0
05 Sep 2023
ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer
Gyeongdong Yang
Yungwook Kwon
Hyunjin Kim
ViT
99
2
0
04 Sep 2023
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Chi Han
Qifan Wang
Yuan Yao
Wenhan Xiong
Yu Chen
Heng Ji
Sinong Wang
581
96
0
30 Aug 2023
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
...
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAG
RALM
334
932
0
28 Aug 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Lin Geng Foo
Hossein Rahmani
Jing Liu
760
47
0
27 Aug 2023
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
464
2,808
0
24 Aug 2023
Stabilizing RNN Gradients through Pre-training
Luca Herranz-Celotti
Jean Rouat
242
1
0
23 Aug 2023
How Much Temporal Long-Term Context is Needed for Action Segmentation?
IEEE International Conference on Computer Vision (ICCV), 2023
Emad Bahrami Rad
Gianpiero Francesca
Juergen Gall
ViT
241
45
0
22 Aug 2023
SimDA: Simple Diffusion Adapter for Efficient Video Generation
Computer Vision and Pattern Recognition (CVPR), 2023
Zhen Xing
Jingdong Sun
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
DiffM
268
105
0
18 Aug 2023
Learning Computational Efficient Bots with Costly Features
Anthony Kobanda
Valliappan C. A.
Joshua Romoff
Ludovic Denoyer
OffRL
140
2
0
18 Aug 2023
Story Visualization by Online Text Augmentation with Context Memory
IEEE International Conference on Computer Vision (ICCV), 2023
Daechul Ahn
Daneul Kim
Gwangmo Song
Seung Wook Kim
Honglak Lee
Luan Tuyen Chau
Jonghyun Choi
DiffM
263
8
0
15 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
International Conference on Learning Representations (ICLR), 2023
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
365
186
0
14 Aug 2023
A Novel Ehanced Move Recognition Algorithm Based on Pre-trained Models with Positional Embeddings
H. Wen
Jie Wang
Xiaodong Qiao
169
0
0
14 Aug 2023
Detecting Spells in Fantasy Literature with a Transformer Based Artificial Intelligence
Marcel Moravek
Alexander Zender
Andreas Müller
42
0
0
07 Aug 2023
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Michaël Mathieu
Sherjil Ozair
Srivatsan Srinivasan
Çağlar Gülçehre
Shangtong Zhang
...
Sergio Gomez Colmenarejo
Aaron van den Oord
Wojciech M. Czarnecki
Nando de Freitas
Oriol Vinyals
OffRL
176
14
0
07 Aug 2023
RCMHA: Relative Convolutional Multi-Head Attention for Natural Language Modelling
Herman Sugiharto
Aradea
H. Mubarok
249
1
0
07 Aug 2023
Exploring Different Time-series-Transformer (TST) Architectures: A Case Study in Battery Life Prediction for Electric Vehicles (EVs)
Niranjan Sitapure
Atharva Kulkarni
AI4TS
74
6
0
07 Aug 2023
Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion Mining
IAES International Journal of Artificial Intelligence (IJ-AI) (IJ-AI), 2023
Nour Eddine Zekaoui
Siham Yousfi
Maryem Rhanoui
M. Mikram
177
4
0
07 Aug 2023
Multi-scale Alternated Attention Transformer for Generalized Stereo Matching
Chinese Control and Decision Conference (CCDC), 2023
Wei Miao
Hong Zhao
Tom Tongjia Chen
Wei Huang
Changyan Xiao
ViT
142
0
0
06 Aug 2023
DETR Doesn't Need Multi-Scale or Locality Design
Yutong Lin
Yuhui Yuan
Zheng Zhang
Chen Li
Nanning Zheng
Han Hu
281
5
0
03 Aug 2023
Knowledge-aware Collaborative Filtering with Pre-trained Language Model for Personalized Review-based Rating Prediction
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
Quanxiu Wang
Xinlei Cao
Jianyong Wang
Wei Zhang
VLM
85
10
0
02 Aug 2023
LLMs4OL: Large Language Models for Ontology Learning
International Workshop on the Semantic Web (SW), 2023
Hamed Babaei Giglou
Jennifer D'Souza
Sören Auer
196
136
0
31 Jul 2023
Thinker: Learning to Plan and Act
Neural Information Processing Systems (NeurIPS), 2023
Stephen Chung
Ivan Anokhin
David M. Krueger
LLMAG
OffRL
LRM
294
12
0
27 Jul 2023
Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources
Jiasheng Si
Yingjie Zhu
Xingyu Shi
Deyu Zhou
Yulan He
104
0
0
22 Jul 2023
L-Eval: Instituting Standardized Evaluation for Long Context Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Chen An
Shansan Gong
Ming Zhong
Xingjian Zhao
Mukai Li
Jun Zhang
Lingpeng Kong
Xipeng Qiu
ELM
ALM
469
202
0
20 Jul 2023
Integrating a Heterogeneous Graph with Entity-aware Self-attention using Relative Position Labels for Reading Comprehension Model
Shima Foolad
Kourosh Kiani
304
1
0
19 Jul 2023
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Tom Lieberum
Matthew Rahtz
János Kramár
Neel Nanda
G. Irving
Rohin Shah
Vladimir Mikulik
316
141
0
18 Jul 2023
Attention over pre-trained Sentence Embeddings for Long Document Classification
Amine Abdaoui
Sourav Dutta
129
3
0
18 Jul 2023
CSSL-RHA: Contrastive Self-Supervised Learning for Robust Handwriting Authentication
Wenwen Qiang
Luntian Mou
Changwen Zheng
Wen Gao
AAML
217
4
0
18 Jul 2023
Copy Is All You Need
International Conference on Learning Representations (ICLR), 2023
Tian Lan
Deng Cai
Yan Wang
Heyan Huang
Xian-Ling Mao
246
32
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Lin Wang
OffRL
865
1,229
0
12 Jul 2023
Previous
1
2
3
...
12
13
14
...
39
40
41
Next
Page 13 of 41
Page
of 41
Go