v1v2v3 (latest)

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

50 / 2,017 papers shown

Title
Exploration of Masked and Causal Language Modelling for Text Generation Nicolo Micheletti Samuel Belkadi Lifeng Han Goran Nenadic 197 11 0 21 May 2024
Mamba in Speech: Towards an Alternative to Self-Attention Xiangyu Zhang Qiquan Zhang Hexin Liu Tianyi Xiao Xinyuan Qian Beena Ahmed E. Ambikairajah Haizhou Li Julien Epps Mamba 307 86 0 21 May 2024
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned ProportionsInternational Conference on Machine Learning (ICML), 2024 Victor Agostinelli Sanghyun Hong Lizhong Chen KELM 175 3 0 18 May 2024
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving Pai Zeng Zhenyu Ning Jieru Zhao Weihao Cui Mengwei Xu Liwei Guo Xusheng Chen Yizhou Shan LLMAG 224 5 0 18 May 2024
Layer-Condensed KV Cache for Efficient Inference of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Haoyi Wu Kewei Tu MQ 278 35 0 17 May 2024
A Hybrid Deep Learning Framework for Stock Price Prediction Considering the Investor Sentiment of Online Forum Enhanced by Popularity Huiyu Li Junhua Hu 87 0 0 17 May 2024
Positional encoding is not the same as context: A study on positional encoding for sequential recommendation Alejo López-Ávila Jinhua Du Abbas Shimary Ze Li 219 5 0 16 May 2024
Robust Singing Voice Transcription Serves SynthesisAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Ruiqi Li Yu Zhang Yongqi Wang Zhiqing Hong Rongjie Huang Zhou Zhao 208 16 0 16 May 2024
Enhancing Maritime Trajectory Forecasting via H3 Index and Causal Language Modelling (CLM) Nicolas Drapier Aladine Chetouani A. Chateigner 110 5 0 15 May 2024
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator LearningInternational Conference on Machine Learning (ICML), 2024 Junfeng Chen Kailiang Wu 383 10 0 15 May 2024
A Survey on Transformers in NLP with Focus on Efficiency Wazib Ansar Saptarsi Goswami Amlan Chakrabarti MedIm 269 11 0 15 May 2024
Improving Transformers with Dynamically Composable Multi-Head AttentionInternational Conference on Machine Learning (ICML), 2024 Da Xiao Qingye Meng Shengping Li Xingyuan Yuan 179 5 0 14 May 2024
Automated Deep Learning for Load Forecasting Julie Keisler Sandra Claudel Gilles Cabriel Margaux Brégère AI4TS 165 3 0 14 May 2024
MambaOut: Do We Really Need Mamba for Vision?Computer Vision and Pattern Recognition (CVPR), 2024 Weihao Yu Xinchao Wang Mamba 245 159 0 13 May 2024
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment GenerationInternational Joint Conference on Artificial Intelligence (IJCAI), 2024 Jianyi Chen Wei Xue Xu Tan Zhen Ye Qi-fei Liu Yi-Ting Guo 123 4 0 13 May 2024
Towards Subgraph Isomorphism Counting with Graph Kernels Xin Liu Weiqi Wang Jiaxin Bai Yangqiu Song 146 1 0 13 May 2024
Transforming the Bootstrap: Using Transformers to Compute Scattering Amplitudes in Planar N = 4 Super Yang-Mills Theory Tianji Cai G. W. Merz Franccois Charton Niklas Nolte Matthias Wilhelm K. Cranmer Lance J. Dixon 293 22 0 09 May 2024
Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation Mo Guan Yan Wang Guangkun Ma Jiarui Liu Mingzu Sun SLR 178 12 0 09 May 2024
Smurfs: Multi-Agent System using Context-Efficient DFSDT for Tool PlanningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Junzhi Chen Juhao Liang Benyou Wang LLMAG 169 4 0 09 May 2024
Whole Genome Transformer for Gene Interaction Effects in Microbiome Habitat SpecificityAAAI Conference on Artificial Intelligence (AAAI), 2024 Zhufeng Li S. S. Cranganore Nicholas D. Youngblut Niki Kilbertus 286 4 0 09 May 2024
Lightweight Spatial Modeling for Combinatorial Information Extraction From Documents Yanfei Dong Lambert Deng Jiazheng Zhang Xiaodong Yu Ting Lin Francesco Gelli Soujanya Poria W. Lee 163 0 0 08 May 2024
SUTRA: Scalable Multilingual Language Model Architecture Abhijit Bendale Michael Sapienza Steven Ripplinger Simon Gibbs Jaewon Lee Pranav Mistry LRM ELM 185 8 0 07 May 2024
A Transformer with Stack Attention Jiaoda Li Jennifer C. White Mrinmaya Sachan Robert Bamler 191 4 0 07 May 2024
Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer Zhuoyi Yang Heyang Jiang Wenyi Hong Jiayan Teng Wendi Zheng Yuxiao Dong Ming Ding Jie Tang SupR 104 10 0 07 May 2024
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion EncodingACM Multimedia (MM), 2024 Tao Liu Feilong Chen Shuai Fan Chenpeng Du Qi Chen Xie Chen Kai Yu DiffM PINN 177 54 0 06 May 2024
Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation Kaize Shi Xueyao Sun Qing Li Guandong Xu 212 20 0 06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond Zheng Zhu Xiaofeng Wang Wangbo Zhao Chen Min Nianchen Deng ... Dawei Zhao Liang Xiao Jian-jun Zhao Jiwen Lu Guan Huang VGen LM&Ro 278 74 0 06 May 2024
Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making Zhuang Lei Jingdong Zhao Yuntao Li Zichun Xu Liangliang Zhao Hong Liu 163 3 0 30 Apr 2024
Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics J. Michaelov Catherine Arnett Benjamin Bergen 172 5 0 30 Apr 2024
Decoding Radiologists' Intentions: A Novel System for Accurate Region Identification in Chest X-ray Image Analysis Akash Awasthi Safwan Ahmad Bryant Le Hien Nguyen 93 2 0 29 Apr 2024
Research and application of artificial intelligence based webshell detection model: A literature review Mingrui Ma Lansheng Han Chunjie Zhou 268 5 0 28 Apr 2024
Setting up the Data Printer with Improved English to Ukrainian Machine Translation Yurii Paniv Dmytro Chaplynskyi Nikita Trynus Volodymyr Kyrylov AI4CE 242 3 0 23 Apr 2024
Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory Hung Le D. Nguyen Kien Do Svetha Venkatesh T. Tran 177 0 0 18 Apr 2024
Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation Jingmin Sun Yuxuan Liu Zecheng Zhang Hayden Schaeffer AI4CE 319 33 0 18 Apr 2024
Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study Zooey Nguyen Anthony Annunziata Vinh Luong Sang Dinh Quynh Le Anh Hai Ha Chanh Le Hong An Phan Shruti Raghavan Christopher Nguyen LRM 137 7 0 17 Apr 2024
AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts Meng Jiang Y. Yu Qing Zhao Jianqiang Li Changwei Song ... Wei-dong Zhai Dan Luo Xiaoqin Wang Guanghui Fu Bing Xiang Yang 142 3 0 17 Apr 2024
Position Engineering: Boosting Large Language Models through Positional Information Manipulation Zhiyuan He Huiqiang Jiang Zilong Wang Yuqing Yang Luna Qiu Lili Qiu LLMAG 81 12 0 17 Apr 2024
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs Woomin Song Seunghyuk Oh Sangwoo Mo Jaehyung Kim Sukmin Yun Jung-Woo Ha Jinwoo Shin 170 29 0 16 Apr 2024
TEL'M: Test and Evaluation of Language Models G. Cybenko Joshua Ackerman Paul Lintilhac ALM ELM 305 1 0 16 Apr 2024
TransformerFAM: Feedback attention is working memory Dongseong Hwang Weiran Wang Zhuoyuan Huo K. Sim P. M. Mengibar 332 17 0 14 Apr 2024
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies Benjue Weng LM&MA 232 13 0 13 Apr 2024
NeurIT: Pushing the Limit of Neural Inertial Tracking for Indoor Robotic IoT Xinzhe Zheng Sijie Ji Yipeng Pan Kaiwen Zhang Chenshu Wu 239 2 0 13 Apr 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Xuezhe Ma Xiaomeng Yang Wenhan Xiong Beidi Chen Lili Yu Hao Zhang Jonathan May Luke Zettlemoyer Omer Levy Chunting Zhou 155 48 0 12 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Tsendsuren Munkhdalai Manaal Faruqui Siddharth Gopal LRM LLMAG CLL 269 158 0 10 Apr 2024
Bidirectional Long-Range Parser for Sequential Data Understanding George Leotescu Daniel Voinea A. Popa 185 1 0 08 Apr 2024
Learning Correlation Structures for Vision Transformers Manjin Kim Paul Hongsuck Seo Cordelia Schmid Minsu Cho ViT 245 25 0 05 Apr 2024
Training LLMs over Neurally Compressed Text Brian Lester Jaehoon Lee A. Alemi Jeffrey Pennington Adam Roberts Jascha Narain Sohl-Dickstein Noah Constant 175 9 0 04 Apr 2024
A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue GenerationInternational Conference on Language Resources and Evaluation (LREC), 2024 Jifan Yu Xiaohan Zhang Yifan Xu Xuanyu Lei Zijun Yao Jing Zhang Lei Hou Juanzi Li HILM 245 4 0 04 Apr 2024
Streaming Dense Video Captioning Xingyi Zhou Anurag Arnab Shyamal Buch Shen Yan Austin Myers Xuehan Xiong Arsha Nagrani Cordelia Schmid VLM 221 72 0 01 Apr 2024
Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training Vivian Liu Yiqiao Yin 270 38 0 01 Apr 2024