v1v2v3 (latest)

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

50 / 2,022 papers shown

Signformer is all you need: Towards Edge AI for Sign Language

Eta Yang

SLR

305

19 Nov 2024

KLCBL: An Improved Police Incident Classification Model

Liu Zhuoxian

Shi Tuo

Hu Xiaofeng

182

11 Nov 2024

EviRerank: Adaptive Evidence Construction for Long-Document LLM Reranking

211

09 Nov 2024

The Evolution of RWKV: Advancements in Efficient Language Modeling

Akul Datta

VLM

188

05 Nov 2024

Provable Length Generalization in Sequence Prediction via Spectral Filtering

346

01 Nov 2024

Video Token Merging for Long-form Video Understanding

291

31 Oct 2024

Generating Realistic Tabular Data with Large Language ModelsIndustrial Conference on Data Mining (IDM), 2024

Dang Nguyen

Thin Nguyen

215

29 Oct 2024

Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Aosong Feng

Rex Ying

Leandros Tassiulas

247

28 Oct 2024

Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning TechniquesApplied Soft Computing (Appl. Soft Comput.), 2024

David Ortiz-Perez

Manuel Benavent-Lledo

José García Rodríguez

David Tomás

M. Flores Vizcaya-Moreno

232

24 Oct 2024

Large Body Language Models

Saif Punjwani

Larry Heck

172

21 Oct 2024

Generalized Probabilistic Attention Mechanism in Transformers

DongNyeong Heo

Heeyoul Choi

277

21 Oct 2024

Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization

261

18 Oct 2024

Rethinking Transformer for Long Contextual Histopathology Whole Slide Image AnalysisNeural Information Processing Systems (NeurIPS), 2024

Lin Yang

289

18 Oct 2024

An Evolved Universal Transformer MemoryInternational Conference on Learning Representations (ICLR), 2024

1.3K

17 Oct 2024

How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

310

17 Oct 2024

Super-resolving Real-world Image Illumination Enhancement: A New Dataset and A Conditional Diffusion Model

Yang Liu

Yaofang Liu

Raymond H. Chan

221

16 Oct 2024

Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical SpecificationsIEEE Wireless Communications and Networking Conference (WCNC), 2024

282

15 Oct 2024

Survey and Evaluation of Converging Architecture in LLMs based on Footsteps of OperationsIEEE Open Journal of the Computer Society (JOCS), 2024

162

15 Oct 2024

SLaNC: Static LayerNorm Calibration

Mahsa Salmani

Nikita Trukhanov

I. Soloveychik

244

14 Oct 2024

BookWorm: A Dataset for Character Description and AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Argyrios Papoudakis

Mirella Lapata

Frank Keller

195

14 Oct 2024

ChuLo: Chunk-Level Key Information Representation for Long Document UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

451

14 Oct 2024

GEM-VPC: A dual Graph-Enhanced Multimodal integration for Video Paragraph Captioning

Eileen Wang

Caren Han

Josiah Poon

212

12 Oct 2024

ACER: Automatic Language Model Context Extension via Retrieval

177

11 Oct 2024

On the token distance modeling ability of higher RoPE attention dimensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

245

11 Oct 2024

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing AttentionAsian Conference on Computer Vision (ACCV), 2024

221

11 Oct 2024

HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation PredictionNeural Information Processing Systems (NeurIPS), 2024

Yong Li

199

10 Oct 2024

Chain-of-Sketch: Enabling Global Visual Reasoning

296

10 Oct 2024

Masked Generative Priors Improve World Models Sequence Modelling Capabilities

869

10 Oct 2024

TouchInsight: Uncertainty-aware Rapid Touch and Text Input for Mixed Reality from Egocentric VisionACM Symposium on User Interface Software and Technology (UIST), 2024

Christian Holz

188

08 Oct 2024

DAPE V2: Process Attention Score as Feature Map for Length ExtrapolationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Jing Xiong

...

Michael Ng

Xin Jiang

Zhenguo Li

Yu Li

362

07 Oct 2024

Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Xinyu Liu

Runsong Zhao

Pengcheng Huang

Chunyang Xiao

Bei Li

Jingang Wang

Tong Xiao

Jingbo Zhu

162

07 Oct 2024

Timer-XL: Long-Context Transformers for Unified Time Series ForecastingInternational Conference on Learning Representations (ICLR), 2024

Yong Liu

Guo Qin

Xiangdong Huang

Jianmin Wang

Mingsheng Long

AI4TS

333

07 Oct 2024

Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective

...

631

06 Oct 2024

Correlation-Aware Select and Merge Attention for Efficient Fine-Tuning and Context Length Extension

141

05 Oct 2024

S7: Selective and Simplified State Space Layers for Sequence Modeling

273

04 Oct 2024

ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question
Answering

ALR

^2

: A Retrieve-then-Reason Framework for Long-context Question Answering

Yixuan Su

209

04 Oct 2024

MELODI: Exploring Memory Compression for Long ContextsInternational Conference on Learning Representations (ICLR), 2024

194

04 Oct 2024

Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs

381

04 Oct 2024

Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document ClassificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

173

03 Oct 2024

ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI

344

03 Oct 2024

Efficient Streaming LLM for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Ozlem Kalinli

214

02 Oct 2024

Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

218

26 Sep 2024

Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions

Zeyneb N. Kaya

Souvick Ghosh

129

25 Sep 2024

Generative AI-driven forecasting of oil production

201

24 Sep 2024

Ads that Talk Back: Implications and Perceptions of Injecting Personalized Advertising into LLM Chatbots

282

23 Sep 2024

PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference

Zeyu Zhang

Haiying Shen

VLM

348

23 Sep 2024

FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAsInternational Conference on Field-Programmable Technology (ICFPT), 2024

271

21 Sep 2024

"I Never Said That": A dataset, taxonomy and baselines on response clarity classificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Konstantinos Thomas

Giorgos Filandrianos

Maria Lymperaiou

Chrysoula Zerva

Giorgos Stamou

177

20 Sep 2024

Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

Sourav Verma

RALM 3DV

259

20 Sep 2024

Towards LifeSpan Cognitive Systems

Yu Wang

...

Wei Wang

Heng Ji

Julian McAuley

KELM CLL

995

20 Sep 2024