v1v2 (latest)

ERNIE-Doc: A Retrospective Long-Document Modeling Transformer

Annual Meeting of the Association for Computational Linguistics (ACL), 2021

31 December 2020

Papers citing "ERNIE-Doc: A Retrospective Long-Document Modeling Transformer"

25 / 25 papers shown

AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Ten Modalities via Language as a Reference FrameworkIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023

...

625

08 Jan 2025

Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document ClassificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

193

03 Oct 2024

NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference TimeAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Zhenyu Zhang

Yu Sun

287

07 Aug 2024

DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion

289

03 Jun 2024

Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification

Jungmin Yun

Mihyeon Kim

Youngbin Kim

339

03 Jun 2024

TransformerFAM: Feedback attention is working memory

486

14 Apr 2024

Labels Need Prompts Too: Mask Matching for Natural Language Understanding TasksAAAI Conference on Artificial Intelligence (AAAI), 2023

351

14 Dec 2023

Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey

...

502

114

21 Nov 2023

Large Language Models are legal but they are not: Making the case for a powerful LegalLLM

299

15 Nov 2023

Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs

Or Sharir

Anima Anandkumar

150

27 Jul 2023

S2vNTM: Semi-supervised vMF Neural Topic Modeling

Weijie Xu

Jay Desai

Srinivasan H. Sengamedu

Xiaoyu Jiang

Francis Iannacci

VLM

237

06 Jul 2023

Recurrent Action Transformer with Memory

473

15 Jun 2023

Neural Natural Language Processing for Long Texts: A Survey on Classification and SummarizationEngineering applications of artificial intelligence (Eng. Appl. Artif. Intell.), 2023

Dimitrios Tsirmpas

Ioannis Gkionis

Georgios Th. Papadopoulos

Ioannis Mademlis

AILaw AI4TS AI4CE

516

25 May 2023

A General-Purpose Multilingual Document Encoder

Onur Galoglu

Robert Litschko

Goran Glavaš

234

11 May 2023

Scaling Transformer to 1M tokens and beyond with RMT

422

114

19 Apr 2023

A Survey on Long Text Modeling with Transformers

430

28 Feb 2023

Processing Long Legal Documents with Pre-trained Transformers: Modding LegalBERT and Longformer

327

02 Nov 2022

Museformer: Transformer with Fine- and Coarse-Grained Attention for Music GenerationNeural Information Processing Systems (NeurIPS), 2022

Xu Tan

346

19 Oct 2022

Recurrent Memory TransformerNeural Information Processing Systems (NeurIPS), 2022

523

170

14 Jul 2022

Revisiting Transformer-based Models for Long Document ClassificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

281

14 Apr 2022

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

...

235

23 Dec 2021

PoNet: Pooling Network for Efficient Token Mixing in Long Sequences

Chao-Hong Tan

Qian Chen

Wen Wang

Qinglin Zhang

Siqi Zheng

Zhenhua Ling

ViT

251

06 Oct 2021

ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

...

Dianhai Yu

301

579

05 Jul 2021

A Survey of TransformersAI Open (AO), 2021

Tianyang Lin

Yuxin Wang

Xiangyang Liu

Xipeng Qiu

ViT

651

1,457

08 Jun 2021

Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question-Answering DataConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Dian Yu

Kai Sun

Dong Yu

Claire Cardie

222

01 Feb 2021