Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.02409
Cited By
Dodo: Dynamic Contextual Compression for Decoder-only LMs
3 October 2023
Guanghui Qin
Corby Rosset
Ethan C. Chau
Nikhil Rao
Benjamin Van Durme
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dodo: Dynamic Contextual Compression for Decoder-only LMs"
13 / 13 papers shown
Title
KV-Distill: Nearly Lossless Learnable Context Compression for LLMs
Vivek Chari
Guanghui Qin
Benjamin Van Durme
VLM
66
0
0
13 Mar 2025
Training Plug-n-Play Knowledge Modules with Deep Context Distillation
Lucas Page-Caccia
Alan Ansell
E. Ponti
Ivan Vulić
Alessandro Sordoni
SyDa
114
0
0
11 Mar 2025
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning
Giulio Corallo
Orion Weller
Fabio Petroni
Paolo Papotti
MQ
VLM
49
0
0
06 Mar 2025
Lost in the Passage: Passage-level In-context Learning Does Not Necessarily Need a "Passage"
Hao Sun
Chenming Tang
Gengyang Li
Yunfang Wu
AIMat
45
0
0
15 Feb 2025
Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models
Zhisong Zhang
Yan Wang
Xinting Huang
Tianqing Fang
H. Zhang
Chenlong Deng
Shuaiyi Li
Dong Yu
75
2
0
21 Dec 2024
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations
Jeffrey Cheng
Benjamin Van Durme
LRM
69
24
0
17 Dec 2024
CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation
Abe Bohan Hou
Orion Weller
Guanghui Qin
Eugene Yang
Dawn J Lawrie
Nils Holzenberger
Andrew Blair-Stanek
Benjamin Van Durme
AILaw
ELM
63
5
0
24 Jun 2024
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Amanda Bertsch
Uri Alon
Graham Neubig
Matthew R. Gormley
RALM
94
122
0
02 May 2023
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
242
1,071
0
05 Oct 2022
ABC: Attention with Bounded-memory Control
Hao Peng
Jungo Kasai
Nikolaos Pappas
Dani Yogatama
Zhaofeng Wu
Lingpeng Kong
Roy Schwartz
Noah A. Smith
61
22
0
06 Oct 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
242
695
0
27 Aug 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,986
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
2,009
0
28 Jul 2020
1