Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

28 June 2024

Yang Song

Hengshu Zhu

Rui Yan

Papers citing "Mixture of In-Context Experts Enhance LLMs' Long Context Awareness"

10 / 10 papers shown

Title
BiGSCoder: State Space Model for Code Understanding Shweta Verma Abhinav Anand Mira Mezini Mamba 36 0 0 02 May 2025
More Expressive Attention with Negative Weights Ang Lv Ruobing Xie Shuaipeng Li Jiayi Liao X. Sun Zhanhui Kang Di Wang Rui Yan 30 0 0 11 Nov 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation Chuanyang Zheng Yihang Gao Han Shi Jing Xiong Jiankai Sun ... Xiaozhe Ren Michael Ng Xin Jiang Zhenguo Li Yu Li 26 1 0 07 Oct 2024
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead Tao Tan Yining Qian Ang Lv Hongzhan Lin Songhao Wu Yongbo Wang Feng Wang Jingtong Wu Xin Lu Rui Yan 22 1 0 29 Sep 2024
Language Models "Grok" to Copy Ang Lv Ruobing Xie Xingwu Sun Zhanhui Kang Rui Yan LLMAG 26 1 0 14 Sep 2024
StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses Jia-Nan Li Quan Tu Cunli Mao Zhengtao Yu Ji-Rong Wen Rui Yan OffRL 19 3 0 13 Mar 2024
Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory Xin Cheng Di Luo Xiuying Chen Lemao Liu Dongyan Zhao Rui Yan RALM 142 86 0 03 May 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang Alexandre Variengien Arthur Conmy Buck Shlegeris Jacob Steinhardt 210 486 0 01 Nov 2022
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 240 453 0 24 Sep 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation Ofir Press Noah A. Smith M. Lewis 237 690 0 27 Aug 2021