Make Your LLM Fully Utilize the Context

Make Your LLM Fully Utilize the Context

25 April 2024

Papers citing "Make Your LLM Fully Utilize the Context"

14 / 14 papers shown

Title
Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models Matthew Dahl AILaw ELM 45 0 0 05 May 2025
Shifting Long-Context LLMs Research from Input to Output Yuhao Wu Yushi Bai Zhiqing Hu Shangqing Tu Ming Shan Hee Juanzi Li Roy Ka-Wei Lee 57 0 0 06 Mar 2025
LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm Siwei Wu Y. Li Xingwei Qu Rishi Ravikumar Y. Li Tyler Loakman Shanghaoran Quan Xiaoyong Wei Shanghaoran Quan Xiaoyong Wei R. Batista-Navarro C. Lin 41 2 0 26 Feb 2025
Parallel Key-Value Cache Fusion for Position Invariant RAG Philhoon Oh Jinwoo Shin James Thorne 3DV 47 0 0 13 Jan 2025
Lost-in-Distance: Impact of Contextual Proximity on LLM Performance in Graph Tasks Hamed Firooz Maziar Sanjabi Wenlong Jiang Xiaoling Zhai 41 3 0 03 Jan 2025
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Jonathan Roberts Kai Han Samuel Albanie LLMAG 60 0 0 07 Nov 2024
On the Loss of Context-awareness in General Instruction Fine-tuning Yihan Wang Andrew Bai Nanyun Peng Cho-Jui Hsieh 48 1 0 05 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling? Lizhe Fang Yifei Wang Zhaoyang Liu Chenheng Zhang Stefanie Jegelka Jinyang Gao Bolin Ding Yisen Wang 36 4 0 31 Oct 2024
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following Gabrielle Kaili-May Liu Bowen Shi Avi Caciularu Idan Szpektor Arman Cohan 58 3 0 30 Oct 2024
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Haotian Tang Yecheng Wu Shang Yang Enze Xie Junsong Chen Junyu Chen Zhuoyang Zhang Han Cai Y. Lu Song Han 46 32 0 14 Oct 2024
How to Train Long-Context Language Models (Effectively) Tianyu Gao Alexander Wettig Howard Yen Danqi Chen RALM 42 36 0 03 Oct 2024
Where is the answer? Investigating Positional Bias in Language Model Knowledge Extraction Kuniaki Saito Kihyuk Sohn Chen-Yu Lee Yoshitaka Ushiku 36 2 0 16 Feb 2024
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation Ofir Press Noah A. Smith M. Lewis 234 690 0 27 Aug 2021
Efficient Content-Based Sparse Attention with Routing Transformers Aurko Roy M. Saffar Ashish Vaswani David Grangier MoE 228 502 0 12 Mar 2020