v1v2 (latest)

CogCanvas: Verbatim-Grounded Artifact Extraction for Long LLM Conversations

23 December 2025

Tao An

ArXiv (abs)PDF HTML Github (20537★)

Main:11 Pages

5 Figures

Bibliography:2 Pages

11 Tables

Appendix:5 Pages

Abstract

Conversation summarization loses nuanced details: when asked about coding preferences after 40 turns, summarization recalls "use type hints" but drops the critical constraint "everywhere" (19.0% exact match vs. 93.0% for our approach).We present CogCanvas, a training-free framework inspired by how teams use whiteboards to anchor shared memory. Rather than compressing conversation history, CogCanvas extracts verbatim-grounded artifacts (decisions, facts, reminders) and retrieves them via temporal-aware graph.On the LoCoMo benchmark (all 10 conversations from the ACL 2024 release), CogCanvas achieves the highest overall accuracy among training-free methods (32.4%), outperforming RAG (24.6%) by +7.8pp, with decisive advantages on complex reasoning tasks: +20.6pp on temporal reasoning (32.7% vs. 12.1% RAG) and +1.1pp on multi-hop questions (41.7% vs. 40.6% RAG). CogCanvas also leads on single-hop retrieval (26.6% vs. 24.6% RAG). Ablation studies reveal that BGE reranking contributes +7.7pp, making it the largest contributor to CogCanvas's performance.While heavily-optimized approaches achieve higher absolute scores through dedicated training (EverMemOS: ~92%), our training-free approach provides practitioners with an immediately-deployable alternative that significantly outperforms standard baselines. Code and data:this https URL

View on arXiv

Comments on this paper