DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory

10 October 2024

Yutong Wang

Jiali Zeng

Xuebo Liu

Derek F. Wong

Fandong Meng

Jie Zhou

Min Zhang

ArXiv PDF HTML

Abstract

Large language models (LLMs) have achieved reasonable quality improvements in machine translation (MT). However, most current research on MT-LLMs still faces significant challenges in maintaining translation consistency and accuracy when processing entire documents. In this paper, we introduce DelTA, a Document-levEL Translation Agent designed to overcome these limitations. DelTA features a multi-level memory structure that stores information across various granularities and spans, including Proper Noun Records, Bilingual Summary, Long-Term Memory, and Short-Term Memory, which are continuously retrieved and updated by auxiliary LLM-based components. Experimental results indicate that DelTA significantly outperforms strong baselines in terms of translation consistency and quality across four open/closed-source LLMs and two representative document translation datasets, achieving an increase in consistency scores by up to 4.58 percentage points and in COMET scores by up to 3.16 points on average. DelTA employs a sentence-by-sentence translation strategy, ensuring no sentence omissions and offering a memory-efficient solution compared to the mainstream method. Furthermore, DelTA improves pronoun and context-dependent translation accuracy, and the summary component of the agent also shows promise as a tool for query-based summarization tasks. The code and data of our approach are released atthis https URL.

View on arXiv

@article{wang2025_2410.08143,
  title={ DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory },
  author={ Yutong Wang and Jiali Zeng and Xuebo Liu and Derek F. Wong and Fandong Meng and Jie Zhou and Min Zhang },
  journal={arXiv preprint arXiv:2410.08143},
  year={ 2025 }
}

Comments on this paper