Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

8 October 2025

Wafaa Mohammed

Vlad Niculae

Chrysoula Zerva

ArXiv (abs)PDF HTML

Main:8 Pages

14 Figures

Bibliography:6 Pages

14 Tables

Appendix:9 Pages

Abstract

Large language models (LLMs) have emerged as strong contenders in machinethis http URL, they still struggle to adequately handle discourse phenomena, such as pronoun resolution and lexical cohesion at the document level. In this study, we thoroughly investigate the discourse phenomena performance of LLMs in context-aware translation. We demonstrate that discourse knowledge is encoded within LLMs and propose the use of quality-aware decoding (QAD) to effectively extract this knowledge, showcasing its superiority over other decoding approaches through comprehensive analysis. Furthermore, we illustrate that QAD enhances the semantic richness of translations and aligns them more closely with human preferences.

View on arXiv

Comments on this paper