mbrs: A Library for Minimum Bayes Risk Decoding

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

8 August 2024

Hiroyuki Deguchi

Yusuke Sakai

Hidetaka Kamigaito

Taro Watanabe

ArXiv (abs)PDF HTML Github (37★)

Main:6 Pages

3 Figures

Bibliography:3 Pages

9 Tables

Appendix:3 Pages

Abstract

Minimum Bayes risk (MBR) decoding is a decision rule of text generation tasks that outperforms conventional maximum a posterior (MAP) decoding using beam search by selecting high-quality outputs based on a utility function rather than those with high-probability. Typically, it finds the most suitable hypothesis from the set of hypotheses under the sampled pseudo-references. mbrs is a library of MBR decoding, which can flexibly combine various metrics, alternative expectation estimations, and algorithmic variants. It is designed with a focus on speed measurement and calling count of code blocks, transparency, reproducibility, and extensibility, which are essential for researchers and developers. We published our mbrs as an MIT-licensed open-source project, and the code is available on GitHub. GitHub: https://github.com/naist-nlp/mbrs

View on arXiv

Comments on this paper