Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.04696
Cited By
BLEURT: Learning Robust Metrics for Text Generation
9 April 2020
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BLEURT: Learning Robust Metrics for Text Generation"
50 / 206 papers shown
Title
Natural Language Generation in Healthcare: A Review of Methods and Applications
Mengxian Lyu
Xiaohan Li
Ziyi Chen
Jinqian Pan
Cheng Peng
Sankalp Talankar
Yonghui Wu
LM&MA
38
0
0
07 May 2025
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Dylan Bouchard
Mohit Singh Chauhan
HILM
70
0
0
27 Apr 2025
LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline
Biao Fu
Minpeng Liao
Kai Fan
Chengxi Li
L. Zhang
Yidong Chen
Xiaodong Shi
OffRL
76
1
0
13 Apr 2025
FUSE : A Ridge and Random Forest-Based Metric for Evaluating MT in Indigenous Languages
Rahul Raja
A. Vats
34
1
0
28 Mar 2025
Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems
Jędrzej Warczyński
Mateusz Lango
Ondrej Dusek
36
0
0
28 Feb 2025
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
Zhaopeng Feng
Jiayuan Su
Jiamei Zheng
Jiahan Ren
Yan Zhang
Jian Wu
Hongwei Wang
Zuozhu Liu
ELM
201
0
0
21 Feb 2025
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
Shreya Shukla
Jose Torres
Abhijit Mishra
Jacek Gwizdka
Shounak Roychowdhury
43
0
0
20 Feb 2025
Learning to Substitute Words with Model-based Score Ranking
Hongye Liu
Ricardo Henao
41
0
0
09 Feb 2025
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu
Yao Chen
Zeyi Wen
Weiguo Liu
Bingsheng He
64
1
0
02 Feb 2025
MDEval: Evaluating and Enhancing Markdown Awareness in Large Language Models
Zhongpu Chen
Y. Liu
Long Shi
Zhi-Jie Wang
Xingyan Chen
Yu Zhao
Fuji Ren
43
0
0
28 Jan 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
28
1
0
28 Jan 2025
BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation
Suvodip Dey
M. Desarkar
OffRL
41
0
0
20 Jan 2025
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
122
0
0
20 Jan 2025
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Ziyan Jiang
Rui Meng
Xinyi Yang
Semih Yavuz
Yingbo Zhou
Wenhu Chen
MLLM
VLM
51
18
0
03 Jan 2025
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
Sheikh Shafayat
Dongkeun Yoon
Woori Jang
Jiwoo Choi
Alice H. Oh
Seohyon Jung
91
1
0
03 Jan 2025
LLM-based Translation Inference with Iterative Bilingual Understanding
Andong Chen
Kehai Chen
Yang Xiang
Xuefeng Bai
Muyun Yang
Yang Feng
T. Zhao
Min Zhang
LRM
82
5
0
31 Dec 2024
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
108
63
0
25 Nov 2024
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
31
1
0
08 Nov 2024
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
Ran Zhang
Wei-Ye Zhao
Steffen Eger
71
4
0
24 Oct 2024
EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs
Yijie Li
Yuan Sun
ELM
31
0
0
13 Oct 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
32
2
0
12 Oct 2024
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
H. Xia
Zhengbang Yang
Junbo Zou
Rhys Tracy
Yuqing Wang
...
Xun Shao
Zhuoqing Xie
Yuan-fang Wang
Weining Shen
Hanjie Chen
ReLM
LRM
ELM
31
2
0
11 Oct 2024
What do Large Language Models Need for Machine Translation Evaluation?
Shenbin Qian
Archchana Sindhujan
Minnie Kabra
Diptesh Kanojia
Constantin Orasan
Tharindu Ranasinghe
Frédéric Blain
ELM
LRM
ALM
LM&MA
26
0
0
04 Oct 2024
Better Instruction-Following Through Minimum Bayes Risk
Ian Wu
Patrick Fernandes
Amanda Bertsch
Seungone Kim
Sina Pakazad
Graham Neubig
48
9
0
03 Oct 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
86
5
0
13 Sep 2024
An Efficient Sign Language Translation Using Spatial Configuration and Motion Dynamics with LLMs
Eui Jun Hwang
Sukmin Cho
Junmyeong Lee
Jong C. Park
SLR
66
4
0
20 Aug 2024
Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts
Jiaqing Liu
Chong Deng
Qinglin Zhang
Shilin Zhou
Hai Yu
Hai Yu
Wen Wang
34
0
0
19 Aug 2024
StyEmp: Stylizing Empathetic Response Generation via Multi-Grained Prefix Encoder and Personality Reinforcement
Yahui Fu
Chenhui Chu
Tatsuya Kawahara
29
2
0
05 Aug 2024
Don't Throw Away Data: Better Sequence Knowledge Distillation
Jun Wang
Eleftheria Briakou
Hamid Dadkhahi
Rishabh Agarwal
Colin Cherry
Trevor Cohn
39
5
0
15 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
29
18
0
08 Jul 2024
On Speeding Up Language Model Evaluation
Jin Peng Zhou
Christian K. Belardi
Ruihan Wu
Travis Zhang
Carla P. Gomes
Wen Sun
Kilian Q. Weinberger
48
1
0
08 Jul 2024
MINDECHO: Role-Playing Language Agents for Key Opinion Leaders
Rui Xu
Dakuan Lu
Xiaoyu Tan
Xintao Wang
Siyu Yuan
Jiangjie Chen
Wei Chu
Xu Yinghui
LLMAG
29
3
0
07 Jul 2024
Sentence-level Aggregation of Lexical Metrics Correlates Stronger with Human Judgements than Corpus-level Aggregation
Paulo Cavalin
P. Domingues
Claudio S. Pinhanez
29
0
0
03 Jul 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
50
13
0
24 Jun 2024
SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading
Tu Anh Dinh
Carlos Mullov
Leonard Barmann
Zhaolin Li
Danni Liu
...
Michael Beigl
Rainer Stiefelhagen
Carsten Dachsbacher
Klemens Bohm
Jan Niehues
ELM
35
8
0
14 Jun 2024
DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms
Andong Chen
Lianzhang Lou
Kehai Chen
Xuefeng Bai
Yang Xiang
Muyun Yang
Tiejun Zhao
Min Zhang
VLM
35
12
0
11 Jun 2024
Evaluating Durability: Benchmark Insights into Multimodal Watermarking
Jielin Qiu
William Jongwon Han
Xuandong Zhao
Shangbang Long
Christos Faloutsos
Lei Li
51
1
0
06 Jun 2024
The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches
Bhashithe Abeysinghe
Ruhan Circi
ELM
29
21
0
05 Jun 2024
Large Language Models as Evaluators for Recommendation Explanations
Xiaoyu Zhang
Yishan Li
Jiayin Wang
Bowen Sun
Weizhi Ma
Peijie Sun
Min Zhang
LRM
ELM
35
12
0
05 Jun 2024
Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation
Hao Li
Yuping Wu
Viktor Schlegel
R. Batista-Navarro
Tharindu Madusanka
...
Jiayan Zeng
Xiaochi Wang
Xinran He
Yizhi Li
Goran Nenadic
31
6
0
05 Jun 2024
XRec: Large Language Models for Explainable Recommendation
Qiyao Ma
Xubin Ren
Chao Huang
LRM
29
17
0
04 Jun 2024
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
Tianlong Wang
Xianfeng Jiao
Yifan He
Zhongzhi Chen
Yinghao Zhu
Xu Chu
Junyi Gao
Yasha Wang
Liantao Ma
LLMSV
59
7
0
26 May 2024
What Have We Achieved on Non-autoregressive Translation?
Yafu Li
Huajian Zhang
Jianhao Yan
Yongjing Yin
Yue Zhang
29
1
0
21 May 2024
Chasing COMET: Leveraging Minimum Bayes Risk Decoding for Self-Improving Machine Translation
Kamil Guttmann
Miko Pokrywka
Adrian Charkiewicz
Artur Nowakowski
58
3
0
20 May 2024
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
Minghao Wu
Jiahao Xu
Yulin Yuan
Gholamreza Haffari
Longyue Wang
Weihua Luo
Kaifu Zhang
LLMAG
114
22
0
20 May 2024
PromptMind Team at MEDIQA-CORR 2024: Improving Clinical Text Correction with Error Categorization and LLM Ensembles
Kesav Gundabathula
Sriram R Kolar
LRM
30
7
0
14 May 2024
Efficient Data Generation for Source-grounded Information-seeking Dialogs: A Use Case for Meeting Transcripts
Lotem Golany
Filippo Galgani
Maya Mamo
Nimrod Parasol
Omer Vandsburger
Nadav Bar
Ido Dagan
27
2
0
02 May 2024
MediFact at MEDIQA-CORR 2024: Why AI Needs a Human Touch
Nadia Saeed
17
1
0
27 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
67
45
0
23 Apr 2024
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
41
42
0
15 Apr 2024
1
2
3
4
5
Next