Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.06034
Cited By
Better Summarization Evaluation with Word Embeddings for ROUGE
25 August 2015
Jun-Ping Ng
Viktoria Abrecht
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Better Summarization Evaluation with Word Embeddings for ROUGE"
50 / 81 papers shown
Title
Evaluation Should Not Ignore Variation: On the Impact of Reference Set Choice on Summarization Metrics
Silvia Casola
Yang Liu
Siyao Peng
Oliver Kraus
Albert Gatt
Barbara Plank
27
0
0
17 Jun 2025
LecEval: An Automated Metric for Multimodal Knowledge Acquisition in Multimedia Learning
Joy Lim Jia Yin
Daniel Zhang-Li
Jifan Yu
Haoyang Li
Shangqing Tu
...
Zhiyuan Liu
Huiqin Liu
Lei Hou
Juanzi Li
Bin Xu
83
0
0
04 May 2025
Summarization Metrics for Spanish and Basque: Do Automatic Scores and LLM-Judges Correlate with Humans?
Jeremy Barnes
Naiara Perez
Alba Bonet-Jover
Begoña Altuna
110
2
0
21 Mar 2025
ProMRVL-CAD: Proactive Dialogue System with Multi-Round Vision-Language Interactions for Computer-Aided Diagnosis
Xueshen Li
Xinlong Hou
Ziyi Huang
Yu Gan
LM&MA
MedIm
98
0
0
15 Feb 2025
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
184
11
0
03 Oct 2024
Model-based Preference Optimization in Abstractive Summarization without Human Feedback
Jaepill Choi
Kyubyung Chae
Jiwoo Song
Yohan Jo
Taesup Kim
68
2
0
27 Sep 2024
Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
Roshan S. Sharma
Suwon Shon
Mark Lindsey
Hira Dhamyal
Rita Singh
Bhiksha Raj
107
1
0
12 Aug 2024
Rethinking Transformer-based Multi-document Summarization: An Empirical Investigation
Congbo Ma
Wei Emma Zhang
Dileepa Pitawela
Haojie Zhuang
Yanfeng Shu
58
0
0
16 Jul 2024
Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges
Jonas Becker
Jan Philip Wahle
Bela Gipp
Terry Ruas
122
11
0
24 May 2024
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
Abhinav Agarwalla
Abhay Gupta
Alexandre Marques
Shubhra Pandit
Michael Goin
...
Tuan Nguyen
Mahmoud Salem
Dan Alistarh
Sean Lie
Mark Kurtz
MoE
SyDa
142
11
0
06 May 2024
ROUGE-K: Do Your Summaries Have Keywords?
Sotaro Takeshita
Simone Paolo Ponzetto
Kai Eckert
73
1
0
08 Mar 2024
Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence
Yinhong Liu
Yixuan Su
Ehsan Shareghi
Nigel Collier
89
4
0
15 Feb 2024
LUNA: A Framework for Language Understanding and Naturalness Assessment
Marat Saidov
A. Bakalova
Ekaterina Taktasheva
Vladislav Mikhailov
Ekaterina Artemova
ELM
75
2
0
09 Jan 2024
Comparative Experimentation of Accuracy Metrics in Automated Medical Reporting: The Case of Otitis Consultations
Wouter Faber
Renske Eline Bootsma
Tom Huibers
S. Dulmen
S. Brinkkemper
36
1
0
22 Nov 2023
Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects -- A Survey
Ashok Urlana
Pruthwik Mishra
Tathagato Roy
Rahul Mishra
78
11
0
15 Nov 2023
Generative Judge for Evaluating Alignment
Junlong Li
Shichao Sun
Weizhe Yuan
Run-Ze Fan
Hai Zhao
Pengfei Liu
ELM
ALM
119
91
0
09 Oct 2023
Automatic Personalized Impression Generation for PET Reports Using Large Language Models
Xin Tie
Muheon Shin
Ali Pirasteh
Nevein Ibrahim
Zachary Huemann
...
K. M. Kelly
John W. Garrett
Junjie Hu
Steve Y. Cho
Tyler Bradshaw
LM&MA
122
10
0
18 Sep 2023
Redundancy Aware Multi-Reference Based Gainwise Evaluation of Extractive Summarization
Mousumi Akter
Shubhra (Santu) Karmaker
65
1
0
04 Aug 2023
MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types
K. Murugesan
Sarathkrishna Swaminathan
Soham Dan
Subhajit Chaudhury
Chulaka Gunasekara
...
Ibrahim Abdelaziz
Achille Fokoue
Pavan Kapanipathi
Salim Roukos
Alexander G. Gray
96
5
0
18 Jun 2023
UMSE: Unified Multi-scenario Summarization Evaluation
Shen Gao
Zhitao Yao
Chongyang Tao
Preslav Nakov
Fajie Yuan
Zhaochun Ren
Zhumin Chen
91
5
0
26 May 2023
Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks
Xiao Pu
Mingqi Gao
Xiaojun Wan
ELM
93
4
0
24 May 2023
Evaluating Evaluation Metrics: A Framework for Analyzing NLG Evaluation Metrics using Measurement Theory
Ziang Xiao
Susu Zhang
Vivian Lai
Q. V. Liao
ELM
117
30
0
24 May 2023
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method
Yiming Wang
Zhuosheng Zhang
Rui Wang
117
88
0
22 May 2023
On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection
Fatma Elsafoury
Stamos Katsigiannis
79
1
0
22 May 2023
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
Anas Himmi
Ekhine Irurozki
Nathan Noiry
Stephan Clémençon
Pierre Colombo
196
9
0
17 May 2023
SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism
Mehwish Fatima
Tim Kolber
K. Markert
Michael Strube
41
0
0
04 Apr 2023
Lay Text Summarisation Using Natural Language Processing: A Narrative Literature Review
Oliver Vinzelberg
M. Jenkins
Gordon Morison
David McMinn
Z. Tieges
61
6
0
24 Mar 2023
Curriculum-Guided Abstractive Summarization
Sajad Sotudeh
Hanieh Deilamsalehy
Franck Dernoncourt
Nazli Goharian
90
2
0
02 Feb 2023
A comprehensive review of automatic text summarization techniques: method, data, evaluation and coding
D. Cajueiro
A. G. Nery
Igor Tavares
Maísa Kely de Melo
Silvia A. dos Reis
Weigang Li
V. R. R. Celestino
88
15
0
04 Jan 2023
Towards Abstractive Timeline Summarisation using Preference-based Reinforcement Learning
Yuxuan Ye
Edwin Simpson
36
0
0
14 Nov 2022
How Far are We from Robust Long Abstractive Summarization?
Huan Yee Koh
Jiaxin Ju
He Zhang
Ming Liu
Shirui Pan
HILM
113
40
0
30 Oct 2022
Towards Interpretable Summary Evaluation via Allocation of Contextual Embeddings to Reference Text Topics
Ben Schaper
Christopher Lohse
Marcell Streile
Andrea Giovannini
Richard Osuala
52
1
0
25 Oct 2022
DATScore: Evaluating Translation with Data Augmented Translations
Moussa Kamal Eddine
Guokan Shang
Michalis Vazirgiannis
73
5
0
12 Oct 2022
WikiDes: A Wikipedia-Based Dataset for Generating Short Descriptions from Paragraphs
Hoang Thang Ta
Abu Bakar Siddiqur Rahman
Navonil Majumder
Amir Hussain
Lotfollah Najjar
N. Howard
Soujanya Poria
Alexander Gelbukh
83
11
0
27 Sep 2022
Text Summarization with Oracle Expectation
Yumo Xu
Mirella Lapata
VLM
68
4
0
26 Sep 2022
The Glass Ceiling of Automatic Evaluation in Natural Language Generation
Pierre Colombo
Maxime Peyrard
Nathan Noiry
Robert West
Pablo Piantanida
216
11
0
31 Aug 2022
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation
Cyril Chhun
Pierre Colombo
Chloé Clavel
Fabian M. Suchanek
191
55
0
24 Aug 2022
Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm
Alicia Y. Tsai
Laurent El Ghaoui
31
1
0
19 Aug 2022
SMART: Sentences as Basic Units for Text Evaluation
Reinald Kim Amplayo
Peter J. Liu
Yao-Min Zhao
Shashi Narayan
79
22
0
01 Aug 2022
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
149
128
0
03 Jul 2022
MentSum: A Resource for Exploring Summarization of Mental Health Online Posts
Sajad Sotudeh
Nazli Goharian
Zachary Young
AI4MH
71
13
0
02 Jun 2022
A global analysis of metrics used for measuring performance in natural language processing
Kathrin Blagec
Georg Dorffner
M. Moradi
Simon Ott
Matthias Samwald
95
28
0
25 Apr 2022
Towards Explainable Evaluation Metrics for Natural Language Generation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei Zhao
Yang Gao
Steffen Eger
AAML
ELM
76
20
0
21 Mar 2022
What are the best systems? New perspectives on NLP Benchmarking
Pierre Colombo
Nathan Noiry
Ekhine Irurozki
Stephan Clémençon
205
42
0
08 Feb 2022
DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence
Wei Zhao
Michael Strube
Steffen Eger
121
38
0
26 Jan 2022
WIDAR -- Weighted Input Document Augmented ROUGE
Raghav Jain
Vaibhav Mavi
Anubhav Jangra
S. Saha
61
4
0
23 Jan 2022
Multi-Narrative Semantic Overlap Task: Evaluation and Benchmark
Naman Bansal
Mousumi Akter
Shubhra (Santu) Karmaker
84
0
0
14 Jan 2022
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation
Pierre Colombo
Chloe Clave
Pablo Piantanida
137
44
0
02 Dec 2021
Better than Average: Paired Evaluation of NLP Systems
Maxime Peyrard
Wei Zhao
Steffen Eger
Robert West
ELM
114
26
0
20 Oct 2021
Using Natural Language Processing to Understand Reasons and Motivators Behind Customer Calls in Financial Domain
Ankit Patil
Ankush Chopra
Sohom Ghosh
Vamshi Vadla
AI4TS
42
1
0
18 Oct 2021
1
2
Next