ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.01610
  4. Cited By
Answers Unite! Unsupervised Metrics for Reinforced Summarization Models

Answers Unite! Unsupervised Metrics for Reinforced Summarization Models

4 September 2019
Thomas Scialom
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
ArXivPDFHTML

Papers citing "Answers Unite! Unsupervised Metrics for Reinforced Summarization Models"

36 / 36 papers shown
Title
Summarization Metrics for Spanish and Basque: Do Automatic Scores and LLM-Judges Correlate with Humans?
Summarization Metrics for Spanish and Basque: Do Automatic Scores and LLM-Judges Correlate with Humans?
Jeremy Barnes
Naiara Perez
Alba Bonet-Jover
Begoña Altuna
59
1
0
21 Mar 2025
SteLLA: A Structured Grading System Using LLMs with RAG
SteLLA: A Structured Grading System Using LLMs with RAG
Hefei Qiu
Brian White
Ashley Ding
Reinaldo Costa
Ali Hachem
Wei Ding
Ping Chen
AI4Ed
56
0
0
17 Jan 2025
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
37
7
0
03 Oct 2024
OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization
OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization
Yuchen Shen
Xiaojun Wan
25
9
0
27 Oct 2023
UMSE: Unified Multi-scenario Summarization Evaluation
UMSE: Unified Multi-scenario Summarization Evaluation
Shen Gao
Zhitao Yao
Chongyang Tao
Xiuying Chen
Pengjie Ren
Z. Ren
Zhumin Chen
30
5
0
26 May 2023
Annotating and Detecting Fine-grained Factual Errors for Dialogue
  Summarization
Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
Rongxin Zhu
Jianzhong Qi
Jey Han Lau
31
9
0
26 May 2023
Revisiting the Gold Standard: Grounding Summarization Evaluation with
  Robust Human Evaluation
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Yixin Liu
Alexander R. Fabbri
Pengfei Liu
Yilun Zhao
Linyong Nan
...
Simeng Han
Shafiq R. Joty
Chien-Sheng Wu
Caiming Xiong
Dragomir R. Radev
ALM
10
132
0
15 Dec 2022
RQUGE: Reference-Free Metric for Evaluating Question Generation by
  Answering the Question
RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question
Alireza Mohammadshahi
Thomas Scialom
Majid Yazdani
Pouya Yanki
Angela Fan
James Henderson
Marzieh Saeidi
26
20
0
02 Nov 2022
Just ClozE! A Novel Framework for Evaluating the Factual Consistency
  Faster in Abstractive Summarization
Just ClozE! A Novel Framework for Evaluating the Factual Consistency Faster in Abstractive Summarization
Yiyang Li
Lei Li
Marina Litvak
N. Vanetik
Dingxing Hu
Yuze Li
Yanquan Zhou
HILM
32
0
0
06 Oct 2022
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation
  of Story Generation
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation
Cyril Chhun
Pierre Colombo
Chloé Clavel
Fabian M. Suchanek
53
50
0
24 Aug 2022
An Empirical Survey on Long Document Summarization: Datasets, Models and
  Metrics
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
73
122
0
03 Jul 2022
Conditional Generation with a Question-Answering Blueprint
Conditional Generation with a Question-Answering Blueprint
Shashi Narayan
Joshua Maynez
Reinald Kim Amplayo
Kuzman Ganchev
Annie Louis
Fantine Huot
Anders Sandholm
Dipanjan Das
Mirella Lapata
54
47
0
01 Jul 2022
Repro: An Open-Source Library for Improving the Reproducibility and
  Usability of Publicly Available Research Code
Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code
Daniel Deutsch
Dan Roth
AI4CE
37
2
0
29 Apr 2022
Evaluation of Automatic Text Summarization using Synthetic Facts
Evaluation of Automatic Text Summarization using Synthetic Facts
J. Ahn
Foaad Khosmood
HILM
11
0
0
11 Apr 2022
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
23
294
0
22 Sep 2021
Investigating Crowdsourcing Protocols for Evaluating the Factual
  Consistency of Summaries
Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries
Xiangru Tang
Alexander R. Fabbri
Haoran Li
Ziming Mao
Griffin Adams
Borui Wang
Asli Celikyilmaz
Yashar Mehdad
Dragomir R. Radev
HILM
13
19
0
19 Sep 2021
Compression, Transduction, and Creation: A Unified Framework for
  Evaluating Natural Language Generation
Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation
Mingkai Deng
Bowen Tan
Zhengzhong Liu
Eric P. Xing
Zhiting Hu
16
72
0
14 Sep 2021
Factual Consistency Evaluation for Text Summarization via Counterfactual
  Estimation
Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
Yuexiang Xie
Fei Sun
Yang Deng
Yaliang Li
Bolin Ding
HILM
10
53
0
30 Aug 2021
QACE: Asking Questions to Evaluate an Image Caption
QACE: Asking Questions to Evaluate an Image Caption
Hwanhee Lee
Thomas Scialom
Seunghyun Yoon
Franck Dernoncourt
Kyomin Jung
CoGe
17
18
0
28 Aug 2021
BookSum: A Collection of Datasets for Long-form Narrative Summarization
BookSum: A Collection of Datasets for Long-form Narrative Summarization
Wojciech Kry'sciñski
Nazneen Rajani
Divyansh Agarwal
Caiming Xiong
Dragomir R. Radev
RALM
19
145
0
18 May 2021
Towards Human-Free Automatic Quality Evaluation of German Summarization
Towards Human-Free Automatic Quality Evaluation of German Summarization
Neslihan Iskender
Oleg V. Vasilyev
Tim Polzehl
John Bohannon
Sebastian Möller
21
1
0
13 May 2021
The Summary Loop: Learning to Write Abstractive Summaries Without
  Examples
The Summary Loop: Learning to Write Abstractive Summaries Without Examples
Philippe Laban
Andrew Hsi Bloomberg
John F. Canny
Marti A. Hearst
17
56
0
11 May 2021
A Token-level Reference-free Hallucination Detection Benchmark for
  Free-form Text Generation
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
219
143
0
18 Apr 2021
What's in a Summary? Laying the Groundwork for Advances in
  Hospital-Course Summarization
What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization
Griffin Adams
Emily Alsentzer
Mert Ketenci
Jason Zucker
Noémie Elhadad
35
46
0
12 Apr 2021
QuestEval: Summarization Asks for Fact-based Evaluation
QuestEval: Summarization Asks for Fact-based Evaluation
Thomas Scialom
Paul-Alexis Dray
Patrick Gallinari
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
Alex Jinpeng Wang
HILM
11
267
0
23 Mar 2021
Towards Faithfulness in Open Domain Table-to-text Generation from an
  Entity-centric View
Towards Faithfulness in Open Domain Table-to-text Generation from an Entity-centric View
Tianyu Liu
Xin Zheng
Baobao Chang
Zhifang Sui
119
35
0
17 Feb 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
248
285
0
02 Feb 2021
PARENTing via Model-Agnostic Reinforcement Learning to Correct
  Pathological Behaviors in Data-to-Text Generation
PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation
Clément Rebuffel
Laure Soulier
Geoffrey Scoutheeten
Patrick Gallinari
6
9
0
21 Oct 2020
SummEval: Re-evaluating Summarization Evaluation
SummEval: Re-evaluating Summarization Evaluation
Alexander R. Fabbri
Wojciech Kry'sciñski
Bryan McCann
Caiming Xiong
R. Socher
Dragomir R. Radev
HILM
38
687
0
24 Jul 2020
SueNes: A Weakly Supervised Approach to Evaluating Single-Document
  Summarization via Negative Sampling
SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling
F. S. Bao
Hebi Li
Ge Luo
Minghui Qiu
Yinfei Yang
Youbiao He
Cen Chen
16
4
0
13 May 2020
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for
  Multi-Document Summarization
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
Yang Gao
Wei-Ye Zhao
Steffen Eger
ELM
16
124
0
07 May 2020
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven
  Cloze Reward
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward
Luyang Huang
Lingfei Wu
Lu Wang
RALM
24
161
0
03 May 2020
MLSUM: The Multilingual Summarization Corpus
MLSUM: The Multilingual Summarization Corpus
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
17
172
0
30 Apr 2020
Asking and Answering Questions to Evaluate the Factual Consistency of
  Summaries
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
Alex Jinpeng Wang
Kyunghyun Cho
M. Lewis
HILM
10
468
0
08 Apr 2020
Fill in the BLANC: Human-free quality estimation of document summaries
Fill in the BLANC: Human-free quality estimation of document summaries
Oleg V. Vasilyev
Vedant Dharnidharka
John Bohannon
3DH
31
116
0
23 Feb 2020
CTRL: A Conditional Transformer Language Model for Controllable
  Generation
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. Keskar
Bryan McCann
L. Varshney
Caiming Xiong
R. Socher
AI4CE
52
1,232
0
11 Sep 2019
1