ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.14799
  4. Cited By
Evaluation of Text Generation: A Survey

Evaluation of Text Generation: A Survey

26 June 2020
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
    ELM
    LM&MA
ArXivPDFHTML

Papers citing "Evaluation of Text Generation: A Survey"

45 / 45 papers shown
Title
Evaluating and Mitigating Bias in AI-Based Medical Text Generation
Evaluating and Mitigating Bias in AI-Based Medical Text Generation
Xiuying Chen
Tairan Wang
Juexiao Zhou
Zirui Song
Xin Gao
X. Zhang
MedIm
37
0
0
24 Apr 2025
SCORE: Story Coherence and Retrieval Enhancement for AI Narratives
SCORE: Story Coherence and Retrieval Enhancement for AI Narratives
Qiang Yi
Yangfan He
J. Wang
Xinyuan Song
Shiyao Qian
...
K. Li
Kuan Lu
Menghao Huo
Jiaqi Chen
Tianyu Shi
RALM
42
6
0
30 Mar 2025
Natural Language Generation
Natural Language Generation
Emiel van Miltenburg
Chenghua Lin
36
2
0
20 Mar 2025
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
Esteban Garces Arias
Hannah Blocher
Julian Rodemann
Meimingwei Li
Christian Heumann
Matthias Aßenmacher
23
1
0
24 Oct 2024
4-LEGS: 4D Language Embedded Gaussian Splatting
4-LEGS: 4D Language Embedded Gaussian Splatting
Gal Fiebelman
Tamir Cohen
Ayellet Morgenstern
Peter Hedman
Hadar Averbuch-Elor
3DGS
33
1
0
14 Oct 2024
A Perspective on Literary Metaphor in the Context of Generative AI
A Perspective on Literary Metaphor in the Context of Generative AI
Imke van Heerden
Anil Bas
18
1
0
02 Sep 2024
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Asmar Nadeem
Faegheh Sardari
R. Dawes
Syed Sameed Husain
Adrian Hilton
Armin Mustafa
47
4
0
10 Jun 2024
Cracking the Code of Juxtaposition: Can AI Models Understand the
  Humorous Contradictions
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Zhe Hu
Tuo Liang
Jing Li
Yiren Lu
Yunlai Zhou
Yiran Qiao
Jing Ma
Yu Yin
36
4
0
29 May 2024
Automating Customer Needs Analysis: A Comparative Study of Large Language Models in the Travel Industry
Automating Customer Needs Analysis: A Comparative Study of Large Language Models in the Travel Industry
Simone Barandoni
F. Chiarello
Lorenzo Cascone
Emiliano Marrale
Salvatore Puccio
51
5
0
27 Apr 2024
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models
Yukyung Lee
Soonwon Ka
Bokyung Son
Pilsung Kang
Jaewook Kang
LLMAG
42
6
0
22 Apr 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
22
12
0
26 Jan 2024
Autocompletion of Chief Complaints in the Electronic Health Records
  using Large Language Models
Autocompletion of Chief Complaints in the Electronic Health Records using Large Language Models
K. M. S. Islam
A. S. Nipu
Praveen Madiraju
Priya Deshpande
LM&MA
29
6
0
11 Jan 2024
Metric Space Magnitude for Evaluating the Diversity of Latent Representations
Metric Space Magnitude for Evaluating the Diversity of Latent Representations
K. Limbeck
R. Andreeva
Rik Sarkar
Bastian Alexander Rieck
22
3
0
27 Nov 2023
Creating a silver standard for patent simplification
Creating a silver standard for patent simplification
Silvia Casola
A. Lavelli
Horacio Saggion
AILaw
14
3
0
24 Oct 2023
Semantic and Expressive Variation in Image Captions Across Languages
Semantic and Expressive Variation in Image Captions Across Languages
Andre Ye
Sebastin Santy
Jena D. Hwang
Amy X. Zhang
Ranjay Krishna
VLM
46
3
0
22 Oct 2023
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed
  Question Answering
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering
Pei Ke
Fei Huang
Fei Mi
Yasheng Wang
Qun Liu
Xiaoyan Zhu
Minlie Huang
ReLM
ELM
29
10
0
13 Jul 2023
A Critical Evaluation of Evaluations for Long-form Question Answering
A Critical Evaluation of Evaluations for Long-form Question Answering
Fangyuan Xu
Yixiao Song
Mohit Iyyer
Eunsol Choi
ELM
27
94
0
29 May 2023
Model-Based Simulation for Optimising Smart Reply
Model-Based Simulation for Optimising Smart Reply
Benjamin Towle
Ke Zhou
30
1
0
26 May 2023
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
Christoforos Vasilatos
Manaar Alam
Talal Rahwan
Yasir Zaki
Michail Maniatakos
DeLMO
32
32
0
26 May 2023
Similarity of Neural Network Models: A Survey of Functional and Representational Measures
Similarity of Neural Network Models: A Survey of Functional and Representational Measures
Max Klabunde
Tobias Schumacher
M. Strohmaier
Florian Lemmerich
45
63
0
10 May 2023
SkillQG: Learning to Generate Question for Reading Comprehension
  Assessment
SkillQG: Learning to Generate Question for Reading Comprehension Assessment
Xiaoqiang Wang
Bang Liu
Siliang Tang
Lingfei Wu
8
3
0
08 May 2023
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark
D. Ribeiro
Shen Wang
Xiaofei Ma
He Zhu
Rui Dong
...
William Yang Wang
Zhiheng Huang
George Karypis
Bing Xiang
Dan Roth
LRM
ReLM
12
23
0
13 Feb 2023
Music Playlist Title Generation Using Artist Information
Music Playlist Title Generation Using Artist Information
Haven Kim
Seungheon Doh
Junwon Lee
Juhan Nam
16
3
0
14 Jan 2023
MAUVE Scores for Generative Models: Theory and Practice
MAUVE Scores for Generative Models: Theory and Practice
Krishna Pillutla
Lang Liu
John Thickstun
Sean Welleck
Swabha Swayamdipta
Rowan Zellers
Sewoong Oh
Yejin Choi
Zaïd Harchaoui
EGVM
23
21
0
30 Dec 2022
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
Tianxing He
Jingyu Zhang
Tianle Wang
Sachin Kumar
Kyunghyun Cho
James R. Glass
Yulia Tsvetkov
25
44
0
20 Dec 2022
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
O. Yu. Golovneva
Moya Chen
Spencer Poff
Martin Corredor
Luke Zettlemoyer
Maryam Fazel-Zarandi
Asli Celikyilmaz
ReLM
LRM
13
137
0
15 Dec 2022
CREPE: Open-Domain Question Answering with False Presuppositions
CREPE: Open-Domain Question Answering with False Presuppositions
Xinyan Velocity Yu
Sewon Min
Luke Zettlemoyer
Hannaneh Hajishirzi
14
45
0
30 Nov 2022
Dialect-robust Evaluation of Generated Text
Dialect-robust Evaluation of Generated Text
Jiao Sun
Thibault Sellam
Elizabeth Clark
Tu Vu
Timothy Dozat
Dan Garrette
Aditya Siddhant
Jacob Eisenstein
Sebastian Gehrmann
13
19
0
02 Nov 2022
Unsupervised Sentence Textual Similarity with Compositional Phrase
  Semantics
Unsupervised Sentence Textual Similarity with Compositional Phrase Semantics
Zihao W. Wang
Jiaheng Dou
Yong Zhang
OT
19
4
0
05 Oct 2022
A Bayesian Bradley-Terry model to compare multiple ML algorithms on
  multiple data sets
A Bayesian Bradley-Terry model to compare multiple ML algorithms on multiple data sets
Jacques Wainer
11
10
0
09 Aug 2022
RankGen: Improving Text Generation with Large Ranking Models
RankGen: Improving Text Generation with Large Ranking Models
Kalpesh Krishna
Yapei Chang
John Wieting
Mohit Iyyer
AIMat
11
68
0
19 May 2022
Towards Explainable Evaluation Metrics for Natural Language Generation
Towards Explainable Evaluation Metrics for Natural Language Generation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei-Ye Zhao
Yang Gao
Steffen Eger
AAML
ELM
14
20
0
21 Mar 2022
E-KAR: A Benchmark for Rationalizing Natural Language Analogical
  Reasoning
E-KAR: A Benchmark for Rationalizing Natural Language Analogical Reasoning
Jiangjie Chen
Rui Xu
Ziquan Fu
Wei Shi
Zhongqiao Li
Xinbo Zhang
Changzhi Sun
Lei Li
Yanghua Xiao
Hao Zhou
ELM
21
35
0
16 Mar 2022
Probing BERT's priors with serial reproduction chains
Probing BERT's priors with serial reproduction chains
Takateru Yamakoshi
Thomas L. Griffiths
Robert D. Hawkins
18
12
0
24 Feb 2022
A Survey of Controllable Text Generation using Transformer-based
  Pre-trained Language Models
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
33
213
0
14 Jan 2022
Towards more patient friendly clinical notes through language models and
  ontologies
Towards more patient friendly clinical notes through language models and ontologies
Francesco Moramarco
Damir Juric
Aleksandar Savkov
Jack Flann
Maria Lehl
...
Tessa Grafen
V. Zhelezniak
Sunir Gohil
Alex Papadopoulos Korfiatis
Nils Y. Hammerla
26
7
0
23 Dec 2021
Few-shot Controllable Style Transfer for Low-Resource Multilingual
  Settings
Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings
Kalpesh Krishna
Deepak Nathani
Xavier Garcia
Bidisha Samanta
Partha P. Talukdar
19
24
0
14 Oct 2021
Learning Compact Metrics for MT
Learning Compact Metrics for MT
Amy Pu
Hyung Won Chung
Ankur P. Parikh
Sebastian Gehrmann
Thibault Sellam
22
97
0
12 Oct 2021
Explain Me the Painting: Multi-Topic Knowledgeable Art Description
  Generation
Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
Zechen Bai
Yuta Nakashima
Noa Garcia
66
42
0
13 Sep 2021
Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework
  for Scrutinizing Machine Text
Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text
Yao Dou
Maxwell Forbes
Rik Koncel-Kedziorski
Noah A. Smith
Yejin Choi
DeLMO
6
125
0
02 Jul 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
243
284
0
02 Feb 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using
  Divergence Frontiers
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
15
340
0
02 Feb 2021
Text Style Transfer: A Review and Experimental Evaluation
Text Style Transfer: A Review and Experimental Evaluation
Zhiqiang Hu
Roy Ka-Wei Lee
Charu C. Aggarwal
Aston Zhang
AI4TS
37
26
0
24 Oct 2020
Automated Source Code Generation and Auto-completion Using Deep
  Learning: Comparing and Discussing Current Language-Model-Related Approaches
Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches
Juan Cruz-Benito
Sanjay Vishwakarma
Francisco Martín-Fernández
Ismael Faro Ibm Quantum
22
30
0
16 Sep 2020
Language GANs Falling Short
Language GANs Falling Short
Massimo Caccia
Lucas Page-Caccia
W. Fedus
Hugo Larochelle
Joelle Pineau
Laurent Charlin
117
214
0
06 Nov 2018
1