Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2006.14799
Cited By
v1
v2 (latest)
Evaluation of Text Generation: A Survey
26 June 2020
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Evaluation of Text Generation: A Survey"
50 / 243 papers shown
OPOR-Bench: Evaluating Large Language Models on Online Public Opinion Report Generation
Jinzheng Yu
Yang Xu
Haozhen Li
Junqi Li
Yifan Feng
Ligu Zhu
Hao Shen
Lei Shi
ELM
320
2
0
01 Dec 2025
Rating Roulette: Self-Inconsistency in LLM-As-A-Judge Frameworks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Rajarshi Haldar
Julia Hockenmaier
197
11
0
31 Oct 2025
CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity
Zhaoyi Joey Hou
Bowei Alvin Zhang
Yining Lu
Bhiman Kumar Baghel
Anneliese Brei
...
Faeze Brahman
Snigdha Chaturvedi
Haw-Shiuan Chang
Daniel Khashabi
Xiang Lorraine Li
199
1
0
23 Oct 2025
A Layered Intuition -- Method Model with Scope Extension for LLM Reasoning
Hong Su
LRM
107
3
0
12 Oct 2025
Evaluating Spatiotemporal Consistency in Automatically Generated Sewing Instructions
Luisa Geiger
Mareike Hartmann
Michael Sullivan
Alexander Koller
121
0
0
29 Sep 2025
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Yang Wang
Chenghao Xiao
Chia-Yi Hsiao
Zi Yan Chang
Chi-Li Chen
Tyler Loakman
Chenghua Lin
346
1
0
04 Sep 2025
The illusion of a perfect metric: Why evaluating AI's words is harder than it looks
Maria Paz Oliva
Adriana Correia
Ivan Vankov
Viktor Botev
ALM
222
0
0
19 Aug 2025
References Matter: Investigating the Impact of Reference Set Variation on Summarization Evaluation
Silvia Casola
Wenshu Fan
Siyao Peng
Oliver Kraus
Albert Gatt
Barbara Plank
410
0
0
17 Jun 2025
From Multimodal Perception to Strategic Reasoning: A Survey on AI-Generated Game Commentary
Qirui Zheng
Xingbo Wang
Keyuan Cheng
Muhammad Asif Ali
Yunlong Lu
Wenxin Li
216
0
0
17 Jun 2025
COGENT: A Curriculum-oriented Framework for Generating Grade-appropriate Educational Content
Workshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2025
Zhengyuan Liu
Stella Xin Yin
Dion Hoe-Lian Goh
Nancy F. Chen
ELM
325
2
0
11 Jun 2025
Design of Trimmed Helicoid Soft-Rigid Hybrid Robots
International Conference on Soft Robotics (ICSR), 2025
Zach J. Patterson
Emily R. Sologuren
Daniela Rus
130
2
0
03 Jun 2025
APE: Selective Fine-tuning with Acceptance Criteria for Language Model Adaptation
Javier Marín
245
0
0
26 May 2025
Evaluating and Mitigating Bias in AI-Based Medical Text Generation
Nature Computational Science (Nat. Comput. Sci.), 2025
Xiuying Chen
Tairan Wang
Juexiao Zhou
Zirui Song
Xin Gao
Wei Wei
MedIm
285
13
0
24 Apr 2025
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
Wencong You
Daniel Lowd
348
1
0
24 Apr 2025
CPR: Leveraging LLMs for Topic and Phrase Suggestion to Facilitate Comprehensive Product Reviews
Ekta Gujral
Apurva Sinha
Lishi Ji
Bijayani Sanghamitra Mishra
171
0
0
18 Apr 2025
LLMs as Span Annotators: A Comparative Study of LLMs and Humans
Zdeněk Kasner
Vilém Zouhar
Patrícia Schmidtová
Ivan Kartáč
Kristýna Onderková
Ondřej Plátek
Dimitra Gkatzia
Saad Mahamood
Ondrej Dusek
Simone Balloccu
ALM
663
8
0
11 Apr 2025
SCORE: Story Coherence and Retrieval Enhancement for AI Narratives
Qiang Yi
Yangfan He
Jing Wang
Xinyuan Song
Shiyao Qian
...
Menghao Huo
Kuan Lu
Jiaqi Chen
Lewei He
Tianyu Shi
RALM
835
84
0
30 Mar 2025
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Tuo Liang
Zhe Hu
Jing Li
Hao Zhang
Yiren Lu
...
Yiran Qiao
Disheng Liu
Jeirui Peng
Jing Ma
Yu Yin
398
2
0
29 Mar 2025
Natural Language Generation
Theoretical Issues In Natural Language Processing (TINLP), 2018
Emiel van Miltenburg
Chenghua Lin
350
2
0
20 Mar 2025
Argument Summarization and its Evaluation in the Era of Large Language Models
Moritz Altemeyer
Steffen Eger
Johannes Daxenberger
Yanran Chen
Tim Altendorf
Philipp Cimiano
Benjamin Schiller
LM&MA
ELM
LRM
482
6
0
02 Mar 2025
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Rylan Schaeffer
Punit Singh Koura
Binh Tang
R. Subramanian
Aaditya K. Singh
...
Vedanuj Goswami
Sergey Edunov
Dieuwke Hupkes
Sanmi Koyejo
Sharan Narang
ALM
463
2
0
24 Feb 2025
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Bosi Wen
Pei Ke
Yufei Sun
C. Wang
Xiaohan Zhang
Jinfeng Zhou
Jie Tang
Hongning Wang
Minlie Huang
271
5
0
18 Feb 2025
Reference-free Evaluation Metrics for Text Generation: A Survey
Takumi Ito
Kees van Deemter
Jun Suzuki
ELM
447
11
0
21 Jan 2025
Interactive Information Need Prediction with Intent and Context
Kevin Ros
Dhyey Pandya
ChengXiang Zhai
174
0
0
05 Jan 2025
AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility
Yixian Shen
Hang Zhang
Yanxin Shen
Lun Wang
Chuanqi Shi
Shaoshuai Du
Yiyi Tao
312
16
0
03 Jan 2025
QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization
Shiyue Zhang
David Wan
Arie Cattan
Ayal Klein
Ido Dagan
Joey Tianyi Zhou
413
5
0
10 Dec 2024
Challenges in Trustworthy Human Evaluation of Chatbots
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Wenting Zhao
Alexander M. Rush
Tanya Goyal
ALM
362
12
0
05 Dec 2024
I Can Tell What I am Doing: Toward Real-World Natural Language Grounding of Robot Experiences
Conference on Robot Learning (CoRL), 2024
Zihan Wang
Brian Liang
Varad Dhat
Zander Brumbaugh
Nick Walker
Ranjay Krishna
Maya Cakmak
347
17
0
20 Nov 2024
Script-Strategy Aligned Generation: Aligning LLMs with Expert-Crafted Dialogue Scripts and Therapeutic Strategies for Psychotherapy
Proceedings of the ACM on Human-Computer Interaction (PACMHCI), 2024
Xin Sun
Jan de Wit
Zhuying Li
Jiahuan Pei
Abdallah El Ali
Jos A. Bosch
526
7
0
11 Nov 2024
Comparative Analysis of Demonstration Selection Algorithms for LLM In-Context Learning
Dong Shu
Jundong Li
253
4
0
30 Oct 2024
Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Ruiyu Xiao
Lei Wu
Yuhang Gou
Weinan Zhang
Ting Liu
184
2
0
30 Oct 2024
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
Esteban Garces Arias
Hannah Blocher
Julian Rodemann
Meimingwei Li
Gaojuan Fan
Yi Men
449
5
0
24 Oct 2024
OpenMU: Your Swiss Army Knife for Music Understanding
Mengjie Zhao
Zhi-Wei Zhong
Zhuoyuan Mao
Shiqi Yang
Wei-Hsiang Liao
Shusuke Takahashi
Hiromi Wakaki
Yuki Mitsufuji
OSLM
426
13
0
21 Oct 2024
Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration
Neural Information Processing Systems (NeurIPS), 2024
Yun-Yen Chuang
Hung-Min Hsu
Kevin Lin
Chen-Sheng Gu
Ling Zhen Li
Ray-I Chang
Hung-yi Lee
DiffM
VLM
320
1
0
17 Oct 2024
4-LEGS: 4D Language Embedded Gaussian Splatting
Gal Fiebelman
Tamir Cohen
Ayellet Morgenstern
Peter Hedman
Hadar Averbuch-Elor
3DGS
522
4
0
14 Oct 2024
MLP-SLAM: Multilayer Perceptron-Based Simultaneous Localization and Mapping
Taozhe Li
Wei Sun
395
1
0
14 Oct 2024
Investigating Human-Computer Interaction and Visual Comprehension in Text Generation Process of Natural Language Generation Models
Yunchao Wang
Zihang Fu
Chaoqing Xu
Guodao Sun
Ronghua Liang
212
0
0
11 Oct 2024
Debate, Deliberate, Decide (D3): A Cost-Aware Adversarial Framework for Reliable and Interpretable LLM Evaluation
Chaithanya Bandi
Abir Harrasse
Hari Bandi
LLMAG
ELM
423
11
0
07 Oct 2024
Natural Language Generation for Visualizations: State of the Art, Challenges and Future Directions
Enamul Hoque
Mohammed Saidul Islam
246
9
0
29 Sep 2024
Quality Matters: Evaluating Synthetic Data for Tool-Using LLMs
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Shadi Iskander
Nachshon Cohen
Zohar Karnin
Ori Shapira
Sofia Tolmach
SyDa
129
21
0
24 Sep 2024
LLMs are One-Shot URL Classifiers and Explainers
Fariza Rashid
Nishavi Ranaweera
Ben Doyle
Suranga Seneviratne
LRM
283
17
0
22 Sep 2024
The Effect of Education in Prompt Engineering: Evidence from Journalists
Amirsiavosh Bashardoust
Yuanjun Feng
Dominique Geissler
Stefan Feuerriegel
Y. Shrestha
221
8
0
18 Sep 2024
Exploring Fine-tuned Generative Models for Keyphrase Selection: A Case Study for Russian
Anna Glazkova
Dmitry A. Morozov
243
1
0
16 Sep 2024
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation
Ingo Ziegler
Abdullatif Köksal
Desmond Elliott
Hinrich Schütze
333
19
0
03 Sep 2024
A Perspective on Literary Metaphor in the Context of Generative AI
Imke van Heerden
Anil Bas
232
3
0
02 Sep 2024
Summarizing long regulatory documents with a multi-step pipeline
Mika Sie
Ruby Beek
Michiel Bots
S. Brinkkemper
Albert Gatt
AILaw
ELM
202
8
0
19 Aug 2024
Automatic Metrics in Natural Language Generation: A Survey of Current Evaluation Practices
International Conference on Natural Language Generation (INLG), 2024
Patrícia Schmidtová
Saad Mahamood
Simone Balloccu
Ondřej Dušek
Albert Gatt
Dimitra Gkatzia
David M. Howcroft
Ondřej Plátek
Adarsa Sivaprasad
256
25
0
17 Aug 2024
What should I wear to a party in a Greek taverna? Evaluation for Conversational Agents in the Fashion Domain
Antonis Maronikolakis
Ana Peleteiro Ramallo
Weiwei Cheng
Thomas Kober
LLMAG
203
2
0
13 Aug 2024
Exploring Personality-Driven Personalization in XAI: Enhancing User Trust in Gameplay
Zhaoxin Li
Sophie Yang
Shijie Wang
178
1
0
08 Aug 2024
Interpretable Differential Diagnosis with Dual-Inference Large Language Models
Shuang Zhou
Sirui Ding
Jiashuo Wang
Mingquan Lin
Genevieve B. Melton
Rui Zhang
LM&MA
243
4
0
10 Jul 2024
1
2
3
4
5
Next
Page 1 of 5