All Papers
0 / 0 papers shown
Title |
|---|
Title |
|---|

Title |
|---|
![]() MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging
Knowledge GraphsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
![]() Leveraging Large Language Models for NLG Evaluation: Advances and
ChallengesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
![]() Rethinking Response Evaluation from Interlocutor's Eye for Open-Domain
Dialogue SystemsInternational Joint Conference on Natural Language Processing (IJCNLP), 2024 |
![]() BatchEval: Towards Human-like Text EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() CESAR: Automatic Induction of Compositional Instructions for Multi-turn
DialogsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() Fusion-Eval: Integrating Assistant Evaluators with LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented
Instruction Tuning with Auxiliary Evaluation AspectsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() DialogBench: Evaluating LLMs as Human-like Dialogue SystemsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation BenchmarkConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() Calibrating LLM-Based EvaluatorInternational Conference on Language Resources and Evaluation (LREC), 2023 |
![]() RADE: Reference-Assisted Dialogue Evaluation for Open-Domain DialogueAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() Towards Multilingual Automatic Dialogue EvaluationSIGDIAL Conferences (SIGDIAL), 2023 |
![]() GPTEval: A Survey on Assessments of ChatGPT and GPT-4International Conference on Language Resources and Evaluation (LREC), 2023 |
![]() LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise
Comparisons using Large Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 |
![]() DecompEval: Evaluating Generated Texts as Unsupervised Decomposed
Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() C-PMI: Conditional Pointwise Mutual Information for Turn-level Dialogue
EvaluationWorkshop on Document-grounded Dialogue and Conversational Question Answering (DialDoc), 2023 |
![]() MISMATCH: Fine-grained Evaluation of Machine-generated Text with
Mismatch Error TypesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() Correction of Errors in Preference Ratings from Automated Metrics for
Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying
ReferencesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() Asking Clarification Questions to Handle Ambiguity in Open-Domain QAConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() Towards More Robust NLP System Evaluation: Handling Missing Scores in
BenchmarksConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric
Preference ChecklistAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() DEnsity: Open-domain Dialogue Evaluation Metric using Density EstimationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() Exploring the Use of Large Language Models for Reference-Free Text
Quality Evaluation: An Empirical StudyInternational Joint Conference on Natural Language Processing (IJCNLP), 2023 |
![]() G-Eval: NLG Evaluation using GPT-4 with Better Human AlignmentConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() KPEval: Towards Fine-Grained Semantic-Based Keyphrase EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() GPTScore: Evaluate as You DesireNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() Opportunities and Challenges in Neural Dialog TutoringConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 |
![]() On the Blind Spots of Model-Based Evaluation Metrics for Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
![]() Don't Forget Your ABC's: Evaluating the State-of-the-Art in
Chat-Oriented Dialogue SystemsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
![]() PoE: a Panel of Experts for Generalized Automatic Dialogue AssessmentIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022 |