Simple LLM Prompting is State-of-the-Art for Robust and Multilingual
Dialogue Evaluation

Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation

31 August 2023

Patrícia Pereira

Joao Paulo Carvalho

Isabel Trancoso

Papers citing "Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation"

18 / 18 papers shown

Title
Grammar Control in Dialogue Response Generation for Language Learning Chatbots Dominik Glandorf Peng Cui Detmar Meurers Mrinmaya Sachan KELM 50 1 0 11 Feb 2025
Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs John Mendonça Isabel Trancoso A. Lavie ALM 29 1 0 20 Aug 2024
ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues John Mendonça Isabel Trancoso A. Lavie 29 3 0 16 Jul 2024
Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated Dialogues Kuanchao Chu Yi-Pei Chen Hideki Nakayama LLMAG 34 2 0 13 Jul 2024
On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation John Mendonça A. Lavie Isabel Trancoso ELM 43 2 0 04 Jul 2024
ConvoCache: Smart Re-Use of Chatbot Responses Conor Atkins Ian D. Wood M. Kâafar H. Asghar Nardine Basta Michal Kepkowski 33 0 0 26 Jun 2024
A survey of dynamic graph neural networks Yanping Zheng Lu Yi Zhewei Wei AI4TS AI4CE 25 7 0 28 Apr 2024
Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers Libo Qin Qiguang Chen Yuhang Zhou Zhi Chen Yinghui Li Lizi Liao Min Li Wanxiang Che Philip S. Yu LRM 47 36 0 07 Apr 2024
Are LLM-based Evaluators Confusing NLG Quality Criteria? Xinyu Hu Mingqi Gao Sen Hu Yang Zhang Yicheng Chen Teng Xu Xiaojun Wan AAML ELM 34 21 0 19 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges Mingqi Gao Xinyu Hu Jie Ruan Xiao Pu Xiaojun Wan ELM LM&MA 53 29 0 02 Feb 2024
Building a Llama2-finetuned LLM for Odia Language Utilizing Domain Knowledge Instruction Set Guneet Singh Kohli Shantipriya Parida Sambit Sekhar Samirit Saha Nipun B. Nair Parul Agarwal Sonal Khosla Kusumlata Patiyal Debasish Dhal 28 13 0 19 Dec 2023
A Survey of the Evolution of Language Model-Based Dialogue Systems Hongru Wang Lingzhi Wang Yiming Du Liang Chen Jing Zhou Yufei Wang Kam-Fai Wong LRM 49 20 0 28 Nov 2023
Little Giants: Exploring the Potential of Small LLMs as Evaluation Metrics in Summarization in the Eval4NLP 2023 Shared Task Neema Kotonya Saran Krishnasamy Joel R. Tetreault Alejandro Jaimes 16 9 0 01 Nov 2023
Appropriateness is all you need! Hendrik Kempt A. Lavie S. Nagel 18 1 0 27 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4 Sébastien Bubeck Varun Chandrasekaran Ronen Eldan J. Gehrke Eric Horvitz ... Scott M. Lundberg Harsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang ELM AI4MH AI4CE ALM 236 2,232 0 22 Mar 2023
EnDex: Evaluation of Dialogue Engagingness at Scale Guangxuan Xu Ruibo Liu Fabrice Harel-Canada Nischal Reddy Chandra Nanyun Peng 13 5 0 22 Oct 2022
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task Ricardo Rei Marcos Vinícius Treviso Nuno M. Guerreiro Chrysoula Zerva Ana C. Farinha ... T. Glushkova Duarte M. Alves A. Lavie Luísa Coheur André F. T. Martins 52 138 0 13 Sep 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 303 11,881 0 04 Mar 2022