LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond

23 May 2023

Papers citing "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"

6 / 6 papers shown

Title
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina Yuan Gao Dokyun Lee Gordon Burtch Sina Fazelpour LRM 40 7 0 25 Oct 2024
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI Mahyar Abbasian Elahe Khatibi Iman Azimi David Oniani Zahra Shakeri Hossein Abad ... Bryant Lin Olivier Gevaert Li-Jia Li Ramesh C. Jain Amir M. Rahmani LM&MA ELM AI4MH 20 63 0 21 Sep 2023
ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts Rajdeep Mukherjee Abhinav Bohra Akash Banerjee Soumya Sharma Manjunath Hegde ... Shivani Shrivastava Koustuv Dasgupta Niloy Ganguly Saptarshi Ghosh Pawan Goyal RALM 35 44 0 22 Oct 2022
Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets Philippe Laban Chien-Sheng Wu Wenhao Liu Caiming Xiong 33 5 0 13 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,261 0 28 Jan 2022
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics Artidoro Pagnoni Vidhisha Balachandran Yulia Tsvetkov HILM 215 305 0 27 Apr 2021