Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.00908
Cited By
FineSurE: Fine-grained Summarization Evaluation using LLMs
1 July 2024
Hwanjun Song
Hang Su
Igor Shalyminov
Jason (Jinglun) Cai
Saab Mansour
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FineSurE: Fine-grained Summarization Evaluation using LLMs"
23 / 23 papers shown
Title
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
0
0
26 Apr 2025
Exploration of Plan-Guided Summarization for Narrative Texts: the Case of Small Language Models
Matt Grenander
Siddharth Varia
Paula Czarnowska
Yogarshi Vyas
Kishaloy Halder
Bonan Min
HILM
29
0
0
12 Apr 2025
DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?
Daniil Larionov
Sotaro Takeshita
Ran Zhang
Yanran Chen
Christoph Leiter
Zhipin Wang
Christian Greisinger
Steffen Eger
ReLM
ELM
LRM
69
0
0
10 Apr 2025
From Speech to Summary: A Comprehensive Survey of Speech Summarization
Fabian Retkowski
Maike Züfle
Andreas Sudmann
Dinah Pfau
Jan Niehues
Alexander Waibel
39
0
0
10 Apr 2025
FinGrAct: A Framework for FINe-GRrained Evaluation of ACTionability in Explainable Automatic Fact-Checking
Islam Eldifrawi
Shengrui Wang
Amine Trabelsi
26
0
0
07 Apr 2025
Crowdsourcing-Based Knowledge Graph Construction for Drug Side Effects Using Large Language Models with an Application on Semaglutide
Zhijie Duan
Kai Wei
Zhaoqian Xue
Jiayan Zhou
Shu Yang
Siyuan Ma
Jin Jin
Lingyao Li
35
0
0
06 Apr 2025
ReFeed: Multi-dimensional Summarization Refinement with Reflective Reasoning on Feedback
Taewon Yun
Jihwan Oh
Hyangsuk Min
Yuho Lee
Jihwan Bang
Jason (Jinglun) Cai
Hwanjun Song
OffRL
LRM
34
0
0
27 Mar 2025
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings
Austin Xu
Srijan Bansal
Yifei Ming
Semih Yavuz
Shafiq R. Joty
ELM
89
2
0
19 Mar 2025
Rehearse With User: Personalized Opinion Summarization via Role-Playing based on Large Language Models
Yanyue Zhang
Yulan He
Deyu Zhou
31
0
0
01 Mar 2025
LCTG Bench: LLM Controlled Text Generation Benchmark
K. K.
Masato Mita
Peinan Zhang
S. Sasaki
Ryosuke Ishigami
Naoaki Okazaki
55
0
0
28 Jan 2025
CaseSumm: A Large-Scale Dataset for Long-Context Summarization from U.S. Supreme Court Opinions
Mourad Heddaya
Kyle MacMillan
Anup Malani
Hongyuan Mei
Chenhao Tan
AILaw
ELM
22
0
0
03 Jan 2025
Chain-of-MetaWriting: Linguistic and Textual Analysis of How Small Language Models Write Young Students Texts
Ioana Buhnila
Georgeta Cislaru
Amalia Todirascu
80
1
0
19 Dec 2024
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models
Ming Cheng
Jiaying Gong
Chenhan Yuan
William A. Ingram
Edward A. Fox
Hoda Eldardiry
40
0
0
07 Nov 2024
Prompting and Fine-Tuning of Small LLMs for Length-Controllable Telephone Call Summarization
David Thulke
Yingbo Gao
Rricha Jalota
Christian Dugast
Hermann Ney
14
3
0
24 Oct 2024
Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation
Ryotaro Shimizu
Takashi Wada
Yu Wang
Johannes Kruse
Sean O'Brien
...
Yuya Yoshikawa
Yuki Saito
Fugee Tsung
M. Goto
Julian McAuley
19
0
0
17 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
48
7
0
09 Oct 2024
Enhancing Temporal Understanding in Audio Question Answering for Large Audio Language Models
A. Sridhar
Yinyi Guo
Erik M. Visser
AuLLM
25
0
0
10 Sep 2024
Report Cards: Qualitative Evaluation of Language Models Using Natural Language Summaries
Blair Yang
Fuyang Cui
Keiran Paster
Jimmy Ba
Pashootan Vaezipoor
Silviu Pitis
Michael Ruogu Zhang
18
1
0
01 Sep 2024
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of Free-Form Text
Sher Badshah
Hassan Sajjad
ELM
36
9
0
17 Aug 2024
Towards Better Chain-of-Thought Prompting Strategies: A Survey
Zihan Yu
Liang He
Zhen Wu
Xinyu Dai
Jiajun Chen
LRM
118
40
0
08 Oct 2023
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics
Artidoro Pagnoni
Vidhisha Balachandran
Yulia Tsvetkov
HILM
215
305
0
27 Apr 2021
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
170
3,504
0
10 Jun 2015
1