ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.07382
  4. Cited By
Less is More for Long Document Summary Evaluation by LLMs

Less is More for Long Document Summary Evaluation by LLMs

14 September 2023
Yunshu Wu
Hayate Iso
Pouya Pezeshkpour
Nikita Bhutani
Estevam R. Hruschka
ArXivPDFHTML

Papers citing "Less is More for Long Document Summary Evaluation by LLMs"

26 / 26 papers shown
Title
Tempo: Application-aware LLM Serving with Mixed SLO Requirements
Tempo: Application-aware LLM Serving with Mixed SLO Requirements
Wei Zhang
Zhiyu Wu
Yi Mu
Banruo Liu
Myungjin Lee
Fan Lai
51
0
0
24 Apr 2025
Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AI
Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AI
Eser Kandogan
Nikita Bhutani
Dan Zhang
Rafael Li Chen
Sairam Gurajada
Estevam R. Hruschka
AIFin
34
0
0
10 Apr 2025
Evaluating Bias in LLMs for Job-Resume Matching: Gender, Race, and Education
Evaluating Bias in LLMs for Job-Resume Matching: Gender, Race, and Education
Hayate Iso
Pouya Pezeshkpour
Nikita Bhutani
Estevam R. Hruschka
61
0
0
24 Mar 2025
NExtLong: Toward Effective Long-Context Training without Long Documents
NExtLong: Toward Effective Long-Context Training without Long Documents
Chaochen Gao
Xing Wu
Zijia Lin
Debing Zhang
Songlin Hu
SyDa
64
1
0
22 Jan 2025
A Survey on Time-Series Distance Measures
A Survey on Time-Series Distance Measures
John Paparrizos
Haojun Li
Fan Yang
Kaize Wu
Jens E. d'Hondt
Odysseas Papapetrou
AI4TS
26
0
0
31 Dec 2024
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
Abdullah Al Rabeyah
Fabrício Góes
Marco Volpe
Talles Medeiros
69
1
0
23 Nov 2024
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Taewhoo Lee
Chanwoong Yoon
Kyochul Jang
Donghyeon Lee
Minju Song
Hyunjae Kim
Jaewoo Kang
ELM
35
1
0
22 Oct 2024
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Seiji Maekawa
Hayate Iso
Nikita Bhutani
RALM
95
1
0
15 Oct 2024
Mitigating the Impact of Reference Quality on Evaluation of
  Summarization Systems with Reference-Free Metrics
Mitigating the Impact of Reference Quality on Evaluation of Summarization Systems with Reference-Free Metrics
Théo Gigant
Camille Guinaudeau
Marc Decombas
Frédéric Dufaux
40
1
0
08 Oct 2024
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Philippe Laban
Alexander R. Fabbri
Caiming Xiong
Chien-Sheng Wu
RALM
38
41
0
01 Jul 2024
Is It Really Long Context if All You Need Is Retrieval? Towards
  Genuinely Difficult Long Context NLP
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP
Omer Goldman
Alon Jacovi
Aviv Slobodkin
Aviya Maimon
Ido Dagan
Reut Tsarfaty
58
10
0
29 Jun 2024
A Systematic Survey of Text Summarization: From Statistical Methods to
  Large Language Models
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
Haopeng Zhang
Philip S. Yu
Jiawei Zhang
30
14
0
17 Jun 2024
Hierarchical Attention Graph for Scientific Document Summarization in
  Global and Local Level
Hierarchical Attention Graph for Scientific Document Summarization in Global and Local Level
Chenlong Zhao
Xiwen Zhou
Xiaopeng Xie
Yong Zhang
16
3
0
16 May 2024
Large Language Models are Inconsistent and Biased Evaluators
Large Language Models are Inconsistent and Biased Evaluators
Rickard Stureborg
Dimitris Alikaniotis
Yoshi Suhara
ALM
32
50
0
02 May 2024
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG
Ayana Niwa
Hayate Iso
20
4
0
27 Feb 2024
Identifying Factual Inconsistencies in Summaries: Grounding Model
  Inference via Task Taxonomy
Identifying Factual Inconsistencies in Summaries: Grounding Model Inference via Task Taxonomy
Liyan Xu
Zhenlin Su
Mo Yu
Jin Xu
Jinho D. Choi
Jie Zhou
Fei Liu
HILM
24
2
0
20 Feb 2024
Generating Zero-shot Abstractive Explanations for Rumour Verification
Generating Zero-shot Abstractive Explanations for Rumour Verification
I. Bilal
Preslav Nakov
Rob Procter
M. Liakata
16
0
0
23 Jan 2024
Leveraging Large Language Models for NLG Evaluation: Advances and
  Challenges
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MA
ELM
26
9
0
13 Jan 2024
Characterizing Large Language Models as Rationalizers of
  Knowledge-intensive Tasks
Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks
Aditi Mishra
Sajjadur Rahman
H. Kim
Kushan Mitra
Estevam R. Hruschka
13
6
0
09 Nov 2023
GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions
GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions
Ting-Yao Hsu
Chieh-Yang Huang
Ryan A. Rossi
Sungchul Kim
C. Lee Giles
‘Kenneth’ Huang
15
11
0
23 Oct 2023
Redundancy Aware Multi-Reference Based Gainwise Evaluation of Extractive
  Summarization
Redundancy Aware Multi-Reference Based Gainwise Evaluation of Extractive Summarization
Mousumi Akter
Shubhra (Santu) Karmaker
18
1
0
04 Aug 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Can Large Language Models Be an Alternative to Human Evaluations?
Cheng-Han Chiang
Hung-yi Lee
ALM
LM&MA
209
568
0
03 May 2023
A Meta-Evaluation of Faithfulness Metrics for Long-Form Hospital-Course
  Summarization
A Meta-Evaluation of Faithfulness Metrics for Long-Form Hospital-Course Summarization
Griffin Adams
Jason Zucker
Noémie Elhadad
46
22
0
07 Mar 2023
AutoTemplate: A Simple Recipe for Lexically Constrained Text Generation
AutoTemplate: A Simple Recipe for Lexically Constrained Text Generation
Hayate Iso
8
7
0
15 Nov 2022
How Far are We from Robust Long Abstractive Summarization?
How Far are We from Robust Long Abstractive Summarization?
Huan Yee Koh
Jiaxin Ju
He Zhang
Ming Liu
Shirui Pan
HILM
23
39
0
30 Oct 2022
On the Limitations of Reference-Free Evaluations of Generated Text
On the Limitations of Reference-Free Evaluations of Generated Text
Daniel Deutsch
Rotem Dror
Dan Roth
32
45
0
22 Oct 2022
1