Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.02178
Cited By
Assessing and Verifying Task Utility in LLM-Powered Applications
3 May 2024
Negar Arabzadeh
Siging Huo
Nikhil Mehta
Qinqyun Wu
Chi Wang
Ahmed Hassan Awadallah
Charles L. A. Clarke
Julia Kiseleva
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Assessing and Verifying Task Utility in LLM-Powered Applications"
7 / 7 papers shown
Title
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri
Melissa Z. Pan
Shuyi Yang
Lakshya A Agrawal
Bhavya Chopra
...
Dan Klein
Kannan Ramchandran
Matei A. Zaharia
Joseph E. Gonzalez
Ion Stoica
LLMAG
Presented at
ResearchTrend Connect | LLMAG
on
23 Apr 2025
112
5
0
17 Mar 2025
Benchmarking Prompt Sensitivity in Large Language Models
Amirhossein Razavi
Mina Soltangheis
Negar Arabzadeh
Sara Salamat
Morteza Zihayat
Ebrahim Bagheri
59
1
0
09 Feb 2025
LLM-Human Pipeline for Cultural Context Grounding of Conversations
Rajkumar Pujari
Dan Goldwasser
16
1
0
17 Oct 2024
Creative Agents: Empowering Agents with Imagination for Creative Tasks
Chi Zhang
Penglin Cai
Yuhui Fu
Haoqi Yuan
Zongqing Lu
LM&Ro
LLMAG
49
20
0
05 Dec 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Cheng-Han Chiang
Hung-yi Lee
ALM
LM&MA
201
559
0
03 May 2023
Aligning Offline Metrics and Human Judgments of Value for Code Generation Models
Victor C. Dibia
Adam Fourney
Gagan Bansal
Forough Poursabzi-Sangdeh
Han Liu
Saleema Amershi
ALM
OffRL
28
12
0
29 Oct 2022
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
196
791
0
13 Sep 2019
1