ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.02178
  4. Cited By
Assessing and Verifying Task Utility in LLM-Powered Applications

Assessing and Verifying Task Utility in LLM-Powered Applications

3 May 2024
Negar Arabzadeh
Siging Huo
Nikhil Mehta
Qinqyun Wu
Chi Wang
Ahmed Hassan Awadallah
Charles L. A. Clarke
Julia Kiseleva
ArXivPDFHTML

Papers citing "Assessing and Verifying Task Utility in LLM-Powered Applications"

7 / 7 papers shown
Title
Why Do Multi-Agent LLM Systems Fail?
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri
Melissa Z. Pan
Shuyi Yang
Lakshya A Agrawal
Bhavya Chopra
...
Dan Klein
Kannan Ramchandran
Matei A. Zaharia
Joseph E. Gonzalez
Ion Stoica
LLMAG
Presented at ResearchTrend Connect | LLMAG on 23 Apr 2025
112
5
0
17 Mar 2025
Benchmarking Prompt Sensitivity in Large Language Models
Benchmarking Prompt Sensitivity in Large Language Models
Amirhossein Razavi
Mina Soltangheis
Negar Arabzadeh
Sara Salamat
Morteza Zihayat
Ebrahim Bagheri
59
1
0
09 Feb 2025
LLM-Human Pipeline for Cultural Context Grounding of Conversations
LLM-Human Pipeline for Cultural Context Grounding of Conversations
Rajkumar Pujari
Dan Goldwasser
16
1
0
17 Oct 2024
Creative Agents: Empowering Agents with Imagination for Creative Tasks
Creative Agents: Empowering Agents with Imagination for Creative Tasks
Chi Zhang
Penglin Cai
Yuhui Fu
Haoqi Yuan
Zongqing Lu
LM&Ro
LLMAG
49
20
0
05 Dec 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Can Large Language Models Be an Alternative to Human Evaluations?
Cheng-Han Chiang
Hung-yi Lee
ALM
LM&MA
201
559
0
03 May 2023
Aligning Offline Metrics and Human Judgments of Value for Code
  Generation Models
Aligning Offline Metrics and Human Judgments of Value for Code Generation Models
Victor C. Dibia
Adam Fourney
Gagan Bansal
Forough Poursabzi-Sangdeh
Han Liu
Saleema Amershi
ALM
OffRL
28
12
0
29 Oct 2022
PubMedQA: A Dataset for Biomedical Research Question Answering
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
196
791
0
13 Sep 2019
1