Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14802
Cited By
Estimating Large Language Model Capabilities without Labeled Test Data
24 May 2023
Harvey Yiyun Fu
Qinyuan Ye
Albert Xu
Xiang Ren
Robin Jia
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Estimating Large Language Model Capabilities without Labeled Test Data"
13 / 13 papers shown
Title
Plug-and-Play Performance Estimation for LLM Services without Relying on Labeled Data
Can Wang
Dianbo Sui
Hongliang Sun
Hao Ding
Bolin Zhang
Zhiying Tu
19
0
0
10 Oct 2024
When Raw Data Prevails: Are Large Language Model Embeddings Effective in Numerical Data Representation for Medical Machine Learning Applications?
Yanjun Gao
Skatje Myers
Shan Chen
Dmitriy Dligach
Timothy A Miller
Danielle S. Bitterman
M. Churpek
Majid Afshar
29
7
0
15 Aug 2024
Active Testing of Large Language Model via Multi-Stage Sampling
Yuheng Huang
Jiayang Song
Qiang Hu
Felix Juefei-Xu
Lei Ma
16
2
0
07 Aug 2024
Third-Party Language Model Performance Prediction from Instruction
Rahul Nadkarni
Yizhong Wang
Noah A. Smith
ELM
LRM
26
0
0
19 Mar 2024
The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey
Dhruv Dhamani
Mary Lou Maher
16
1
0
29 Dec 2023
Predicting generalization performance with correctness discriminators
Yuekun Yao
Alexander Koller
16
0
0
15 Nov 2023
Effective Proxy for Human Labeling: Ensemble Disagreement Scores in Large Language Models for Industrial NLP
Wei Du
Laksh Advani
Yashmeet Gambhir
Daniel J. Perry
Prashant Shiralkar
Zhengzheng Xing
Aaron Colak
ALM
17
1
0
11 Sep 2023
On the Relation between Sensitivity and Accuracy in In-context Learning
Yanda Chen
Chen Zhao
Zhou Yu
Kathleen McKeown
He He
178
77
0
16 Sep 2022
Re-Examining Calibration: The Case of Question Answering
Chenglei Si
Chen Zhao
Sewon Min
Jordan L. Boyd-Graber
38
30
0
25 May 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
202
167
0
18 Apr 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
274
882
0
18 Apr 2021
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
231
288
0
17 Mar 2020
1