ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.12784
  4. Cited By
UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple
  Choice Questions

UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions

18 June 2024
Xunzhi Wang
Zhuowei Zhang
Qiongyu Li
Gaonan Chen
Mengting Hu
Zhiyu li
Bitong Luo
Hang Gao
Zhixin Han
Haotian Wang
    ELM
ArXivPDFHTML

Papers citing "UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions"

5 / 5 papers shown
Title
Entropy Heat-Mapping: Localizing GPT-Based OCR Errors with Sliding-Window Shannon Analysis
Entropy Heat-Mapping: Localizing GPT-Based OCR Errors with Sliding-Window Shannon Analysis
Alexei Kaltchenko
30
0
0
30 Apr 2025
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
Yongjin Yang
Haneul Yoo
Hwaran Lee
48
1
0
13 Aug 2024
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
237
840
0
05 Oct 2022
Reducing conversational agents' overconfidence through linguistic
  calibration
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
193
108
0
30 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,003
0
20 Apr 2018
1