Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.09135
Cited By
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
14 April 2024
Taojun Hu
Xiao-Hua Zhou
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions"
9 / 9 papers shown
Title
Evaluating the Process Modeling Abilities of Large Language Models -- Preliminary Foundations and Results
Peter Fettke
Constantin Houy
ELM
35
0
0
14 Mar 2025
Advancing Cyber Incident Timeline Analysis Through Rule Based AI and Large Language Models
Fatma Yasmine Loumachi
Mohamed Chahine Ghanem
AI4CE
36
1
0
04 Sep 2024
Fairness in Large Language Models in Three Hours
Thang Doan Viet
Zichong Wang
Minh Nhat Nguyen
Wenbin Zhang
33
8
0
02 Aug 2024
ORAN-Bench-13K: An Open Source Benchmark for Assessing LLMs in Open Radio Access Networks
Pranshav Gajjar
Vijay K. Shah
25
5
0
08 Jul 2024
Compositional Zero-Shot Domain Transfer with Text-to-Text Models
Fangyu Liu
Qianchu Liu
Shruthi Bannur
Fernando Pérez-García
Naoto Usuyama
...
A. Nori
Hoifung Poon
Javier Alvarez-Valle
Ozan Oktay
Stephanie L. Hyland
VLM
35
5
0
23 Mar 2023
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
240
1,070
0
05 Oct 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
216
327
0
23 Aug 2022
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
117
130
0
15 Dec 2021
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
196
791
0
13 Sep 2019
1