ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.09135
  4. Cited By
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions

Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions

14 April 2024
Taojun Hu
Xiao-Hua Zhou
    ELM
ArXivPDFHTML

Papers citing "Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions"

9 / 9 papers shown
Title
Evaluating the Process Modeling Abilities of Large Language Models -- Preliminary Foundations and Results
Evaluating the Process Modeling Abilities of Large Language Models -- Preliminary Foundations and Results
Peter Fettke
Constantin Houy
ELM
35
0
0
14 Mar 2025
Advancing Cyber Incident Timeline Analysis Through Rule Based AI and
  Large Language Models
Advancing Cyber Incident Timeline Analysis Through Rule Based AI and Large Language Models
Fatma Yasmine Loumachi
Mohamed Chahine Ghanem
AI4CE
36
1
0
04 Sep 2024
Fairness in Large Language Models in Three Hours
Fairness in Large Language Models in Three Hours
Thang Doan Viet
Zichong Wang
Minh Nhat Nguyen
Wenbin Zhang
33
8
0
02 Aug 2024
ORAN-Bench-13K: An Open Source Benchmark for Assessing LLMs in Open
  Radio Access Networks
ORAN-Bench-13K: An Open Source Benchmark for Assessing LLMs in Open Radio Access Networks
Pranshav Gajjar
Vijay K. Shah
25
5
0
08 Jul 2024
Compositional Zero-Shot Domain Transfer with Text-to-Text Models
Compositional Zero-Shot Domain Transfer with Text-to-Text Models
Fangyu Liu
Qianchu Liu
Shruthi Bannur
Fernando Pérez-García
Naoto Usuyama
...
A. Nori
Hoifung Poon
Javier Alvarez-Valle
Ozan Oktay
Stephanie L. Hyland
VLM
35
5
0
23 Mar 2023
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
240
1,070
0
05 Oct 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
216
327
0
23 Aug 2022
Measure and Improve Robustness in NLP Models: A Survey
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
117
130
0
15 Dec 2021
PubMedQA: A Dataset for Biomedical Research Question Answering
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
196
791
0
13 Sep 2019
1