ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.12474
  4. Cited By
Evaluating the Performance of Large Language Models on GAOKAO Benchmark
v1v2v3 (latest)

Evaluating the Performance of Large Language Models on GAOKAO Benchmark

21 May 2023
Xiaotian Zhang
Chun-yan Li
Yi Zong
Zhengyu Ying
Liang He
Xipeng Qiu
    ALMELM
ArXiv (abs)PDFHTML

Papers citing "Evaluating the Performance of Large Language Models on GAOKAO Benchmark"

16 / 66 papers shown
FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of
  Large Language Models
FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models
Wei Li
Ren Ma
Jiang Wu
Chenya Gu
Jiahui Peng
Jinyang Len
Songyang Zhang
Hang Yan
Dahua Lin
Conghui He
ELM
166
1
0
29 Apr 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLMLRM
829
764
0
07 Mar 2024
GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models
  Evaluation
GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation
Yi Zong
Xipeng Qiu
ELMVLM
151
13
0
24 Feb 2024
PRE: A Peer Review Based Large Language Model Evaluator
PRE: A Peer Review Based Large Language Model Evaluator
Zhumin Chu
Jiaxin Mao
Yiteng Tu
Haitao Li
Yiqun Liu
LRMALM
257
27
0
28 Jan 2024
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Jun Zhao
Zhihao Zhang
Luhui Gao
Tao Gui
Tao Gui
Xuanjing Huang
ELM
368
104
0
02 Jan 2024
Urban Generative Intelligence (UGI): A Foundational Platform for Agents
  in Embodied City Environment
Urban Generative Intelligence (UGI): A Foundational Platform for Agents in Embodied City Environment
Fengli Xu
Jun Zhang
Chen Gao
J. Feng
Yong Li
AI4CELLMAG
324
45
0
19 Dec 2023
Evaluating GPT-4's Vision Capabilities on Brazilian University Admission
  Exams
Evaluating GPT-4's Vision Capabilities on Brazilian University Admission Exams
Ramon Pires
Thales Sales Almeida
Hugo Queiroz Abonizio
Rodrigo Nogueira
ELM
153
7
0
23 Nov 2023
CFBenchmark: Chinese Financial Assistant Benchmark for Large Language
  Model
CFBenchmark: Chinese Financial Assistant Benchmark for Large Language Model
Yang Lei
Jiangtong Li
Dawei Cheng
Zhijun Ding
Changjun Jiang
168
19
0
10 Nov 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
793
3,036
0
28 Sep 2023
LawBench: Benchmarking Legal Knowledge of Large Language Models
LawBench: Benchmarking Legal Knowledge of Large Language Models
Zhiwei Fei
Xiaoyu Shen
D. Zhu
Fengzhe Zhou
Zhuo Han
Songyang Zhang
Kai-xiang Chen
Zongwen Shen
Jidong Ge
ELMAILaw
254
85
0
28 Sep 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare
  Conversations Powered by Generative AI
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI
Mahyar Abbasian
Elahe Khatibi
Iman Azimi
David Oniani
Zahra Shakeri Hossein Abad
...
Bryant Lin
Olivier Gevaert
Li-Jia Li
Ramesh C. Jain
Amir M. Rahmani
LM&MAELMAI4MH
497
119
0
21 Sep 2023
Baichuan 2: Open Large-scale Language Models
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELMLRM
800
923
0
19 Sep 2023
AGIBench: A Multi-granularity, Multimodal, Human-referenced,
  Auto-scoring Benchmark for Large Language Models
AGIBench: A Multi-granularity, Multimodal, Human-referenced, Auto-scoring Benchmark for Large Language ModelsBenchCouncil International Symposium (ISB), 2023
Fei Tang
Wanling Gao
Luzhou Peng
Jianfeng Zhan
ELM
122
2
0
05 Sep 2023
CLEVA: Chinese Language Models EVAluation Platform
CLEVA: Chinese Language Models EVAluation PlatformConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yanyang Li
Jianqiao Zhao
Duo Zheng
Zi-Yuan Hu
Zhi Chen
...
Yongfeng Huang
Shijia Huang
Dahua Lin
Michael R. Lyu
Liwei Wang
ALMELM
316
15
0
09 Aug 2023
Model Spider: Learning to Rank Pre-Trained Models Efficiently
Model Spider: Learning to Rank Pre-Trained Models EfficientlyNeural Information Processing Systems (NeurIPS), 2023
Yi-Kai Zhang
Ting Huang
Yao-Xiang Ding
De-Chuan Zhan
Han-Jia Ye
291
39
0
06 Jun 2023
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist
  Examination
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist ExaminationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dongfang Li
Jindi Yu
Baotian Hu
Zhenran Xu
Hao Fei
ELM
177
14
0
22 May 2023
Previous
12