Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.02049
Cited By
Post Turing: Mapping the landscape of LLM Evaluation
3 November 2023
Alexey Tikhonov
Ivan P. Yamshchikov
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Post Turing: Mapping the landscape of LLM Evaluation"
8 / 8 papers shown
Title
AI Predicts AGI: Leveraging AGI Forecasting and Peer Review to Explore LLMs' Complex Reasoning Capabilities
Fabrizio Davide
Pietro Torre
Andrea Gaggioli
Andrea Gaggioli
ELM
107
0
0
12 Dec 2024
Beyond Turing Test: Can GPT-4 Sway Experts' Decisions?
Takehiro Takayanagi
Hiroya Takamura
Kiyoshi Izumi
Chung-Chi Chen
ELM
DeLMO
20
1
0
25 Sep 2024
PLUGH: A Benchmark for Spatial Understanding and Reasoning in Large Language Models
Alexey Tikhonov
ELM
ReLM
LRM
20
0
0
03 Aug 2024
Humor Mechanics: Advancing Humor Generation with Multistep Reasoning
Alexey Tikhonov
Pavel Shtykovskiy
LRM
ReLM
26
1
0
12 May 2024
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
ELM
ALM
178
780
0
02 May 2023
Open-Domain Dialog Evaluation using Follow-Ups Likelihood
Maxime De Bruyn
Ehsan Lotfi
Jeska Buhmann
Walter Daelemans
24
9
0
12 Sep 2022
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,815
0
17 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1