Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.10855
Cited By
LatEval: An Interactive LLMs Evaluation Benchmark with Incomplete Information from Lateral Thinking Puzzles
21 August 2023
Shulin Huang
Shirong Ma
Yinghui Li
Mengzuo Huang
Wuhe Zou
Weidong Zhang
Haitao Zheng
LLMAG
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LatEval: An Interactive LLMs Evaluation Benchmark with Incomplete Information from Lateral Thinking Puzzles"
24 / 24 papers shown
Title
Solving Situation Puzzles with Large Language Model and External Reformulation
Kun Li
Xinwei Chen
Tianyou Song
Chengrui Zhou
Zhuoran Liu
Zhenyan Zhang
Jiangjian Guo
Qing Shan
ReLM
LRM
60
2
0
24 Mar 2025
Corrections Meet Explanations: A Unified Framework for Explainable Grammatical Error Correction
Jingheng Ye
Shang Qin
Yinghui Li
Hai-Tao Zheng
Shen Wang
Qingsong Wen
50
0
0
24 Feb 2025
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
Yinghui Li
Jiayi Kuang
Haojing Huang
Zhikun Xu
Xinnian Liang
...
Xiaoyu Tan
C. Qu
Ying Shen
Hai-Tao Zheng
Philip S. Yu
LRM
41
3
0
12 Feb 2025
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Yinghui Li
Haojing Huang
Jiayi Kuang
Yangning Li
Shu Guo
C. Qu
Xiaoyu Tan
Hai-Tao Zheng
Ying Shen
Philip S. Yu
CLL
63
5
0
11 Feb 2025
Exploring the Implicit Semantic Ability of Multimodal Large Language Models: A Pilot Study on Entity Set Expansion
Hebin Wang
Yangning Li
Yinghui Li
Hai-Tao Zheng
Wenhao Jiang
Hong-Gee Kim
32
0
0
03 Jan 2025
Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles
Qi Chen
Bowen Zhang
Gang Wang
Qi Wu
ReLM
LRM
16
3
0
09 Oct 2024
A Survey on Complex Tasks for Goal-Directed Interactive Agents
Mareike Hartmann
Alexander Koller
LM&Ro
LLMAG
32
0
0
27 Sep 2024
IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering
Ruosen Li
Barry Wang
Ruochen Li
Xinya Du
ELM
21
5
0
24 Aug 2024
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Xianjie Wu
Jian Yang
Linzheng Chai
Ge Zhang
Jiaheng Liu
...
Xianfu Cheng
Tianzhen Sun
Guanglin Niu
Tongliang Li
Zhoujun Li
LMTD
ELM
60
17
0
17 Aug 2024
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?
Nemika Tyagi
Mihir Parmar
Mohith Kulkarni
Aswin Rrv
Nisarg Patel
Mutsumi Nakamura
Arindam Mitra
Chitta Baral
LRM
22
0
0
20 Jul 2024
BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Baktash Ansari
Mohammadmostafa Rostamkhani
Sauleh Eetemadi
LRM
21
1
0
07 Jun 2024
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers
Harshit Gupta
Manav Chaudhary
Tathagata Raha
Shivansh Subramanian
Vasudeva Varma
ReLM
LRM
14
1
0
25 May 2024
MasonTigers at SemEval-2024 Task 9: Solving Puzzles with an Ensemble of Chain-of-Thoughts
Md. Nishat Raihan
Dhiman Goswami
Al Nahian Bin Emran
Sadiya Sayara Chowdhury Puspo
Amrita Ganguly
Marcos Zampieri
ReLM
LRM
25
1
0
22 Mar 2024
Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark
Zhikun Xu
Yinghui Li
Ruixue Ding
Xinyu Wang
Boli Chen
Yong-jia Jiang
Hai-Tao Zheng
Wenlian Lu
Pengjun Xie
Fei Huang
33
11
0
29 Feb 2024
Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction
Yinghui Li
Shang Qin
Jingheng Ye
Shirong Ma
Yangning Li
Libo Qin
Xuming Hu
Wenhao Jiang
Hai-Tao Zheng
Philip S. Yu
LRM
15
5
0
18 Feb 2024
Puzzle Solving using Reasoning of Large Language Models: A Survey
Panagiotis Giadikiaroglou
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
ELM
ReLM
LRM
11
24
0
17 Feb 2024
When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models
Yinghui Li
Qingyu Zhou
Yuanzhen Luo
Shirong Ma
Yangning Li
Hai-Tao Zheng
Xuming Hu
Philip S. Yu
LRM
36
13
0
16 Feb 2024
EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models with Semi-structured Data
Shirong Ma
Shen Huang
Shulin Huang
Xiaobin Wang
Yangning Li
Hai-Tao Zheng
Pengjun Xie
Fei Huang
Yong-jia Jiang
22
6
0
25 Dec 2023
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
120
137
0
19 Sep 2023
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&Ro
OffRL
LRM
AI4CE
87
148
0
07 Mar 2023
Towards Attribute-Entangled Controllable Text Generation: A Pilot Study of Blessing Generation
Shulin Huang
Shirong Ma
Yinghui Li
Y. Li
Shiyang Lin
Haitao Zheng
Ying Shen
20
4
0
29 Oct 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
53
96
0
07 Nov 2021
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
205
364
0
15 Oct 2021
1