ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.18225
  4. Cited By
CogBench: a large language model walks into a psychology lab

CogBench: a large language model walks into a psychology lab

28 February 2024
Julian Coda-Forno
Marcel Binz
Jane X. Wang
Eric Schulz
    ELMALMLLMAGLM&MA
ArXiv (abs)PDFHTML

Papers citing "CogBench: a large language model walks into a psychology lab"

35 / 35 papers shown
Are Large Language Models Sensitive to the Motives Behind Communication?
Are Large Language Models Sensitive to the Motives Behind Communication?
Addison J. Wu
Ryan Liu
Kerem Oktar
T. Sumers
Thomas L. Griffiths
164
0
0
22 Oct 2025
Unraveling the cognitive patterns of Large Language Models through module communities
Unraveling the cognitive patterns of Large Language Models through module communities
Kushal Raj Bhandari
Pin-Yu Chen
Jianxi Gao
96
0
0
25 Aug 2025
How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations
How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations
Brandon Jaipersaud
David M. Krueger
Ekdeep Singh Lubana
100
2
0
07 Aug 2025
How LLMs Comprehend Temporal Meaning in Narratives: A Case Study in Cognitive Evaluation of LLMs
How LLMs Comprehend Temporal Meaning in Narratives: A Case Study in Cognitive Evaluation of LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Karin de Langis
J. Park
Andreas Schramm
Bin Hu
Khanh Chi Le
Michael C. Mensink
Ahn Thu Tong
Luan Tuyen Chau
116
1
0
18 Jul 2025
Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
Badr AlKhamissi
C. Nicolò De Sabbata
Greta Tuckute
Zeming Chen
Martin Schrimpf
Antoine Bosselut
MoELRM
253
4
0
16 Jun 2025
Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets
Efficient Ensemble for Fine-tuning Language Models on Multiple DatasetsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Dongyue Li
Ziniu Zhang
Lu Wang
Hongyang R. Zhang
169
6
0
28 May 2025
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
Jiayi Geng
Howard Chen
Dilip Arumugam
Thomas L. Griffiths
326
3
0
23 May 2025
Understanding LLM Scientific Reasoning through Promptings and Model's Explanation on the Answers
Understanding LLM Scientific Reasoning through Promptings and Model's Explanation on the Answers
Alice Rueda
Mohammed S. Hassan
Argyrios Perivolaris
Bazen G. Teferra
Reza Samavi
...
Yanzhe Zhang
Bo Cao
Divya Sharma
Sridhar Krishnan Venkat Bhat
Venkat Bhat
ELMLRM
353
5
0
02 May 2025
Memorization and Knowledge Injection in Gated LLMs
Memorization and Knowledge Injection in Gated LLMs
Xu Pan
Ely Hahami
Zechen Zhang
H. Sompolinsky
KELMCLLRALM
317
3
0
30 Apr 2025
Toward Efficient Exploration by Large Language Model Agents
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
406
10
0
29 Apr 2025
Strong Memory, Weak Control: An Empirical Study of Executive Functioning in LLMs
Strong Memory, Weak Control: An Empirical Study of Executive Functioning in LLMs
Karin de Langis
J. Park
Bin Hu
Khanh Chi Le
Andreas Schramm
Michael C. Mensink
Andrew Elfenbein
Dongyeop Kang
356
3
0
03 Apr 2025
The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas
The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas
Giovanni Franco Gabriel Marraffini
Andrés Cotton
Noe Fabian Hsueh
Axel Fridman
Juan Wisznia
Luciano Del Corro
186
6
0
25 Mar 2025
Levels of Analysis for Large Language Models
Levels of Analysis for Large Language Models
Alexander Ku
Declan Campbell
Xuechunzi Bai
Jiayi Geng
Ryan Liu
...
Ilia Sucholutsky
Veniamin Veselovsky
Liyi Zhang
Jian-Qiao Zhu
Thomas L. Griffiths
ELM
365
5
0
17 Mar 2025
LLM Agents Display Human Biases but Exhibit Distinct Learning Patterns
Idan Horowitz
Ori Plonsky
266
2
0
13 Mar 2025
On Benchmarking Human-Like Intelligence in Machines
On Benchmarking Human-Like Intelligence in Machines
Lance Ying
Katherine M. Collins
L. Wong
Ilia Sucholutsky
Ryan Liu
Adrian Weller
Tianmin Shu
Thomas Griffiths
Joshua B. Tenenbaum
ALMELM
912
19
0
27 Feb 2025
Human Cognitive Benchmarks Reveal Foundational Visual Gaps in MLLMs
Human Cognitive Benchmarks Reveal Foundational Visual Gaps in MLLMs
Zhenyu Zhao
Dasen Dai
Jen-Yuan Huang
Youliang Yuan
Xiaoyuan Liu
Wenxuan Wang
Wenxiang Jiao
Pinjia He
Zhaopeng Tu
Haodong Duan
LRM
438
2
0
23 Feb 2025
Paradigms of AI Evaluation: Mapping Goals, Methodologies and Culture
Paradigms of AI Evaluation: Mapping Goals, Methodologies and CultureInternational Joint Conference on Artificial Intelligence (IJCAI), 2024
John Burden
Marko Tesic
Lorenzo Pacchiardi
José Hernández-Orallo
308
7
0
21 Feb 2025
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
Raj Sanjay Shah
Sashank Varma
LRM
413
2
0
22 Jan 2025
Meaningless is better: hashing bias-inducing words in LLM prompts improves performance in logical reasoning and statistical learning
Meaningless is better: hashing bias-inducing words in LLM prompts improves performance in logical reasoning and statistical learning
Milena Chadimová
Eduard Jurášek
Tomáš Kliegr
437
0
0
26 Nov 2024
VideoCogQA: A Controllable Benchmark for Evaluating Cognitive Abilities in Video-Language Models
VideoCogQA: A Controllable Benchmark for Evaluating Cognitive Abilities in Video-Language Models
Chenglin Li
Qianglong Chen
Zhi Li
Feng Tao
Yin Zhang
418
0
0
14 Nov 2024
Game-theoretic LLM: Agent Workflow for Negotiation Games
Game-theoretic LLM: Agent Workflow for Negotiation Games
Qingfeng Lan
Ollie Liu
Jinkui Chi
Alfonso Amayuelas
Julie Chen
...
Lizhou Fan
Fei Sun
William Yang Wang
Xinze Wang
Zelong Li
368
47
0
08 Nov 2024
Can LLMs make trade-offs involving stipulated pain and pleasure states?
Can LLMs make trade-offs involving stipulated pain and pleasure states?
Geoff Keeling
Winnie Street
Martyna Stachaczyk
Daria Zakharova
Iulia M. Comsa
Anastasiya Sakovych
Isabella Logothesis
Zejia Zhang
Blaise Agüera y Arcas
Jonathan Birch
218
11
0
01 Nov 2024
Large Language Model Benchmarks in Medical Tasks
Large Language Model Benchmarks in Medical Tasks
Lawrence K. Q. Yan
Ming Li
Yujiao Shi
Cheng Fei
Cheng Fei
...
Junyu Liu
Xinyuan Song
Riyang Bao
Zekun Jiang
Ziyuan Qin
LM&MAAI4MH
695
19
0
28 Oct 2024
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse
Ryan Liu
Jiayi Geng
Addison J. Wu
Ilia Sucholutsky
Tania Lombrozo
Thomas Griffiths
ReLMLRM
429
87
0
27 Oct 2024
TeachTune: Reviewing Pedagogical Agents Against Diverse Student Profiles
  with Simulated Students
TeachTune: Reviewing Pedagogical Agents Against Diverse Student Profiles with Simulated StudentsInternational Conference on Human Factors in Computing Systems (CHI), 2024
Hyoungwook Jin
Minju Yoo
Jeongeon Park
Yokyung Lee
Xu Wang
Juho Kim
ELM
340
25
0
05 Oct 2024
How Does Code Pretraining Affect Language Model Task Performance?
How Does Code Pretraining Affect Language Model Task Performance?
Jackson Petty
Sjoerd van Steenkiste
Tal Linzen
367
17
0
06 Sep 2024
Large Language Models and Cognitive Science: A Comprehensive Review of
  Similarities, Differences, and Challenges
Large Language Models and Cognitive Science: A Comprehensive Review of Similarities, Differences, and Challenges
Qian Niu
Junyu Liu
Ziqian Bi
Pohsun Feng
Benji Peng
...
Ming Li
Lawrence KQ Yan
Yichao Zhang
Caitlyn Heqi Yin
Cheng Fei
404
47
0
04 Sep 2024
Evaluating AI Evaluation: Perils and Prospects
Evaluating AI Evaluation: Perils and Prospects
John Burden
ELM
220
13
0
12 Jul 2024
Large Language Model Recall Uncertainty is Modulated by the Fan Effect
Large Language Model Recall Uncertainty is Modulated by the Fan Effect
Jesse Roberts
Kyle Moore
Thao Pham
Oseremhen Ewaleifoh
Doug Fisher
302
6
0
08 Jul 2024
Large Language Models Assume People are More Rational than We Really are
Large Language Models Assume People are More Rational than We Really are
Ryan Liu
Jiayi Geng
Joshua C. Peterson
Ilia Sucholutsky
Thomas Griffiths
508
35
0
24 Jun 2024
M3GIA: A Cognition Inspired Multilingual and Multimodal General
  Intelligence Ability Benchmark
M3GIA: A Cognition Inspired Multilingual and Multimodal General Intelligence Ability Benchmark
Wei Song
Yadong Li
Jianhua Xu
Guowei Wu
Lingfeng Ming
...
Weihua Luo
Houyi Li
Yi Du
Fangda Guo
Kaicheng Yu
ELMLRM
272
12
0
08 Jun 2024
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice
Jian-Qiao Zhu
Haijiang Yan
Thomas Griffiths
300
8
0
29 May 2024
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
Phakphum Artkaew
LRM
171
0
0
28 May 2024
Large Language Models are Biased Reinforcement Learners
Large Language Models are Biased Reinforcement Learners
William M. Hayes
Nicolas Yax
Stefano Palminteri
OffRL
202
3
0
19 May 2024
Can large language models explore in-context?
Can large language models explore in-context?Neural Information Processing Systems (NeurIPS), 2024
Akshay Krishnamurthy
Keegan Harris
Dylan J. Foster
Cyril Zhang
Aleksandrs Slivkins
LM&RoLLMAGLRM
586
53
0
22 Mar 2024
1