The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain

11 May 2023

Papers citing "The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain"

46 / 46 papers shown

Title
Boosting Performance on ARC is a Matter of Perspective Daniel Franzen Jan Disselhoff David Hartmann RALM LRM 47 0 0 08 May 2025
Can Large Reasoning Models do Analogical Reasoning under Perceptual Uncertainty? Giacomo Camposampiero Michael Hersche Roger Wattenhofer Abu Sebastian Abbas Rahimi LRM 56 1 0 14 Mar 2025
An Empirical Comparison of Cost Functions in Inductive Logic Programming Céline Hocquette Andrew Cropper 54 0 0 10 Mar 2025
Probing the Capacity of Language Model Agents to Operationalize Disparate Experiential Context Despite Distraction Sonny George Chris Sypherd Dylan Cashman LLMAG 71 0 0 19 Nov 2024
Combining Induction and Transduction for Abstract Reasoning Wen-Ding Li Keya Hu Carter Larsen Yuqing Wu Simon Alford ... Dat Nguyen Wei-Long Zheng Zenna Tavares Yewen Pu Kevin Ellis AI4CE 35 7 0 04 Nov 2024
Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad? Antonia Wüst Tim Nelson Tobiasch Lukas Helff Inga Ibs Wolfgang Stammer Devendra Singh Dhami Constantin Rothkopf Kristian Kersting CoGe ReLM VLM LRM 68 1 0 25 Oct 2024
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models Jiachun Li Pengfei Cao Zhuoran Jin Yubo Chen Kang-Jun Liu Jun Zhao LRM ELM 37 3 0 12 Oct 2024
Mars: Situated Inductive Reasoning in an Open-World Environment Xiaojuan Tang Jiaqi Li Yitao Liang Song-chun Zhu Muhan Zhang Zilong Zheng LM&Ro LRM LLMAG 29 1 0 10 Oct 2024
System 2 Reasoning via Generality and Adaptation Sejin Kim Sundong Kim LRM AI4CE 73 0 0 10 Oct 2024
Tackling the Abstraction and Reasoning Corpus with Vision Transformers: the Importance of 2D Representation, Positions, and Objects Wenhao Li Yudong Xu Scott Sanner Elias Boutros Khalil ViT 36 3 0 08 Oct 2024
Addressing and Visualizing Misalignments in Human Task-Solving Trajectories Sejin Kim Hosung Lee Sundong Kim 33 0 0 21 Sep 2024
H-ARC: A Robust Estimate of Human Performance on the Abstraction and Reasoning Corpus Benchmark Solim LeGris Wai Keen Vong Brenden Lake Todd M. Gureckis LRM 40 8 0 02 Sep 2024
Defining and Evaluating Decision and Composite Risk in Language Models Applied to Natural Language Inference Ke Shen Mayank Kejriwal 37 0 0 04 Aug 2024
ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning Hosung Lee Sejin Kim Seungpil Lee Sanha Hwang Jihwan Lee Byung-Jun Lee Sundong Kim LRM 37 8 0 30 Jul 2024
Intelligence Analysis of Language Models Liane Galanti Ethan Baron LRM 37 1 0 20 Jul 2024
Autonomous Prompt Engineering in Large Language Models Daan Kepel Konstantina Valogianni LLMAG 48 6 0 25 Jun 2024
A-I-RAVEN and I-RAVEN-Mesh: Two New Benchmarks for Abstract Visual Reasoning Mikołaj Małkiński Jacek Mańdziuk 36 0 0 16 Jun 2024
What is the Visual Cognition Gap between Humans and Multimodal LLMs? Xu Cao Bolin Lai Wenqian Ye Yunsheng Ma Joerg Heintz Jintai Chen Jianguo Cao James M. Rehg 45 8 0 14 Jun 2024
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models Marianna Nezhurina Lucia Cipolina-Kun Mehdi Cherti J. Jitsev LLMAG LRM ELM ReLM 58 25 0 04 Jun 2024
Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience Martina G. Vilas Federico Adolfi David Poeppel Gemma Roig 48 5 0 03 Jun 2024
What is it for a Machine Learning Model to Have a Capability? Jacqueline Harding Nathaniel Sharadin ELM 38 3 0 14 May 2024
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs Jordan Dotzel Yuzong Chen Bahaa Kotb Sushma Prasad Gang Wu Sheng Li Mohamed S. Abdelfattah Zhiru Zhang 31 8 0 06 May 2024
MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning Yifan Jiang Jiarui Zhang Kexuan Sun Zhivar Sourati Kian Ahrabian Kaixin Ma Filip Ilievski Jay Pujara LRM 37 11 0 21 Apr 2024
PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns Yew Ken Chia Vernon Toh Yan Han Deepanway Ghosal Lidong Bing Soujanya Poria LRM ReLM 41 13 0 20 Mar 2024
Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus Seungpil Lee Woochang Sim Donghyeon Shin Sanha Hwang Wongyu Seo Jiwon Park Seokki Lee Sejin Kim Sundong Kim LRM 42 19 0 18 Mar 2024
Do Large Language Models Solve ARC Visual Analogies Like People Do? Gustaw Opielka Hannes Rosenbusch Veerle Vijverberg Claire E. Stevenson LRM 32 6 0 13 Mar 2024
Limits of Transformer Language Models on Learning to Compose Algorithms Jonathan Thomm Aleksandar Terzić Giacomo Camposampiero Michael Hersche Bernhard Schölkopf Abbas Rahimi 39 3 0 08 Feb 2024
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Natasha Butt Blazej Manczak Auke Wiggers Corrado Rainone David W. Zhang Michaël Defferrard Taco S. Cohen ReLM LRM 46 17 0 07 Feb 2024
Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language Conversion for Language Models Yile Wang Sijie Cheng Zixin Sun Peng Li Yang Liu ReLM LRM 32 4 0 22 Jan 2024
Generalized Planning for the Abstraction and Reasoning Corpus Chao Lei N. Lipovetzky Krista A. Ehinger LRM AI4CE 53 9 0 15 Jan 2024
Evolving Code with A Large Language Model Erik Hemberg Stephen Moskal Una-May O’Reilly 22 26 0 13 Jan 2024
In Generative AI we Trust: Can Chatbots Effectively Verify Political Information? Elizaveta Kuznetsova M. Makhortykh Victoria Vziatysheva Martha Stolze Ani Baghumyan Aleksandra Urman 22 2 0 20 Dec 2023
Inherent limitations of LLMs regarding spatial information He Yan Xinyao Hu Xiangpeng Wan Chengyu Huang Kai Zou Shiqi Xu LRM 28 2 0 05 Dec 2023
Solving ARC visual analogies with neural embeddings and vector arithmetic: A generalized method Luca H. Thoms Karel A. Veldkamp Hannes Rosenbusch Claire E. Stevenson DML 35 5 0 14 Nov 2023
Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks Melanie Mitchell Alessandro B. Palmarini A. Moskvichev LRM 27 49 0 14 Nov 2023
OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning Rim Assouel Pau Rodríguez Perouz Taslakian David Vazquez Yoshua Bengio LRM OCL 21 0 0 28 Oct 2023
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement Linlu Qiu Liwei Jiang Ximing Lu Melanie Sclar Valentina Pyatkin ... Bailin Wang Yoon Kim Yejin Choi Nouha Dziri Xiang Ren LRM ReLM 43 75 0 12 Oct 2023
ChatGPT v Bard v Bing v Claude 2 v Aria v human-expert. How good are AI chatbots at scientific writing? Edisa Lozić Benjamin Štular 34 29 0 14 Sep 2023
Building Trust in Conversational AI: A Comprehensive Review and Solution Architecture for Explainable, Privacy-Aware Systems using LLMs and Knowledge Graph Ahtsham Zafar V. Parthasarathy Chan Le Van Saad Shahid A. khan Arsalan Shahid 16 13 0 13 Aug 2023
Large Language Models as General Pattern Machines Suvir Mirchandani F. Xia Peter R. Florence Brian Ichter Danny Driess Montse Gonzalez Arenas Kanishka Rao Dorsa Sadigh Andy Zeng LLMAG 54 184 0 10 Jul 2023
Unraveling the ARC Puzzle: Mimicking Human Solutions with Object-Centric Decision Transformer Jaehyun Park Jaegyun Im Sanha Hwang Mintaek Lim Sabina Ualibekova Sejin Kim Sundong Kim AI4CE 22 12 0 14 Jun 2023
LLMs and the Abstraction and Reasoning Corpus: Successes, Failures, and the Importance of Object-based Representations Yudong Xu Wenhao Li Pashootan Vaezipoor Scott Sanner Elias Boutros Khalil LRM 28 54 0 26 May 2023
Prompting is not a substitute for probability measurements in large language models Jennifer Hu R. Levy 33 38 0 22 May 2023
The Debate Over Understanding in AI's Large Language Models Melanie Mitchell D. Krakauer ELM 74 203 0 14 Oct 2022
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices Mikolaj Malkiñski Jacek Mańdziuk 120 41 0 28 Jan 2022
Neural-guided, Bidirectional Program Search for Abstraction and Reasoning Simon Alford Anshul Gandhi Akshay Rangamani Andrzej Banburski Tony Wang Sylee Dandekar John Chin T. Poggio S. Chin LRM 103 21 0 22 Oct 2021