Auxiliary task demands mask the capabilities of smaller language models

3 April 2024

Papers citing "Auxiliary task demands mask the capabilities of smaller language models"

24 / 24 papers shown

Title
A Survey on Collaborative Mechanisms Between Large and Small Language Models Yi Chen JiaHao Zhao HaoHao Han 28 0 0 12 May 2025
Do Large Language Models know who did what to whom? Joseph M. Denning Xiaohan Bryor Snefjella Idan A. Blank 50 1 0 23 Apr 2025
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models Tyler A. Chang Benjamin Bergen 41 0 0 21 Apr 2025
Linking forward-pass dynamics in Transformers and real-time human processing Jennifer Hu Michael A. Lepori Michael Franke AI4CE 65 0 0 18 Apr 2025
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks Pavel Chizhov Mattia Nee Pierre-Carl Langlais Ivan P. Yamshchikov ReLM ELM LRM 39 1 0 10 Apr 2025
Not All Data Are Unlearned Equally Aravind Krishnan Siva Reddy Marius Mosbach MU 65 0 0 07 Apr 2025
Language Models Fail to Introspect About Their Knowledge of Language Siyuan Song Jennifer Hu Kyle Mahowald LRM KELM HILM ELM 79 2 0 10 Mar 2025
Re-evaluating Theory of Mind evaluation in large language models Jennifer Hu Felix Sosa T. Ullman 40 0 0 28 Feb 2025
Distributional Scaling Laws for Emergent Capabilities Rosie Zhao Tian Qin David Alvarez-Melis Sham Kakade Naomi Saphra LRM 37 0 0 24 Feb 2025
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories Raj Sanjay Shah Sashank Varma LRM 89 0 0 22 Jan 2025
One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity Sonia K. Murthy Tomer Ullman Jennifer Hu ALM 41 10 0 07 Nov 2024
Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers Lorenzo Pacchiardi Marko Tesic Lucy G. Cheke José Hernández Orallo 31 3 0 15 Oct 2024
WinoPron: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case Vagrant Gautam Julius Steuer Eileen Bingert Ray Johns Anne Lauscher Dietrich Klakow 46 3 0 09 Sep 2024
Anthropocentric bias and the possibility of artificial cognition Raphael Milliere Charles Rathkopf 29 1 0 04 Jul 2024
Social Bias Evaluation for Large Language Models Requires Prompt Variations Rem Hida Masahiro Kaneko Naoaki Okazaki 38 13 0 03 Jul 2024
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages Andrew M. Bean Simi Hellsten Harry Mayne Jabez Magomere Ethan A. Chi Ryan A. Chi Scott A. Hale Hannah Rose Kirk ELM LRM 34 6 0 10 Jun 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions Polina Tsvilodub Paul Marty Sonia Ramotowska Jacopo Romoli Michael Franke 19 0 0 09 May 2024
Evidence from counterfactual tasks supports emergent analogical reasoning in large language models Taylor W. Webb K. Holyoak Hongjing Lu LRM ELM 27 4 0 14 Apr 2024
Robust Pronoun Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased? Vagrant Gautam Eileen Bingert D. Zhu Anne Lauscher Dietrich Klakow 38 8 0 04 Apr 2024
Language models align with human judgments on key grammatical constructions Jennifer Hu Kyle Mahowald G. Lupyan Anna A. Ivanova Roger Levy 30 22 0 19 Jan 2024
Can language models handle recursively nested grammatical structures? A case study on comparing models and humans Andrew Kyle Lampinen ReLM ELM 25 36 0 27 Oct 2022
Large Language Models are Zero-Shot Reasoners Takeshi Kojima S. Gu Machel Reid Yutaka Matsuo Yusuke Iwasawa ReLM LRM 291 4,048 0 24 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,261 0 28 Jan 2022
How Can We Accelerate Progress Towards Human-like Linguistic Generalization? Tal Linzen 216 188 0 03 May 2020