ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.02418
  4. Cited By
Auxiliary task demands mask the capabilities of smaller language models

Auxiliary task demands mask the capabilities of smaller language models

3 April 2024
Jennifer Hu
Michael C. Frank
    ELM
ArXivPDFHTML

Papers citing "Auxiliary task demands mask the capabilities of smaller language models"

24 / 24 papers shown
Title
A Survey on Collaborative Mechanisms Between Large and Small Language Models
A Survey on Collaborative Mechanisms Between Large and Small Language Models
Yi Chen
JiaHao Zhao
HaoHao Han
28
0
0
12 May 2025
Do Large Language Models know who did what to whom?
Do Large Language Models know who did what to whom?
Joseph M. Denning
Xiaohan
Bryor Snefjella
Idan A. Blank
50
1
0
23 Apr 2025
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler A. Chang
Benjamin Bergen
41
0
0
21 Apr 2025
Linking forward-pass dynamics in Transformers and real-time human processing
Linking forward-pass dynamics in Transformers and real-time human processing
Jennifer Hu
Michael A. Lepori
Michael Franke
AI4CE
65
0
0
18 Apr 2025
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
Pavel Chizhov
Mattia Nee
Pierre-Carl Langlais
Ivan P. Yamshchikov
ReLM
ELM
LRM
39
1
0
10 Apr 2025
Not All Data Are Unlearned Equally
Not All Data Are Unlearned Equally
Aravind Krishnan
Siva Reddy
Marius Mosbach
MU
65
0
0
07 Apr 2025
Language Models Fail to Introspect About Their Knowledge of Language
Siyuan Song
Jennifer Hu
Kyle Mahowald
LRM
KELM
HILM
ELM
79
2
0
10 Mar 2025
Re-evaluating Theory of Mind evaluation in large language models
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
40
0
0
28 Feb 2025
Distributional Scaling Laws for Emergent Capabilities
Distributional Scaling Laws for Emergent Capabilities
Rosie Zhao
Tian Qin
David Alvarez-Melis
Sham Kakade
Naomi Saphra
LRM
37
0
0
24 Feb 2025
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
Raj Sanjay Shah
Sashank Varma
LRM
89
0
0
22 Jan 2025
One fish, two fish, but not the whole sea: Alignment reduces language
  models' conceptual diversity
One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity
Sonia K. Murthy
Tomer Ullman
Jennifer Hu
ALM
41
10
0
07 Nov 2024
Leaving the barn door open for Clever Hans: Simple features predict LLM
  benchmark answers
Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers
Lorenzo Pacchiardi
Marko Tesic
Lucy G. Cheke
José Hernández Orallo
31
3
0
15 Oct 2024
WinoPron: Revisiting English Winogender Schemas for Consistency,
  Coverage, and Grammatical Case
WinoPron: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case
Vagrant Gautam
Julius Steuer
Eileen Bingert
Ray Johns
Anne Lauscher
Dietrich Klakow
46
3
0
09 Sep 2024
Anthropocentric bias and the possibility of artificial cognition
Anthropocentric bias and the possibility of artificial cognition
Raphael Milliere
Charles Rathkopf
29
1
0
04 Jul 2024
Social Bias Evaluation for Large Language Models Requires Prompt
  Variations
Social Bias Evaluation for Large Language Models Requires Prompt Variations
Rem Hida
Masahiro Kaneko
Naoaki Okazaki
38
13
0
03 Jul 2024
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in
  Low-Resource and Extinct Languages
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Andrew M. Bean
Simi Hellsten
Harry Mayne
Jabez Magomere
Ethan A. Chi
Ryan A. Chi
Scott A. Hale
Hannah Rose Kirk
ELM
LRM
34
6
0
10 Jun 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the
  Inferences of Plain and Embedded Disjunctions
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions
Polina Tsvilodub
Paul Marty
Sonia Ramotowska
Jacopo Romoli
Michael Franke
19
0
0
09 May 2024
Evidence from counterfactual tasks supports emergent analogical
  reasoning in large language models
Evidence from counterfactual tasks supports emergent analogical reasoning in large language models
Taylor W. Webb
K. Holyoak
Hongjing Lu
LRM
ELM
27
4
0
14 Apr 2024
Robust Pronoun Fidelity with English LLMs: Are they Reasoning,
  Repeating, or Just Biased?
Robust Pronoun Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased?
Vagrant Gautam
Eileen Bingert
D. Zhu
Anne Lauscher
Dietrich Klakow
38
8
0
04 Apr 2024
Language models align with human judgments on key grammatical
  constructions
Language models align with human judgments on key grammatical constructions
Jennifer Hu
Kyle Mahowald
G. Lupyan
Anna A. Ivanova
Roger Levy
30
22
0
19 Jan 2024
Can language models handle recursively nested grammatical structures? A
  case study on comparing models and humans
Can language models handle recursively nested grammatical structures? A case study on comparing models and humans
Andrew Kyle Lampinen
ReLM
ELM
25
36
0
27 Oct 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
How Can We Accelerate Progress Towards Human-like Linguistic
  Generalization?
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
Tal Linzen
216
188
0
03 May 2020
1