ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.01007
  4. Cited By
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural
  Language Inference
v1v2v3v4 (latest)

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

4 February 2019
R. Thomas McCoy
Ellie Pavlick
Tal Linzen
ArXiv (abs)PDFHTML

Papers citing "Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference"

50 / 863 papers shown
Efficient PRM Training Data Synthesis via Formal Verification
Efficient PRM Training Data Synthesis via Formal Verification
Ryo Kamoi
Yusen Zhang
Nan Zhang
Sarkar Snigdha Sarathi Das
Rui Zhang
Wenpeng Yin
Rui Zhang
LRM
362
2
0
10 Apr 2026
Different types of syntactic agreement recruit the same units within large language models
Different types of syntactic agreement recruit the same units within large language models
Daria Kryvosheieva
Andrea de Varda
Evelina Fedorenko
Greta Tuckute
162
1
0
03 Dec 2025
Auxiliary Metrics Help Decoding Skill Neurons in the Wild
Auxiliary Metrics Help Decoding Skill Neurons in the Wild
Yixiu Zhao
Xiaozhi Wang
Zijun Yao
Lei Hou
Juanzi Li
408
0
0
26 Nov 2025
BengaliFig: A Low-Resource Challenge for Figurative and Culturally Grounded Reasoning in Bengali
BengaliFig: A Low-Resource Challenge for Figurative and Culturally Grounded Reasoning in Bengali
Abdullah Al Sefat
222
1
0
25 Nov 2025
MapFormer: Self-Supervised Learning of Cognitive Maps with Input-Dependent Positional Embeddings
MapFormer: Self-Supervised Learning of Cognitive Maps with Input-Dependent Positional Embeddings
Victor Rambaud
Salvador Mascarenhas
Yair Lakretz
193
0
0
24 Nov 2025
Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation
Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation
Marii Ojastu
Hele-Andra Kuulmets
Aleksei Dorkin
Marika Borovikova
Dage Särg
Kairit Sirts
222
0
0
21 Nov 2025
Don't Learn, Ground: A Case for Natural Language Inference with Visual Grounding
Don't Learn, Ground: A Case for Natural Language Inference with Visual Grounding
Daniil Ignatev
Ayman Santeer
Albert Gatt
Denis Paperno
191
0
0
21 Nov 2025
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Cognitive Foundations for Reasoning and Their Manifestation in LLMs
Priyanka Kargupta
Shuyue Stella Li
Haocheng Wang
Jinu Lee
Shan Chen
...
Thomas L. Griffiths
Max Kleiman-Weiner
Jiawei Han
Asli Celikyilmaz
Yulia Tsvetkov
LRM
264
8
0
20 Nov 2025
Analyzing and Mitigating Negation Artifacts using Data Augmentation for Improving ELECTRA-Small Model Accuracy
Analyzing and Mitigating Negation Artifacts using Data Augmentation for Improving ELECTRA-Small Model Accuracy
Mojtaba Noghabaei
111
0
0
09 Nov 2025
Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation
Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation
Zhiwei Zhang
Xiaomin Li
Yudi Lin
Hui Liu
Ramraj Chandradevan
...
Minhua Lin
Fali Wang
Xianfeng Tang
Qi He
Suhang Wang
LLMAGLRM
298
6
0
04 Nov 2025
Measuring what Matters: Construct Validity in Large Language Model Benchmarks
Measuring what Matters: Construct Validity in Large Language Model Benchmarks
Andrew M. Bean
Ryan Kearns
Angelika Romanou
Franziska Sofia Hafner
Harry Mayne
...
Christopher Summerfield
Philip Torr
Cozmin Ududec
Luc Rocher
Adam Mahdi
ALM
586
32
0
03 Nov 2025
LingGym: How Far Are LLMs from Thinking Like Field Linguists?
LingGym: How Far Are LLMs from Thinking Like Field Linguists?
Changbing Yang
Franklin Ma
Freda Shi
Jian Zhu
ReLMLRM
327
4
0
01 Nov 2025
Do Students Debias Like Teachers? On the Distillability of Bias Mitigation Methods
Do Students Debias Like Teachers? On the Distillability of Bias Mitigation Methods
Jiali Cheng
Chirag Agarwal
Hadi Amiri
156
1
0
30 Oct 2025
MERGE: Minimal Expression-Replacement GEneralization Test for Natural Language Inference
MERGE: Minimal Expression-Replacement GEneralization Test for Natural Language Inference
Mădălina Zgreabăn
Tejaswini Deoskar
Lasha Abzianidze
184
0
0
28 Oct 2025
StreetMath: Study of LLMs' Approximation Behaviors
StreetMath: Study of LLMs' Approximation Behaviors
Chiung-Yi Tseng
Somshubhra Roy
Maisha Thasin
Danyang Zhang
Blessing Effiong
LRM
177
1
0
27 Oct 2025
Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data
Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data
Qilin Ye
Deqing Fu
Robin Jia
Vatsal Sharan
217
0
0
22 Oct 2025
LLM-Augmented Symbolic NLU System for More Reliable Continuous Causal Statement Interpretation
LLM-Augmented Symbolic NLU System for More Reliable Continuous Causal Statement Interpretation
Xin Lian
Kenneth D. Forbus
193
0
0
22 Oct 2025
Moneyball with LLMs: Analyzing Tabular Summarization in Sports Narratives
Moneyball with LLMs: Analyzing Tabular Summarization in Sports Narratives
Ritam Upadhyay
Naman Ahuja
Rishabh Baral
Aparna Garimella
Vivek Gupta
LMTD
229
0
0
20 Oct 2025
Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain
Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain
Jingmin An
Yilong Song
Ruolin Yang
Nai Ding
Lingxi Lu
Yuxuan Wang
Wei Wang
Chu Zhuang
Q. Wang
Fang Fang
210
2
0
15 Oct 2025
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models
Gagan Bhatia
Somayajulu G Sripada
Kevin Allan
Jacobo Azcona
HILMLRM
328
3
0
07 Oct 2025
Reward Models are Metrics in a Trench Coat
Reward Models are Metrics in a Trench Coat
Sebastian Gehrmann
189
0
0
03 Oct 2025
Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
Chantal Shaib
Vinith Suriyakumar
Levent Sagun
Byron C. Wallace
Elisa Kreiss
LRM
230
3
0
25 Sep 2025
GRPO++: Enhancing Dermatological Reasoning under Low Resource Settings
GRPO++: Enhancing Dermatological Reasoning under Low Resource Settings
Ismam Nur Swapnil
Aranya Saha
Tanvir Ahmed Khan
Mohammad Ariful Haque
134
0
0
23 Sep 2025
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Dongjun Kim
Gyuho Shim
YongChan Chun
Minhyuk Kim
Chanjun Park
Heuiseok Lim
186
2
0
23 Sep 2025
Extractive Fact Decomposition for Interpretable Natural Language Inference in one Forward Pass
Extractive Fact Decomposition for Interpretable Natural Language Inference in one Forward Pass
Nicholas Popovic
Michael Färber
145
1
0
23 Sep 2025
The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies
The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies
Jiaxu Zhou
Jen-tse Huang
Xuhui Zhou
Man Ho Lam
Xintao Wang
Hao Zhu
Wenxuan Wang
Maarten Sap
ALM
242
6
0
22 Sep 2025
Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations
Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations
Linyang He
Qiaolin Wang
Xilin Jiang
Nima Mesgarani
303
4
0
19 Sep 2025
Can Large Language Models Robustly Perform Natural Language Inference for Japanese Comparatives?
Can Large Language Models Robustly Perform Natural Language Inference for Japanese Comparatives?
Yosuke Mikami
Daiki Matsuoka
Hitomi Yanaka
ELM
115
0
0
17 Sep 2025
Do Natural Language Descriptions of Model Activations Convey Privileged Information?
Do Natural Language Descriptions of Model Activations Convey Privileged Information?
Millicent Li
Alberto Mario Ceballos Arroyo
Giordano Rogers
Naomi Saphra
Byron C. Wallace
260
4
0
16 Sep 2025
MORABLES: A Benchmark for Assessing Abstract Moral Reasoning in LLMs with Fables
MORABLES: A Benchmark for Assessing Abstract Moral Reasoning in LLMs with Fables
Matteo Marcuzzo
A. Zangari
A. Albarelli
Jose Camacho-Collados
Mohammad Taher Pilehvar
264
6
0
15 Sep 2025
Compartmentalised Agentic Reasoning for Clinical NLI
Compartmentalised Agentic Reasoning for Clinical NLI
Mael Jullien
Lei Xu
Marco Valentino
André Freitas
LRM
195
0
0
12 Sep 2025
On Aligning Prediction Models with Clinical Experiential Learning: A Prostate Cancer Case Study
On Aligning Prediction Models with Clinical Experiential Learning: A Prostate Cancer Case Study
Jacqueline Jil Vallon
William Overman
Wanqiao Xu
Neil Panjwani
Xi Ling
...
Geoffrey Sonn
Sandy Srinivas
E. Pollom
Mark K. Buyyounouski
Mohsen Bayati
206
1
0
04 Sep 2025
Can Out-of-Distribution Evaluations Uncover Reliance on Shortcuts? A Case Study in Question Answering
Can Out-of-Distribution Evaluations Uncover Reliance on Shortcuts? A Case Study in Question Answering
Michal Štefánik
Timothee Mickus
Marek Kadlcík
Michal Spiegel
Josef Kuchař
157
0
0
25 Aug 2025
Natural Language Satisfiability: Exploring the Problem Distribution and Evaluating Transformer-based Language Models
Natural Language Satisfiability: Exploring the Problem Distribution and Evaluating Transformer-based Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Tharindu Madusanka
Ian Pratt-Hartmann
Riza Batista-Navarro
LRM
135
4
0
23 Aug 2025
LLMs Struggle with NLI for Perfect Aspect: A Cross-Linguistic Study in Chinese and Japanese
LLMs Struggle with NLI for Perfect Aspect: A Cross-Linguistic Study in Chinese and Japanese
Jie Lu
Du Jin
Hitomi Yanaka
100
0
0
16 Aug 2025
Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics
Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics
Carter Blum
Katja Filipova
Ann Yuan
Asma Ghandeharioun
Julian Zimmert
...
Jessica Hoffmann
Tal Linzen
Martin Wattenberg
Lucas Dixon
Mor Geva
273
2
0
14 Aug 2025
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
Keyon Vafa
Peter G. Chang
Ashesh Rambachan
S. Mullainathan
772
28
0
09 Jul 2025
Discourse Heuristics For Paradoxically Moral Self-Correction
Discourse Heuristics For Paradoxically Moral Self-CorrectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Guangliang Liu
Zimo Qi
Xitong Zhang
K. Johnson
LRM
220
4
0
01 Jul 2025
Model Guidance via Robust Feature Attribution
Model Guidance via Robust Feature Attribution
Mihnea Ghitu
Vihari Piratla
Matthew Wicker
AAML
264
0
0
24 Jun 2025
CC-LEARN: Cohort-based Consistency Learning
CC-LEARN: Cohort-based Consistency Learning
Xiao Ye
Shaswat Shrivastava
Zhaonan Li
Jacob Dineen
Shijie Lu
Avneet Ahuja
Ming shen
Zhikun Xu
Ben Zhou
OffRLLRM
461
2
0
18 Jun 2025
When Does Meaning Backfire? Investigating the Role of AMRs in NLI
When Does Meaning Backfire? Investigating the Role of AMRs in NLI
Junghyun Min
Xiulin Yang
Shira Wein
LLMSV
350
2
0
17 Jun 2025
LoRA Users Beware: A Few Spurious Tokens Can Manipulate Your Finetuned Model
LoRA Users Beware: A Few Spurious Tokens Can Manipulate Your Finetuned Model
Pradyut Sekhsaria
Marcel Mateos Salles
Hai Huang
Randall Balestriero
Randall Balestriero
386
1
0
13 Jun 2025
A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs
A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs
Benno Krojer
Mojtaba Komeili
Candace Ross
Q. Garrido
Koustuv Sinha
Nicolas Ballas
Mahmoud Assran
471
11
0
11 Jun 2025
Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable events
Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable eventsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
J. Michaelov
Reeka Estacio
Zhien Zhang
Benjamin Bergen
ReLMLRM
251
2
0
07 Jun 2025
RELIC: Evaluating Compositional Instruction Following via Language Recognition
RELIC: Evaluating Compositional Instruction Following via Language Recognition
Jackson Petty
Michael Y. Hu
Wentao Wang
Shauli Ravfogel
William Merrill
Tal Linzen
350
2
0
05 Jun 2025
Exploring Explanations Improves the Robustness of In-Context Learning
Exploring Explanations Improves the Robustness of In-Context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ukyo Honda
Tatsushi Oka
LRM
341
0
0
03 Jun 2025
Image Generation from Contextually-Contradictory Prompts
Image Generation from Contextually-Contradictory Prompts
Saar Huberman
Or Patashnik
Omer Dahary
Ron Mokady
Daniel Cohen-Or
DiffM
283
4
0
02 Jun 2025
Spurious Correlations and Beyond: Understanding and Mitigating Shortcut Learning in SDOH Extraction with Large Language Models
Spurious Correlations and Beyond: Understanding and Mitigating Shortcut Learning in SDOH Extraction with Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Fardin Ahsan Sakib
Ziwei Zhu
Karen Trister Grace
Meliha Yetisgen
Özlem Uzuner
281
1
0
30 May 2025
Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds
Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds
Ishwar B Balappanawar
Vamshi Krishna Bonagiri
Anish Joishy
Manas Gaur
K. Thirunarayan
Ponnurangam Kumaraguru
ReLMLRM
334
0
0
28 May 2025
Research Community Perspectives on "Intelligence" and Large Language Models
Research Community Perspectives on "Intelligence" and Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Bertram Højer
Terne Sasha Thorn Jakobsen
Anna Rogers
Stefan Heinrich
229
3
0
27 May 2025
1234...161718
Next
Page 1 of 18
Pageof 18