From Form(s) to Meaning: Probing the Semantic Depths of Language Models
Using Multisense Consistency

From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency

18 April 2024

Papers citing "From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency"

10 / 10 papers shown

Title
Consistency in Language Models: Current Landscape, Challenges, and Future Directions Jekaterina Novikova Carol Anderson Borhane Blili-Hamelin Subhabrata Majumdar HILM 69 0 0 01 May 2025
MultiLoKo: a multilingual local knowledge benchmark for LLMs spanning 31 languages Dieuwke Hupkes Nikolay Bogoychev 46 0 0 14 Apr 2025
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges Aman Singh Thakur Kartik Choudhary Venkat Srinik Ramayapally Sankaran Vaidyanathan Dieuwke Hupkes ELM ALM 45 55 0 18 Jun 2024
Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities Suhas Arehalli Brian Dillon Tal Linzen 23 35 0 21 Oct 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief Nora Kassner Oyvind Tafjord Hinrich Schütze Peter Clark KELM LRM 220 64 0 29 Sep 2021
Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and Attention S. Ryu Richard L. Lewis 31 25 0 26 Apr 2021
Measuring and Improving Consistency in Pretrained Language Models Yanai Elazar Nora Kassner Shauli Ravfogel Abhilasha Ravichander Eduard H. Hovy Hinrich Schütze Yoav Goldberg HILM 255 343 0 01 Feb 2021
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets Mor Geva Yoav Goldberg Jonathan Berant 232 319 0 21 Aug 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,927 0 20 Apr 2018