Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little

14 April 2021

Robin Jia

Douwe Kiela

Papers citing "Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little"

50 / 165 papers shown

Title
Do Large Language Models know who did what to whom? Joseph M. Denning Xiaohan Bryor Snefjella Idan A. Blank 50 1 0 23 Apr 2025
Linguistic Interpretability of Transformer-based Language Models: a systematic review Miguel López-Otal Jorge Gracia Jordi Bernad Carlos Bobed Lucía Pitarch-Ballesteros Emma Anglés-Herrero VLM 36 0 0 09 Apr 2025
PAD: Towards Efficient Data Generation for Transfer Learning Using Phrase Alignment Jong Myoung Kim Young-Jun_Lee Ho-Jin Choi Sangkeun Jung 58 0 0 24 Mar 2025
A Survey on Federated Fine-tuning of Large Language Models Yebo Wu Chunlin Tian Jingguang Li He Sun Kahou Tam Li Li Chengzhong Xu FedML 81 0 0 15 Mar 2025
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia Chenxi Wang Tianle Gu Zhongyu Wei Lang Gao Zirui Song Xiuying Chen OffRL 56 2 0 03 Mar 2025
Comply: Learning Sentences with Complex Weights inspired by Fruit Fly Olfaction Alexei Figueroa Justus Westerhoff Golzar Atefi Dennis Fast B. Winter Felix Alexader Gers Alexander Loser Wolfang Nejdl 52 0 0 03 Feb 2025
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Jiajun Zhu Peihao Wang Ruisi Cai Jason D. Lee Pan Li Z. Wang KELM 36 1 0 03 Jan 2025
What makes a good metric? Evaluating automatic metrics for text-to-image consistency Candace Ross Melissa Hall Adriana Romero Soriano Adina Williams 90 3 0 18 Dec 2024
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers MohammadReza Ebrahimi Sunny Panchal Roland Memisevic 33 5 0 10 Aug 2024
From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks? Tao Feng Lizhen Qu Niket Tandon Zhuang Li Xiaoxi Kang Gholamreza Haffari LRM 29 4 0 29 Jul 2024
Continual Learning for Temporal-Sensitive Question Answering Wanqi Yang Yunqiu Xu Yanda Li Kunze Wang Binbin Huang Ling-Hao Chen CLL 27 3 0 17 Jul 2024
Does Incomplete Syntax Influence Korean Language Model? Focusing on Word Order and Case Markers Jong Myoung Kim Young-Jun Lee Yong-jin Han Sangkeun Jung Ho-Jin Choi 32 2 0 12 Jul 2024
Black Big Boxes: Do Language Models Hide a Theory of Adjective Order? Jaap Jumelet Lisa Bylinina Willem H. Zuidema Jakub Szymanik 59 4 0 02 Jul 2024
Unveiling Glitches: A Deep Dive into Image Encoding Bugs within CLIP Ayush Ranjan Daniel Wen Karthik Bhat 24 0 0 30 Jun 2024
Changing Answer Order Can Decrease MMLU Accuracy Vipul Gupta David Pantoja Candace Ross Adina Williams Megan Ung 50 22 0 27 Jun 2024
Fairness and Bias in Multimodal AI: A Survey Tosin P. Adewumi Lama Alkhaled Namrata Gurung G. V. Boven Irene Pagliai 48 9 0 27 Jun 2024
Exploring the Impact of a Transformer's Latent Space Geometry on Downstream Task Performance Anna C. Marbut John W. Chandler Travis J. Wheeler 27 0 0 18 Jun 2024
Bag of Lies: Robustness in Continuous Pre-training BERT I. Gevers Walter Daelemans 36 0 0 14 Jun 2024
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages Trinh Pham Khoi M. Le Luu Anh Tuan 34 1 0 14 Jun 2024
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More O. Kitouni Niklas Nolte Diane Bouchacourt Adina Williams Mike Rabbat Mark Ibrahim LRM CLL 46 12 0 07 Jun 2024
Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation Kaize Shi Xueyao Sun Qing Li Guandong Xu 43 12 0 06 May 2024
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining Yiitan Yuan Zhuo Chen Xubo Liu Haohe Liu Xuenan Xu Dongya Jia Yuanzhe Chen Mark D. Plumbley Wenwu Wang CLIP VLM 40 9 0 27 Apr 2024
Open-Source AI-based SE Tools: Opportunities and Challenges of Collaborative Software Learning Zhihao Lin Wei Ma Tao Lin Yaowen Zheng Jingquan Ge Jun Wang Jacques Klein Tegawende F. Bissyande Yang Liu Li Li VLM 30 4 0 09 Apr 2024
A Morphology-Based Investigation of Positional Encodings Poulami Ghosh Shikhar Vashishth Raj Dabre Pushpak Bhattacharyya 24 1 0 06 Apr 2024
Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs Kanishka Misra Kyle Mahowald 27 22 0 28 Mar 2024
Reverse Training to Nurse the Reversal Curse O. Yu. Golovneva Zeyuan Allen-Zhu Jason Weston Sainbayar Sukhbaatar 26 32 0 20 Mar 2024
Word Order's Impacts: Insights from Reordering and Generation Analysis Qinghua Zhao Jiaang Li Lei Li Zenghui Zhou Junfeng Liu 27 0 0 18 Mar 2024
Topic Aware Probing: From Sentence Length Prediction to Idiom Identification how reliant are Neural Language Models on Topic? Vasudevan Nedumpozhimana John D. Kelleher 29 1 0 04 Mar 2024
Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training Qingyan Guo Rui Wang Junliang Guo Xu Tan Jiang Bian Yujiu Yang LRM 16 5 0 01 Mar 2024
Word Order and World Knowledge Qinghua Zhao Vinit Ravishankar Nicolas Garneau Anders Søgaard 16 0 0 01 Mar 2024
Pointing out the Shortcomings of Relation Extraction Models with Semantically Motivated Adversarials Gennaro Nolano Moritz Blum Basil Ell Philipp Cimiano 22 1 0 29 Feb 2024
When does word order matter and when doesn't it? Xuanda Chen T. O'Donnell Siva Reddy 25 0 0 29 Feb 2024
Semantics of Multiword Expressions in Transformer-Based Models: A Survey Filip Miletic Sabine Schulte im Walde 40 6 0 27 Jan 2024
Mission: Impossible Language Models Julie Kallini Isabel Papadimitriou Richard Futrell Kyle Mahowald Christopher Potts ELM LRM 42 19 0 12 Jan 2024
I am a Strange Dataset: Metalinguistic Tests for Language Models Tristan Thrush Jared Moore Miguel Monares Christopher Potts Douwe Kiela 14 5 0 10 Jan 2024
Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation Shan Zhong Zhongzhan Huang Shanghua Gao Wushao Wen Liang Lin Marinka Zitnik Pan Zhou LLMAG LRM 19 35 0 05 Dec 2023
Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text Qi Cao Takeshi Kojima Yutaka Matsuo Yusuke Iwasawa 12 18 0 30 Nov 2023
Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions? Wang Zhu Ishika Singh Yuan Huang Robin Jia Jesse Thomason 31 2 0 28 Nov 2023
BIM: Block-Wise Self-Supervised Learning with Masked Image Modeling Yixuan Luo Mengye Ren Sai Qian Zhang 15 0 0 28 Nov 2023
System 2 Attention (is something you might need too) Jason Weston Sainbayar Sukhbaatar RALM OffRL LRM 22 57 0 20 Nov 2023
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study Maike Zufle Verna Dankers Ivan Titov 25 0 0 16 Nov 2023
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals Yanai Elazar Bhargavi Paranjape Hao Peng Sarah Wiegreffe Khyathi Raghavi Vivek Srikumar Sameer Singh Noah A. Smith AAML OOD 21 0 0 16 Nov 2023
Multilingual Nonce Dependency Treebanks: Understanding how Language Models represent and process syntactic structure David Arps Laura Kallmeyer Younes Samih Hassan Sajjad 19 1 0 13 Nov 2023
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance? Ahmed Alajrami Katerina Margatina Nikolaos Aletras AAML 19 1 0 26 Oct 2023
On Surgical Fine-tuning for Language Encoders Abhilasha Lodha Gayatri Belapurkar Saloni Chalkapurkar Yuanming Tao Reshmi Ghosh Samyadeep Basu Dmitrii Petrov Soundararajan Srinivasan 14 3 0 25 Oct 2023
The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining Ting-Rui Chiang Dani Yogatama 25 1 0 25 Oct 2023
Unnatural language processing: How do language models handle machine-generated prompts? Corentin Kervadec Francesca Franzon Marco Baroni 23 5 0 24 Oct 2023
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary Myeongjun Jang Thomas Lukasiewicz 14 4 0 24 Oct 2023
The Locality and Symmetry of Positional Encodings Lihu Chen Gaël Varoquaux Fabian M. Suchanek 23 0 0 19 Oct 2023
CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models Sreyan Ghosh Ashish Seth Sonal Kumar Utkarsh Tyagi Chandra Kiran Reddy Evuru S. Ramaneswaran S. Sakshi Oriol Nieto R. Duraiswami Dinesh Manocha AuLLM VLM CoGe 35 21 0 12 Oct 2023