v1v2v3 (latest)

Neural Network Acceptability Judgments

31 May 2018

Alex Warstadt

Amanpreet Singh

Samuel R. Bowman

ArXiv (abs)PDF HTML

Papers citing "Neural Network Acceptability Judgments"

50 / 950 papers shown

Variational Learning is Effective for Large Deep Networks

...

Mohammad Emtiyaz Khan

Thomas Möllenhoff

310

27 Feb 2024

Sinkhorn Distance Minimization for Knowledge Distillation

Yulei Qin

203

27 Feb 2024

MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning

478

27 Feb 2024

Layer-wise Regularized Dropout for Neural Language Models

Shiwen Ni

Min Yang

Ruifeng Xu

Chengming Li

Xiping Hu

123

26 Feb 2024

LoRA Meets Dropout under a Unified Framework

Lingpeng Kong

Chuan Wu

329

25 Feb 2024

Towards Efficient Active Learning in NLP via Pretrained Representations

Artem Vysogorets

Achintya Gopal

144

23 Feb 2024

Advancing Parameter Efficiency in Fine-tuning via Representation Editing

Xuanjing Huang

428

23 Feb 2024

PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning

Jianling Sun

254

23 Feb 2024

Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond

Yulan He

296

22 Feb 2024

Beyond Simple Averaging: Improving NLP Ensemble Performance with Topological-Data-Analysis-Based Weighting

P. Proskura

Alexey Zaytsev

290

22 Feb 2024

Improving Language Understanding from Screenshots

201

21 Feb 2024

On Sensitivity of Learning with Limited Labelled Data to the Effects of Randomness: Impact of Interactions and Systematic Choices

Branislav Pecher

Ivan Srba

Maria Bielikova

255

20 Feb 2024

HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts

457

20 Feb 2024

Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance

336

20 Feb 2024

In-Context Learning Demonstration Selection via Influence Analysis

Vinay M.S.

Minh-Hao Van

Xintao Wu

295

19 Feb 2024

Induced Model Matching: Restricted Models Help Train Full-Featured Models

Usama Muneeb

Mesrob I. Ohannessian

111

19 Feb 2024

LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

195

18 Feb 2024

Contrastive Instruction Tuning

270

17 Feb 2024

Uncertainty Quantification for In-Context Learning of Large Language Models

...

283

15 Feb 2024

Reusing Softmax Hardware Unit for GELU Computation in Transformers

C. Peltekis

K. Alexandridis

G. Dimitrakopoulos

123

15 Feb 2024

JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models

240

13 Feb 2024

Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning

220

13 Feb 2024

Should I try multiple optimizers when fine-tuning pre-trained Transformers for NLP tasks? Should I tune their hyperparameters?Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024

Nefeli Gkouti

Prodromos Malakasiotis

Stavros Toumpis

Ion Androutsopoulos

193

10 Feb 2024

A Unified Causal View of Instruction Tuning

175

09 Feb 2024

Learn To be Efficient: Build Structured Sparsity in Large Language Models

Beidi Chen

283

09 Feb 2024

SoftEDA: Rethinking Rule-Based Data Augmentation with Soft Labels

Juhwan Choi

100

08 Feb 2024

AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes

Juhwan Choi

218

08 Feb 2024

The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry

213

06 Feb 2024

Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes

246

02 Feb 2024

SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing

341

30 Jan 2024

A Survey on Data Augmentation in Large Model Era

480

27 Jan 2024

HiFT: A Hierarchical Full Parameter Fine-Tuning StrategyConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Shi Feng

294

26 Jan 2024

Instructional Fingerprinting of Large Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Pang Wei Koh

278

21 Jan 2024

Finding a Needle in the Adversarial Haystack: A Targeted Paraphrasing Approach For Uncovering Edge Cases with Minimal Distribution DistortionConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024

Aly M. Kassem

Sherif Saad

AAML

299

21 Jan 2024

Quantum Transfer Learning for Acceptability JudgementsQuantum Machine Intelligence (QMI), 2024

243

15 Jan 2024

Model Editing at Scale leads to Gradual and Catastrophic ForgettingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Akshat Gupta

Anurag Rao

Gopala Anumanchipalli

KELM CLL

212

15 Jan 2024

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

198

13 Jan 2024

The Butterfly Effect of Altering Prompts: How Small Changes and Jailbreaks Affect Large Language Model PerformanceAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

A. Salinas

Fred Morstatter

356

08 Jan 2024

MosaicBERT: A Bidirectional Encoder Optimized for Fast PretrainingNeural Information Processing Systems (NeurIPS), 2023

311

29 Dec 2023

Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression

Luis Balderas

Miguel Lastra

José M. Benítez

123

17 Dec 2023

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

Dirk Groeneveld

Anas Awadalla

Iz Beltagy

Akshita Bhagia

Ian H. Magnusson

Hao Peng

Oyvind Tafjord

Pete Walsh

Kyle Richardson

Jesse Dodge

265

15 Dec 2023

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak SupervisionInternational Conference on Machine Learning (ICML), 2023

...

349

386

14 Dec 2023

GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction

197

12 Dec 2023

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks

Mohammad-Javad Davari

Eugene Belilovsky

MoMe

262

11 Dec 2023

GTA: Gated Toxicity Avoidance for LM Performance Preservation

Heegyu Kim

Hyunsouk Cho

163

11 Dec 2023

Beyond Gradient and Priors in Privacy Attacks: Leveraging Pooler Layer Inputs of Language Models in Federated Learning

Jianwei Li

Sheng Liu

Qi Lei

PILM SILM AAML

255

10 Dec 2023

Graph Convolutions Enrich the Self-Attention in Transformers!

Jeongwhan Choi

399

07 Dec 2023

LayerCollapse: Adaptive compression of neural networks

Soheil Zibakhsh Shabgahi

Mohammad Soheil Shariff

F. Koushanfar

AI4CE

208

29 Nov 2023

Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text DetoxificationInternational Joint Conference on Natural Language Processing (IJCNLP), 2023

334

23 Nov 2023

Sparse Low-rank Adaptation of Pre-trained Language Models

Zhiyuan Liu

Maosong Sun

312

20 Nov 2023