Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1805.12471
Cited By
v1
v2
v3 (latest)
Neural Network Acceptability Judgments
31 May 2018
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural Network Acceptability Judgments"
50 / 950 papers shown
Variational Learning is Effective for Large Deep Networks
Yuesong Shen
Nico Daheim
Bai Cong
Peter Nickl
Gian Maria Marconi
...
Rio Yokota
Iryna Gurevych
Daniel Cremers
Mohammad Emtiyaz Khan
Thomas Möllenhoff
310
43
0
27 Feb 2024
Sinkhorn Distance Minimization for Knowledge Distillation
Xiao Cui
Yulei Qin
Yuting Gao
Enwei Zhang
Zihan Xu
Tong Wu
Ke Li
Xing Sun
Wen-gang Zhou
Houqiang Li
203
19
0
27 Feb 2024
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning
Sudipta Singha Roy
Chengshun Shi
Shiguang Wu
Mengqi Zhang
Zhaochun Ren
Maarten de Rijke
Zhumin Chen
Jiahuan Pei
MoE
478
15
0
27 Feb 2024
Layer-wise Regularized Dropout for Neural Language Models
Shiwen Ni
Min Yang
Ruifeng Xu
Chengming Li
Xiping Hu
123
0
0
26 Feb 2024
LoRA Meets Dropout under a Unified Framework
Sheng Wang
Liheng Chen
Jiyue Jiang
Boyang Xue
Lingpeng Kong
Chuan Wu
329
22
0
25 Feb 2024
Towards Efficient Active Learning in NLP via Pretrained Representations
Artem Vysogorets
Achintya Gopal
144
0
0
23 Feb 2024
Advancing Parameter Efficiency in Fine-tuning via Representation Editing
Muling Wu
Tianlong Li
Xiaohua Wang
Changze Lv
Changze Lv
Zixuan Ling
Jianhao Zhu
Cenyuan Zhang
Xiaoqing Zheng
Xuanjing Huang
428
33
0
23 Feb 2024
PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning
Zhisheng Lin
Han Fu
Chenghao Liu
Zhuo Li
Jianling Sun
MoE
MoMe
254
6
0
23 Feb 2024
Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond
Xinyu Wang
Hainiu Xu
Lin Gui
Yulan He
MoMe
AIFin
296
2
0
22 Feb 2024
Beyond Simple Averaging: Improving NLP Ensemble Performance with Topological-Data-Analysis-Based Weighting
P. Proskura
Alexey Zaytsev
290
0
0
22 Feb 2024
Improving Language Understanding from Screenshots
Tianyu Gao
Zirui Wang
Adithya Bhaskar
Danqi Chen
VLM
201
13
0
21 Feb 2024
On Sensitivity of Learning with Limited Labelled Data to the Effects of Randomness: Impact of Interactions and Systematic Choices
Branislav Pecher
Ivan Srba
Maria Bielikova
255
5
0
20 Feb 2024
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
MoE
457
24
0
20 Feb 2024
Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance
Branislav Pecher
Ivan Srba
Maria Bielikova
ALM
336
15
0
20 Feb 2024
In-Context Learning Demonstration Selection via Influence Analysis
Vinay M.S.
Minh-Hao Van
Xintao Wu
295
12
0
19 Feb 2024
Induced Model Matching: Restricted Models Help Train Full-Featured Models
Usama Muneeb
Mesrob I. Ohannessian
111
0
0
19 Feb 2024
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models
Yifan Yang
Jiajun Zhou
Ngai Wong
Zheng Zhang
195
16
0
18 Feb 2024
Contrastive Instruction Tuning
Tianyi Yan
Fei Wang
James Y. Huang
Wenxuan Zhou
Fan Yin
Aram Galstyan
Wenpeng Yin
Muhao Chen
ALM
270
10
0
17 Feb 2024
Uncertainty Quantification for In-Context Learning of Large Language Models
Chen Ling
Xujiang Zhao
Xuchao Zhang
Wei Cheng
Yanchi Liu
...
Katsushi Matsuda
Jie Ji
Guangji Bai
Bo Pan
Haifeng Chen
283
33
0
15 Feb 2024
Reusing Softmax Hardware Unit for GELU Computation in Transformers
C. Peltekis
K. Alexandridis
G. Dimitrakopoulos
123
9
0
15 Feb 2024
JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models
Jillian R. Fisher
Ximing Lu
Jaehun Jung
Liwei Jiang
Zaid Harchaoui
Yejin Choi
240
9
0
13 Feb 2024
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning
Haeju Lee
Minchan Jeong
SeYoung Yun
Kee-Eung Kim
AAML
VPVLM
220
4
0
13 Feb 2024
Should I try multiple optimizers when fine-tuning pre-trained Transformers for NLP tasks? Should I tune their hyperparameters?
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Nefeli Gkouti
Prodromos Malakasiotis
Stavros Toumpis
Ion Androutsopoulos
193
6
0
10 Feb 2024
A Unified Causal View of Instruction Tuning
Luyao Chen
Wei Huang
Ruqing Zhang
Wei Chen
Jiafeng Guo
Xueqi Cheng
175
1
0
09 Feb 2024
Learn To be Efficient: Build Structured Sparsity in Large Language Models
Haizhong Zheng
Xiaoyan Bai
Xueshen Liu
Z. Morley Mao
Beidi Chen
Fan Lai
Atul Prakash
283
23
0
09 Feb 2024
SoftEDA: Rethinking Rule-Based Data Augmentation with Soft Labels
Juhwan Choi
Kyohoon Jin
Junho Lee
Sang-hyŏn Song
Youngbin Kim
100
6
0
08 Feb 2024
AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes
Juhwan Choi
Kyohoon Jin
Junho Lee
Sangmin Song
Youngbin Kim
218
3
0
08 Feb 2024
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry
Michael Zhang
Kush S. Bhatia
Hermann Kumbong
Christopher Ré
213
84
0
06 Feb 2024
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Yingyi Chen
Qinghua Tao
F. Tonin
Johan A. K. Suykens
246
1
0
02 Feb 2024
SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing
Sheng Li
Geng Yuan
Yuezhen Dai
Youtao Zhang
Yanzhi Wang
Xulong Tang
341
26
0
30 Jan 2024
A Survey on Data Augmentation in Large Model Era
Yue Zhou
Chenlu Guo
Xu Wang
Yi-Ju Chang
Yuan Wu
LM&MA
VLM
480
49
0
27 Jan 2024
HiFT: A Hierarchical Full Parameter Fine-Tuning Strategy
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yongkang Liu
Yiqun Zhang
Qian Li
Tong Liu
Shi Feng
Daling Wang
Yifei Zhang
Hinrich Schütze
294
14
0
26 Jan 2024
Instructional Fingerprinting of Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Lyne Tchapmi
Fei Wang
Mingyu Derek Ma
Pang Wei Koh
Chaowei Xiao
Muhao Chen
WaLM
278
57
0
21 Jan 2024
Finding a Needle in the Adversarial Haystack: A Targeted Paraphrasing Approach For Uncovering Edge Cases with Minimal Distribution Distortion
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Aly M. Kassem
Sherif Saad
AAML
299
3
0
21 Jan 2024
Quantum Transfer Learning for Acceptability Judgements
Quantum Machine Intelligence (QMI), 2024
Giuseppe Buonaiuto
Raffaele Guarasci
Aniello Minutolo
G. De Pietro
M. Esposito
243
16
0
15 Jan 2024
Model Editing at Scale leads to Gradual and Catastrophic Forgetting
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Akshat Gupta
Anurag Rao
Gopala Anumanchipalli
KELM
CLL
212
72
0
15 Jan 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Zhengxin Zhang
Dan Zhao
Xupeng Miao
Xupeng Miao
Qing Li
Yong Jiang
Zhihao Jia
MQ
198
11
0
13 Jan 2024
The Butterfly Effect of Altering Prompts: How Small Changes and Jailbreaks Affect Large Language Model Performance
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
A. Salinas
Fred Morstatter
356
85
0
08 Jan 2024
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
Neural Information Processing Systems (NeurIPS), 2023
Jacob P. Portes
Alex Trott
Sam Havens
Daniel King
Abhinav Venigalla
Moin Nadeem
Nikhil Sardana
D. Khudia
Jonathan Frankle
311
32
0
29 Dec 2023
Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression
Luis Balderas
Miguel Lastra
José M. Benítez
123
2
0
17 Dec 2023
Catwalk: A Unified Language Model Evaluation Framework for Many Datasets
Dirk Groeneveld
Anas Awadalla
Iz Beltagy
Akshita Bhagia
Ian H. Magnusson
Hao Peng
Oyvind Tafjord
Pete Walsh
Kyle Richardson
Jesse Dodge
265
2
0
15 Dec 2023
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
International Conference on Machine Learning (ICML), 2023
Collin Burns
Pavel Izmailov
Jan Hendrik Kirchner
Bowen Baker
Leo Gao
...
Adrien Ecoffet
Manas Joglekar
Jan Leike
Ilya Sutskever
Jeff Wu
ELM
349
386
0
14 Dec 2023
GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction
Jiacheng Ruan
Jingsheng Gao
Mingye Xie
Suncheng Xiang
Zefang Yu
Ting Liu
Yuzhuo Fu
MoE
197
8
0
12 Dec 2023
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Mohammad-Javad Davari
Eugene Belilovsky
MoMe
262
97
0
11 Dec 2023
GTA: Gated Toxicity Avoidance for LM Performance Preservation
Heegyu Kim
Hyunsouk Cho
163
2
0
11 Dec 2023
Beyond Gradient and Priors in Privacy Attacks: Leveraging Pooler Layer Inputs of Language Models in Federated Learning
Jianwei Li
Sheng Liu
Qi Lei
PILM
SILM
AAML
255
4
0
10 Dec 2023
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
399
12
0
07 Dec 2023
LayerCollapse: Adaptive compression of neural networks
Soheil Zibakhsh Shabgahi
Mohammad Soheil Shariff
F. Koushanfar
AI4CE
208
1
0
29 Nov 2023
Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification
International Joint Conference on Natural Language Processing (IJCNLP), 2023
Daryna Dementieva
Daniil Moskovskiy
David Dale
Sergey Petrakov
334
25
0
23 Nov 2023
Sparse Low-rank Adaptation of Pre-trained Language Models
Ning Ding
Xingtai Lv
Qiaosen Wang
Yulin Chen
Bowen Zhou
Zhiyuan Liu
Maosong Sun
312
95
0
20 Nov 2023
Previous
1
2
3
...
5
6
7
...
17
18
19
Next