Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.04393
Cited By
v1
v2
v3 (latest)
Positional Artefacts Propagate Through Masked Language Model Embeddings
9 November 2020
Ziyang Luo
Artur Kulmizev
Xiaoxi Mao
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Positional Artefacts Propagate Through Masked Language Model Embeddings"
32 / 32 papers shown
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Haozheng Luo
Chenghao Qiu
Maojiang Su
Zhihan Zhou
Zoe Mehta
Guo Ye
Jerry Yao-Chieh Hu
Han Liu
AAML
502
7
0
01 May 2025
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
Jinguang Wang
Jiangming Wang
Haifeng Sun
Tingting Yang
Zirui Zhuang
Wanyi Ning
Yuexi Yin
Q. Qi
Jianxin Liao
MQ
MoMe
241
3
0
07 Mar 2025
Robust AI-Generated Text Detection by Restricted Embeddings
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Kristian Kuznetsov
Eduard Tulchinskii
Laida Kushnareva
German Magai
Serguei Barannikov
Sergey I. Nikolenko
Irina Piontkovskaya
DeLMO
220
18
0
10 Oct 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
225
4
0
27 Jun 2024
Improving Interpretability and Robustness for the Detection of AI-Generated Images
T. Gaintseva
Laida Kushnareva
German Magai
Irina Piontkovskaya
Sergey I. Nikolenko
Ziquan Liu
S. Barannikov
Gregory Slabaugh
269
4
0
21 Jun 2024
Outlier Reduction with Gated Attention for Improved Post-training Quantization in Large Sequence-to-sequence Speech Foundation Models
Dominik Wagner
Ilja Baumann
Korbinian Riedhammer
Tobias Bocklet
MQ
230
6
0
16 Jun 2024
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs
Jaewoo Yang
Hayun Kim
Younghoon Kim
303
20
0
23 May 2024
Unveiling Linguistic Regions in Large Language Models
Zhihao Zhang
Jun Zhao
Tao Gui
Tao Gui
Xuanjing Huang
411
24
0
22 Feb 2024
A Simple and Effective Pruning Approach for Large Language Models
International Conference on Learning Representations (ICLR), 2023
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
683
781
0
20 Jun 2023
Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Katharina Hämmerl
Alina Fastowski
Jindrich Libovický
Kangyang Luo
511
16
0
01 Jun 2023
The Impact of Positional Encoding on Length Generalization in Transformers
Neural Information Processing Systems (NeurIPS), 2023
Amirhossein Kazemnejad
Inkit Padhi
Karthikeyan N. Ramamurthy
Payel Das
Siva Reddy
495
348
0
31 May 2023
Intriguing Properties of Quantization at Scale
Neural Information Processing Systems (NeurIPS), 2023
Arash Ahmadian
Saurabh Dash
Hongyu Chen
Bharat Venkitesh
Stephen Gou
Phil Blunsom
Ahmet Üstün
Sara Hooker
MQ
361
49
0
30 May 2023
Feature-Learning Networks Are Consistent Across Widths At Realistic Scales
Neural Information Processing Systems (NeurIPS), 2023
Nikhil Vyas
Alexander B. Atanasov
Blake Bordelon
Depen Morwani
Sabarish Sainathan
Cengiz Pehlevan
506
43
0
28 May 2023
Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhong Zhang
Bang Liu
Junming Shao
316
20
0
27 May 2023
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ta-Chung Chi
Ting-Han Fan
Li-Wei Chen
Alexander I. Rudnicky
Peter J. Ramadge
VLM
MILM
213
23
0
23 May 2023
Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Na Li
Hanane Kteich
Zied Bouraoui
Steven Schockaert
271
11
0
16 May 2023
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps
International Conference on Learning Representations (ICLR), 2023
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
561
33
0
01 Feb 2023
Representation biases in sentence transformers
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Dmitry Nikolaev
Sebastian Padó
298
11
0
30 Jan 2023
The case for 4-bit precision: k-bit Inference Scaling Laws
International Conference on Machine Learning (ICML), 2022
Tim Dettmers
Luke Zettlemoyer
MQ
529
313
0
19 Dec 2022
The Curious Case of Absolute Position Embeddings
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
293
21
0
23 Oct 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Neural Information Processing Systems (NeurIPS), 2022
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Yazhe Niu
Shanghang Zhang
Tao Gui
F. Yu
Xianglong Liu
MQ
443
212
0
27 Sep 2022
Isotropic Representation Can Improve Dense Retrieval
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2022
Euna Jung
J. Park
Jaekeol Choi
Sungyoon Kim
Wonjong Rhee
OOD
288
7
0
01 Sep 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
615
975
0
15 Aug 2022
Outliers Dimensions that Disrupt Transformers Are Driven by Frequency
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Giovanni Puccetti
Anna Rogers
Aleksandr Drozd
F. Dell’Orletta
637
61
0
23 May 2022
GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Ali Modarressi
Mohsen Fayyaz
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
ViT
245
54
0
06 May 2022
DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks
Ziyang Luo
Yadong Xi
Jing Ma
Zhiwei Yang
Xiaoxi Mao
Changjie Fan
Rongsheng Zhang
215
5
0
19 Apr 2022
Measuring the Mixing of Contextual Information in the Transformer
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
379
75
0
08 Mar 2022
An Isotropy Analysis in the Multilingual BERT Embedding Space
Findings (Findings), 2021
S. Rajaee
Mohammad Taher Pilehvar
296
41
0
09 Oct 2021
Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations
Ekaterina Taktasheva
Vladislav Mikhailov
Ekaterina Artemova
283
15
0
28 Sep 2021
On Isotropy Calibration of Transformers
First Workshop on Insights from Negative Results in NLP (Insights), 2021
Yue Ding
Karolis Martinkus
Damian Pascual
Simon Clematide
Roger Wattenhofer
207
1
0
27 Sep 2021
All Bark and No Bite: Rogue Dimensions in Transformer Language Models Obscure Representational Quality
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
William Timkey
Marten van Schijndel
574
156
0
09 Sep 2021
BERT Busters: Outlier Dimensions that Disrupt Transformers
Findings (Findings), 2021
Olga Kovaleva
Saurabh Kulshreshtha
Anna Rogers
Anna Rumshisky
573
116
0
14 May 2021
1
Page 1 of 1