Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.04393
Cited By
Positional Artefacts Propagate Through Masked Language Model Embeddings
9 November 2020
Ziyang Luo
Artur Kulmizev
Xiaoxi Mao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Positional Artefacts Propagate Through Masked Language Model Embeddings"
32 / 32 papers shown
Title
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Haozheng Luo
Chenghao Qiu
Maojiang Su
Zhihan Zhou
Zoe Mehta
Guo Ye
Jerry Yao-Chieh Hu
Han Liu
AAML
55
0
0
01 May 2025
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
Jinguang Wang
J. Wang
Haifeng Sun
Tingting Yang
Zirui Zhuang
Wanyi Ning
Yuexi Yin
Q. Qi
Jianxin Liao
MQ
MoMe
44
0
0
07 Mar 2025
Robust AI-Generated Text Detection by Restricted Embeddings
Kristian Kuznetsov
Eduard Tulchinskii
Laida Kushnareva
German Magai
Serguei Barannikov
Sergey I. Nikolenko
Irina Piontkovskaya
DeLMO
32
115
0
10 Oct 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
33
2
0
27 Jun 2024
Improving Interpretability and Robustness for the Detection of AI-Generated Images
T. Gaintseva
Laida Kushnareva
German Magai
Irina Piontkovskaya
Sergey I. Nikolenko
Martin Benning
S. Barannikov
Gregory Slabaugh
24
1
0
21 Jun 2024
Outlier Reduction with Gated Attention for Improved Post-training Quantization in Large Sequence-to-sequence Speech Foundation Models
Dominik Wagner
Ilja Baumann
K. Riedhammer
Tobias Bocklet
MQ
30
1
0
16 Jun 2024
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs
Jaewoo Yang
Hayun Kim
Younghoon Kim
39
11
0
23 May 2024
Unveiling Linguistic Regions in Large Language Models
Zhihao Zhang
Jun Zhao
Qi Zhang
Tao Gui
Xuanjing Huang
39
11
0
22 Feb 2024
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
56
353
0
20 Jun 2023
Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity
Katharina Hämmerl
Alina Fastowski
Jindrich Libovický
Alexander M. Fraser
20
6
0
01 Jun 2023
The Impact of Positional Encoding on Length Generalization in Transformers
Amirhossein Kazemnejad
Inkit Padhi
K. Ramamurthy
Payel Das
Siva Reddy
19
177
0
31 May 2023
Intriguing Properties of Quantization at Scale
Arash Ahmadian
Saurabh Dash
Hongyu Chen
Bharat Venkitesh
Stephen Gou
Phil Blunsom
A. Ustun
Sara Hooker
MQ
43
38
0
30 May 2023
Feature-Learning Networks Are Consistent Across Widths At Realistic Scales
Nikhil Vyas
Alexander B. Atanasov
Blake Bordelon
Depen Morwani
Sabarish Sainathan
C. Pehlevan
24
22
0
28 May 2023
Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models
Zhong Zhang
Bang Liu
Junming Shao
23
6
0
27 May 2023
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Ta-Chung Chi
Ting-Han Fan
Li-Wei Chen
Alexander I. Rudnicky
Peter J. Ramadge
VLM
MILM
52
12
0
23 May 2023
Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models
Na Li
Hanane Kteich
Zied Bouraoui
Steven Schockaert
19
9
0
16 May 2023
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
20
14
0
01 Feb 2023
Representation biases in sentence transformers
Dmitry Nikolaev
Sebastian Padó
21
7
0
30 Jan 2023
The case for 4-bit precision: k-bit Inference Scaling Laws
Tim Dettmers
Luke Zettlemoyer
MQ
19
214
0
19 Dec 2022
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
77
15
0
23 Oct 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
22
145
0
27 Sep 2022
Isotropic Representation Can Improve Dense Retrieval
Euna Jung
J. Park
Jaekeol Choi
Sungyoon Kim
Wonjong Rhee
OOD
14
5
0
01 Sep 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
24
625
0
15 Aug 2022
Outliers Dimensions that Disrupt Transformers Are Driven by Frequency
Giovanni Puccetti
Anna Rogers
Aleksandr Drozd
F. Dell’Orletta
71
42
0
23 May 2022
GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers
Ali Modarressi
Mohsen Fayyaz
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
ViT
13
33
0
06 May 2022
DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks
Ziyang Luo
Yadong Xi
Jing Ma
Zhiwei Yang
Xiaoxi Mao
Changjie Fan
Rongsheng Zhang
6
3
0
19 Apr 2022
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
21
48
0
08 Mar 2022
An Isotropy Analysis in the Multilingual BERT Embedding Space
S. Rajaee
Mohammad Taher Pilehvar
8
32
0
09 Oct 2021
Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations
Ekaterina Taktasheva
Vladislav Mikhailov
Ekaterina Artemova
8
13
0
28 Sep 2021
On Isotropy Calibration of Transformers
Yue Ding
Karolis Martinkus
Damian Pascual
Simon Clematide
Roger Wattenhofer
13
1
0
27 Sep 2021
All Bark and No Bite: Rogue Dimensions in Transformer Language Models Obscure Representational Quality
William Timkey
Marten van Schijndel
213
110
0
09 Sep 2021
BERT Busters: Outlier Dimensions that Disrupt Transformers
Olga Kovaleva
Saurabh Kulshreshtha
Anna Rogers
Anna Rumshisky
19
85
0
14 May 2021
1