ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.12327
  4. Cited By
A Primer in BERTology: What we know about how BERT works
v1v2v3 (latest)

A Primer in BERTology: What we know about how BERT works

Transactions of the Association for Computational Linguistics (TACL), 2020
27 February 2020
Anna Rogers
Olga Kovaleva
Anna Rumshisky
    OffRL
ArXiv (abs)PDFHTML

Papers citing "A Primer in BERTology: What we know about how BERT works"

50 / 780 papers shown
Search-R3: Unifying Reasoning and Embedding in Large Language Models
Search-R3: Unifying Reasoning and Embedding in Large Language Models
Yuntao Gui
James Cheng
KELMLRM
260
2
0
10 Apr 2026
What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models
What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models
Tianchen Deng
Yue Pan
Shenghai Yuan
Dong Li
Chen Wang
...
Danwei W. Wang
Jingchuan Wang
Javier Civera
Hesheng Wang
Weidong Chen
159
17
0
03 Dec 2025
Layer Probing Improves Kinase Functional Prediction with Protein Language Models
Layer Probing Improves Kinase Functional Prediction with Protein Language Models
Ajit Kumar
IndraPrakash Jha
44
0
0
29 Nov 2025
Standard Occupation Classifier -- A Natural Language Processing Approach
Standard Occupation Classifier -- A Natural Language Processing Approach
Sidharth Rony
Jack Patman
163
0
0
28 Nov 2025
Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts
Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts
Mosab Rezaei
Mina Rajaei Moghadam
A. Shaikh
Hamed Alhoori
Reva Freedman
170
0
0
25 Nov 2025
A Hybrid Classical-Quantum Fine Tuned BERT for Text Classification
A Hybrid Classical-Quantum Fine Tuned BERT for Text Classification
Abu Kaisar Mohammad Masum
Naveed Mahmud
M. H. Najafi
Sercan Aygün
141
0
0
21 Nov 2025
N-GLARE: An Non-Generative Latent Representation-Efficient LLM Safety Evaluator
N-GLARE: An Non-Generative Latent Representation-Efficient LLM Safety Evaluator
Zheyu Lin
Jirui Yang
Hengqi Guo
Yubing Bao
Yao Guan
Yao Guan
191
0
0
18 Nov 2025
SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation
SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation
Berkcan Kapusuzoglu
Supriyo Chakraborty
Renkun Ni
Stephen Rawls
Sambit Sahu
MoMeCLL
278
0
0
11 Nov 2025
Catching Contamination Before Generation: Spectral Kill Switches for Agents
Catching Contamination Before Generation: Spectral Kill Switches for Agents
Valentin Noël
144
0
0
08 Nov 2025
Quantitative Bounds for Length Generalization in Transformers
Quantitative Bounds for Length Generalization in Transformers
Zachary Izzo
Eshaan Nichani
Jason D. Lee
300
5
0
30 Oct 2025
Enhancing Sentiment Classification with Machine Learning and Combinatorial Fusion
Enhancing Sentiment Classification with Machine Learning and Combinatorial Fusion
Sean Patten
Pin-Yu Chen
Christina Schweikert
D. Frank Hsu
132
0
0
30 Oct 2025
Decomposition-Enhanced Training for Post-Hoc Attributions In Language Models
Decomposition-Enhanced Training for Post-Hoc Attributions In Language Models
Sriram Balasubramaniam
S. Basu
Koustava Goswami
Ryan Rossi
Varun Manjunatha
Roshan Santhosh
Ruiyi Zhang
Soheil Feizi
Nedim Lipka
LRMReLM
421
1
0
29 Oct 2025
Forging GEMs: Advancing Greek NLP through Quality-Based Corpus Curation
Forging GEMs: Advancing Greek NLP through Quality-Based Corpus Curation
Alexandra Apostolopoulou
Konstantinos Kanaris
Athanasios Koursaris
Dimitris Tsakalidis
George Domalis
I. Livieris
221
0
0
22 Oct 2025
That's Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation
That's Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation
Jaesung Bae
Cameron Churchwell
Mitchell Hermon
Tsun-An Hsieh
Jocelyn Xu
Yekaterina Yegorova
Mark Hasegawa-Johnson
Heng Ji
145
0
0
21 Oct 2025
Training-Free Spectral Fingerprints of Voice Processing in Transformers
Training-Free Spectral Fingerprints of Voice Processing in Transformers
Valentin Noël
216
2
0
21 Oct 2025
Attention Is All You Need for KV Cache in Diffusion LLMs
Attention Is All You Need for KV Cache in Diffusion LLMs
Quan Nguyen-Tri
Mukul Ranjan
Zhiqiang Shen
228
10
0
16 Oct 2025
CAST: Compositional Analysis via Spectral Tracking for Understanding Transformer Layer Functions
CAST: Compositional Analysis via Spectral Tracking for Understanding Transformer Layer Functions
Zihao Fu
Ming Liao
Chris Russell
Zhenguang G. Cai
162
1
0
16 Oct 2025
Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification
Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification
Mahamodul Hasan Mahadi
Md. Nasif Safwan
Souhardo Rahman
Shahnaj Parvin
Aminun Nahar
Kamruddin Nur
VLM
123
0
0
14 Oct 2025
Fairness Metric Design Exploration in Multi-Domain Moral Sentiment Classification using Transformer-Based Models
Fairness Metric Design Exploration in Multi-Domain Moral Sentiment Classification using Transformer-Based Models
Battemuulen Naranbat
Seyed Sahand Mohammadi Ziabari
Yousuf Nasser Al Husaini
Ali Mohammed Mansoor Alsahag
102
1
0
13 Oct 2025
Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning
Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning
Minsik Choi
Hyegang Son
Changhoon Kim
Young Geun Kim
AAML
184
0
0
10 Oct 2025
Mapping Semantic & Syntactic Relationships with Geometric Rotation
Mapping Semantic & Syntactic Relationships with Geometric Rotation
Michael Freenor
Lauren Alvarez
LLMSV
238
1
0
10 Oct 2025
SkipSR: Faster Super Resolution with Token Skipping
SkipSR: Faster Super Resolution with Token Skipping
Rohan Choudhury
Shanchuan Lin
Jianyi Wang
Hao Chen
Qi Zhao
Feng Cheng
Lu Jiang
Kris Kitani
László A. Jeni
SupR
279
0
0
09 Oct 2025
Reasoning for Hierarchical Text Classification: The Case of Patents
Reasoning for Hierarchical Text Classification: The Case of Patents
Lekang Jiang
Wenjun Sun
Stephan Goetz
BDL
208
9
0
08 Oct 2025
Mechanistic Interpretability of Socio-Political Frames in Language Models
Mechanistic Interpretability of Socio-Political Frames in Language Models
Hadi Asghari
Sami Nenno
128
0
0
04 Oct 2025
Allocation of Parameters in Transformers
Allocation of Parameters in Transformers
Ruoxi Yu
Haotian Jiang
Jingpu Cheng
Penghao Yu
Qianxiao Li
Zhong Li
MoE
197
0
0
04 Oct 2025
A Hierarchical Error Framework for Reliable Automated Coding in Communication Research: Applications to Health and Political Communication
A Hierarchical Error Framework for Reliable Automated Coding in Communication Research: Applications to Health and Political Communication
Zhilong Zhao
Yindi Liu
AILaw
257
0
0
29 Sep 2025
Investigating Multi-layer Representations for Dense Passage Retrieval
Investigating Multi-layer Representations for Dense Passage Retrieval
Zhongbin Xie
Thomas Lukasiewicz
170
1
0
28 Sep 2025
Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing
Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing
Xinnan Dai
Chung-Hsiang Lo
Kai Guo
Shenglai Zeng
Dongsheng Luo
Shucheng Zhou
188
1
0
24 Sep 2025
A Novel Differential Feature Learning for Effective Hallucination Detection and Classification
A Novel Differential Feature Learning for Effective Hallucination Detection and Classification
Wenkai Wang
Vincent C. S. Lee
Yizhen Zheng
130
0
0
20 Sep 2025
Steering Language Models in Multi-Token Generation: A Case Study on Tense and Aspect
Steering Language Models in Multi-Token Generation: A Case Study on Tense and Aspect
Alina Klerings
Jannik Brinkmann
Daniel Ruffinelli
Simone Paolo Ponzetto
LLMSV
208
0
0
15 Sep 2025
Documents Are People and Words Are Items: A Psychometric Approach to Textual Data with Contextual Embeddings
Documents Are People and Words Are Items: A Psychometric Approach to Textual Data with Contextual Embeddings
Jinsong Chen
98
0
0
10 Sep 2025
Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
Junjie Mu
Zonghao Ying
Zhekui Fan
Zonglei Jing
Yaoyuan Zhang
Zhengmin Yu
Wenxin Zhang
Quanchen Zou
Xiangzheng Zhang
AAML
214
5
0
08 Sep 2025
Comparative Analysis of Transformer Models in Disaster Tweet Classification for Public Safety
Comparative Analysis of Transformer Models in Disaster Tweet Classification for Public Safety
Sharif Noor Zisad
N. M. Istiak Chowdhury
Ragib Hasan
273
1
0
04 Sep 2025
Learning Mechanism Underlying NLP Pre-Training and Fine-Tuning
Learning Mechanism Underlying NLP Pre-Training and Fine-Tuning
Yarden Tzach
Ronit D. Gross
Ella Koresh
Shalom Rosner
Or Shpringer
Tal Halevi
Ido Kanter
184
3
0
03 Sep 2025
Towards Fundamental Language Models: Does Linguistic Competence Scale with Model Size?
Towards Fundamental Language Models: Does Linguistic Competence Scale with Model Size?
Jaime Collado-Montañez
L. Alfonso Ureña-López
Arturo Montejo-Ráez
HILMELMLRM
140
0
0
02 Sep 2025
MindGuard: Intrinsic Decision Inspection for Securing LLM Agents Against Metadata Poisoning
MindGuard: Intrinsic Decision Inspection for Securing LLM Agents Against Metadata Poisoning
Zhiqiang Wang
Junyang Zhang
Guanquan Shi
Haoran Cheng
Yunhao Yao
Kaiwen Guo
Haohua Du
Xiang-Yang Li
415
5
0
28 Aug 2025
Transplant Then Regenerate: A New Paradigm for Text Data Augmentation
Transplant Then Regenerate: A New Paradigm for Text Data Augmentation
Guangzhan Wang
Hongyu Zhang
Beijun Shen
Xiaodong Gu
330
0
0
20 Aug 2025
Semantic Anchoring in Agentic Memory: Leveraging Linguistic Structures for Persistent Conversational Context
Semantic Anchoring in Agentic Memory: Leveraging Linguistic Structures for Persistent Conversational Context
Maitreyi Chatterjee
Devansh Agarwal
RALMKELM
191
1
0
18 Aug 2025
Cognitive Decision Routing in Large Language Models: When to Think Fast, When to Think Slow
Cognitive Decision Routing in Large Language Models: When to Think Fast, When to Think Slow
Y. Du
C. Guo
W. Wang
G. Tang
LRM
152
1
0
17 Aug 2025
Streamlining Admission with LOR Insights: AI-Based Leadership Assessment in Online Master's Program
Streamlining Admission with LOR Insights: AI-Based Leadership Assessment in Online Master's Program
Meryem Yilmaz Soylu
Adrian Gallard
Jeonghyun Lee
Gayane Grigoryan
Rushil Desai
Stephen Harmon
234
0
0
07 Aug 2025
I Think, Therefore I Am Under-Qualified? A Benchmark for Evaluating Linguistic Shibboleth Detection in LLM Hiring Evaluations
I Think, Therefore I Am Under-Qualified? A Benchmark for Evaluating Linguistic Shibboleth Detection in LLM Hiring Evaluations
Julia Kharchenko
Tanya Roosta
Aman Chadha
Chirag Shah
152
1
0
06 Aug 2025
When Truth Is Overridden: Uncovering the Internal Origins of Sycophancy in Large Language Models
When Truth Is Overridden: Uncovering the Internal Origins of Sycophancy in Large Language Models
Keyu Wang
Jin Li
Shu Yang
Zhuoran Zhang
Haiyan Zhao
583
18
0
04 Aug 2025
Length Representations in Large Language Models
Length Representations in Large Language Models
Sangjun Moon
Dasom Choi
Jingun Kwon
Hidetaka Kamigaito
Manabu Okumura
MILM
263
2
0
27 Jul 2025
Explainable Mapper: Charting LLM Embedding Spaces Using Perturbation-Based Explanation and Verification Agents
Explainable Mapper: Charting LLM Embedding Spaces Using Perturbation-Based Explanation and Verification Agents
Xinyuan Yan
Rita Sevastjanova
Sinie van der Ben
Mennatallah El-Assady
Bei Wang
308
3
0
24 Jul 2025
Discourse Heuristics For Paradoxically Moral Self-Correction
Discourse Heuristics For Paradoxically Moral Self-CorrectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Guangliang Liu
Zimo Qi
Xitong Zhang
K. Johnson
LRM
220
4
0
01 Jul 2025
Can structural correspondences ground real world representational content in Large Language Models?
Can structural correspondences ground real world representational content in Large Language Models?
Iwan Williams
241
3
0
19 Jun 2025
A Vietnamese Dataset for Text Segmentation and Multiple Choices Reading Comprehension
A Vietnamese Dataset for Text Segmentation and Multiple Choices Reading Comprehension
Toan Nguyen Hai
Ha Nguyen Viet
Truong Quan Xuan
Duc Do Minh
193
1
0
19 Jun 2025
Targeted Lexical Injection: Unlocking Latent Cross-Lingual Alignment in Lugha-Llama via Early-Layer LoRA Fine-Tuning
Targeted Lexical Injection: Unlocking Latent Cross-Lingual Alignment in Lugha-Llama via Early-Layer LoRA Fine-Tuning
Stanley Ngugi
233
0
0
18 Jun 2025
From Raw Corpora to Domain Benchmarks: Automated Evaluation of LLM Domain Expertise
From Raw Corpora to Domain Benchmarks: Automated Evaluation of LLM Domain Expertise
Nitin Sharma
Thomas Wolfers
Çağatay Yıldız
ALM
251
0
0
09 Jun 2025
Towards an Explainable Comparison and Alignment of Feature Embeddings
Towards an Explainable Comparison and Alignment of Feature Embeddings
Mohammad Jalali
Bahar Dibaei Nia
Farzan Farnia
438
5
0
06 Jun 2025
1234...141516
Next
Page 1 of 16
Pageof 16