ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 13,105 papers shown
Title
Sybil-based Virtual Data Poisoning Attacks in Federated Learning
Sybil-based Virtual Data Poisoning Attacks in Federated Learning
Changxun Zhu
Qilong Wu
Lingjuan Lyu
Shibei Xue
AAML
FedML
5
0
0
15 May 2025
Demystifying AI Agents: The Final Generation of Intelligence
Demystifying AI Agents: The Final Generation of Intelligence
Kevin J McNamara
Rhea Pritham Marpu
15
0
0
15 May 2025
Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models
Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models
Annie Wong
Thomas Bäck
Aske Plaat
N. V. Stein
Anna V. Kononova
ReLM
ELM
LRM
41
0
0
15 May 2025
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
Jing-Cheng Pang
Kaiyuan Li
Y. Wang
Si-Hang Yang
Shengyi Jiang
Yang Yu
OffRL
LLMAG
LM&Ro
LRM
15
0
0
15 May 2025
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations
Yile Wang
Zhanyu Shen
Hui Huang
15
0
0
15 May 2025
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
Yidan Wang
Yubing Ren
Yanan Cao
Binxing Fang
16
0
0
15 May 2025
ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention
ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention
Jintian Shao
Hongyi Huang
Jiayi Wu
Beiwen Zhang
ZhiYu Wu
You Shan
MingKai Zheng
17
0
0
15 May 2025
The Devil Is in the Word Alignment Details: On Translation-Based Cross-Lingual Transfer for Token Classification Tasks
The Devil Is in the Word Alignment Details: On Translation-Based Cross-Lingual Transfer for Token Classification Tasks
Benedikt Ebing
Goran Glavas
22
0
0
15 May 2025
Task-Core Memory Management and Consolidation for Long-term Continual Learning
Task-Core Memory Management and Consolidation for Long-term Continual Learning
Tianyu Huai
Jie Zhou
Yuxuan Cai
Qin Chen
Wen Wu
Xingjiao Wu
Xipeng Qiu
Liang He
CLL
15
0
0
15 May 2025
The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks
The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks
Zhonghao Lyu
Ming Xiao
Jie Xu
Mikael Skoglund
Marco Di Renzo
15
0
0
14 May 2025
Interim Report on Human-Guided Adaptive Hyperparameter Optimization with Multi-Fidelity Sprints
Interim Report on Human-Guided Adaptive Hyperparameter Optimization with Multi-Fidelity Sprints
Michael Kamfonas
7
0
0
14 May 2025
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning
Berkay Guler
Giovanni Geraci
Hamid Jafarkhani
20
0
0
14 May 2025
SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures
SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures
Julian Kranz
Davide Gallon
Steffen Dereich
Arnulf Jentzen
11
0
0
14 May 2025
AdaFortiTran: An Adaptive Transformer Model for Robust OFDM Channel Estimation
AdaFortiTran: An Adaptive Transformer Model for Robust OFDM Channel Estimation
Berkay Guler
Hamid Jafarkhani
6
1
0
14 May 2025
LiDDA: Data Driven Attribution at LinkedIn
LiDDA: Data Driven Attribution at LinkedIn
John Bencina
Erkut Aykutlug
Yue Chen
Zerui Zhang
Stephanie Sorenson
Shao Tang
Changshuai Wei
12
0
0
14 May 2025
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
Brandon Smith
Mohamed Reda Bouadjenek
Tahsin Alamgir Kheya
Phillip Dawson
S. Aryal
ALM
ELM
21
0
0
14 May 2025
Multilingual Machine Translation with Quantum Encoder Decoder Attention-based Convolutional Variational Circuits
Multilingual Machine Translation with Quantum Encoder Decoder Attention-based Convolutional Variational Circuits
Subrit Dikshit
Ritu Tiwari
Priyank Jain
16
0
0
14 May 2025
ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor
ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor
Seungbeom Choi
Jeonghoe Goo
Eunjoo Jeon
Mingyu Yang
Minsung Jang
16
0
0
14 May 2025
Analog Foundation Models
Analog Foundation Models
Julian Büchel
Iason Chalas
Giovanni Acampa
An Chen
Omobayode Fagbohungbe
Sidney Tsai
K. E. Maghraoui
M. Le Gallo
Abbas Rahimi
A. Sebastian
MQ
23
0
0
14 May 2025
A Scalable Unsupervised Framework for multi-aspect labeling of Multilingual and Multi-Domain Review Data
A Scalable Unsupervised Framework for multi-aspect labeling of Multilingual and Multi-Domain Review Data
Jiin Park
Misuk Kim
11
0
0
14 May 2025
Adversarial Suffix Filtering: a Defense Pipeline for LLMs
Adversarial Suffix Filtering: a Defense Pipeline for LLMs
David Khachaturov
Robert D. Mullins
AAML
13
0
0
14 May 2025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Julian Tanke
Takashi Shibuya
Kengo Uchida
Koichi Saito
Yuki Mitsufuji
Mamba
32
0
0
14 May 2025
Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models
Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models
Junda Zhao
Yuliang Song
Eldan Cohen
11
0
0
14 May 2025
The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News
The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News
Yuhan Liu
Y. Liu
Xiaoqing Zhang
X. Chen
Rui Yan
LLMAG
41
0
0
13 May 2025
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush Rai
Kyle Min
Tarun Krishna
Feiyan Hu
A. Smeaton
Noel E. O'Connor
VGen
19
0
0
13 May 2025
Automatic Task Detection and Heterogeneous LLM Speculative Decoding
Automatic Task Detection and Heterogeneous LLM Speculative Decoding
Danying Ge
Jianhua Gao
Qizhi Jiang
Yifei Feng
Weixing Ji
26
0
0
13 May 2025
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
43
0
0
13 May 2025
Small but Significant: On the Promise of Small Language Models for Accessible AIED
Small but Significant: On the Promise of Small Language Models for Accessible AIED
Yumou Wei
Paulo Carvalho
John Stamper
SyDa
40
0
0
13 May 2025
Next Word Suggestion using Graph Neural Network
Next Word Suggestion using Graph Neural Network
Abisha Thapa Magar
Anup Shakya
GNN
28
0
0
13 May 2025
Exploiting Text Semantics for Few and Zero Shot Node Classification on Text-attributed Graph
Exploiting Text Semantics for Few and Zero Shot Node Classification on Text-attributed Graph
Yuxiang Wang
Xiao Yan
Shiyu Jin
Quanqing Xu
Chuang Hu
Yuanyuan Zhu
Bo Du
Jia Wu
Jiawei Jiang
24
0
0
13 May 2025
LM-Scout: Analyzing the Security of Language Model Integration in Android Apps
LM-Scout: Analyzing the Security of Language Model Integration in Android Apps
Muhammad Ibrahim
Gűliz Seray Tuncay
Z. Berkay Celik
Aravind Machiry
Antonio Bianchi
31
0
0
13 May 2025
An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models
An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models
J. Mao
Itay Griniasty
Yan Sun
M. Transtrum
J. Sethna
Pratik Chaudhari
14
0
0
13 May 2025
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies
Xiaoliang Luo
Xinyi Xu
Michael Ramscar
Bradley C. Love
25
0
0
13 May 2025
LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models
LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models
Takumi Shibata
Yuichi Miyamura
25
0
0
13 May 2025
Ultrasound Report Generation with Multimodal Large Language Models for Standardized Texts
Ultrasound Report Generation with Multimodal Large Language Models for Standardized Texts
Peixuan Ge
Tongkun Su
Faqin Lv
Baoliang Zhao
Peng Zhang
...
Liang Yao
Yu Sun
Zenan Wang
Pak Kin Wong
Ying Hu
MedIm
11
0
0
13 May 2025
Learning Advanced Self-Attention for Linear Transformers in the Singular Value Domain
Learning Advanced Self-Attention for Linear Transformers in the Singular Value Domain
Hyowon Wi
Jeongwhan Choi
Noseong Park
23
0
0
13 May 2025
RepCali: High Efficient Fine-tuning Via Representation Calibration in Latent Space for Pre-trained Language Models
RepCali: High Efficient Fine-tuning Via Representation Calibration in Latent Space for Pre-trained Language Models
Fujun Zhang
Xiangdong Su
29
0
0
13 May 2025
Guiding LLM-based Smart Contract Generation with Finite State Machine
Guiding LLM-based Smart Contract Generation with Finite State Machine
Hao Luo
Yuhao Lin
Xiao Yan
Xintong Hu
Y. Wang
Qiming Zeng
Hao Wang
Jiawei Jiang
31
0
0
13 May 2025
MilChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Remote Sensing
MilChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Remote Sensing
Aybora Koksal
Aydin Alatan
LRM
24
0
0
12 May 2025
AI-Enabled Accurate Non-Invasive Assessment of Pulmonary Hypertension Progression via Multi-Modal Echocardiography
AI-Enabled Accurate Non-Invasive Assessment of Pulmonary Hypertension Progression via Multi-Modal Echocardiography
Jiewen Yang
Taoran Huang
Shangwei Ding
Xiaowei Xu
Qinhua Zhao
...
Bin Pu
Jiexuan Zheng
Caojin Zhang
Hongwen Fei
X. Li
16
0
0
12 May 2025
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
Hajar Sakai
Sarah Lam
VLM
38
0
0
12 May 2025
Efficient and Reproducible Biomedical Question Answering using Retrieval Augmented Generation
Efficient and Reproducible Biomedical Question Answering using Retrieval Augmented Generation
Linus Stuhlmann
Michael Alexander Saxer
Jonathan Fürst
RALM
31
0
0
12 May 2025
Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems
Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems
Ippokratis Koukoulis
Ilias Syrigos
Thanasis Korakis
13
0
0
12 May 2025
Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Guang Yan
Yuhui Zhang
Zimu Guo
Lutan Zhao
Xiaojun Chen
Chen Wang
Wenhao Wang
Dan Meng
Rui Hou
29
0
0
12 May 2025
Tagging fully hadronic exotic decays of the vectorlike $\mathbf{B}$ quark using a graph neural network
Tagging fully hadronic exotic decays of the vectorlike B\mathbf{B}B quark using a graph neural network
Jai Bardhan
Tanumoy Mandal
Subhadip Mitra
Cyrin Neeraj
Mihir Rawat
25
0
0
12 May 2025
Domain Regeneration: How well do LLMs match syntactic properties of text domains?
Domain Regeneration: How well do LLMs match syntactic properties of text domains?
Da Ju
Hagen Blix
Adina Williams
DeLMO
33
0
0
12 May 2025
A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny
A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny
Karahan Sarıtaş
Çağatay Yıldız
24
0
0
12 May 2025
Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition
Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition
Zheng Yao
Shuai Wang
Guido Zuccon
21
0
0
12 May 2025
Chronocept: Instilling a Sense of Time in Machines
Chronocept: Instilling a Sense of Time in Machines
Krish Goel
Sanskar Pandey
KS Mahadevan
Harsh Kumar
Vishesh Khadaria
23
0
0
12 May 2025
HAMLET: Healthcare-focused Adaptive Multilingual Learning Embedding-based Topic Modeling
HAMLET: Healthcare-focused Adaptive Multilingual Learning Embedding-based Topic Modeling
Hajar Sakai
Sarah Lam
34
0
0
12 May 2025
1234...261262263
Next