Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.01513
Cited By
CharBERT: Character-aware Pre-trained Language Model
3 November 2020
Wentao Ma
Yiming Cui
Chenglei Si
Ting Liu
Shijin Wang
Guoping Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CharBERT: Character-aware Pre-trained Language Model"
21 / 21 papers shown
Title
Using External knowledge to Enhanced PLM for Semantic Matching
Min Li
Chun Yuan
26
0
0
10 May 2025
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
32
2
0
28 Oct 2024
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu
Xuezhe Ma
LRM
48
3
0
18 Oct 2024
Advancing Post-OCR Correction: A Comparative Study of Synthetic Data
Shuhao Guan
Derek Greene
26
6
0
05 Aug 2024
Beyond Binary Gender Labels: Revealing Gender Biases in LLMs through Gender-Neutral Name Predictions
Zhiwen You
Haejin Lee
Shubhanshu Mishra
Sullam Jeoung
Apratim Mishra
Jinseok Kim
Jana Diesner
27
9
0
07 Jul 2024
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
K. Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Janet Liu
H. Wang
31
23
0
08 May 2024
READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
Chenglei Si
Zhengyan Zhang
Yingfa Chen
Xiaozhi Wang
Zhiyuan Liu
Maosong Sun
AAML
22
1
0
14 Feb 2023
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
30
46
0
14 Jul 2022
Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation
Shengyao Zhuang
Houxing Ren
Linjun Shou
Jian Pei
Ming Gong
Guido Zuccon
Daxin Jiang
32
64
0
21 Jun 2022
Down and Across: Introducing Crossword-Solving as a New NLP Benchmark
Saurabh Kulshreshtha
Olga Kovaleva
Namrata Shivagunde
Anna Rumshisky
ELM
LRM
26
4
0
20 May 2022
Signal in Noise: Exploring Meaning Encoded in Random Character Sequences with Character-Aware Language Models
Mark Chu
Bhargav Srinivasa Desikan
E. Nadler
Ruggerio L. Sardo
Elise Darragh-Ford
Douglas Guilbeault
18
0
0
15 Mar 2022
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Sabrina J. Mielke
Zaid Alyafeai
Elizabeth Salesky
Colin Raffel
Manan Dey
...
Arun Raja
Chenglei Si
Wilson Y. Lee
Benoît Sagot
Samson Tan
30
140
0
20 Dec 2021
Using Distributional Principles for the Semantic Study of Contextual Language Models
Olivier Ferret
17
1
0
23 Nov 2021
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
Arij Riabi
Benoît Sagot
Djamé Seddah
26
15
0
26 Oct 2021
BERT Cannot Align Characters
Antonis Maronikolakis
Philipp Dufter
Hinrich Schütze
23
0
0
20 Sep 2021
Local Structure Matters Most: Perturbation Study in NLU
Louis Clouâtre
Prasanna Parthasarathi
Amal Zouaq
Sarath Chandar
22
13
0
29 Jul 2021
Evaluating Various Tokenizers for Arabic Text Classification
Zaid Alyafeai
Maged S. Al-Shaibani
Mustafa Ghaleb
Irfan Ahmad
23
41
0
14 Jun 2021
UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with BiLSTM-CRF and ToxicBERT Comment Classification
Son T. Luu
N. Nguyen
34
8
0
20 Apr 2021
Generating Natural Language Adversarial Examples
M. Alzantot
Yash Sharma
Ahmed Elgohary
Bo-Jhang Ho
Mani B. Srivastava
Kai-Wei Chang
AAML
245
914
0
21 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
252
13,364
0
25 Aug 2014
1