ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05714
  4. Cited By
A Multiscale Visualization of Attention in the Transformer Model

A Multiscale Visualization of Attention in the Transformer Model

12 June 2019
Jesse Vig
    ViT
ArXivPDFHTML

Papers citing "A Multiscale Visualization of Attention in the Transformer Model"

32 / 82 papers shown
Title
GenNI: Human-AI Collaboration for Data-Backed Text Generation
GenNI: Human-AI Collaboration for Data-Backed Text Generation
Hendrik Strobelt
J. Kinley
Robert Krueger
Johanna Beyer
Hanspeter Pfister
Alexander M. Rush
23
23
0
19 Oct 2021
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Bingbing Li
Hongwu Peng
Rajat Sainju
Junhuan Yang
Lei Yang
Yueying Liang
Weiwen Jiang
Binghui Wang
Hang Liu
Caiwen Ding
19
11
0
15 Oct 2021
BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation
  Models
BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models
Kangjie Chen
Yuxian Meng
Xiaofei Sun
Shangwei Guo
Tianwei Zhang
Jiwei Li
Chun Fan
SILM
23
105
0
06 Oct 2021
Automated and Explainable Ontology Extension Based on Deep Learning: A
  Case Study in the Chemical Domain
Automated and Explainable Ontology Extension Based on Deep Learning: A Case Study in the Chemical Domain
A. Memariani
Martin Glauer
Fabian Neuhaus
Till Mossakowski
Janna Hastings
28
5
0
19 Sep 2021
Puzzle Solving without Search or Human Knowledge: An Unnatural Language
  Approach
Puzzle Solving without Search or Human Knowledge: An Unnatural Language Approach
David A. Noever
Ryerson Burdick
ReLM
123
7
0
07 Sep 2021
T3-Vis: a visual analytic framework for Training and fine-Tuning
  Transformers in NLP
T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP
Raymond Li
Wen Xiao
Lanjun Wang
Hyeju Jang
Giuseppe Carenini
ViT
23
23
0
31 Aug 2021
Multilingual Multi-Aspect Explainability Analyses on Machine Reading
  Comprehension Models
Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models
Yiming Cui
Weinan Zhang
Wanxiang Che
Ting Liu
Zhigang Chen
Shijin Wang
LRM
17
9
0
26 Aug 2021
An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for
  Caregivers
An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers
Lu Wang
Munif Ishad Mujib
Jake Williams
G. Demiris
Jina Huh-Yoo
AI4MH
27
32
0
28 Jul 2021
Quantifying Explainability in NLP and Analyzing Algorithms for
  Performance-Explainability Tradeoff
Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff
Michael J. Naylor
C. French
Samantha R. Terker
Uday Kamath
36
10
0
12 Jul 2021
Elbert: Fast Albert with Confidence-Window Based Early Exit
Elbert: Fast Albert with Confidence-Window Based Early Exit
Keli Xie
Siyuan Lu
Meiqi Wang
Zhongfeng Wang
14
20
0
01 Jul 2021
Do Models Learn the Directionality of Relations? A New Evaluation:
  Relation Direction Recognition
Do Models Learn the Directionality of Relations? A New Evaluation: Relation Direction Recognition
Shengfei Lyu
Xingyu Wu
Jinlong Li
Qiuju Chen
Huanhuan Chen
21
5
0
19 May 2021
VisQA: X-raying Vision and Language Reasoning in Transformers
VisQA: X-raying Vision and Language Reasoning in Transformers
Theo Jaunet
Corentin Kervadec
Romain Vuillemot
G. Antipov
M. Baccouche
Christian Wolf
10
26
0
02 Apr 2021
Synthesis of Compositional Animations from Textual Descriptions
Synthesis of Compositional Animations from Textual Descriptions
Anindita Ghosh
N. Cheema
Cennet Oguz
Christian Theobalt
P. Slusallek
31
170
0
26 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
43
1,144
0
18 Mar 2021
ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular
  Property Prediction
ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction
Seyone Chithrananda
Gabriel Grand
Bharath Ramsundar
AI4CE
20
388
0
19 Oct 2020
The elephant in the interpretability room: Why use attention as
  explanation when we have saliency methods?
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings
Katja Filippova
XAI
LRM
30
172
0
12 Oct 2020
Plan ahead: Self-Supervised Text Planning for Paragraph Completion Task
Plan ahead: Self-Supervised Text Planning for Paragraph Completion Task
Dongyeop Kang
Eduard H. Hovy
LRM
40
24
0
11 Oct 2020
Two are Better than One: Joint Entity and Relation Extraction with
  Table-Sequence Encoders
Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders
Jue Wang
Wei Lu
26
224
0
08 Oct 2020
Transformer-GCRF: Recovering Chinese Dropped Pronouns with General
  Conditional Random Fields
Transformer-GCRF: Recovering Chinese Dropped Pronouns with General Conditional Random Fields
Jingxuan Yang
Kerui Xu
Jun Xu
Si Li
Sheng Gao
Jun Guo
Ji-Rong Wen
Nianwen Xue
19
7
0
07 Oct 2020
Rethinking Attention with Performers
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
13
1,520
0
30 Sep 2020
Attention Flows: Analyzing and Comparing Attention Mechanisms in
  Language Models
Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models
Joseph F DeRose
Jiayao Wang
M. Berger
17
83
0
03 Sep 2020
BERTology Meets Biology: Interpreting Attention in Protein Language
  Models
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig
Ali Madani
L. Varshney
Caiming Xiong
R. Socher
Nazneen Rajani
29
288
0
26 Jun 2020
IMoJIE: Iterative Memory-Based Joint Open Information Extraction
IMoJIE: Iterative Memory-Based Joint Open Information Extraction
Keshav Kolluru
Samarth Aggarwal
Vipul Rathore
Mausam
Soumen Chakrabarti
VLM
19
71
0
17 May 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,450
0
18 Mar 2020
ProGen: Language Modeling for Protein Generation
ProGen: Language Modeling for Protein Generation
Ali Madani
Bryan McCann
Nikhil Naik
N. Keskar
N. Anand
Raphael R. Eguchi
Po-Ssu Huang
R. Socher
26
275
0
08 Mar 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression
  of Pre-Trained Transformers
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
47
1,199
0
25 Feb 2020
Stress Test Evaluation of Transformer-based Models in Natural Language
  Understanding Tasks
Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
Carlos Aspillaga
Andrés Carvallo
Vladimir Araujo
ELM
39
31
0
14 Feb 2020
Knowledge Guided Named Entity Recognition for BioMedical Text
Knowledge Guided Named Entity Recognition for BioMedical Text
Pratyay Banerjee
Kuntal Kumar Pal
M. Devarakonda
Chitta Baral
19
0
0
10 Nov 2019
Generalizing Natural Language Analysis through Span-relation
  Representations
Generalizing Natural Language Analysis through Span-relation Representations
Zhengbao Jiang
Wenyuan Xu
Jun Araki
Graham Neubig
30
60
0
10 Nov 2019
Keyphrase Extraction from Scholarly Articles as Sequence Labeling using
  Contextualized Embeddings
Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings
Dhruva Sahrawat
Debanjan Mahata
Mayank Kulkarni
Haimin Zhang
Rakesh Gosangi
Amanda Stent
Agniv Sharma
Yaman Kumar Singla
R. Shah
Roger Zimmermann
9
30
0
19 Oct 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
29
1,649
0
22 Aug 2019
Analyzing the Structure of Attention in a Transformer Language Model
Analyzing the Structure of Attention in a Transformer Language Model
Jesse Vig
Yonatan Belinkov
19
357
0
07 Jun 2019
Previous
12