Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1803.07416
Cited By
Tensor2Tensor for Neural Machine Translation
16 March 2018
Ashish Vaswani
Samy Bengio
E. Brevdo
François Chollet
Aidan Gomez
Stephan Gouws
Llion Jones
Lukasz Kaiser
Nal Kalchbrenner
Niki Parmar
Ryan Sepassi
Noam M. Shazeer
Jakob Uszkoreit
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Tensor2Tensor for Neural Machine Translation"
50 / 264 papers shown
Title
Facilitating Cognitive Accessibility with LLMs: A Multi-Task Approach to Easy-to-Read Text Generation
François Ledoyen
Gaël Dias
Jeremie Pantin
Alexis Lechervy
Fabrice Maurel
Youssef Chahir
88
0
0
01 Oct 2025
Large Language Models for Summarizing Czech Historical Documents and Beyond
International Conference on Agents and Artificial Intelligence (ICAART), 2025
Václav Tran
Jakub Šmíd
J. Martínek
Ladislav Lenc
Pavel Král
112
1
0
14 Aug 2025
Lightweight and Interpretable Transformer via Mixed Graph Algorithm Unrolling for Traffic Forecast
Ji Qi
Tam Thuc Do
Mingxiao Liu
Zhuoshi Pan
Yuzhe Li
Gene Cheung
H. Vicky Zhao
AI4TS
238
0
0
19 May 2025
A Local Polyak-Lojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models
Ziqing Xu
Hancheng Min
Salma Tarmoun
Enrique Mallada
Rene Vidal
253
2
0
16 May 2025
Efficient Time Series Forecasting via Hyper-Complex Models and Frequency Aggregation
Eyal Yakir
Dor Tsur
Haim Permuter
AI4TS
314
0
0
27 Feb 2025
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning
Asian Conference on Machine Learning (ACML), 2025
Siddharth Aravindan
Dixant Mittal
Wee Sun Lee
BDL
271
0
0
17 Jan 2025
Domain adapted machine translation: What does catastrophic forgetting forget and why?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Danielle Saunders
Steve DeNeefe
AI4CE
105
4
0
23 Dec 2024
Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch
Donglin Di
Weinan Zhang
Yue Zhang
Fanglin Wang
270
1
0
24 Oct 2024
A survey of neural-network-based methods utilising comparable data for finding translation equivalents
Michaela Denisová
Pavel Rychlý
230
0
0
19 Oct 2024
Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations
Zhicheng Ren
Zhiping Xiao
Luke Huan
260
0
0
04 Sep 2024
DLP: towards active defense against backdoor attacks with decoupled learning process
Zonghao Ying
Bin Wu
AAML
270
12
0
18 Jun 2024
Separable Physics-Informed Neural Networks for the solution of elasticity problems
V. A. Es'kin
Danil V. Davydov
Julia V. Guréva
Alexey O. Malkhanov
Mikhail E. Smorkalov
PINN
AI4CE
275
6
0
24 Jan 2024
Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and Baselines
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Stephen Lawrence Bothwell
Justin DeBenedetto
Theresa Crnkovich
Hildegund Müller
David Chiang
ObjD
279
3
0
30 Nov 2023
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code
International Conference on Learning Representations (ICLR), 2023
Nadezhda Chirkova
Sergey Troshin
225
9
0
01 Aug 2023
3D Medical Image Segmentation based on multi-scale MPU-Net
Zeqiu Yu
Shuo Han
Ziheng Song
3DV
189
5
0
11 Jul 2023
Urania: Visualizing Data Analysis Pipelines for Natural Language-Based Data Exploration
Yi Guo
Nana Cao
Xiaoyu Qi
Haoyang Li
Danqing Shi
Jing Zhang
Qing Chen
Daniel Weiskopf
149
5
0
13 Jun 2023
HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Shantipriya Parida
Idris Abdulmumin
Shamsuddeen Hassan Muhammad
Aneesh Bose
Guneet Singh Kohli
Ibrahim Said Ahmad
Ketan Kotwal
S. Sarkar
Ondrej Bojar
Habeebah Adamu Kakudi
286
10
0
28 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Neural Information Processing Systems (NeurIPS), 2023
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurelien Lucchi
Thomas Hofmann
358
70
0
25 May 2023
Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhuoyuan Mao
Mary Dabre
Qianying Liu
Israfel Salazar
Chenhui Chu
Sadao Kurohashi
120
7
0
16 May 2023
AttentionViz: A Global View of Transformer Attention
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2023
Catherine Yeh
Yida Chen
Aoyu Wu
Cynthia Chen
Fernanda Viégas
Martin Wattenberg
ViT
281
87
0
04 May 2023
string2string: A Modern Python Library for String-to-String Algorithms
Mirac Suzgun
Stuart M. Shieber
Dan Jurafsky
148
10
0
27 Apr 2023
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder
Z. Fu
W. Lam
Qian Yu
Anthony Man-Cho So
Shengding Hu
Zhiyuan Liu
Nigel Collier
AuLLM
147
59
0
08 Apr 2023
Datamator: An Intelligent Authoring Tool for Creating Datamations via Data Query Decomposition
Yi Guo
Nana Cao
Ligan Cai
Yanqiu Wu
Daniel Weiskopf
Danqing Shi
Qing Chen
194
2
0
06 Apr 2023
About optimal loss function for training physics-informed neural networks under respecting causality
V. A. Es'kin
Danil V. Davydov
Ekaterina D. Egorova
Alexey O. Malkhanov
Mikhail A. Akhukov
Mikhail E. Smorkalov
PINN
233
8
0
05 Apr 2023
Synthetically generated text for supervised text analysis
Political Analysis (PA), 2023
Andrew Halterman
DeLMO
147
12
0
28 Mar 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Muhammad Usama
Junaid Qadir
424
66
0
21 Mar 2023
Mutation-Based Adversarial Attacks on Neural Text Detectors
G. Liang
Jesus Guerrero
I. Alsmadi
DeLMO
174
11
0
11 Feb 2023
ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated Text
Sandra Mitrović
Davide Andreoletti
Omran Ayoub
DeLMO
156
181
0
30 Jan 2023
CUNI Systems for the WMT22 Czech-Ukrainian Translation Task
Conference on Machine Translation (WMT), 2022
Martin Popel
Jindrich Libovický
Jindřich Helcl
132
6
0
01 Dec 2022
QNet: A Quantum-native Sequence Encoder Architecture
International Conference on Quantum Computing and Engineering (ICQCE), 2022
Wei-Yen Day
Hao-Sheng Chen
Min Sun
231
1
0
31 Oct 2022
Tools for Extracting Spatio-Temporal Patterns in Meteorological Image Sequences: From Feature Engineering to Attention-Based Neural Networks
A. S. Bansal
Yoonjin Lee
Kyle Hilburn
I. Ebert‐Uphoff
AI4TS
284
2
0
22 Oct 2022
On the Explainability of Natural Language Processing Deep Models
ACM Computing Surveys (ACM CSUR), 2022
Julia El Zini
M. Awad
232
109
0
13 Oct 2022
PARAGEN : A Parallel Generation Toolkit
Jiangtao Feng
Yi Zhou
Jun Zhang
Xian Qian
Liwei Wu
Zhexi Zhang
Yanming Liu
Mingxuan Wang
Lei Li
Hao Zhou
VLM
181
3
0
07 Oct 2022
A Deep Investigation of RNN and Self-attention for the Cyrillic-Traditional Mongolian Bidirectional Conversion
International Conference on Neural Information Processing (ICONIP), 2022
Muhan Na
Rui Liu
Feilong
Guanglai Gao
121
0
0
24 Sep 2022
Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
International Conference on Machine Learning (ICML), 2022
Lily H. Zhang
Veronica Tozzo
J. Higgins
Rajesh Ranganath
BDL
MoE
226
24
0
23 Jun 2022
B2T Connection: Serving Stability and Performance in Deep Transformers
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Sho Takase
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
302
15
0
01 Jun 2022
How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing
Artificial Intelligence Review (Artif Intell Rev), 2022
Samuel Sousa
Roman Kern
PILM
AILaw
190
58
0
20 May 2022
Optimizing Mixture of Experts using Dynamic Recompilations
Ferdinand Kossmann
Zhihao Jia
A. Aiken
218
5
0
04 May 2022
Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
International Conference on Language Resources and Evaluation (LREC), 2022
Idris Abdulmumin
S. Dash
Musa Abdullahi Dawud
Shantipriya Parida
Shamsuddeen Hassan Muhammad
Ibrahim Said Ahmad
Subhadarshi Panda
Ondrej Bojar
B. Galadanci
Bello Shehu Bello
263
21
0
02 May 2022
NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural Language Understanding in Task-Oriented Dialogue
I. Casanueva
Ivan Vulić
Georgios P. Spithourakis
Paweł Budzianowski
229
16
0
27 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
International Conference on Language Resources and Evaluation (LREC), 2022
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
277
9
0
11 Apr 2022
Scaling Up Models and Data with
t5x
\texttt{t5x}
t5x
and
seqio
\texttt{seqio}
seqio
Journal of machine learning research (JMLR), 2022
Adam Roberts
Hyung Won Chung
Anselm Levskaya
Gaurav Mishra
James Bradbury
...
Brennan Saeta
Ryan Sepassi
A. Spiridonov
Joshua Newlan
Andrea Gesmundo
ALM
269
211
0
31 Mar 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
International Conference on Machine Learning (ICML), 2022
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
224
75
0
15 Feb 2022
Capitalization and Punctuation Restoration: a Survey
Artificial Intelligence Review (AIR), 2021
V. Pais
D. Tufis
202
21
0
21 Nov 2021
Benchmarking and scaling of deep learning models for land cover image classification
Ioannis Papoutsis
Nikolaos Ioannis Bountos
Angelos Zavras
Dimitrios Michail
Christos Tryfonopoulos
421
70
0
18 Nov 2021
Say What? Collaborative Pop Lyric Generation Using Multitask Transfer Learning
International Conference on Human-Agent Interaction (HAI), 2021
Naveen Ram
Tanay Gummadi
Rahul Bhethanabotla
Richard J. Savery
Gil Weinberg
159
9
0
15 Nov 2021
Leveraging redundancy in attention with Reuse Transformers
Srinadh Bhojanapalli
Ayan Chakrabarti
Andreas Veit
Michal Lukasik
Himanshu Jain
Frederick Liu
Yin-Wen Chang
Sanjiv Kumar
147
36
0
13 Oct 2021
The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation
Orevaoghene Ahia
Julia Kreutzer
Sara Hooker
302
58
0
06 Oct 2021
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
395
182
0
17 Sep 2021
Miðeind's WMT 2021 submission
Haukur Barri Símonarson
Vésteinn Snæbjarnarson
Pétur Orri Ragnarsson
Haukur Páll Jónsson
Vilhjálmur Þorsteinsson
VLM
121
13
0
15 Sep 2021
1
2
3
4
5
6
Next