ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.07416
  4. Cited By
Tensor2Tensor for Neural Machine Translation

Tensor2Tensor for Neural Machine Translation

16 March 2018
Ashish Vaswani
Samy Bengio
E. Brevdo
François Chollet
Aidan N. Gomez
Stephan Gouws
Llion Jones
Lukasz Kaiser
Nal Kalchbrenner
Niki Parmar
Ryan Sepassi
Noam M. Shazeer
Jakob Uszkoreit
ArXivPDFHTML

Papers citing "Tensor2Tensor for Neural Machine Translation"

50 / 261 papers shown
Title
Efficient Time Series Forecasting via Hyper-Complex Models and Frequency Aggregation
Efficient Time Series Forecasting via Hyper-Complex Models and Frequency Aggregation
Eyal Yakir
Dor Tsur
H. Permuter
AI4TS
66
0
0
27 Feb 2025
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning
EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning
Siddharth Aravindan
Dixant Mittal
Wee Sun Lee
BDL
79
0
0
17 Jan 2025
Domain adapted machine translation: What does catastrophic forgetting
  forget and why?
Domain adapted machine translation: What does catastrophic forgetting forget and why?
Danielle Saunders
Steve DeNeefe
AI4CE
26
0
0
23 Dec 2024
Building Dialogue Understanding Models for Low-resource Language
  Indonesian from Scratch
Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch
Donglin Di
Weinan Zhang
Yue Zhang
Fanglin Wang
23
1
0
24 Oct 2024
A survey of neural-network-based methods utilising comparable data for
  finding translation equivalents
A survey of neural-network-based methods utilising comparable data for finding translation equivalents
Michaela Denisová
Pavel Rychlý
24
0
0
19 Oct 2024
Do We Trust What They Say or What They Do? A Multimodal User Embedding
  Provides Personalized Explanations
Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations
Zhicheng Ren
Zhiping Xiao
Yizhou Sun
38
0
0
04 Sep 2024
DLP: towards active defense against backdoor attacks with decoupled
  learning process
DLP: towards active defense against backdoor attacks with decoupled learning process
Zonghao Ying
Bin Wu
AAML
44
6
0
18 Jun 2024
Separable Physics-Informed Neural Networks for the solution of
  elasticity problems
Separable Physics-Informed Neural Networks for the solution of elasticity problems
V. A. Es'kin
Danil V. Davydov
Julia V. Guréva
Alexey O. Malkhanov
Mikhail E. Smorkalov
PINN
AI4CE
20
2
0
24 Jan 2024
Introducing Rhetorical Parallelism Detection: A New Task with Datasets,
  Metrics, and Baselines
Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and Baselines
Stephen Lawrence Bothwell
Justin DeBenedetto
Theresa Crnkovich
Hildegund Müller
David Chiang
ObjD
19
2
0
30 Nov 2023
CodeBPE: Investigating Subtokenization Options for Large Language Model
  Pretraining on Source Code
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code
Nadezhda Chirkova
Sergey Troshin
21
8
0
01 Aug 2023
3D Medical Image Segmentation based on multi-scale MPU-Net
3D Medical Image Segmentation based on multi-scale MPU-Net
Zeqiu Yu
Shuo Han
Ziheng Song
3DV
11
3
0
11 Jul 2023
Urania: Visualizing Data Analysis Pipelines for Natural Language-Based
  Data Exploration
Urania: Visualizing Data Analysis Pipelines for Natural Language-Based Data Exploration
Yi Guo
Nana Cao
Xiaoyu Qi
Haoyang Li
Danqing Shi
Jing Zhang
Qing Chen
Daniel Weiskopf
19
4
0
13 Jun 2023
HaVQA: A Dataset for Visual Question Answering and Multimodal Research
  in Hausa Language
HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language
Shantipriya Parida
Idris Abdulmumin
Shamsuddeen Hassan Muhammad
Aneesh Bose
Guneet Singh Kohli
I. Ahmad
Ketan Kotwal
S. Sarkar
Ondrej Bojar
Habeebah Adamu Kakudi
22
4
0
28 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurélien Lucchi
Thomas Hofmann
34
53
0
25 May 2023
Exploring the Impact of Layer Normalization for Zero-shot Neural Machine
  Translation
Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation
Zhuoyuan Mao
Raj Dabre
Qianying Liu
Haiyue Song
Chenhui Chu
Sadao Kurohashi
11
7
0
16 May 2023
AttentionViz: A Global View of Transformer Attention
AttentionViz: A Global View of Transformer Attention
Catherine Yeh
Yida Chen
Aoyu Wu
Cynthia Chen
Fernanda Viégas
Martin Wattenberg
ViT
33
52
0
04 May 2023
string2string: A Modern Python Library for String-to-String Algorithms
string2string: A Modern Python Library for String-to-String Algorithms
Mirac Suzgun
Stuart M. Shieber
Dan Jurafsky
39
7
0
27 Apr 2023
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a
  Regularized Encoder-Decoder
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder
Z. Fu
W. Lam
Qian Yu
Anthony Man-Cho So
Shengding Hu
Zhiyuan Liu
Nigel Collier
AuLLM
28
41
0
08 Apr 2023
Datamator: An Intelligent Authoring Tool for Creating Datamations via
  Data Query Decomposition
Datamator: An Intelligent Authoring Tool for Creating Datamations via Data Query Decomposition
Yi Guo
Nana Cao
Ligan Cai
Yanqiu Wu
Daniel Weiskopf
Danqing Shi
Qing Chen
23
1
0
06 Apr 2023
About optimal loss function for training physics-informed neural
  networks under respecting causality
About optimal loss function for training physics-informed neural networks under respecting causality
V. A. Es'kin
Danil V. Davydov
Ekaterina D. Egorova
Alexey O. Malkhanov
Mikhail A. Akhukov
Mikhail E. Smorkalov
PINN
16
7
0
05 Apr 2023
Synthetically generated text for supervised text analysis
Synthetically generated text for supervised text analysis
Andrew Halterman
DeLMO
32
6
0
28 Mar 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
Mutation-Based Adversarial Attacks on Neural Text Detectors
Mutation-Based Adversarial Attacks on Neural Text Detectors
G. Liang
Jesus Guerrero
I. Alsmadi
DeLMO
22
7
0
11 Feb 2023
ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine
  Learning Model for Detecting Short ChatGPT-generated Text
ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated Text
Sandra Mitrović
Davide Andreoletti
Omran Ayoub
DeLMO
19
144
0
30 Jan 2023
CUNI Systems for the WMT22 Czech-Ukrainian Translation Task
CUNI Systems for the WMT22 Czech-Ukrainian Translation Task
Martin Popel
Jindrich Libovický
Jindřich Helcl
19
4
0
01 Dec 2022
QNet: A Quantum-native Sequence Encoder Architecture
QNet: A Quantum-native Sequence Encoder Architecture
Wei-Yen Day
Hao-Sheng Chen
Min Sun
21
0
0
31 Oct 2022
Tools for Extracting Spatio-Temporal Patterns in Meteorological Image
  Sequences: From Feature Engineering to Attention-Based Neural Networks
Tools for Extracting Spatio-Temporal Patterns in Meteorological Image Sequences: From Feature Engineering to Attention-Based Neural Networks
A. S. Bansal
Yoonjin Lee
Kyle Hilburn
I. Ebert‐Uphoff
AI4TS
31
2
0
22 Oct 2022
On the Explainability of Natural Language Processing Deep Models
On the Explainability of Natural Language Processing Deep Models
Julia El Zini
M. Awad
27
82
0
13 Oct 2022
PARAGEN : A Parallel Generation Toolkit
PARAGEN : A Parallel Generation Toolkit
Jiangtao Feng
Yi Zhou
Jun Zhang
Xian Qian
Liwei Wu
Zhexi Zhang
Yanming Liu
Mingxuan Wang
Lei Li
Hao Zhou
VLM
30
3
0
07 Oct 2022
A Deep Investigation of RNN and Self-attention for the
  Cyrillic-Traditional Mongolian Bidirectional Conversion
A Deep Investigation of RNN and Self-attention for the Cyrillic-Traditional Mongolian Bidirectional Conversion
Muhan Na
Rui Liu
Feilong
Guanglai Gao
25
0
0
24 Sep 2022
Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
Lily H. Zhang
Veronica Tozzo
J. Higgins
Rajesh Ranganath
BDL
MoE
19
16
0
23 Jun 2022
B2T Connection: Serving Stability and Performance in Deep Transformers
B2T Connection: Serving Stability and Performance in Deep Transformers
Sho Takase
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
11
10
0
01 Jun 2022
How to keep text private? A systematic review of deep learning methods
  for privacy-preserving natural language processing
How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing
Samuel Sousa
Roman Kern
PILM
AILaw
20
39
0
20 May 2022
Optimizing Mixture of Experts using Dynamic Recompilations
Optimizing Mixture of Experts using Dynamic Recompilations
Ferdinand Kossmann
Zhihao Jia
A. Aiken
21
5
0
04 May 2022
Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine
  Translation
Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
Idris Abdulmumin
S. Dash
Musa Abdullahi Dawud
Shantipriya Parida
Shamsuddeen Hassan Muhammad
I. Ahmad
Subhadarshi Panda
Ondrej Bojar
B. Galadanci
Bello Shehu Bello
16
16
0
02 May 2022
NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural
  Language Understanding in Task-Oriented Dialogue
NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural Language Understanding in Task-Oriented Dialogue
I. Casanueva
Ivan Vulić
Georgios P. Spithourakis
Paweł Budzianowski
25
10
0
27 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$
Scaling Up Models and Data with t5x\texttt{t5x}t5x and seqio\texttt{seqio}seqio
Adam Roberts
Hyung Won Chung
Anselm Levskaya
Gaurav Mishra
James Bradbury
...
Brennan Saeta
Ryan Sepassi
A. Spiridonov
Joshua Newlan
Andrea Gesmundo
ALM
43
193
0
31 Mar 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
35
65
0
15 Feb 2022
Capitalization and Punctuation Restoration: a Survey
Capitalization and Punctuation Restoration: a Survey
V. Pais
D. Tufis
17
19
0
21 Nov 2021
Benchmarking and scaling of deep learning models for land cover image
  classification
Benchmarking and scaling of deep learning models for land cover image classification
Ioannis Papoutsis
N. Bountos
Angelos Zavras
Dimitrios Michail
Christos Tryfonopoulos
13
55
0
18 Nov 2021
Say What? Collaborative Pop Lyric Generation Using Multitask Transfer
  Learning
Say What? Collaborative Pop Lyric Generation Using Multitask Transfer Learning
Naveen Ram
Tanay Gummadi
Rahul Bhethanabotla
Richard J. Savery
Gil Weinberg
15
9
0
15 Nov 2021
Leveraging redundancy in attention with Reuse Transformers
Leveraging redundancy in attention with Reuse Transformers
Srinadh Bhojanapalli
Ayan Chakrabarti
Andreas Veit
Michal Lukasik
Himanshu Jain
Frederick Liu
Yin-Wen Chang
Sanjiv Kumar
18
23
0
13 Oct 2021
The Low-Resource Double Bind: An Empirical Study of Pruning for
  Low-Resource Machine Translation
The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation
Orevaoghene Ahia
Julia Kreutzer
Sara Hooker
107
51
0
06 Oct 2021
Primer: Searching for Efficient Transformers for Language Modeling
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
85
152
0
17 Sep 2021
Miðeind's WMT 2021 submission
Miðeind's WMT 2021 submission
Haukur Barri Símonarson
Vésteinn Snæbjarnarson
Pétur Orri Ragnarsson
Haukur Páll Jónsson
Vilhjálmur Þorsteinsson
VLM
18
11
0
15 Sep 2021
CrossedWires: A Dataset of Syntactically Equivalent but Semantically
  Disparate Deep Learning Models
CrossedWires: A Dataset of Syntactically Equivalent but Semantically Disparate Deep Learning Models
Max Zvyagin
Thomas Brettin
Arvind Ramanathan
Sumit Kumar Jha
14
1
0
29 Aug 2021
YANMTT: Yet Another Neural Machine Translation Toolkit
YANMTT: Yet Another Neural Machine Translation Toolkit
Raj Dabre
Eiichiro Sumita
31
13
0
25 Aug 2021
Compositional Generalization in Multilingual Semantic Parsing over
  Wikidata
Compositional Generalization in Multilingual Semantic Parsing over Wikidata
Ruixiang Cui
Rahul Aralikatte
Heather Lent
Daniel Hershcovich
34
11
0
07 Aug 2021
Residual Tree Aggregation of Layers for Neural Machine Translation
Residual Tree Aggregation of Layers for Neural Machine Translation
Guoliang Li
Yiyang Li
35
0
0
19 Jul 2021
123456
Next