ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1409.0473
  4. Cited By
Neural Machine Translation by Jointly Learning to Align and Translate

Neural Machine Translation by Jointly Learning to Align and Translate

1 September 2014
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
    AIMat
ArXivPDFHTML

Papers citing "Neural Machine Translation by Jointly Learning to Align and Translate"

50 / 6,152 papers shown
Title
Text Serialization and Their Relationship with the Conventional
  Paradigms of Tabular Machine Learning
Text Serialization and Their Relationship with the Conventional Paradigms of Tabular Machine Learning
Kyoka Ono
Simon A. Lee
LMTD
24
7
0
19 Jun 2024
A Primal-Dual Framework for Transformers and Neural Networks
A Primal-Dual Framework for Transformers and Neural Networks
Tan M. Nguyen
Tam Nguyen
Nhat Ho
Andrea L. Bertozzi
Richard G. Baraniuk
Stanley J. Osher
ViT
29
13
0
19 Jun 2024
Unveiling the Hidden Structure of Self-Attention via Kernel Principal
  Component Analysis
Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
R. Teo
Tan M. Nguyen
45
4
0
19 Jun 2024
Self-Supervised Time-Series Anomaly Detection Using Learnable Data
  Augmentation
Self-Supervised Time-Series Anomaly Detection Using Learnable Data Augmentation
K. Choi
Jihun Yi
J. Mok
Sungroh Yoon
35
1
0
18 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
49
8
0
17 Jun 2024
Multiple Sources are Better Than One: Incorporating External Knowledge
  in Low-Resource Glossing
Multiple Sources are Better Than One: Incorporating External Knowledge in Low-Resource Glossing
Changbing Yang
Garrett Nicolai
Miikka Silfverberg
37
1
0
16 Jun 2024
SynthTree: Co-supervised Local Model Synthesis for Explainable
  Prediction
SynthTree: Co-supervised Local Model Synthesis for Explainable Prediction
Evgenii Kuriabov
Jia Li
35
0
0
16 Jun 2024
The Rise and Fall(?) of Software Engineering
The Rise and Fall(?) of Software Engineering
Antonio Mastropaolo
Camilo Escobar-Velásquez
Mario Linares-Vásquez
35
2
0
14 Jun 2024
Investigating the translation capabilities of Large Language Models
  trained on parallel data only
Investigating the translation capabilities of Large Language Models trained on parallel data only
Javier García Gilabert
Carlos Escolano
Aleix Sant Savall
Francesca de Luca Fornaciari
Audrey Mash
Xixian Liao
Maite Melero
LRM
42
2
0
13 Jun 2024
Meta-Learning an Evolvable Developmental Encoding
Meta-Learning an Evolvable Developmental Encoding
Milton L. Montero
Erwan Plantec
Eleni Nisioti
J. Pedersen
Sebastian Risi
40
0
0
13 Jun 2024
MMIL: A novel algorithm for disease associated cell type discovery
MMIL: A novel algorithm for disease associated cell type discovery
Erin Craig
Timothy Keyes
J. Sarno
Maxim E. Zaslavsky
Garry Nolan
Kara Davis
Trevor Hastie
Robert Tibshirani
20
0
0
12 Jun 2024
Resource Allocation and Workload Scheduling for Large-Scale Distributed
  Deep Learning: A Survey
Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Chengming Li
Victor C. M. Leung
Yanyi Guo
Xiping Hu
45
3
0
12 Jun 2024
DeTriever: Decoder-representation-based Retriever for Improving NL2SQL
  In-Context Learning
DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning
Yuxi Feng
Raymond Li
Zhenan Fan
Giuseppe Carenini
Mohammadreza Pourreza
Weiwei Zhang
Yong Zhang
34
1
0
12 Jun 2024
An Empirical Study of Mamba-based Language Models
An Empirical Study of Mamba-based Language Models
R. Waleffe
Wonmin Byeon
Duncan Riach
Brandon Norick
V. Korthikanti
...
Vartika Singh
Jared Casper
Jan Kautz
M. Shoeybi
Bryan Catanzaro
63
65
0
12 Jun 2024
Labeling Comic Mischief Content in Online Videos with a Multimodal
  Hierarchical-Cross-Attention Model
Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model
Elaheh Baharlouei
Mahsa Shafaei
Yigeng Zhang
Hugo Jair Escalante
Thamar Solorio
51
0
0
12 Jun 2024
Transformer Models in Education: Summarizing Science Textbooks with
  AraBART, MT5, AraT5, and mBART
Transformer Models in Education: Summarizing Science Textbooks with AraBART, MT5, AraT5, and mBART
Sari Masri
Yaqeen Raddad
Fidaa Khandaqji
Huthaifa I. Ashqar
Mohammed Elhenawy
36
5
0
11 Jun 2024
TIM: Temporal Interaction Model in Notification System
TIM: Temporal Interaction Model in Notification System
Huxiao Ji
Haitao Yang
Linchuan Li
Shunyu Zhang
Cunyi Zhang
Xuanping Li
Wenwu Ou
34
0
0
11 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
77
57
0
11 Jun 2024
Continuum Attention for Neural Operators
Continuum Attention for Neural Operators
Edoardo Calvello
Nikola B. Kovachki
Matthew E. Levine
Andrew M. Stuart
36
10
0
10 Jun 2024
Symmetric Dot-Product Attention for Efficient Training of BERT Language
  Models
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
Martin Courtois
Malte Ostendorff
Leonhard Hennig
Georg Rehm
39
2
0
10 Jun 2024
Explainable AI for Mental Disorder Detection via Social Media: A survey
  and outlook
Explainable AI for Mental Disorder Detection via Social Media: A survey and outlook
Yusif Ibrahimov
Tarique Anwar
Tommy Yuan
39
3
0
10 Jun 2024
Recent advancements in computational morphology : A comprehensive survey
Recent advancements in computational morphology : A comprehensive survey
Jatayu Baxi
Brijesh S. Bhatt
AI4CE
43
1
0
08 Jun 2024
Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI
  Applications
Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications
Zhou Zhou
Guohang He
Zheng Zhang
Luziwei Leng
Qinghai Guo
Jianxing Liao
Xuan Song
Ran Cheng
47
2
0
08 Jun 2024
L-SFAN: Lightweight Spatially-focused Attention Network for Pain
  Behavior Detection
L-SFAN: Lightweight Spatially-focused Attention Network for Pain Behavior Detection
Jorge Ortigoso-Narro
F. Díaz-de-María
Mohammad Mahdi Dehshibi
Ana Tajadura-Jiménez
43
1
0
07 Jun 2024
Interpretable Lightweight Transformer via Unrolling of Learned Graph
  Smoothness Priors
Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors
Tam Thuc Do
Parham Eftekhar
Seyed Alireza Hosseini
Gene Cheung
Philip A. Chou
31
1
0
06 Jun 2024
XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the
  Multilingual Generation of News Headlines and Tags
XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags
Faisal Tareque Shohan
Mir Tafseer Nayeem
Samsul Islam
Abu Ubaida Akash
Chenyu You
42
2
0
06 Jun 2024
Enhancing CTC-based speech recognition with diverse modeling units
Enhancing CTC-based speech recognition with diverse modeling units
Shiyi Han
Zhihong Lei
Mingbin Xu
Xingyu Na
Zhen Huang
41
0
0
05 Jun 2024
Exact Conversion of In-Context Learning to Model Weights in
  Linearized-Attention Transformers
Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
Brian K Chen
Tianyang Hu
Hui Jin
Hwee Kuan Lee
Kenji Kawaguchi
55
0
0
05 Jun 2024
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Namgyu Ho
Sangmin Bae
Taehyeon Kim
Hyunjik Jo
Yireun Kim
Tal Schuster
Adam Fisch
James Thorne
Se-Young Yun
47
8
0
04 Jun 2024
Universal In-Context Approximation By Prompting Fully Recurrent Models
Universal In-Context Approximation By Prompting Fully Recurrent Models
Aleksandar Petrov
Tom A. Lamb
Alasdair Paren
Philip Torr
Adel Bibi
LRM
32
0
0
03 Jun 2024
3D WholeBody Pose Estimation based on Semantic Graph Attention Network
  and Distance Information
3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information
Sihan Wen
Xiantan Zhu
Zhiming Tan
3DH
42
0
0
03 Jun 2024
MultiMax: Sparse and Multi-Modal Attention Learning
MultiMax: Sparse and Multi-Modal Attention Learning
Yuxuan Zhou
Mario Fritz
M. Keuper
42
1
0
03 Jun 2024
A Synergistic Approach In Network Intrusion Detection By Neurosymbolic
  AI
A Synergistic Approach In Network Intrusion Detection By Neurosymbolic AI
Alice Bizzarri
Chung-En Yu
B. Jalaeian
Fabrizio Riguzzi
Nathaniel D. Bastian
AAML
29
2
0
03 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a
  Hybrid Model
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
43
7
0
02 Jun 2024
Pseudo-label Based Domain Adaptation for Zero-Shot Text Steganalysis
Pseudo-label Based Domain Adaptation for Zero-Shot Text Steganalysis
Yufei Luo
Zhen Yang
Ru Zhang
Jianyi Liu
20
0
0
01 Jun 2024
RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis
RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis
Md. Mostafizer Rahman
Ariful Islam Shiplu
Yutaka Watanobe
Md. Ashad Alam
30
10
0
01 Jun 2024
Recurrent neural networks: vanishing and exploding gradients are not the
  end of the story
Recurrent neural networks: vanishing and exploding gradients are not the end of the story
Nicolas Zucchet
Antonio Orvieto
ODL
AAML
45
9
0
31 May 2024
P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image
  Segmentation
P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image Segmentation
Qi Zhang
Guohua Geng
Long-He Yan
Pengbo Zhou
Zhaodi Li
Kang Li
Qinglin Liu
DiffM
40
1
0
30 May 2024
Training-efficient density quantum machine learning
Training-efficient density quantum machine learning
Brian Coyle
El Amine Cherrat
Nishant Jain
Natansh Mathur
Snehal Raj
Skander Kazdaghli
Iordanis Kerenidis
47
5
0
30 May 2024
Understanding and Addressing the Under-Translation Problem from the
  Perspective of Decoding Objective
Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective
Chenze Shao
Fandong Meng
Jiali Zeng
Jie Zhou
18
0
0
29 May 2024
Contextual Position Encoding: Learning to Count What's Important
Contextual Position Encoding: Learning to Count What's Important
O. Yu. Golovneva
Tianlu Wang
Jason Weston
Sainbayar Sukhbaatar
53
25
0
29 May 2024
Prototype Analysis in Hopfield Networks with Hebbian Learning
Prototype Analysis in Hopfield Networks with Hebbian Learning
Hayden McAlister
Anthony Robins
Lech Szymanski
24
2
0
29 May 2024
Understanding Transformer Reasoning Capabilities via Graph Algorithms
Understanding Transformer Reasoning Capabilities via Graph Algorithms
Clayton Sanford
Bahare Fatemi
Ethan Hall
Anton Tsitsulin
Seyed Mehran Kazemi
Jonathan J. Halcrow
Bryan Perozzi
Vahab Mirrokni
46
30
0
28 May 2024
Various Lengths, Constant Speed: Efficient Language Modeling with
  Lightning Attention
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
46
9
0
27 May 2024
Compressed-Language Models for Understanding Compressed File Formats: a
  JPEG Exploration
Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration
Juan C. Pérez
Alejandro Pardo
Mattia Soldan
Hani Itani
Juan Carlos León Alcázar
Guohao Li
32
2
0
27 May 2024
The Multi-Range Theory of Translation Quality Measurement: MQM scoring
  models and Statistical Quality Control
The Multi-Range Theory of Translation Quality Measurement: MQM scoring models and Statistical Quality Control
A. Lommel
Serge Gladkoff
Alan Melby
Sue Ellen Wright
Ingemar Strandvik
...
Romina Marazzato Sparano
Monica Foresi
Johani Innis
Lifeng Han
Goran Nenadic
38
2
0
27 May 2024
SoK: Leveraging Transformers for Malware Analysis
SoK: Leveraging Transformers for Malware Analysis
Pradip Kunwar
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
Elisa Bertino
90
0
0
27 May 2024
Active Learning for Finely-Categorized Image-Text Retrieval by Selecting
  Hard Negative Unpaired Samples
Active Learning for Finely-Categorized Image-Text Retrieval by Selecting Hard Negative Unpaired Samples
D. Jo
Kyuewang Lee
Jaeho Chung
Jin Young Choi
24
0
0
25 May 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics
  Theory of Transformers
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
Lorenzo Tiberi
Francesca Mignacco
Kazuki Irie
H. Sompolinsky
44
6
0
24 May 2024
Optimizing Large Language Models for OpenAPI Code Completion
Optimizing Large Language Models for OpenAPI Code Completion
Bohdan Petryshyn
M. Lukoševičius
LLMAG
ALM
40
0
0
24 May 2024
Previous
123...8910...122123124
Next