Neural Machine Translation by Jointly Learning to Align and Translate

1 September 2014

Papers citing "Neural Machine Translation by Jointly Learning to Align and Translate"

50 / 6,152 papers shown

Title
Text Serialization and Their Relationship with the Conventional Paradigms of Tabular Machine Learning Kyoka Ono Simon A. Lee LMTD 24 7 0 19 Jun 2024
A Primal-Dual Framework for Transformers and Neural Networks Tan M. Nguyen Tam Nguyen Nhat Ho Andrea L. Bertozzi Richard G. Baraniuk Stanley J. Osher ViT 29 13 0 19 Jun 2024
Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis R. Teo Tan M. Nguyen 45 4 0 19 Jun 2024
Self-Supervised Time-Series Anomaly Detection Using Learnable Data Augmentation K. Choi Jihun Yi J. Mok Sungroh Yoon 35 1 0 18 Jun 2024
A Survey on Human Preference Learning for Large Language Models Ruili Jiang Kehai Chen Xuefeng Bai Zhixuan He Juntao Li Muyun Yang Tiejun Zhao Liqiang Nie Min Zhang 49 8 0 17 Jun 2024
Multiple Sources are Better Than One: Incorporating External Knowledge in Low-Resource Glossing Changbing Yang Garrett Nicolai Miikka Silfverberg 37 1 0 16 Jun 2024
SynthTree: Co-supervised Local Model Synthesis for Explainable Prediction Evgenii Kuriabov Jia Li 35 0 0 16 Jun 2024
The Rise and Fall(?) of Software Engineering Antonio Mastropaolo Camilo Escobar-Velásquez Mario Linares-Vásquez 35 2 0 14 Jun 2024
Investigating the translation capabilities of Large Language Models trained on parallel data only Javier García Gilabert Carlos Escolano Aleix Sant Savall Francesca de Luca Fornaciari Audrey Mash Xixian Liao Maite Melero LRM 42 2 0 13 Jun 2024
Meta-Learning an Evolvable Developmental Encoding Milton L. Montero Erwan Plantec Eleni Nisioti J. Pedersen Sebastian Risi 40 0 0 13 Jun 2024
MMIL: A novel algorithm for disease associated cell type discovery Erin Craig Timothy Keyes J. Sarno Maxim E. Zaslavsky Garry Nolan Kara Davis Trevor Hastie Robert Tibshirani 20 0 0 12 Jun 2024
Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey Feng Liang Zhen Zhang Haifeng Lu Chengming Li Victor C. M. Leung Yanyi Guo Xiping Hu 45 3 0 12 Jun 2024
DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning Yuxi Feng Raymond Li Zhenan Fan Giuseppe Carenini Mohammadreza Pourreza Weiwei Zhang Yong Zhang 34 1 0 12 Jun 2024
An Empirical Study of Mamba-based Language Models R. Waleffe Wonmin Byeon Duncan Riach Brandon Norick V. Korthikanti ... Vartika Singh Jared Casper Jan Kautz M. Shoeybi Bryan Catanzaro 63 65 0 12 Jun 2024
Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model Elaheh Baharlouei Mahsa Shafaei Yigeng Zhang Hugo Jair Escalante Thamar Solorio 51 0 0 12 Jun 2024
Transformer Models in Education: Summarizing Science Textbooks with AraBART, MT5, AraT5, and mBART Sari Masri Yaqeen Raddad Fidaa Khandaqji Huthaifa I. Ashqar Mohammed Elhenawy 36 5 0 11 Jun 2024
TIM: Temporal Interaction Model in Notification System Huxiao Ji Haitao Yang Linchuan Li Shunyu Zhang Cunyi Zhang Xuanping Li Wenwu Ou 34 0 0 11 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling Liliang Ren Yang Liu Yadong Lu Yelong Shen Chen Liang Weizhu Chen Mamba 77 57 0 11 Jun 2024
Continuum Attention for Neural Operators Edoardo Calvello Nikola B. Kovachki Matthew E. Levine Andrew M. Stuart 36 10 0 10 Jun 2024
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models Martin Courtois Malte Ostendorff Leonhard Hennig Georg Rehm 39 2 0 10 Jun 2024
Explainable AI for Mental Disorder Detection via Social Media: A survey and outlook Yusif Ibrahimov Tarique Anwar Tommy Yuan 39 3 0 10 Jun 2024
Recent advancements in computational morphology : A comprehensive survey Jatayu Baxi Brijesh S. Bhatt AI4CE 43 1 0 08 Jun 2024
Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications Zhou Zhou Guohang He Zheng Zhang Luziwei Leng Qinghai Guo Jianxing Liao Xuan Song Ran Cheng 47 2 0 08 Jun 2024
L-SFAN: Lightweight Spatially-focused Attention Network for Pain Behavior Detection Jorge Ortigoso-Narro F. Díaz-de-María Mohammad Mahdi Dehshibi Ana Tajadura-Jiménez 43 1 0 07 Jun 2024
Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors Tam Thuc Do Parham Eftekhar Seyed Alireza Hosseini Gene Cheung Philip A. Chou 31 1 0 06 Jun 2024
XL-HeadTags: Leveraging Multimodal Retrieval Augmentation for the Multilingual Generation of News Headlines and Tags Faisal Tareque Shohan Mir Tafseer Nayeem Samsul Islam Abu Ubaida Akash Chenyu You 42 2 0 06 Jun 2024
Enhancing CTC-based speech recognition with diverse modeling units Shiyi Han Zhihong Lei Mingbin Xu Xingyu Na Zhen Huang 41 0 0 05 Jun 2024
Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Brian K Chen Tianyang Hu Hui Jin Hwee Kuan Lee Kenji Kawaguchi 55 0 0 05 Jun 2024
Block Transformer: Global-to-Local Language Modeling for Fast Inference Namgyu Ho Sangmin Bae Taehyeon Kim Hyunjik Jo Yireun Kim Tal Schuster Adam Fisch James Thorne Se-Young Yun 47 8 0 04 Jun 2024
Universal In-Context Approximation By Prompting Fully Recurrent Models Aleksandar Petrov Tom A. Lamb Alasdair Paren Philip Torr Adel Bibi LRM 32 0 0 03 Jun 2024
3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information Sihan Wen Xiantan Zhu Zhiming Tan 3DH 42 0 0 03 Jun 2024
MultiMax: Sparse and Multi-Modal Attention Learning Yuxuan Zhou Mario Fritz M. Keuper 42 1 0 03 Jun 2024
A Synergistic Approach In Network Intrusion Detection By Neurosymbolic AI Alice Bizzarri Chung-En Yu B. Jalaeian Fabrizio Riguzzi Nathaniel D. Bastian AAML 29 2 0 03 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model Khaled Alomar Halil Ibrahim Aysel Xiaohao Cai MedIm ViT 43 7 0 02 Jun 2024
Pseudo-label Based Domain Adaptation for Zero-Shot Text Steganalysis Yufei Luo Zhen Yang Ru Zhang Jianyi Liu 20 0 0 01 Jun 2024
RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis Md. Mostafizer Rahman Ariful Islam Shiplu Yutaka Watanobe Md. Ashad Alam 30 10 0 01 Jun 2024
Recurrent neural networks: vanishing and exploding gradients are not the end of the story Nicolas Zucchet Antonio Orvieto ODL AAML 45 9 0 31 May 2024
P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image Segmentation Qi Zhang Guohua Geng Long-He Yan Pengbo Zhou Zhaodi Li Kang Li Qinglin Liu DiffM 40 1 0 30 May 2024
Training-efficient density quantum machine learning Brian Coyle El Amine Cherrat Nishant Jain Natansh Mathur Snehal Raj Skander Kazdaghli Iordanis Kerenidis 47 5 0 30 May 2024
Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective Chenze Shao Fandong Meng Jiali Zeng Jie Zhou 18 0 0 29 May 2024
Contextual Position Encoding: Learning to Count What's Important O. Yu. Golovneva Tianlu Wang Jason Weston Sainbayar Sukhbaatar 53 25 0 29 May 2024
Prototype Analysis in Hopfield Networks with Hebbian Learning Hayden McAlister Anthony Robins Lech Szymanski 24 2 0 29 May 2024
Understanding Transformer Reasoning Capabilities via Graph Algorithms Clayton Sanford Bahare Fatemi Ethan Hall Anton Tsitsulin Seyed Mehran Kazemi Jonathan J. Halcrow Bryan Perozzi Vahab Mirrokni 46 30 0 28 May 2024
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention Zhen Qin Weigao Sun Dong Li Xuyang Shen Weixuan Sun Yiran Zhong 46 9 0 27 May 2024
Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration Juan C. Pérez Alejandro Pardo Mattia Soldan Hani Itani Juan Carlos León Alcázar Guohao Li 32 2 0 27 May 2024
The Multi-Range Theory of Translation Quality Measurement: MQM scoring models and Statistical Quality Control A. Lommel Serge Gladkoff Alan Melby Sue Ellen Wright Ingemar Strandvik ... Romina Marazzato Sparano Monica Foresi Johani Innis Lifeng Han Goran Nenadic 38 2 0 27 May 2024
SoK: Leveraging Transformers for Malware Analysis Pradip Kunwar Kshitiz Aryal Maanak Gupta Mahmoud Abdelsalam Elisa Bertino 90 0 0 27 May 2024
Active Learning for Finely-Categorized Image-Text Retrieval by Selecting Hard Negative Unpaired Samples D. Jo Kyuewang Lee Jaeho Chung Jin Young Choi 24 0 0 25 May 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers Lorenzo Tiberi Francesca Mignacco Kazuki Irie H. Sompolinsky 44 6 0 24 May 2024
Optimizing Large Language Models for OpenAPI Code Completion Bohdan Petryshyn M. Lukoševičius LLMAG ALM 40 0 0 24 May 2024