Tensor2Tensor for Neural Machine Translation

16 March 2018

Papers citing "Tensor2Tensor for Neural Machine Translation"

50 / 264 papers shown

Facilitating Cognitive Accessibility with LLMs: A Multi-Task Approach to Easy-to-Read Text Generation

01 Oct 2025

Large Language Models for Summarizing Czech Historical Documents and BeyondInternational Conference on Agents and Artificial Intelligence (ICAART), 2025

130

14 Aug 2025

Lightweight and Interpretable Transformer via Mixed Graph Algorithm Unrolling for Traffic Forecast

263

19 May 2025

A Local Polyak-Lojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models

276

16 May 2025

Efficient Time Series Forecasting via Hyper-Complex Models and Frequency Aggregation

353

27 Feb 2025

EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement LearningAsian Conference on Machine Learning (ACML), 2025

302

17 Jan 2025

Domain adapted machine translation: What does catastrophic forgetting forget and why?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Danielle Saunders

Steve DeNeefe

AI4CE

124

23 Dec 2024

Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

303

24 Oct 2024

A survey of neural-network-based methods utilising comparable data for finding translation equivalents

Michaela Denisová

Pavel Rychlý

238

19 Oct 2024

Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations

Zhicheng Ren

Zhiping Xiao

Luke Huan

272

04 Sep 2024

DLP: towards active defense against backdoor attacks with decoupled learning process

Zonghao Ying

Bin Wu

AAML

303

18 Jun 2024

Separable Physics-Informed Neural Networks for the solution of elasticity problems

302

24 Jan 2024

Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and BaselinesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Stephen Lawrence Bothwell

323

30 Nov 2023

CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source CodeInternational Conference on Learning Representations (ICLR), 2023

Nadezhda Chirkova

Sergey Troshin

242

01 Aug 2023

3D Medical Image Segmentation based on multi-scale MPU-Net

211

11 Jul 2023

Urania: Visualizing Data Analysis Pipelines for Natural Language-Based Data Exploration

Jing Zhang

Daniel Weiskopf

164

13 Jun 2023

HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa LanguageAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Shantipriya Parida

Idris Abdulmumin

Shamsuddeen Hassan Muhammad

Habeebah Adamu Kakudi

294

28 May 2023

Dynamic Context Pruning for Efficient and Interpretable Autoregressive TransformersNeural Information Processing Systems (NeurIPS), 2023

367

25 May 2023

Exploring the Impact of Layer Normalization for Zero-shot Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Qianying Liu

Sadao Kurohashi

145

16 May 2023

AttentionViz: A Global View of Transformer AttentionIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023

324

04 May 2023

string2string: A Modern Python Library for String-to-String Algorithms

Mirac Suzgun

Stuart M. Shieber

Dan Jurafsky

172

27 Apr 2023

Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder

Zhiyuan Liu

167

08 Apr 2023

Datamator: An Intelligent Authoring Tool for Creating Datamations via Data Query Decomposition

Daniel Weiskopf

244

06 Apr 2023

About optimal loss function for training physics-informed neural networks under respecting causality

246

05 Apr 2023

Synthetically generated text for supervised text analysisPolitical Analysis (PA), 2023

Andrew Halterman

DeLMO

156

28 Mar 2023

Transformers in Speech Processing: A Survey

448

21 Mar 2023

Mutation-Based Adversarial Attacks on Neural Text Detectors

195

11 Feb 2023

ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated Text

165

183

30 Jan 2023

CUNI Systems for the WMT22 Czech-Ukrainian Translation TaskConference on Machine Translation (WMT), 2022

Martin Popel

Jindrich Libovický

Jindřich Helcl

134

01 Dec 2022

QNet: A Quantum-native Sequence Encoder ArchitectureInternational Conference on Quantum Computing and Engineering (ICQCE), 2022

Wei-Yen Day

Hao-Sheng Chen

Min Sun

248

31 Oct 2022

Tools for Extracting Spatio-Temporal Patterns in Meteorological Image Sequences: From Feature Engineering to Attention-Based Neural Networks

288

22 Oct 2022

On the Explainability of Natural Language Processing Deep ModelsACM Computing Surveys (ACM CSUR), 2022

Julia El Zini

M. Awad

244

110

13 Oct 2022

PARAGEN : A Parallel Generation Toolkit

Lei Li

191

07 Oct 2022

A Deep Investigation of RNN and Self-attention for the Cyrillic-Traditional Mongolian Bidirectional ConversionInternational Conference on Neural Information Processing (ICONIP), 2022

133

24 Sep 2022

Set Norm and Equivariant Skip Connections: Putting the Deep in Deep SetsInternational Conference on Machine Learning (ICML), 2022

261

23 Jun 2022

B2T Connection: Serving Stability and Performance in Deep TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

315

01 Jun 2022

How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processingArtificial Intelligence Review (Artif Intell Rev), 2022

Samuel Sousa

Roman Kern

PILM AILaw

212

20 May 2022

Optimizing Mixture of Experts using Dynamic Recompilations

Ferdinand Kossmann

Zhihao Jia

A. Aiken

242

04 May 2022

Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine TranslationInternational Conference on Language Resources and Evaluation (LREC), 2022

Shamsuddeen Hassan Muhammad

279

02 May 2022

NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural Language Understanding in Task-Oriented Dialogue

I. Casanueva

Ivan Vulić

Georgios P. Spithourakis

Paweł Budzianowski

248

27 Apr 2022

A Call for Clarity in Beam Search: How It Works and When It StopsInternational Conference on Language Resources and Evaluation (LREC), 2022

Keisuke Sakaguchi

Yejin Choi

286

11 Apr 2022

$Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$$

Scaling Up Models and Data with

\texttt{t5x}

and

\texttt{seqio}

Journal of machine learning research (JMLR), 2022

...

289

213

31 Mar 2022

General-purpose, long-context autoregressive modeling with Perceiver ARInternational Conference on Machine Learning (ICML), 2022

...

Jean-Baptiste Alayrac

João Carreira

Jesse Engel

237

15 Feb 2022

Capitalization and Punctuation Restoration: a SurveyArtificial Intelligence Review (AIR), 2021

V. Pais

D. Tufis

209

21 Nov 2021

Benchmarking and scaling of deep learning models for land cover image classification

Ioannis Papoutsis

Nikolaos Ioannis Bountos

Angelos Zavras

Dimitrios Michail

Christos Tryfonopoulos

450

18 Nov 2021

Say What? Collaborative Pop Lyric Generation Using Multitask Transfer LearningInternational Conference on Human-Agent Interaction (HAI), 2021

176

15 Nov 2021

Leveraging redundancy in attention with Reuse Transformers

Srinadh Bhojanapalli

Sanjiv Kumar

152

13 Oct 2021

The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation

Orevaoghene Ahia

Julia Kreutzer

Sara Hooker

307

06 Oct 2021

Primer: Searching for Efficient Transformers for Language Modeling

401

184

17 Sep 2021

Miðeind's WMT 2021 submission

Haukur Barri Símonarson

Vésteinn Snæbjarnarson

Pétur Orri Ragnarsson

Haukur Páll Jónsson

Vilhjálmur Þorsteinsson

VLM

129

15 Sep 2021