ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.11707
174
3
v1v2v3v4v5 (latest)

Boosting Neural Machine Translation with Dependency-Scaled Self-Attention Network

23 November 2021
Ru Peng
Hongyan Wu
Yi Fang
Shengyi Jiang
Tianyong Hao
Boyu Chen
Jiaqi Zhao
ArXiv (abs)PDFHTML
Abstract

Syntax knowledge contributes its powerful strength in Neural machine translation (NMT) tasks. The early NMT model supposed that syntax details can be automatically learned from numerous texts via attention networks. However, succeeding researches pointed out that limited by the uncontrolled nature of attention computation, the model requires an external syntax to capture the deep syntactic awareness. Although recent syntax-aware NMT methods have bored great fruits in combining syntax, the additional workloads they introduced render the model heavy and slow. Particularly, these efforts scarcely involve the Transformer-based NMT and modify its core self-attention network (SAN). To this end, we propose a parameter-free, dependency-scaled self-attention network (Deps-SAN) for syntax-aware Transformer-based NMT. It integrates a quantified matrix of syntactic dependencies to impose explicit syntactic constraints into the SAN to learn syntactic details and dispel the dispersion of attention distributions. Two knowledge sparsing techniques are further proposed to avoid the model overfitting the dependency noises. Extensive experiments and analyses on the two benchmark NMT tasks verify the effectiveness of our approach.

View on arXiv
Comments on this paper