ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.02176
  4. Cited By
Towards Being Parameter-Efficient: A Stratified Sparsely Activated
  Transformer with Dynamic Capacity
v1v2 (latest)

Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
3 May 2023
Da Xu
Maha Elbayad
Kenton W. Murray
Jean Maillard
Vedanuj Goswami
    MoE
ArXiv (abs)PDFHTML

Papers citing "Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity"

2 / 2 papers shown
Title
MultiMUC: Multilingual Template Filling on MUC-4
MultiMUC: Multilingual Template Filling on MUC-4Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
William Gantt
Shabnam Behzad
Hannah YoungEun An
Yunmo Chen
Aaron Steven White
Benjamin Van Durme
M. Yarmohammadi
151
5
0
29 Jan 2024
Condensing Multilingual Knowledge with Lightweight Language-Specific
  Modules
Condensing Multilingual Knowledge with Lightweight Language-Specific ModulesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Haoran Xu
Weiting Tan
Shuyue Stella Li
Yunmo Chen
Benjamin Van Durme
Philipp Koehn
Kenton W. Murray
263
7
0
23 May 2023
1