ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.14417
  4. Cited By
Mixture of Experts with Mixture of Precisions for Tuning Quality of
  Service

Mixture of Experts with Mixture of Precisions for Tuning Quality of Service

19 July 2024
HamidReza Imani
Abdolah Amirany
Tarek A. El-Ghazawi
    MoE
ArXivPDFHTML

Papers citing "Mixture of Experts with Mixture of Precisions for Tuning Quality of Service"

6 / 6 papers shown
Title
Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques
Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques
Shwai He
Daize Dong
Liang Ding
Ang Li
MoE
50
7
0
04 Jun 2024
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Wei Huang
Yangdong Liu
Haotong Qin
Ying Li
Shiming Zhang
Xianglong Liu
Michele Magno
Xiaojuan Qi
MQ
66
63
0
06 Feb 2024
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit
  Quantization and Robustness
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
Young Jin Kim
Raffy Fahim
Hany Awadalla
MQ
MoE
48
17
0
03 Oct 2023
FlexGen: High-Throughput Generative Inference of Large Language Models
  with a Single GPU
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
135
208
0
13 Mar 2023
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion
  Parameter Pretraining
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie M. Zhang
Yong Li
Wei Lin
Jingren Zhou
Hongxia Yang
MoE
84
42
0
08 Oct 2021
Scalable and Efficient MoE Training for Multitask Multilingual Models
Scalable and Efficient MoE Training for Multitask Multilingual Models
Young Jin Kim
A. A. Awan
Alexandre Muzio
Andres Felipe Cruz Salinas
Liyang Lu
Amr Hendy
Samyam Rajbhandari
Yuxiong He
Hany Awadalla
MoE
88
82
0
22 Sep 2021
1