ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.03226
  4. Cited By
FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion

FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion

5 February 2024
Xing Han
Huy Nguyen
Carl Harris
Nhat Ho
S. Saria
    MoE
ArXivPDFHTML

Papers citing "FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion"

18 / 18 papers shown
Title
Layer-Aware Embedding Fusion for LLMs in Text Classifications
Layer-Aware Embedding Fusion for LLMs in Text Classifications
Jiho Gwak
Yuchul Jung
20
0
0
08 Apr 2025
Foundation-Model-Boosted Multimodal Learning for fMRI-based Neuropathic Pain Drug Response Prediction
Wenrui Fan
L. M. Riza Rizky
Jiayang Zhang
Chen Chen
Haiping Lu
Kevin Teh
Dinesh Selvarajah
Shuo Zhou
33
0
0
28 Feb 2025
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in
  Large Language Models
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
Nam V. Nguyen
Thong T. Doan
Luong Tran
Van Nguyen
Quang Pham
MoE
48
1
0
01 Nov 2024
Quadratic Gating Functions in Mixture of Experts: A Statistical Insight
Quadratic Gating Functions in Mixture of Experts: A Statistical Insight
Pedram Akbarian
Huy Le Nguyen
Xing Han
Nhat Ho
MoE
25
0
0
15 Oct 2024
Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible
  Mixture-of-Experts
Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts
Sukwon Yun
Inyoung Choi
Jie Peng
Yangfan Wu
J. Bao
Qiyiwen Zhang
Jiayi Xin
Qi Long
Tianlong Chen
MoE
33
4
0
10 Oct 2024
Statistical Advantages of Perturbing Cosine Router in Mixture of Experts
Statistical Advantages of Perturbing Cosine Router in Mixture of Experts
Huy Le Nguyen
Pedram Akbarian
Trang Pham
Trang Nguyen
Shujian Zhang
Nhat Ho
MoE
31
2
0
23 May 2024
Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture
  of Experts
Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Huy Nguyen
Nhat Ho
Alessandro Rinaldo
29
3
0
22 May 2024
On Parameter Estimation in Deviated Gaussian Mixture of Experts
On Parameter Estimation in Deviated Gaussian Mixture of Experts
Huy Nguyen
Khai Nguyen
Nhat Ho
19
0
0
07 Feb 2024
On Least Square Estimation in Softmax Gating Mixture of Experts
On Least Square Estimation in Softmax Gating Mixture of Experts
Huy Nguyen
Nhat Ho
Alessandro Rinaldo
26
6
0
05 Feb 2024
Multiple Noises in Diffusion Model for Semi-Supervised Multi-Domain
  Translation
Multiple Noises in Diffusion Model for Semi-Supervised Multi-Domain Translation
Tsiry Mayet
Simon Bernard
Clément Chatelain
Romain Hérault
DiffM
22
0
0
25 Sep 2023
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of
  Experts
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts
Huy Nguyen
Pedram Akbarian
Fanqi Yan
Nhat Ho
MoE
23
16
0
25 Sep 2023
From Sparse to Soft Mixtures of Experts
From Sparse to Soft Mixtures of Experts
J. Puigcerver
C. Riquelme
Basil Mustafa
N. Houlsby
MoE
112
59
0
02 Aug 2023
Towards Convergence Rates for Parameter Estimation in Gaussian-gated
  Mixture of Experts
Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts
Huy Nguyen
TrungTin Nguyen
Khai Nguyen
Nhat Ho
MoE
26
12
0
12 May 2023
Improving Medical Predictions by Irregular Multimodal Electronic Health
  Records Modeling
Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling
Xinlu Zhang
Shiyang Li
Zhiyu Zoey Chen
Xifeng Yan
Linda R. Petzold
AI4TS
39
24
0
18 Oct 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
137
323
0
18 Feb 2022
Clinical-Longformer and Clinical-BigBird: Transformers for long clinical
  sequences
Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences
Yikuan Li
R. M. Wehbe
F. Ahmad
Hanyin Wang
Yuan Luo
VLM
MedIm
129
82
0
27 Jan 2022
Multi-Time Attention Networks for Irregularly Sampled Time Series
Multi-Time Attention Networks for Irregularly Sampled Time Series
Satya Narayan Shukla
Benjamin M. Marlin
AI4TS
100
179
0
25 Jan 2021
Recurrent Neural Networks for Multivariate Time Series with Missing
  Values
Recurrent Neural Networks for Multivariate Time Series with Missing Values
Zhengping Che
S. Purushotham
Kyunghyun Cho
David Sontag
Yan Liu
AI4TS
191
1,674
0
06 Jun 2016
1