ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12550
  4. Cited By
Multilinear Mixture of Experts: Scalable Expert Specialization through
  Factorization

Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization

19 February 2024
James Oldfield
Markos Georgopoulos
Grigorios G. Chrysos
Christos Tzelepis
Yannis Panagakis
M. Nicolaou
Jiankang Deng
Ioannis Patras
    MoE
ArXivPDFHTML

Papers citing "Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization"

7 / 7 papers shown
Title
Covariate Dependent Mixture of Bayesian Networks
Covariate Dependent Mixture of Bayesian Networks
Román Marchant
Dario Draca
Gilad Francis
Sahand Assadzadeh
Mathew Varidel
Frank Iorfino
Sally Cripps
CML
46
0
0
10 Jan 2025
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
Artem Basharin
Andrei Chertkov
Ivan V. Oseledets
34
1
0
23 Oct 2024
From Sparse to Soft Mixtures of Experts
From Sparse to Soft Mixtures of Experts
J. Puigcerver
C. Riquelme
Basil Mustafa
N. Houlsby
MoE
118
114
0
02 Aug 2023
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
120
314
0
21 Sep 2022
Quantifying Local Specialization in Deep Neural Networks
Quantifying Local Specialization in Deep Neural Networks
Shlomi Hod
Daniel Filan
Stephen Casper
Andrew Critch
Stuart J. Russell
58
10
0
13 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
1