Multilinear Mixture of Experts: Scalable Expert Specialization through
Factorization

Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization

19 February 2024

Markos Georgopoulos

Grigorios G. Chrysos

Christos Tzelepis

Yannis Panagakis

Papers citing "Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization"

7 / 7 papers shown

Title
Covariate Dependent Mixture of Bayesian Networks Román Marchant Dario Draca Gilad Francis Sahand Assadzadeh Mathew Varidel Frank Iorfino Sally Cripps CML 46 0 0 10 Jan 2025
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition Artem Basharin Andrei Chertkov Ivan V. Oseledets 34 1 0 23 Oct 2024
From Sparse to Soft Mixtures of Experts J. Puigcerver C. Riquelme Basil Mustafa N. Houlsby MoE 114 114 0 02 Aug 2023
Toy Models of Superposition Nelson Elhage Tristan Hume Catherine Olsson Nicholas Schiefer T. Henighan ... Sam McCandlish Jared Kaplan Dario Amodei Martin Wattenberg C. Olah AAML MILM 117 314 0 21 Sep 2022
Quantifying Local Specialization in Deep Neural Networks Shlomi Hod Daniel Filan Stephen Casper Andrew Critch Stuart J. Russell 55 10 0 13 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 239 2,554 0 04 May 2021
Emerging Properties in Self-Supervised Vision Transformers Mathilde Caron Hugo Touvron Ishan Misra Hervé Jégou Julien Mairal Piotr Bojanowski Armand Joulin 283 5,723 0 29 Apr 2021