PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning

31 July 2024

Papers citing "PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning"

7 / 7 papers shown

Title
Memorization and Knowledge Injection in Gated LLMs Xu Pan Ely Hahami Zechen Zhang H. Sompolinsky KELM CLL RALM 104 0 0 30 Apr 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications Siyuan Mu Sen Lin MoE 84 1 0 10 Mar 2025
Orthogonal Subspace Learning for Language Model Continual Learning Xiao Wang Tianze Chen Qiming Ge Han Xia Rong Bao Rui Zheng Qi Zhang Tao Gui Xuanjing Huang CLL 112 85 0 22 Oct 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering Pan Lu Swaroop Mishra Tony Xia Liang Qiu Kai-Wei Chang Song-Chun Zhu Oyvind Tafjord Peter Clark A. Kalyan ELM ReLM LRM 207 1,089 0 20 Sep 2022
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 278 3,784 0 18 Apr 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation Shuai Lu Daya Guo Shuo Ren Junjie Huang Alexey Svyatkovskiy ... Nan Duan Neel Sundaresan Shao Kun Deng Shengyu Fu Shujie Liu ELM 190 853 0 09 Feb 2021
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 223 4,424 0 23 Jan 2020