ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12233
  4. Cited By
Empirical Study on Updating Key-Value Memories in Transformer
  Feed-forward Layers

Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers

19 February 2024
Zihan Qiu
Zeyu Huang
Youcheng Huang
Jie Fu
    KELM
ArXivPDFHTML

Papers citing "Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers"

6 / 6 papers shown
Title
Human-inspired Perspectives: A Survey on AI Long-term Memory
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
77
2
0
01 Nov 2024
Layerwise Recurrent Router for Mixture-of-Experts
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu
Zeyu Huang
Shuang Cheng
Yizhi Zhou
Zili Wang
Ivan Titov
Jie Fu
MoE
68
2
0
13 Aug 2024
A Closer Look into Mixture-of-Experts in Large Language Models
A Closer Look into Mixture-of-Experts in Large Language Models
Ka Man Lo
Zeyu Huang
Zihan Qiu
Zili Wang
Jie Fu
MoE
25
10
0
26 Jun 2024
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
Haoze Wu
Zihan Qiu
Zili Wang
Hang Zhao
Jie Fu
MoE
27
3
0
18 Jun 2024
Enhancing In-Context Learning Performance with just SVD-Based Weight
  Pruning: A Theoretical Perspective
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
Xinhao Yao
Xiaolin Hu
Shenzhi Yang
Yong Liu
39
2
0
06 Jun 2024
Knowledge Editing for Large Language Models: A Survey
Knowledge Editing for Large Language Models: A Survey
Song Wang
Yaochen Zhu
Haochen Liu
Zaiyi Zheng
Chen Chen
Jundong Li
KELM
66
132
0
24 Oct 2023
1