ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.13676
  4. Cited By
Hymba: A Hybrid-head Architecture for Small Language Models

Hymba: A Hybrid-head Architecture for Small Language Models

20 November 2024
Xin Dong
Y. Fu
Shizhe Diao
Wonmin Byeon
Zijia Chen
Ameya Sunil Mahabaleshwarkar
Shih-yang Liu
Matthijs Van Keirsbilck
Min-Hung Chen
Yoshi Suhara
Y. Lin
Jan Kautz
Pavlo Molchanov
    Mamba
ArXivPDFHTML

Papers citing "Hymba: A Hybrid-head Architecture for Small Language Models"

7 / 7 papers shown
Title
Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and Extrapolation
Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and Extrapolation
Junyu Ma
Tianqing Fang
Z. Zhang
Hongming Zhang
Haitao Mi
Dong Yu
ReLM
RALM
LRM
33
0
0
06 May 2025
WuNeng: Hybrid State with Attention
WuNeng: Hybrid State with Attention
Liu Xiao
Li Zhiyuan
Lin Yueyu
22
0
0
27 Apr 2025
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
Cheng Deng
Luoyang Sun
Jiwen Jiang
Yongcheng Zeng
Xinjian Wu
...
Haoyang Li
Lei Chen
Lionel M. Ni
H. Zhang
Jun Wang
58
0
0
15 Mar 2025
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Disen Lan
Weigao Sun
Jiaxi Hu
Jusen Du
Yu-Xi Cheng
64
0
0
03 Mar 2025
Towards Reasoning Ability of Small Language Models
Towards Reasoning Ability of Small Language Models
Gaurav Srivastava
Shuxiang Cao
Xuan Wang
ReLM
LRM
41
4
0
17 Feb 2025
Multilingual State Space Models for Structured Question Answering in Indic Languages
Multilingual State Space Models for Structured Question Answering in Indic Languages
A. Vats
Rahul Raja
Mrinal Mathur
Vinija Jain
Aman Chadha
60
1
0
01 Feb 2025
Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models
Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models
J. P. Muñoz
Jinjie Yuan
Nilesh Jain
Mamba
61
1
0
28 Jan 2025
1