ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.14052
  4. Cited By
Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

28 December 2022
Daniel Y. Fu
Tri Dao
Khaled Kamal Saab
A. Thomas
Atri Rudra
Christopher Ré
ArXivPDFHTML

Papers citing "Hungry Hungry Hippos: Towards Language Modeling with State Space Models"

50 / 284 papers shown
Title
Mixture of Parrots: Experts improve memorization more than reasoning
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
26
3
0
24 Oct 2024
SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition
SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition
Jiaqi Chen
Yan Yang
Shizhuo Deng
Da Teng
Liyuan Pan
Mamba
24
1
0
22 Oct 2024
LMHaze: Intensity-aware Image Dehazing with a Large-scale
  Multi-intensity Real Haze Dataset
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset
Ruikun Zhang
Hao-Liang Yang
Yan Yang
Ying Fu
Liyuan Pan
17
3
0
21 Oct 2024
Spatial-Mamba: Effective Visual State Space Models via Structure-aware State Fusion
Spatial-Mamba: Effective Visual State Space Models via Structure-aware State Fusion
Chaodong Xiao
Minghan Li
Zhengqiang Zhang
Deyu Meng
Lei Zhang
Mamba
55
4
0
19 Oct 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide
  Image Analysis
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
32
4
0
18 Oct 2024
Flash Inference: Near Linear Time Inference for Long Convolution
  Sequence Models and Beyond
Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
Costin-Andrei Oncescu
Sanket Purandare
Stratos Idreos
Sham Kakade
VLM
AI4TS
3DV
16
0
0
16 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph
Jerome Sieber
M. Zeilinger
Carmen Amo Alonso
33
0
0
14 Oct 2024
Parameter-Efficient Fine-Tuning of State Space Models
Parameter-Efficient Fine-Tuning of State Space Models
Kevin Galim
Wonjun Kang
Yuchen Zeng
H. Koo
Kangwook Lee
29
4
0
11 Oct 2024
Towards Universality: Studying Mechanistic Similarity Across Language
  Model Architectures
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Junxuan Wang
Xuyang Ge
Wentao Shu
Qiong Tang
Yunhua Zhou
Zhengfu He
Xipeng Qiu
27
7
0
09 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient
  Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
Jianguo Li
Weiyao Lin
VLM
33
1
0
09 Oct 2024
Cookbook: A framework for improving LLM generative abilities via
  programmatic data generating templates
Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates
A. Narayan
Mayee F. Chen
Kush S. Bhatia
Christopher Ré
SyDa
36
3
0
07 Oct 2024
SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model
  for Long Sequences Learning
SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model for Long Sequences Learning
Yan Zhong
Ruoyu Zhao
Chao Wang
Qinghai Guo
Jianguo Zhang
Zhichao Lu
Luziwei Leng
39
2
0
07 Oct 2024
On Efficient Variants of Segment Anything Model: A Survey
On Efficient Variants of Segment Anything Model: A Survey
Xiaorui Sun
J. Liu
H. Shen
Xiaofeng Zhu
Ping Hu
VLM
43
4
0
07 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
26
1
0
07 Oct 2024
Efficient and Robust Long-Form Speech Recognition with Hybrid
  H3-Conformer
Efficient and Robust Long-Form Speech Recognition with Hybrid H3-Conformer
Tomoki Honda
S. Sakai
Tatsuya Kawahara
13
0
0
05 Oct 2024
HRVMamba: High-Resolution Visual State Space Model for Dense Prediction
HRVMamba: High-Resolution Visual State Space Model for Dense Prediction
Hao Zhang
Yongqiang Ma
Wenqi Shao
Ping Luo
Nanning Zheng
Kaipeng Zhang
Mamba
20
1
0
04 Oct 2024
Demystifying the Token Dynamics of Deep Selective State Space Models
Demystifying the Token Dynamics of Deep Selective State Space Models
Thieu N. Vo
Tung D. Pham
Xin T. Tong
Tan Minh Nguyen
Mamba
44
0
0
04 Oct 2024
FutureFill: Fast Generation from Convolutional Sequence Models
FutureFill: Fast Generation from Convolutional Sequence Models
Naman Agarwal
Xinyi Chen
Evan Dogariu
Vlad Feinberg
Daniel Suo
Peter L. Bartlett
Elad Hazan
AI4TS
MQ
25
2
0
02 Oct 2024
Were RNNs All We Needed?
Were RNNs All We Needed?
Leo Feng
Frederick Tung
Mohamed Osama Ahmed
Yoshua Bengio
Hossein Hajimirsadegh
AI4TS
23
14
1
02 Oct 2024
Hybrid Mamba for Few-Shot Segmentation
Hybrid Mamba for Few-Shot Segmentation
Qianxiong Xu
Xuanyi Liu
Lanyun Zhu
Guosheng Lin
Cheng Long
Ziyue Li
Rui Zhao
Mamba
20
3
0
29 Sep 2024
MECG-E: Mamba-based ECG Enhancer for Baseline Wander Removal
MECG-E: Mamba-based ECG Enhancer for Baseline Wander Removal
Kuo-Hsuan Hung
Kuan-Chen Wang
Kai-Chun Liu
Wei-Lun Chen
Xugang Lu
Yu Tsao
Chii-Wann Lin
Mamba
23
0
0
27 Sep 2024
MC-SEMamba: A Simple Multi-channel Extension of SEMamba
MC-SEMamba: A Simple Multi-channel Extension of SEMamba
Wen-Yuan Ting
Wenze Ren
Rong-Yu Chao
Hsin-Yi Lin
Yu Tsao
Fan-Gang Zeng
Mamba
30
0
0
26 Sep 2024
Path-adaptive Spatio-Temporal State Space Model for Event-based
  Recognition with Arbitrary Duration
Path-adaptive Spatio-Temporal State Space Model for Event-based Recognition with Arbitrary Duration
Jiazhou Zhou
Kanghao Chen
Lei Zhang
Lin Wang
24
0
0
25 Sep 2024
Contextual Compression in Retrieval-Augmented Generation for Large
  Language Models: A Survey
Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey
Sourav Verma
RALM
3DV
19
2
0
20 Sep 2024
SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal
  Dissection with Mamba
SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba
Xiangning Zhang
Jinnan Chen
Qingwei Zhang
Chengfeng Zhou
Zhengjie Zhang
XiaoBo Li
Dahong Qian
Mamba
22
0
0
18 Sep 2024
TTT-Unet: Enhancing U-Net with Test-Time Training Layers for Biomedical
  Image Segmentation
TTT-Unet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation
Rong-Er Zhou
Zhengqing Yuan
Zhiling Yan
Weixiang Sun
Kai Zhang
Yiwei Li
Yanfang Ye
Xiang Li
Lifang He
Lichao Sun
ViT
MedIm
20
0
0
17 Sep 2024
SkinMamba: A Precision Skin Lesion Segmentation Architecture with
  Cross-Scale Global State Modeling and Frequency Boundary Guidance
SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance
Shun Zou
Mingya Zhang
Bingjian Fan
Zhengyi Zhou
Xiuguo Zou
Mamba
24
3
0
17 Sep 2024
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Wenze Ren
Haibin Wu
Yi-Cheng Lin
Xuanjun Chen
Rong-Yu Chao
Kuo-Hsuan Hung
You-Jin Li
Wen-Yuan Ting
Hsin-Min Wang
Yu Tsao
Mamba
34
0
0
16 Sep 2024
Flash STU: Fast Spectral Transform Units
Flash STU: Fast Spectral Transform Units
Y. Isabel Liu
Windsor Nguyen
Yagiz Devre
Evan Dogariu
Anirudha Majumdar
Elad Hazan
AI4TS
59
1
0
16 Sep 2024
Spatial-Temporal Mamba Network for EEG-based Motor Imagery
  Classification
Spatial-Temporal Mamba Network for EEG-based Motor Imagery Classification
Xiaoxiao Yang
Ziyu Jia
Mamba
17
2
0
15 Sep 2024
Learning Brain Tumor Representation in 3D High-Resolution MR Images via
  Interpretable State Space Models
Learning Brain Tumor Representation in 3D High-Resolution MR Images via Interpretable State Space Models
Qingqiao Hu
Daoan Zhang
Jiebo Luo
Zhenyu Gong
Benedikt Wiestler
Jianguo Zhang
Hongwei Bran Li
24
0
0
12 Sep 2024
Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs
Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs
Sarthak Yadav
Sergios Theodoridis
Z. Tan
32
2
0
29 Aug 2024
MTMamba++: Enhancing Multi-Task Dense Scene Understanding via
  Mamba-Based Decoders
MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders
Baijiong Lin
Weisen Jiang
Pengguang Chen
Shu Liu
Ying-Cong Chen
Mamba
25
1
0
27 Aug 2024
Simplified Mamba with Disentangled Dependency Encoding for Long-Term
  Time Series Forecasting
Simplified Mamba with Disentangled Dependency Encoding for Long-Term Time Series Forecasting
Zixuan Weng
Jindong Han
Wenzhao Jiang
Hao Liu
Mamba
AI4TS
20
2
0
22 Aug 2024
MambaLoc: Efficient Camera Localisation via State Space Model
MambaLoc: Efficient Camera Localisation via State Space Model
Jialu Wang
Kaichen Zhou
Andrew Markham
Niki Trigoni
Mamba
27
0
0
19 Aug 2024
ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective
  Image Enhancement
ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective Image Enhancement
Eashan Adhikarla
Kai Zhang
John Nicholson
Brian D. Davison
Mamba
27
3
0
19 Aug 2024
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
43
24
0
19 Aug 2024
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs
Dongyuan Li
Shiyin Tan
Ying Zhang
Ming Jin
Shirui Pan
Manabu Okumura
Renhe Jiang
Mamba
18
2
0
13 Aug 2024
What comes after transformers? -- A selective survey connecting ideas in
  deep learning
What comes after transformers? -- A selective survey connecting ideas in deep learning
Johannes Schneider
AI4CE
27
2
0
01 Aug 2024
MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and
  Disentangled Multi-Modality Fusion
MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality Fusion
Chencan Fu
Yabiao Wang
Jiangning Zhang
Zhengkai Jiang
Xiaofeng Mao
Jiafu Wu
Weijian Cao
Chengjie Wang
Yanhao Ge
Yong Liu
Mamba
35
2
0
29 Jul 2024
Long Range Switching Time Series Prediction via State Space Model
Long Range Switching Time Series Prediction via State Space Model
Jiaming Zhang
Yang Ding
Yunfeng Gao
26
0
0
27 Jul 2024
VSSD: Vision Mamba with Non-Causal State Space Duality
VSSD: Vision Mamba with Non-Causal State Space Duality
Yuheng Shi
Minjing Dong
Mingjia Li
Chang Xu
Mamba
28
3
0
26 Jul 2024
Attention Is All You Need But You Don't Need All Of It For Inference of
  Large Language Models
Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models
Georgy Tyukin
G. Dovonon
Jean Kaddour
Pasquale Minervini
LRM
26
0
0
22 Jul 2024
Longhorn: State Space Models are Amortized Online Learners
Longhorn: State Space Models are Amortized Online Learners
Bo Liu
Rui Wang
Lemeng Wu
Yihao Feng
Peter Stone
Qian Liu
46
10
0
19 Jul 2024
GroupMamba: Efficient Group-Based Visual State Space Model
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
51
0
0
18 Jul 2024
OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting
OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting
Penglei Gao
Kai Yao
Tiandi Ye
Steven Wang
Yuan Yao
Xiaofeng Wang
Mamba
24
1
0
15 Jul 2024
Fine-grained Analysis of In-context Linear Estimation: Data,
  Architecture, and Beyond
Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond
Yingcong Li
A. S. Rawat
Samet Oymak
21
6
0
13 Jul 2024
Hydra: Bidirectional State Space Models Through Generalized Matrix
  Mixers
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
Sukjun Hwang
Aakash Lahoti
Tri Dao
Albert Gu
Mamba
52
11
0
13 Jul 2024
HiPPO-Prophecy: State-Space Models can Provably Learn Dynamical Systems
  in Context
HiPPO-Prophecy: State-Space Models can Provably Learn Dynamical Systems in Context
Federico Arangath Joseph
K. Haefeli
Noah Liniger
Çağlar Gülçehre
17
2
0
12 Jul 2024
FlashAttention-3: Fast and Accurate Attention with Asynchrony and
  Low-precision
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
Jay Shah
Ganesh Bikshandi
Ying Zhang
Vijay Thakkar
Pradeep Ramani
Tri Dao
48
112
0
11 Jul 2024
Previous
123456
Next