Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.10866
Cited By
Hyena Hierarchy: Towards Larger Convolutional Language Models
21 February 2023
Michael Poli
Stefano Massaroli
Eric Q. Nguyen
Daniel Y. Fu
Tri Dao
S. Baccus
Yoshua Bengio
Stefano Ermon
Christopher Ré
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hyena Hierarchy: Towards Larger Convolutional Language Models"
47 / 47 papers shown
Title
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
40
0
0
22 Mar 2025
HELM: Hierarchical Encoding for mRNA Language Modeling
Mehdi Yazdani-Jahromi
Mangal Prakash
Tommaso Mansi
Artem Moskalev
Rui Liao
78
2
0
13 Mar 2025
Revisiting Convolution Architecture in the Realm of DNA Foundation Models
Yu Bo
Weian Mao
Yanjun Shao
Weiqiang Bai
Peng Ye
Xinzhu Ma
Junbo Zhao
Hao Chen
Chunhua Shen
3DV
55
1
0
25 Feb 2025
State-space models are accurate and efficient neural operators for dynamical systems
Zheyuan Hu
Nazanin Ahmadi Daryakenari
Qianli Shen
Kenji Kawaguchi
George Karniadakis
Mamba
AI4CE
61
10
0
28 Jan 2025
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
Zahra Gharaee
Scott C. Lowe
ZeMing Gong
Pablo Millán Arias
Nicholas Pellegrino
...
Lila Kari
Dirk Steinke
Graham W. Taylor
Paul Fieguth
Angel X. Chang
35
7
0
28 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
95
16
0
17 Jan 2025
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Xavier Gonzalez
Andrew Warrington
Jimmy T.H. Smith
Scott W. Linderman
83
8
0
17 Jan 2025
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
34
2
0
29 Dec 2024
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs
Michael Wornow
Suhana Bedi
Miguel Angel Fuentes Hernandez
E. Steinberg
Jason Alan Fries
Christopher Ré
Sanmi Koyejo
N. Shah
92
4
0
09 Dec 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
32
4
0
18 Oct 2024
State-space models can learn in-context by gradient descent
Neeraj Mohan Sushma
Yudou Tian
Harshvardhan Mestha
Nicolo Colombo
David Kappel
Anand Subramoney
35
3
0
15 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
79
17
0
14 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
52
13
0
06 Oct 2024
LongGenBench: Long-context Generation Benchmark
Xiang Liu
Peijie Dong
Xuming Hu
Xiaowen Chu
RALM
41
8
0
05 Oct 2024
System 2 Reasoning Capabilities Are Nigh
Scott C. Lowe
VLM
LRM
33
0
0
04 Oct 2024
Improving Pretraining Data Using Perplexity Correlations
Tristan Thrush
Christopher Potts
Tatsunori Hashimoto
32
17
0
09 Sep 2024
Masked Mixers for Language Generation and Retrieval
Benjamin L. Badger
31
0
0
02 Sep 2024
Cell-ontology guided transcriptome foundation model
Xinyu Yuan
Zhihao Zhan
Zuobai Zhang
Manqi Zhou
Jianan Zhao
Boyu Han
Yue Li
Jian Tang
35
1
0
22 Aug 2024
MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality Fusion
Chencan Fu
Yabiao Wang
Jiangning Zhang
Zhengkai Jiang
Xiaofeng Mao
Jiafu Wu
Weijian Cao
Chengjie Wang
Yanhao Ge
Yong Liu
Mamba
35
2
0
29 Jul 2024
Genomic Language Models: Opportunities and Challenges
Gonzalo Benegas
Chengzhong Ye
C. Albors
Jianan Canal Li
Yun S. Song
AI4CE
LM&MA
ELM
34
18
0
16 Jul 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun
Xinhao Li
Karan Dalal
Jiarui Xu
Arjun Vikram
...
Xinlei Chen
Xiaolong Wang
Sanmi Koyejo
Tatsunori Hashimoto
Carlos Guestrin
50
89
0
05 Jul 2024
SE(3)-Hyena Operator for Scalable Equivariant Learning
Artem Moskalev
Mangal Prakash
Rui Liao
Tommaso Mansi
24
2
0
01 Jul 2024
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Assaf Ben-Kish
Itamar Zimerman
Shady Abu Hussein
Nadav Cohen
Amir Globerson
Lior Wolf
Raja Giryes
Mamba
64
12
0
20 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
64
54
0
11 Jun 2024
LOLAMEME: Logic, Language, Memory, Mechanistic Framework
Jay Desai
Xiaobo Guo
Srinivasan H. Sengamedu
14
0
0
31 May 2024
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
Jerome Sieber
Carmen Amo Alonso
A. Didier
M. Zeilinger
Antonio Orvieto
AAML
42
7
0
24 May 2024
SpiralMLP: A Lightweight Vision MLP Architecture
Haojie Mu
Burhan Ul Tayyab
Nicholas Chua
32
0
0
31 Mar 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCV
LRM
55
46
0
23 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
26
6
0
28 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
115
347
0
09 Feb 2024
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
19
1
0
18 Dec 2023
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Shida Wang
Qianxiao Li
17
12
0
24 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
20
4
0
21 Nov 2023
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
19
15
0
28 Sep 2023
Learned Gridification for Efficient Point Cloud Processing
P. A. V. D. Linden
David W. Romero
Erik J. Bekkers
3DPC
11
2
0
22 Jul 2023
Retentive Network: A Successor to Transformer for Large Language Models
Yutao Sun
Li Dong
Shaohan Huang
Shuming Ma
Yuqing Xia
Jilong Xue
Jianyong Wang
Furu Wei
LRM
34
300
0
17 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
18
149
0
05 Jul 2023
Long Sequence Hopfield Memory
Hamza Tahir Chaudhry
Jacob A. Zavatone-Veth
Dmitry Krotov
C. Pehlevan
28
17
0
07 Jun 2023
Vision Transformers for Mobile Applications: A Short Survey
Nahid Alam
Steven Kolawole
S. Sethi
Nishant Bansali
Karina Nguyen
ViT
16
3
0
30 May 2023
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
21
9
0
24 May 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
72
550
0
22 May 2023
Ask Me Anything: A simple strategy for prompting language models
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLM
LRM
197
160
0
05 Oct 2022
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
240
453
0
24 Sep 2022
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
David W. Romero
Robert-Jan Bruintjes
Jakub M. Tomczak
Erik J. Bekkers
Mark Hoogendoorn
J. C. V. Gemert
74
81
0
15 Oct 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
Fourier Neural Operator for Parametric Partial Differential Equations
Zong-Yi Li
Nikola B. Kovachki
Kamyar Azizzadenesheli
Burigede Liu
K. Bhattacharya
Andrew M. Stuart
Anima Anandkumar
AI4CE
203
2,254
0
18 Oct 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
228
578
0
12 Mar 2020
1