Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.12570
Cited By
Jamba-1.5: Hybrid Transformer-Mamba Models at Scale
22 August 2024
Jamba Team
Barak Lenz
Alan Arazi
Amir Bergman
Avshalom Manevich
Barak Peleg
Ben Aviram
Chen Almagor
Clara Fridman
Dan Padnos
Daniel Gissin
Daniel Jannai
Dor Muhlgay
Dor Zimberg
E. Gerber
Elad Dolev
Eran Krakovsky
Erez Safahi
Erez Schwartz
Gal Cohen
Gal Shachaf
Haim Rozenblum
Hofit Bata
I. Blass
Inbal Magar
Itay Dalmedigos
Jhonathan Osin
Julie Fadlon
Maria Rozman
Matan Danos
Michael Gokhman
Mor Zusman
N. Gidron
Nir Ratner
Noam Gat
N. Rozen
Oded Fried
Ohad Leshno
Omer Antverg
Omri Abend
Opher Lieber
Or Dagan
Orit Cohavi
Raz Alon
Roí Belson
Roi Cohen
Rom Gilad
Roman Glozman
S. Lev
S. Meirom
Tal Delbari
Tal Ness
Tomer Asida
Tom Ben Gal
Tom Braude
Uriya Pumerantz
Yehoshua Cohen
Yonatan Belinkov
Y. Globerson
Yuval Peleg Levy
Y. Shoham
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Jamba-1.5: Hybrid Transformer-Mamba Models at Scale"
9 / 9 papers shown
Title
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
37
0
0
22 Mar 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
74
1
0
10 Mar 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu-Xi Cheng
MoE
55
1
0
07 Mar 2025
From Markov to Laplace: How Mamba In-Context Learns Markov Chains
Marco Bondaschi
Nived Rajaraman
Xiuying Wei
Kannan Ramchandran
Razvan Pascanu
Çağlar Gülçehre
Michael C. Gastpar
Ashok Vardhan Makkuva
58
0
0
17 Feb 2025
We're Different, We're the Same: Creative Homogeneity Across LLMs
Emily Wenger
Yoed Kenett
86
3
0
31 Jan 2025
GG-SSMs: Graph-Generating State Space Models
Nikola Zubić
Davide Scaramuzza
Mamba
86
1
0
17 Dec 2024
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan
Zhuang Wang
Zhen Jia
Can Karakus
Luca Zancato
Tri Dao
Ravi Netravali
Yida Wang
87
4
0
28 Nov 2024
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Jonathan Roberts
Kai Han
Samuel Albanie
LLMAG
70
0
0
07 Nov 2024
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
62
36
0
03 Oct 2024
1