Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.17844
Cited By
Mechanistic Design and Scaling of Hybrid Architectures
26 March 2024
Michael Poli
Armin W. Thomas
Eric N. D. Nguyen
Pragaash Ponnusamy
Bjorn Deiseroth
Kristian Kersting
Taiji Suzuki
Brian Hie
Stefano Ermon
Christopher Ré
Ce Zhang
Stefano Massaroli
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mechanistic Design and Scaling of Hybrid Architectures"
9 / 9 papers shown
Title
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
108
115
0
05 Jan 2024
Zoology: Measuring and Improving Recall in Efficient Language Models
Simran Arora
Sabri Eyuboglu
Aman Timalsina
Isys Johnson
Michael Poli
James Zou
Atri Rudra
Christopher Ré
36
30
0
08 Dec 2023
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren
Yang Liu
Shuohang Wang
Yichong Xu
Chenguang Zhu
Chengxiang Zhai
16
7
0
19 Jun 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
178
152
0
28 Apr 2023
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
229
326
0
24 Sep 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
58
164
0
21 Feb 2022
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
61
125
0
17 Sep 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
217
1,508
0
31 Dec 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
217
3,054
0
23 Jan 2020
1