Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.14489
Cited By
The Case for Co-Designing Model Architectures with Hardware
25 January 2024
Quentin G. Anthony
Jacob Hatef
Deepak Narayanan
Stella Biderman
Stas Bekman
Junqi Yin
A. Shafi
Hari Subramoni
Dhabaleswar Panda
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Case for Co-Designing Model Architectures with Hardware"
3 / 3 papers shown
Title
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Yujia Zhai
Chengquan Jiang
Leyuan Wang
Xiaoying Jia
Shang Zhang
Zizhong Chen
Xin Liu
Yibo Zhu
62
48
0
06 Oct 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
245
695
0
27 Aug 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,815
0
17 Sep 2019
1