Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.11819
Cited By
Head-wise Shareable Attention for Large Language Models
19 February 2024
Zouying Cao
Yifei Yang
Hai Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Head-wise Shareable Attention for Large Language Models"
4 / 4 papers shown
Title
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
203
2,232
0
22 Mar 2023
CSL: A Large-scale Chinese Scientific Literature Dataset
Yudong Li
Yuqing Zhang
Zhe Zhao
Lin-cheng Shen
Weijie Liu
Weiquan Mao
Hui Zhang
AILaw
118
50
0
12 Sep 2022
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
138
183
0
31 Dec 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1