Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.02486
Cited By
LongNet: Scaling Transformers to 1,000,000,000 Tokens
5 July 2023
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LongNet: Scaling Transformers to 1,000,000,000 Tokens"
14 / 114 papers shown
Title
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
...
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAG
RALM
26
486
0
28 Aug 2023
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
58
1,881
0
24 Aug 2023
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Erik Cambria
Björn W. Schuller
LM&MA
AuLLM
29
36
0
24 Aug 2023
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation
Junru Lu
Siyu An
Mingbao Lin
Gabriele Pergola
Yulan He
Di Yin
Xing Sun
Yunsheng Wu
42
31
0
16 Aug 2023
Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting
Aseem Arora
Shabbirhussain Bhaisaheb
Harshit Nigam
Manasi S. Patwardhan
L. Vig
Gautam M. Shroff
19
8
0
01 Aug 2023
Language models as master equation solvers
Chuanbo Liu
Jin Wang
18
0
0
29 Jul 2023
In-context Autoencoder for Context Compression in a Large Language Model
Tao Ge
Jing Hu
Lei Wang
Xun Wang
Si-Qing Chen
Furu Wei
RALM
29
66
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Saeed Mian
OffRL
46
499
0
12 Jul 2023
Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
Ran Zhang
Jihed Ouni
Steffen Eger
19
6
0
22 Jun 2023
Recurrent Action Transformer with Memory
A. Staroverov
A. Bessonov
Dmitry A. Yudin
A. Kovalev
Aleksandr I. Panov
OffRL
17
4
0
15 Jun 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Mikhail Burtsev
LRM
14
86
0
19 Apr 2023
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
1,982
0
28 Jul 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
Previous
1
2
3