ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.02486
  4. Cited By
LongNet: Scaling Transformers to 1,000,000,000 Tokens

LongNet: Scaling Transformers to 1,000,000,000 Tokens

5 July 2023
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
    CLL
ArXivPDFHTML

Papers citing "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

50 / 114 papers shown
Title
Toward Conversational Agents with Context and Time Sensitive Long-term
  Memory
Toward Conversational Agents with Context and Time Sensitive Long-term Memory
Nick Alonso
Tomás Figliolia
A. Ndirango
Beren Millidge
RALM
3DV
48
3
0
29 May 2024
SelfCP: Compressing Over-Limit Prompt via the Frozen Large Language
  Model Itself
SelfCP: Compressing Over-Limit Prompt via the Frozen Large Language Model Itself
Jun Gao
Ziqiang Cao
Wenjie Li
25
4
0
27 May 2024
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing
  Graph and Text Mutual Transformations
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations
Zhengwu Zhang
Yuntong Hu
Bo Pan
Chen Ling
Liang Zhao
31
2
0
27 May 2024
PRISM: A Multi-Modal Generative Foundation Model for Slide-Level
  Histopathology
PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology
George Shaikovski
Adam Casson
Kristen Severson
Eric Zimmermann
Yi Kan Wang
...
Peter Hamilton
William A. Moye
Eugene Vorontsov
Siqi Liu
Thomas J. Fuchs
MedIm
30
22
0
16 May 2024
You Only Cache Once: Decoder-Decoder Architectures for Language Models
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Yutao Sun
Li Dong
Yi Zhu
Shaohan Huang
Wenhui Wang
Shuming Ma
Quanlu Zhang
Jianyong Wang
Furu Wei
VLM
25
52
0
08 May 2024
From Persona to Personalization: A Survey on Role-Playing Language
  Agents
From Persona to Personalization: A Survey on Role-Playing Language Agents
Jiangjie Chen
Xintao Wang
Rui Xu
Siyu Yuan
Yikai Zhang
...
Caiyu Hu
Siye Wu
Scott Ren
Ziquan Fu
Yanghua Xiao
50
72
0
28 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with
  Infini-attention
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRM
LLMAG
CLL
81
101
0
10 Apr 2024
MambaMixer: Efficient Selective State Space Models with Dual Token and
  Channel Selection
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection
Ali Behrouz
Michele Santacatterina
Ramin Zabih
37
32
0
29 Mar 2024
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV
  Caching
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
Youpeng Zhao
Di Wu
Jun Wang
21
25
0
26 Mar 2024
FastSAM3D: An Efficient Segment Anything Model for 3D Volumetric Medical
  Images
FastSAM3D: An Efficient Segment Anything Model for 3D Volumetric Medical Images
Yiqing Shen
Jingxing Li
Xinyuan Shao
Blanca Inigo Romillo
Ankush Jindal
David Dreizin
Mathias Unberath
MedIm
29
10
0
14 Mar 2024
BurstAttention: An Efficient Distributed Attention Framework for
  Extremely Long Sequences
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Sun Ao
Weilin Zhao
Xu Han
Cheng Yang
Zhiyuan Liu
Chuan Shi
Maosong Sun
GNN
24
8
0
14 Mar 2024
Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient
  Generative Inference
Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient Generative Inference
Muhammad Adnan
Akhil Arunkumar
Gaurav Jain
Prashant J. Nair
Ilya Soloveychik
Purushotham Kamath
19
52
0
14 Mar 2024
Rethinking Generative Large Language Model Evaluation for Semantic
  Comprehension
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Fangyun Wei
Xi Chen
Linzi Luo
ELM
ALM
LRM
27
7
0
12 Mar 2024
Beyond Multiple Instance Learning: Full Resolution All-In-Memory
  End-To-End Pathology Slide Modeling
Beyond Multiple Instance Learning: Full Resolution All-In-Memory End-To-End Pathology Slide Modeling
Gabriele Campanella
Eugene Fluder
Jennifer Zeng
Chad M. Vanderbilt
Thomas J. Fuchs
MedIm
21
0
0
07 Mar 2024
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Yuchen Duan
Weiyun Wang
Zhe Chen
Xizhou Zhu
Lewei Lu
Tong Lu
Yu Qiao
Hongsheng Li
Jifeng Dai
Wenhai Wang
ViT
38
44
0
04 Mar 2024
DenseMamba: State Space Models with Dense Hidden Connection for
  Efficient Large Language Models
DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models
Wei He
Kai Han
Yehui Tang
Chengcheng Wang
Yujie Yang
Tianyu Guo
Yunhe Wang
Mamba
53
25
0
26 Feb 2024
User-LLM: Efficient LLM Contextualization with User Embeddings
User-LLM: Efficient LLM Contextualization with User Embeddings
Lin Ning
Luyang Liu
Jiaxing Wu
Neo Wu
D. Berlowitz
Sushant Prakash
Bradley Green
S. O’Banion
Jun Xie
37
32
0
21 Feb 2024
CAMELoT: Towards Large Language Models with Training-Free Consolidated
  Associative Memory
CAMELoT: Towards Large Language Models with Training-Free Consolidated Associative Memory
Zexue He
Leonid Karlinsky
Donghyun Kim
Julian McAuley
Dmitry Krotov
Rogerio Feris
KELM
RALM
33
10
0
21 Feb 2024
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs
  Miss
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Dmitry Sorokin
Artyom Sorokin
Mikhail Burtsev
RALM
117
32
0
16 Feb 2024
FAST: Factorizable Attention for Speeding up Transformers
FAST: Factorizable Attention for Speeding up Transformers
Armin Gerami
Monte Hoover
P. S. Dulepet
R. Duraiswami
22
0
0
12 Feb 2024
MEMORYLLM: Towards Self-Updatable Large Language Models
MEMORYLLM: Towards Self-Updatable Large Language Models
Yu-Xiang Wang
Yifan Gao
Xiusi Chen
Haoming Jiang
Shiyang Li
...
Zheng Li
Xian Li
Bing Yin
Jingbo Shang
Julian McAuley
KELM
27
16
0
07 Feb 2024
UniMem: Towards a Unified View of Long-Context Large Language Models
UniMem: Towards a Unified View of Long-Context Large Language Models
Junjie Fang
Likai Tang
Hongzhe Bi
Yujia Qin
Si Sun
...
Xiaodong Shi
Sen Song
Yankai Lin
Zhiyuan Liu
Maosong Sun
11
3
0
05 Feb 2024
Beyond the Limits: A Survey of Techniques to Extend the Context Length
  in Large Language Models
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
Xindi Wang
Mahsa Salmani
Parsa Omidi
Xiangyu Ren
Mehdi Rezagholizadeh
A. Eshaghi
LRM
29
35
0
03 Feb 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
32
699
0
17 Jan 2024
Flexibly Scaling Large Language Models Contexts Through Extensible
  Tokenization
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization
Ninglu Shao
Shitao Xiao
Zheng Liu
Peitian Zhang
18
4
0
15 Jan 2024
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Jiaheng Liu
Zhiqi Bai
Yuanxing Zhang
Chenchen Zhang
Yu Zhang
...
Wenbo Su
Tiezheng Ge
Jie Fu
Wenhu Chen
Bo Zheng
38
8
0
13 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Xia Hu
15
99
0
02 Jan 2024
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
53
75
0
23 Dec 2023
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
E. Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
22
74
0
17 Dec 2023
Extending Context Window of Large Language Models via Semantic
  Compression
Extending Context Window of Large Language Models via Semantic Compression
WeiZhi Fei
Xueyan Niu
Pingyi Zhou
Lu Hou
Bo Bai
Lei Deng
Wei Han
23
26
0
15 Dec 2023
SCCA: Shifted Cross Chunk Attention for long contextual semantic
  expansion
SCCA: Shifted Cross Chunk Attention for long contextual semantic expansion
Yuxiang Guo
6
0
0
12 Dec 2023
TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long
  Documents
TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents
James Enouen
Hootan Nakhost
Sayna Ebrahimi
Sercan Ö. Arik
Yan Liu
Tomas Pfister
30
4
0
03 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
27
21
0
01 Dec 2023
Advancing Transformer Architecture in Long-Context Large Language
  Models: A Comprehensive Survey
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAG
KELM
28
53
0
21 Nov 2023
Thread of Thought Unraveling Chaotic Contexts
Thread of Thought Unraveling Chaotic Contexts
Yucheng Zhou
Xiubo Geng
Tao Shen
Chongyang Tao
Guodong Long
Jian-Guang Lou
Jianbing Shen
LRM
18
39
0
15 Nov 2023
LooGLE: Can Long-Context Language Models Understand Long Contexts?
LooGLE: Can Long-Context Language Models Understand Long Contexts?
Jiaqi Li
Mengmeng Wang
Zilong Zheng
Muhan Zhang
ELM
RALM
22
106
0
08 Nov 2023
Circuit as Set of Points
Circuit as Set of Points
Jialv Zou
Xinggang Wang
Jiahao Guo
Wenyu Liu
Qian Zhang
Chang Huang
GNN
3DV
3DPC
15
0
0
26 Oct 2023
CLEX: Continuous Length Extrapolation for Large Language Models
CLEX: Continuous Length Extrapolation for Large Language Models
Guanzheng Chen
Xin Li
Zaiqiao Meng
Shangsong Liang
Li Bing
15
29
0
25 Oct 2023
In-Context Unlearning: Language Models as Few Shot Unlearners
In-Context Unlearning: Language Models as Few Shot Unlearners
Martin Pawelczyk
Seth Neel
Himabindu Lakkaraju
MU
21
98
0
11 Oct 2023
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios
  via Prompt Compression
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
Huiqiang Jiang
Qianhui Wu
Xufang Luo
Dongsheng Li
Chin-Yew Lin
Yuqing Yang
Lili Qiu
RALM
99
179
0
10 Oct 2023
HyperAttention: Long-context Attention in Near-Linear Time
HyperAttention: Long-context Attention in Near-Linear Time
Insu Han
Rajesh Jayaram
Amin Karbasi
Vahab Mirrokni
David P. Woodruff
A. Zandieh
34
59
0
09 Oct 2023
A Comprehensive Performance Study of Large Language Models on Novel AI
  Accelerators
A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators
M. Emani
Sam Foreman
Varuni K. Sastry
Zhen Xie
Siddhisanket Raskar
William Arnold
R. Thakur
V. Vishwanath
M. Papka
ELM
16
9
0
06 Oct 2023
Dodo: Dynamic Contextual Compression for Decoder-only LMs
Dodo: Dynamic Contextual Compression for Decoder-only LMs
Guanghui Qin
Corby Rosset
Ethan C. Chau
Nikhil Rao
Benjamin Van Durme
11
7
0
03 Oct 2023
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
Praneeth Kacham
Vahab Mirrokni
Peilin Zhong
23
7
0
02 Oct 2023
Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words
Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words
Yu Bao
Srinivasan Sivanandan
Theofanis Karaletsos
ViT
17
22
0
28 Sep 2023
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen
Shengju Qian
Haotian Tang
Xin Lai
Zhijian Liu
Song Han
Jiaya Jia
26
150
0
21 Sep 2023
End-to-End Speech Recognition Contextualization with Large Language
  Models
End-to-End Speech Recognition Contextualization with Large Language Models
Egor Lakomkin
Chunyang Wu
Yassir Fathullah
Ozlem Kalinli
M. Seltzer
Christian Fuegen
55
17
0
19 Sep 2023
CoCA: Fusing Position Embedding with Collinear Constrained Attention in
  Transformers for Long Context Window Extending
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
27
4
0
15 Sep 2023
Adapted Large Language Models Can Outperform Medical Experts in Clinical
  Text Summarization
Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization
Dave Van Veen
Cara Van Uden
Louis Blankemeier
Jean-Benoit Delbrouck
Asad Aali
...
C. Langlotz
Jason Hom
S. Gatidis
John M. Pauly
Akshay S. Chaudhari
ELM
AI4MH
LM&MA
45
270
0
14 Sep 2023
Large Language Models for Compiler Optimization
Large Language Models for Compiler Optimization
Chris Cummins
Volker Seeker
Dejan Grubisic
Mostafa Elhoushi
Youwei Liang
...
Jonas Gehring
Fabian Gloeckle
Kim M. Hazelwood
Gabriel Synnaeve
Hugh Leather
18
47
0
11 Sep 2023
Previous
123
Next