ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.17463
  4. Cited By
Training-Free Long-Context Scaling of Large Language Models

Training-Free Long-Context Scaling of Large Language Models

27 February 2024
Chen An
Fei Huang
Jun Zhang
Shansan Gong
Xipeng Qiu
Chang Zhou
Lingpeng Kong
    ALM
    LRM
ArXivPDFHTML

Papers citing "Training-Free Long-Context Scaling of Large Language Models"

29 / 29 papers shown
Title
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Yi Lu
Wanxu Zhao
Xin Zhou
Chenxin An
C. Wang
...
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
39
0
0
26 Apr 2025
Random Long-Context Access for Mamba via Hardware-aligned Hierarchical Sparse Attention
Random Long-Context Access for Mamba via Hardware-aligned Hierarchical Sparse Attention
Xiang Hu
Jiaqi Leng
Jun Zhao
Kewei Tu
Wei Wu
Mamba
45
0
0
23 Apr 2025
Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models
Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models
Yu Fu
Haz Sameen Shahgir
Hui Liu
Xianfeng Tang
Qi He
Yue Dong
KELM
44
0
0
11 Apr 2025
SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling
SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling
Krishna C. Puvvada
Faisal Ladhak
Santiago Akle Serrano
Cheng-Ping Hsieh
Shantanu Acharya
...
Fei Jia
Samuel Kriman
Simeng Sun
Dima Rekesh
Boris Ginsburg
RALM
54
0
0
11 Apr 2025
Cognitive Memory in Large Language Models
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
63
1
0
03 Apr 2025
Implicit Search via Discrete Diffusion: A Study on Chess
Implicit Search via Discrete Diffusion: A Study on Chess
Jiacheng Ye
Zhenyu Wu
Jiahui Gao
Zhiyong Wu
Xin Jiang
Z. Li
Lingpeng Kong
DiffM
43
2
0
27 Feb 2025
LongAttn: Selecting Long-context Training Data via Token-level Attention
LongAttn: Selecting Long-context Training Data via Token-level Attention
Longyun Wu
Dawei Zhu
Guangxiang Zhao
Zhuocheng Yu
Junfeng Ran
Xiangyu Wong
Lin Sun
Sujian Li
33
0
0
24 Feb 2025
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Jiaxi Li
Xingxing Zhang
Xun Wang
Xiaolong Huang
Li Dong
Liang Wang
Si-Qing Chen
Wei Lu
Furu Wei
SyDa
60
0
0
23 Feb 2025
Qwen2.5-1M Technical Report
A. Yang
Bowen Yu
C. Li
Dayiheng Liu
Fei Huang
...
Xingzhang Ren
Xinlong Yang
Y. Li
Zhiying Xu
Z. Zhang
63
10
0
28 Jan 2025
LongSafety: Enhance Safety for Long-Context LLMs
LongSafety: Enhance Safety for Long-Context LLMs
Mianqiu Huang
Xiaoran Liu
Shaojun Zhou
Mozhi Zhang
Chenkun Tan
...
Zhikai Lei
Linlin Li
Q. Liu
Yaqian Zhou
Xipeng Qiu
ELM
ALM
30
0
0
11 Nov 2024
Two are better than one: Context window extension with multi-grained
  self-injection
Two are better than one: Context window extension with multi-grained self-injection
Wei Han
Pan Zhou
Soujanya Poria
Shuicheng Yan
19
0
0
25 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
21
1
0
07 Oct 2024
LoGra-Med: Long Context Multi-Graph Alignment for Medical
  Vision-Language Model
LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model
Duy M. H. Nguyen
N. T. Diep
Trung Q. Nguyen
Hoang-Bao Le
Tai Nguyen
...
Pengtao Xie
Roger Wattenhofer
James Zhou
Daniel Sonntag
Mathias Niepert
VLM
49
1
0
03 Oct 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training
  for Enhanced Speech Recognition and Translation
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Nithin Rao Koluguri
Travis M. Bartley
Hainan Xu
Oleksii Hrinchuk
Jagadeesh Balam
Boris Ginsburg
Georg Kucsko
19
2
0
09 Sep 2024
LongRecipe: Recipe for Efficient Long Context Generalization in Large
  Language Models
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
Zhiyuan Hu
Yuliang Liu
Jinman Zhao
Suyuchen Wang
Yan Wang
...
Qing Gu
Anh Tuan Luu
See-Kiong Ng
Zhiwei Jiang
Bryan Hooi
44
5
0
31 Aug 2024
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Yushi Bai
Jiajie Zhang
Xin Lv
Linzhi Zheng
Siqi Zhu
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
VGen
LLMAG
ALM
31
37
0
13 Aug 2024
ReAttention: Training-Free Infinite Context with Finite Attention Scope
ReAttention: Training-Free Infinite Context with Finite Attention Scope
Xiaoran Liu
Ruixiao Li
Yuerong Song
Zhigeng Liu
Kai Lv
Hang Yan
Hang Yan
Linlin Li
Qun Liu
Xipeng Qiu
LLMAG
23
1
0
21 Jul 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Bo Zheng
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLM
VLM
MU
41
458
0
15 Jul 2024
MammothModa: Multi-Modal Large Language Model
MammothModa: Multi-Modal Large Language Model
Qi She
Junwen Pan
Xin Wan
Rui Zhang
Dawei Lu
Kai Huang
MLLM
VLM
28
1
0
26 Jun 2024
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Ziyan Jiang
Xueguang Ma
Wenhu Chen
RALM
29
47
0
21 Jun 2024
3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position
  Encoding
3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding
Xindian Ma
Wenyuan Liu
Peng Zhang
Nan Xu
29
2
0
14 Jun 2024
MambaLRP: Explaining Selective State Space Sequence Models
MambaLRP: Explaining Selective State Space Sequence Models
F. Jafari
G. Montavon
Klaus-Robert Müller
Oliver Eberle
Mamba
43
9
0
11 Jun 2024
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of
  Multi-modal LLMs in Video Analysis
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Chaoyou Fu
Yuhan Dai
Yondong Luo
Lei Li
Shuhuai Ren
...
Tong Bill Xu
Xiawu Zheng
Enhong Chen
Rongrong Ji
Xing Sun
VLM
MLLM
39
216
0
31 May 2024
Automatically Generating Numerous Context-Driven SFT Data for LLMs
  across Diverse Granularity
Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity
Shanghaoran Quan
27
3
0
26 May 2024
LongEmbed: Extending Embedding Models for Long Context Retrieval
LongEmbed: Extending Embedding Models for Long Context Retrieval
Dawei Zhu
Liang Wang
Nan Yang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
RALM
25
20
0
18 Apr 2024
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an
  Efficient Context Memory
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao
Pengle Zhang
Xu Han
Guangxuan Xiao
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
LLMAG
31
33
0
07 Feb 2024
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
  Lengths in Large Language Models
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
62
21
0
09 Jan 2024
PoSE: Efficient Context Window Extension of LLMs via Positional
  Skip-wise Training
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
Dawei Zhu
Nan Yang
Liang Wang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
52
77
0
19 Sep 2023
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
234
690
0
27 Aug 2021
1