ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17296
  4. Cited By
SELF: Self-Extend the Context Length With Logistic Growth Function

SELF: Self-Extend the Context Length With Logistic Growth Function

22 May 2025
Phat Thanh Dang
Saahil Thoppay
Wang Yang
Qifan Wang
Vipin Chaudhary
Xiaotian Han
ArXiv (abs)PDFHTML

Papers citing "SELF: Self-Extend the Context Length With Logistic Growth Function"

17 / 17 papers shown
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an
  Efficient Context Memory
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao
Pengle Zhang
Xu Han
Guangxuan Xiao
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
LLMAG
343
105
0
07 Feb 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
LLM Maybe LongLM: Self-Extend LLM Context Window Without TuningInternational Conference on Machine Learning (ICML), 2024
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Helen Zhou
530
151
0
02 Jan 2024
Efficient Streaming Language Models with Attention Sinks
Efficient Streaming Language Models with Attention SinksInternational Conference on Learning Representations (ICLR), 2023
Michel Lang
Yuandong Tian
Beidi Chen
Song Han
Mike Lewis
AI4TSRALM
446
1,262
0
29 Sep 2023
PoSE: Efficient Context Window Extension of LLMs via Positional
  Skip-wise Training
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise TrainingInternational Conference on Learning Representations (ICLR), 2023
Dawei Zhu
Nan Yang
Liang Wang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
444
100
0
19 Sep 2023
LongBench: A Bilingual, Multitask Benchmark for Long Context
  Understanding
LongBench: A Bilingual, Multitask Benchmark for Long Context UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
...
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAGRALM
334
932
0
28 Aug 2023
L-Eval: Instituting Standardized Evaluation for Long Context Language
  Models
L-Eval: Instituting Standardized Evaluation for Long Context Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Chen An
Shansan Gong
Ming Zhong
Xingjian Zhao
Mukai Li
Jun Zhang
Lingpeng Kong
Xipeng Qiu
ELMALM
468
202
0
20 Jul 2023
Extending Context Window of Large Language Models via Positional
  Interpolation
Extending Context Window of Large Language Models via Positional Interpolation
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
435
678
0
27 Jun 2023
Landmark Attention: Random-Access Infinite Context Length for
  Transformers
Landmark Attention: Random-Access Infinite Context Length for TransformersNeural Information Processing Systems (NeurIPS), 2023
Amirkeivan Mohtashami
Martin Jaggi
LLMAG
323
193
0
25 May 2023
When Neural Networks Fail to Generalize? A Model Sensitivity Perspective
When Neural Networks Fail to Generalize? A Model Sensitivity PerspectiveAAAI Conference on Artificial Intelligence (AAAI), 2022
Jiajin Zhang
Hanqing Chao
Amit Dhurandhar
Pin-Yu Chen
A. Tajer
Yangyang Xu
Pingkun Yan
OODAAML
216
16
0
01 Dec 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
867
4,378
0
02 May 2022
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length ExtrapolationInternational Conference on Learning Representations (ICLR), 2021
Ofir Press
Noah A. Smith
M. Lewis
826
1,010
0
27 Aug 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
832
3,962
0
20 Apr 2021
Recent Advances in Adversarial Training for Adversarial Robustness
Recent Advances in Adversarial Training for Adversarial RobustnessInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Tao Bai
Jinqi Luo
Jun Zhao
Bihan Wen
Qian Wang
AAML
506
581
0
02 Feb 2021
mT5: A massively multilingual pre-trained text-to-text transformer
mT5: A massively multilingual pre-trained text-to-text transformer
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
692
2,955
0
22 Oct 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.0K
52,526
0
28 May 2020
Compressive Transformers for Long-Range Sequence Modelling
Compressive Transformers for Long-Range Sequence ModellingInternational Conference on Learning Representations (ICLR), 2019
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
RALMVLMKELM
297
770
0
13 Nov 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
749
4,119
0
09 Jan 2019
1