PSC: Extending Context Window of Large Language Models via Phase Shift Calibration

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

18 May 2025

Papers citing "PSC: Extending Context Window of Large Language Models via Phase Shift Calibration"

21 / 21 papers shown

CARFT: Boosting LLM Reasoning via Contrastive Learning with Annotated Chain-of-Thought-based Reinforced Fine-Tuning

211

21 Aug 2025

SGDPO: Self-Guided Direct Preference Optimization for Language Model AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

372

18 May 2025

LoRA Learns Less and Forgets Less

D. Biderman

Jose Javier Gonzalez Ortiz

...

344

230

15 May 2024

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Li Lyna Zhang

Fan Yang

228

261

21 Feb 2024

Code Llama: Open Foundation Models for Code

Baptiste Rozière

...

Louis Martin

464

2,786

24 Aug 2023

L-Eval: Instituting Standardized Evaluation for Long Context Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Lingpeng Kong

Xipeng Qiu

ELM ALM

468

202

20 Jul 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models

Louis Martin

...

Sharan Narang

Sergey Edunov

8.3K

15,302

18 Jul 2023

FlashAttention-2: Faster Attention with Better Parallelism and Work PartitioningInternational Conference on Learning Representations (ICLR), 2023

Tri Dao

LRM

430

2,070

17 Jul 2023

Extending Context Window of Large Language Models via Positional Interpolation

436

684

27 Jun 2023

Landmark Attention: Random-Access Infinite Context Length for TransformersNeural Information Processing Systems (NeurIPS), 2023

Amirkeivan Mohtashami

Martin Jaggi

LLMAG

323

195

25 May 2023

LLaMA: Open and Efficient Foundation Language Models

...

6.8K

17,868

27 Feb 2023

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-AwarenessNeural Information Processing Systems (NeurIPS), 2022

845

3,353

27 May 2022

TruthfulQA: Measuring How Models Mimic Human FalsehoodsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

1.6K

2,692

08 Sep 2021

Train Short, Test Long: Attention with Linear Biases Enables Input Length ExtrapolationInternational Conference on Learning Representations (ICLR), 2021

Ofir Press

Noah A. Smith

M. Lewis

835

1,010

27 Aug 2021

RoFormer: Enhanced Transformer with Rotary Position Embedding

842

4,005

20 Apr 2021

Measuring Massive Multitask Language UnderstandingInternational Conference on Learning Representations (ICLR), 2020

2.3K

6,617

07 Sep 2020

Compressive Transformers for Long-Range Sequence ModellingInternational Conference on Learning Representations (ICLR), 2019

Jack W. Rae

Anna Potapenko

Siddhant M. Jayakumar

Timothy Lillicrap

RALM VLM KELM

297

774

13 Nov 2019

HellaSwag: Can a Machine Really Finish Your Sentence?Annual Meeting of the Association for Computational Linguistics (ACL), 2019

Yejin Choi

638

3,460

19 May 2019

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

Oyvind Tafjord

987

3,777

14 Mar 2018

Attention Is All You NeedNeural Information Processing Systems (NeurIPS), 2017

4.2K

162,388

12 Jun 2017

Deep Residual Learning for Image Recognition

3.7K

217,813

10 Dec 2015