Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.07004
Cited By
Extending LLMs' Context Window with 100 Samples
13 January 2024
Yikai Zhang
Junlong Li
Pengfei Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Extending LLMs' Context Window with 100 Samples"
13 / 13 papers shown
Title
Evaluating LLaMA 3.2 for Software Vulnerability Detection
José Gonçalves
Miguel Silva
Bernardo Cabral
Tiago Dias
Eva Maia
Isabel Praça
Ricardo Severino
Luis Lino Ferreira
41
1
0
10 Mar 2025
AIDBench: A benchmark for evaluating the authorship identification capability of large language models
Zichen Wen
Dadi Guo
Huishuai Zhang
67
0
0
20 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
58
4
0
31 Oct 2024
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
69
37
0
03 Oct 2024
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices
Zhi Chen
Qiguang Chen
Libo Qin
Qipeng Guo
Haijun Lv
Yicheng Zou
Wanxiang Che
Hang Yan
Kai Chen
Dahua Lin
SyDa
38
4
0
03 Sep 2024
Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective
M. Zhong
Chen Zhang
Yikun Lei
Xikai Liu
Yan Gao
Yao Hu
Kehai Chen
Min Zhang
35
5
0
19 Jun 2024
LoCoCo: Dropping In Convolutions for Long Context Compression
Ruisi Cai
Yuandong Tian
Zhangyang Wang
Beidi Chen
33
9
0
08 Jun 2024
LongEmbed: Extending Embedding Models for Long Context Retrieval
Dawei Zhu
Liang Wang
Nan Yang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
RALM
35
20
0
18 Apr 2024
T-RAG: Lessons from the LLM Trenches
M. Fatehkia
J. Lucas
Sanjay Chawla
LLMAG
27
18
0
12 Feb 2024
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
242
690
0
27 Aug 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
160
399
0
18 Jan 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
1,982
0
28 Jul 2020
1