ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.05136
  4. Cited By
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to
  256K

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

6 February 2024
Tao Yuan
Xuefei Ning
Dong Zhou
Zhijie Yang
Shiyao Li
Minghui Zhuang
Zheyue Tan
Zhuyu Yao
Dahua Lin
Boxun Li
Guohao Dai
Shengen Yan
Yu-Xiang Wang
    ALM
ArXivPDFHTML

Papers citing "LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K"

31 / 31 papers shown
Title
FamilyTool: A Multi-hop Personalized Tool Use Benchmark
Yuxin Wang
Yiran Guo
Y. Zheng
Zhangyue Yin
S. Chen
Jie Yang
Jiajun Chen
Xuanjing Huang
Xipeng Qiu
24
0
0
09 Apr 2025
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
C. Xu
Wei Ping
P. Xu
Z. Liu
Boxin Wang
M. Shoeybi
Bo Li
Bryan Catanzaro
17
1
0
08 Apr 2025
The Use of Gaze-Derived Confidence of Inferred Operator Intent in Adjusting Safety-Conscious Haptic Assistance
The Use of Gaze-Derived Confidence of Inferred Operator Intent in Adjusting Safety-Conscious Haptic Assistance
Jeremy D. Webb
Michael Bowman
Songpo Li
Xiaoli Zhang
31
0
0
04 Apr 2025
A Survey on Transformer Context Extension: Approaches and Evaluation
A Survey on Transformer Context Extension: Approaches and Evaluation
Yijun Liu
Jinzheng Yu
Yang Xu
Zhongyang Li
Qingfu Zhu
LLMAG
64
0
0
17 Mar 2025
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
Yuchen Yan
Yongliang Shen
Y. Liu
Jin Jiang
M. Zhang
Jian Shao
Yueting Zhuang
LRM
ReLM
53
3
0
09 Mar 2025
Qwen2.5-1M Technical Report
A. Yang
Bowen Yu
C. Li
Dayiheng Liu
Fei Huang
...
Xingzhang Ren
Xinlong Yang
Y. Li
Zhiying Xu
Z. Zhang
63
10
0
28 Jan 2025
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Jonathan Roberts
Kai Han
Samuel Albanie
LLMAG
72
0
0
07 Nov 2024
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated
  Parameters by Tencent
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
X. Sun
Yanfeng Chen
Y. Huang
Ruobing Xie
Jiaqi Zhu
...
Zhanhui Kang
Yong Yang
Yuhong Liu
Di Wang
Jie Jiang
MoE
ALM
ELM
65
24
0
04 Nov 2024
LoGU: Long-form Generation with Uncertainty Expressions
LoGU: Long-form Generation with Uncertainty Expressions
Ruihan Yang
Caiqi Zhang
Zhisong Zhang
Xinting Huang
Sen Yang
Nigel Collier
Dong Yu
Deqing Yang
HILM
19
0
0
18 Oct 2024
Forgetting Curve: A Reliable Method for Evaluating Memorization
  Capability for Long-context Models
Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models
Xinyu Liu
Runsong Zhao
Pengcheng Huang
Chunyang Xiao
Bei Li
Jingang Wang
Tong Xiao
Jingbo Zhu
21
0
0
07 Oct 2024
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Howard Yen
Tianyu Gao
Minmin Hou
Ke Ding
Daniel Fleischer
Peter Izsak
Moshe Wasserblat
Danqi Chen
ALM
ELM
52
24
0
03 Oct 2024
CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling
  Acceleration in LLMs
CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs
Junlin Lv
Yuan Feng
Xike Xie
Xin Jia
Qirong Peng
Guiming Xie
18
3
0
19 Sep 2024
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context
  Scenarios
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Luning Wang
Shiyao Li
Xuefei Ning
Zhihang Yuan
Shengen Yan
Guohao Dai
Yu Wang
38
0
0
16 Sep 2024
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long
  Context Evaluation Tasks
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks
Zi Yang
28
0
0
10 Sep 2024
Untie the Knots: An Efficient Data Augmentation Strategy for
  Long-Context Pre-Training in Language Models
Untie the Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models
Junfeng Tian
Da Zheng
Yang Cheng
Rui-cang Wang
C. Zhang
Debing Zhang
17
4
0
07 Sep 2024
Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive
  Study and Hybrid Approach
Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach
Zhuowan Li
Cheng-rong Li
Mingyang Zhang
Qiaozhu Mei
Michael Bendersky
3DV
RALM
44
33
0
23 Jul 2024
Evaluating Long Range Dependency Handling in Code Generation Models
  using Multi-Step Key Retrieval
Evaluating Long Range Dependency Handling in Code Generation Models using Multi-Step Key Retrieval
Yannick Assogba
Donghao Ren
41
0
0
23 Jul 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Bo Zheng
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLM
VLM
MU
53
779
0
15 Jul 2024
Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks
Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks
Hokyung Lee
Sumanyu Sharma
Bing Hu
27
2
0
21 Jun 2024
MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to
  200K Tokens
MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens
Yongqi Fan
Hongli Sun
Kui Xue
Xiaofan Zhang
Shaoting Zhang
Tong Ruan
34
0
0
21 Jun 2024
MoA: Mixture of Sparse Attention for Automatic Large Language Model
  Compression
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression
Tianyu Fu
Haofeng Huang
Xuefei Ning
Genghan Zhang
Boju Chen
...
Shiyao Li
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
38
2
0
21 Jun 2024
GraphReader: Building Graph-based Agent to Enhance Long-Context
  Abilities of Large Language Models
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models
Shilong Li
Yancheng He
Hangyu Guo
Xingyuan Bu
Ge Bai
...
Xingwei Qu
Yangguang Li
Wanli Ouyang
Wenbo Su
Bo Zheng
RALM
LLMAG
27
6
0
20 Jun 2024
BABILong: Testing the Limits of LLMs with Long Context
  Reasoning-in-a-Haystack
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Mikhail Burtsev
RALM
ALM
LRM
ReLM
ELM
42
57
0
14 Jun 2024
LoongServe: Efficiently Serving Long-context Large Language Models with
  Elastic Sequence Parallelism
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Bingya Wu
Shengyu Liu
Yinmin Zhong
Peng Sun
Xuanzhe Liu
Xin Jin
RALM
27
49
0
15 Apr 2024
Evaluating Quantized Large Language Models
Evaluating Quantized Large Language Models
Shiyao Li
Xuefei Ning
Luning Wang
Tengxuan Liu
Xiangsheng Shi
Shengen Yan
Guohao Dai
Huazhong Yang
Yu-Xiang Wang
MQ
38
42
0
28 Feb 2024
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention
  and Distributed KVCache
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
Bin Lin
Chen Zhang
Tao Peng
Hanyu Zhao
Wencong Xiao
...
Shen Li
Zhigang Ji
Tao Xie
Yong Li
Wei Lin
31
46
0
05 Jan 2024
Don't Make Your LLM an Evaluation Benchmark Cheater
Don't Make Your LLM an Evaluation Benchmark Cheater
Kun Zhou
Yutao Zhu
Zhipeng Chen
Wentong Chen
Wayne Xin Zhao
Xu Chen
Yankai Lin
Ji-Rong Wen
Jiawei Han
ELM
105
136
0
03 Nov 2023
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language
  Models
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models
Chi Han
Qifan Wang
Hao Peng
Wenhan Xiong
Yu Chen
Heng Ji
Sinong Wang
37
47
0
30 Aug 2023
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text
Wangchunshu Zhou
Yuchen Eleanor Jiang
Peng Cui
Tiannan Wang
Zhenxin Xiao
Yifan Hou
Ryan Cotterell
Mrinmaya Sachan
RALM
LLMAG
82
58
0
22 May 2023
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
240
1,070
0
05 Oct 2022
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
234
690
0
27 Aug 2021
1