ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.11088
  4. Cited By
L-Eval: Instituting Standardized Evaluation for Long Context Language
  Models
v1v2v3 (latest)

L-Eval: Instituting Standardized Evaluation for Long Context Language Models

Annual Meeting of the Association for Computational Linguistics (ACL), 2023
20 July 2023
Chen An
Shansan Gong
Ming Zhong
Xingjian Zhao
Mukai Li
Jun Zhang
Lingpeng Kong
Xipeng Qiu
    ELMALM
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)

Papers citing "L-Eval: Instituting Standardized Evaluation for Long Context Language Models"

37 / 137 papers shown
Title
ERAGent: Enhancing Retrieval-Augmented Language Models with Improved
  Accuracy, Efficiency, and Personalization
ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and PersonalizationEuropean Conference on Artificial Intelligence (ECAI), 2024
Yunxiao Shi
Xing Zi
Zijing Shi
Haimin Zhang
Qiang Wu
Min Xu
RALM
176
21
0
06 May 2024
Make Your LLM Fully Utilize the Context
Make Your LLM Fully Utilize the Context
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
SyDa
342
121
0
25 Apr 2024
SnapKV: LLM Knows What You are Looking for Before Generation
SnapKV: LLM Knows What You are Looking for Before Generation
Yuhong Li
Yingbing Huang
Bowen Yang
Bharat Venkitesh
Acyr Locatelli
Hanchen Ye
Tianle Cai
Patrick Lewis
Deming Chen
VLM
332
360
0
22 Apr 2024
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual
  Alignment
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Zhaofeng Wu
Ananth Balashankar
Yoon Kim
Jacob Eisenstein
Ahmad Beirami
226
25
0
18 Apr 2024
LoongServe: Efficiently Serving Long-context Large Language Models with
  Elastic Sequence Parallelism
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Bingya Wu
Shengyu Liu
Yinmin Zhong
Peng Sun
Xuanzhe Liu
Xin Jin
RALM
265
109
0
15 Apr 2024
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarksNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Chonghua Wang
Haodong Duan
Songyang Zhang
Dahua Lin
Kai-xiang Chen
ELM
169
29
0
09 Apr 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
345
315
0
28 Mar 2024
MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task
  Planning with Open-Source Large Language Model
MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model
Yike Wu
Jiatao Zhang
Nan Hu
LanLing Tang
Guilin Qi
Jun Shao
Jie Ren
Wei Song
201
21
0
27 Mar 2024
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large
  Language Models
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models
Zexuan Qiu
Jingjing Li
Shijue Huang
Wanjun Zhong
Irwin King
ELMALM
265
11
0
06 Mar 2024
Resonance RoPE: Improving Context Length Generalization of Large
  Language Models
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Suyuchen Wang
I. Kobyzev
Peng Lu
Mehdi Rezagholizadeh
Bang Liu
180
21
0
29 Feb 2024
Training-Free Long-Context Scaling of Large Language Models
Training-Free Long-Context Scaling of Large Language Models
Chen An
Fei Huang
Jun Zhang
Shansan Gong
Xipeng Qiu
Chang Zhou
Lingpeng Kong
ALMLRM
254
54
0
27 Feb 2024
$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens
∞\infty∞Bench: Extending Long Context Evaluation Beyond 100K Tokens
Xinrong Zhang
Yingfa Chen
Shengding Hu
Zihang Xu
Junhao Chen
...
Xu Han
Zhen Leng Thai
Shuo Wang
Zhiyuan Liu
Maosong Sun
RALMLRM
427
264
0
21 Feb 2024
LongWanjuan: Towards Systematic Measurement for Long Text Quality
LongWanjuan: Towards Systematic Measurement for Long Text Quality
Kai Lv
Xiaoran Liu
Qipeng Guo
Hang Yan
Conghui He
Xipeng Qiu
Dahua Lin
139
9
0
21 Feb 2024
Same Task, More Tokens: the Impact of Input Length on the Reasoning
  Performance of Large Language Models
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Mosh Levy
Alon Jacoby
Yoav Goldberg
357
141
0
19 Feb 2024
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs
  Miss
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
305
42
0
16 Feb 2024
Data Engineering for Scaling Language Models to 128K Context
Data Engineering for Scaling Language Models to 128K Context
Yao Fu
Yikang Shen
Xinyao Niu
Xiang Yue
Hanna Hajishirzi
Yoon Kim
Hao-Chun Peng
MoE
232
179
0
15 Feb 2024
On the Efficacy of Eviction Policy for Key-Value Constrained Generative
  Language Model Inference
On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model Inference
Siyu Ren
Kenny Q. Zhu
213
35
0
09 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
  Closed-Source LLMs
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILMELMPILM
365
246
0
06 Feb 2024
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
Tao Yuan
Xuefei Ning
Dong Zhou
Zhijie Yang
Shiyao Li
...
Dahua Lin
Boxun Li
Guohao Dai
Shengen Yan
Yu Wang
ALM
269
55
0
06 Feb 2024
LongAlign: A Recipe for Long Context Alignment of Large Language Models
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Yushi Bai
Xin Lv
Jiajie Zhang
Yuze He
Ji Qi
Lei Hou
Jie Tang
Yuxiao Dong
Juanzi Li
ALM
152
76
0
31 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
195
20
0
26 Jan 2024
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention
  and Distributed KVCache
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
Bin Lin
Chen Zhang
Tao Peng
Hanyu Zhao
Wencong Xiao
...
Shen Li
Zhigang Ji
Tao Xie
Yong Li
Jialin Li
248
74
0
05 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
LLM Maybe LongLM: Self-Extend LLM Context Window Without TuningInternational Conference on Machine Learning (ICML), 2024
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Helen Zhou
399
148
0
02 Jan 2024
Linear Attention via Orthogonal Memory
Linear Attention via Orthogonal Memory
Jun Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
215
3
0
18 Dec 2023
Zebra: Extending Context Window with Layerwise Grouped Local-Global
  Attention
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention
Kaiqiang Song
Xiaoyang Wang
Sangwoo Cho
Xiaoman Pan
Dong Yu
198
7
0
14 Dec 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
  Catching up?
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Shafiq Joty
ELMCLLAI4MHLRMALM
252
31
0
28 Nov 2023
Advancing Transformer Architecture in Long-Context Large Language
  Models: A Comprehensive Survey
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAGKELM
286
95
0
21 Nov 2023
LooGLE: Can Long-Context Language Models Understand Long Contexts?
LooGLE: Can Long-Context Language Models Understand Long Contexts?
Jiaqi Li
Minghua Yi
Zilong Zheng
Muhan Zhang
ELMRALM
228
180
0
08 Nov 2023
M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context
  Evaluation Benchmark for Large Language Models
M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Wai-Chung Kwan
Xingshan Zeng
Yufei Wang
Yusen Sun
Liangyou Li
Lifeng Shang
Qun Liu
Kam-Fai Wong
ELM
262
13
0
30 Oct 2023
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large
  Language Models
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Fangyu Lei
Qian Liu
Yiming Huang
Shizhu He
Jun Zhao
Kang Liu
ELMLRM
213
16
0
23 Oct 2023
Scaling Laws of RoPE-based Extrapolation
Scaling Laws of RoPE-based ExtrapolationInternational Conference on Learning Representations (ICLR), 2023
Xiaoran Liu
Hang Yan
Shuo Zhang
Chen An
Xipeng Qiu
Dahua Lin
216
114
0
08 Oct 2023
Effective Long-Context Scaling of Foundation Models
Effective Long-Context Scaling of Foundation ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Wenhan Xiong
Jingyu Liu
Igor Molybog
Hejia Zhang
Prajjwal Bhargava
...
Dániel Baráth
Sergey Edunov
Mike Lewis
Sinong Wang
Hao Ma
261
289
0
27 Sep 2023
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling
  Capacities of Large Language Models
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023
Zican Dong
Tianyi Tang
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
RALMALM
347
49
0
23 Sep 2023
Can Large Language Models Understand Real-World Complex Instructions?
Can Large Language Models Understand Real-World Complex Instructions?AAAI Conference on Artificial Intelligence (AAAI), 2023
Qi He
Jie Zeng
Wenhao Huang
Lina Chen
Jin Xiao
...
Shisong Chen
Yikai Zhang
Zhouhong Gu
Jiaqing Liang
Yanghua Xiao
ALMLRMELM
261
85
0
17 Sep 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Qingyue Wang
Y. Fu
Yanan Cao
Zhiliang Tian
Zhiliang Tian
Dacheng Tao
LLMAGKELMRALM
492
44
0
29 Aug 2023
LongBench: A Bilingual, Multitask Benchmark for Long Context
  Understanding
LongBench: A Bilingual, Multitask Benchmark for Long Context UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
...
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAGRALM
259
868
0
28 Aug 2023
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence ModelingInternational Conference on Machine Learning (ICML), 2022
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Dianbo Sui
3DV
552
9
0
14 Oct 2022
Previous
123