ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.11062
  4. Cited By
Scaling Transformer to 1M tokens and beyond with RMT

Scaling Transformer to 1M tokens and beyond with RMT

19 April 2023
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Mikhail Burtsev
    LRM
ArXivPDFHTML

Papers citing "Scaling Transformer to 1M tokens and beyond with RMT"

20 / 70 papers shown
Title
Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive
  Synthesis using Large Language Models and Satisfiability Solving
Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive Synthesis using Large Language Models and Satisfiability Solving
Matthias Zeller
Susmit Jha
Patrick Lincoln
Jens Behley
Alvaro Velasquez
Rickard Ewetz
C. Stachniss
LRM
20
7
0
28 Sep 2023
In-context Interference in Chat-based Large Language Models
In-context Interference in Chat-based Large Language Models
Eric Nuertey Coleman
J. Hurtado
Vincenzo Lomonaco
KELM
28
1
0
22 Sep 2023
Language Modeling Is Compression
Language Modeling Is Compression
Grégoire Delétang
Anian Ruoss
Paul-Ambroise Duquenne
Elliot Catt
Tim Genewein
...
Wenliang Kevin Li
Matthew Aitchison
Laurent Orseau
Marcus Hutter
J. Veness
AI4CE
32
131
0
19 Sep 2023
Native Language Identification with Big Bird Embeddings
Native Language Identification with Big Bird Embeddings
Sergey Kramp
Giovanni Cassani
Chris Emmery
11
0
0
13 Sep 2023
FArMARe: a Furniture-Aware Multi-task methodology for Recommending
  Apartments based on the user interests
FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests
Ali Abdari
Alex Falcon
Giuseppe Serra
32
2
0
06 Sep 2023
LongBench: A Bilingual, Multitask Benchmark for Long Context
  Understanding
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
...
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAG
RALM
31
496
0
28 Aug 2023
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language
  Models
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
Cheng-Yu Hsieh
Sibei Chen
Chun-Liang Li
Yasuhisa Fujii
Alexander Ratner
Chen-Yu Lee
Ranjay Krishna
Tomas Pfister
LLMAG
SyDa
43
41
0
01 Aug 2023
Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain
  Adapted Least-To-Most Prompting
Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting
Aseem Arora
Shabbirhussain Bhaisaheb
Harshit Nigam
Manasi S. Patwardhan
L. Vig
Gautam M. Shroff
21
8
0
01 Aug 2023
LaFiCMIL: Rethinking Large File Classification from the Perspective of
  Correlated Multiple Instance Learning
LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning
Tiezhu Sun
Weiguo Pian
N. Daoudi
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
31
1
0
30 Jul 2023
L-Eval: Instituting Standardized Evaluation for Long Context Language
  Models
L-Eval: Instituting Standardized Evaluation for Long Context Language Models
Chen An
Shansan Gong
Ming Zhong
Xingjian Zhao
Mukai Li
Jun Zhang
Lingpeng Kong
Xipeng Qiu
ELM
ALM
40
132
0
20 Jul 2023
In-context Autoencoder for Context Compression in a Large Language Model
In-context Autoencoder for Context Compression in a Large Language Model
Tao Ge
Jing Hu
Lei Wang
Xun Wang
Si-Qing Chen
Furu Wei
RALM
32
66
0
13 Jul 2023
RecallM: An Adaptable Memory Mechanism with Temporal Understanding for
  Large Language Models
RecallM: An Adaptable Memory Mechanism with Temporal Understanding for Large Language Models
Brandon Kynoch
Hugo Latapie
Dwane van der Sluis
CLL
LLMAG
KELM
25
2
0
06 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
41
151
0
05 Jul 2023
ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory
ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory
Chenxu Hu
Jie Fu
Chenzhuang Du
Simian Luo
J. Zhao
Hang Zhao
KELM
LLMAG
27
105
0
06 Jun 2023
Focus Your Attention (with Adaptive IIR Filters)
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
32
9
0
24 May 2023
Do All Languages Cost the Same? Tokenization in the Era of Commercial
  Language Models
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
51
81
0
23 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
90
557
0
22 May 2023
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
256
1,996
0
31 Dec 2020
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
71
52
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
282
2,015
0
28 Jul 2020
Previous
12