ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.09752
  4. Cited By
CoLT5: Faster Long-Range Transformers with Conditional Computation

CoLT5: Faster Long-Range Transformers with Conditional Computation

17 March 2023
Joshua Ainslie
Tao Lei
Michiel de Jong
Santiago Ontañón
Siddhartha Brahma
Yury Zemlyanskiy
David C. Uthus
Mandy Guo
James Lee-Thorp
Yi Tay
Yun-hsuan Sung
Sumit Sanghai
    LLMAG
ArXivPDFHTML

Papers citing "CoLT5: Faster Long-Range Transformers with Conditional Computation"

11 / 11 papers shown
Title
Adaptive Layer-skipping in Pre-trained LLMs
Adaptive Layer-skipping in Pre-trained LLMs
Xuan Luo
Weizhi Wang
Xifeng Yan
66
0
0
31 Mar 2025
CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement
CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement
Gaifan Zhang
Yi Zhou
Danushka Bollegala
61
0
0
21 Mar 2025
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Gabrielle Kaili-May Liu
Bowen Shi
Avi Caciularu
Idan Szpektor
Arman Cohan
58
3
0
30 Oct 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
34
1
0
01 Feb 2024
No Train No Gain: Revisiting Efficient Training Algorithms For
  Transformer-based Language Models
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
13
41
0
12 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
32
149
0
05 Jul 2023
Scaling Transformer to 1M tokens and beyond with RMT
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Mikhail Burtsev
LRM
11
86
0
19 Apr 2023
ContractNLI: A Dataset for Document-level Natural Language Inference for
  Contracts
ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts
Yuta Koreeda
Christopher D. Manning
AILaw
87
96
0
05 Oct 2021
Rider: Reader-Guided Passage Reranking for Open-Domain Question
  Answering
Rider: Reader-Guided Passage Reranking for Open-Domain Question Answering
Yuning Mao
Pengcheng He
Xiaodong Liu
Yelong Shen
Jianfeng Gao
Jiawei Han
Weizhu Chen
OOD
LRM
134
37
0
01 Jan 2021
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
1,982
0
28 Jul 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
1