ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.14135
  4. Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
    VLM
ArXivPDFHTML

Papers citing "FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"

50 / 1,418 papers shown
Title
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion
  Inference
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference
Zihao Yu
Haoyang Li
Fangcheng Fu
Xupeng Miao
Bin Cui
DiffM
12
3
0
27 May 2023
Fine-Tuning Language Models with Just Forward Passes
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
6
176
0
27 May 2023
Scissorhands: Exploiting the Persistence of Importance Hypothesis for
  LLM KV Cache Compression at Test Time
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Zichang Liu
Aditya Desai
Fangshuo Liao
Weitao Wang
Victor Xie
Zhaozhuo Xu
Anastasios Kyrillidis
Anshumali Shrivastava
6
200
0
26 May 2023
Backpack Language Models
Backpack Language Models
John Hewitt
John Thickstun
Christopher D. Manning
Percy Liang
KELM
6
16
0
26 May 2023
Imitating Task and Motion Planning with Visuomotor Transformers
Imitating Task and Motion Planning with Visuomotor Transformers
Murtaza Dalal
Ajay Mandlekar
Caelan Reed Garrett
Ankur Handa
Ruslan Salakhutdinov
D. Fox
24
52
0
25 May 2023
Landmark Attention: Random-Access Infinite Context Length for
  Transformers
Landmark Attention: Random-Access Infinite Context Length for Transformers
Amirkeivan Mohtashami
Martin Jaggi
LLMAG
11
94
0
25 May 2023
Online learning of long-range dependencies
Online learning of long-range dependencies
Nicolas Zucchet
Robert Meier
Simon Schug
Asier Mujika
João Sacramento
CLL
33
18
0
25 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurélien Lucchi
Thomas Hofmann
29
51
0
25 May 2023
Manifold Diffusion Fields
Manifold Diffusion Fields
Ahmed A. A. Elhag
Yuyang Wang
J. Susskind
Miguel Angel Bautista
DiffM
AI4CE
25
4
0
24 May 2023
Focus Your Attention (with Adaptive IIR Filters)
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
19
9
0
24 May 2023
Just CHOP: Embarrassingly Simple LLM Compression
Just CHOP: Embarrassingly Simple LLM Compression
A. Jha
Tom Sherborne
Evan Pete Walsh
Dirk Groeneveld
Emma Strubell
Iz Beltagy
12
3
0
24 May 2023
Adapting Language Models to Compress Contexts
Adapting Language Models to Compress Contexts
Alexis Chevalier
Alexander Wettig
Anirudh Ajith
Danqi Chen
LLMAG
9
98
0
24 May 2023
Dual Path Transformer with Partition Attention
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
25
2
0
24 May 2023
BinaryViT: Towards Efficient and Accurate Binary Vision Transformers
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
ViT
17
2
0
24 May 2023
Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model
Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model
Yinghan Long
Sayeed Shafayet Chowdhury
Kaushik Roy
25
1
0
24 May 2023
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by
  Few-Shot Grounding on Wikipedia
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Sina J. Semnani
Violet Z. Yao
He Zhang
M. Lam
KELM
AI4MH
18
71
0
23 May 2023
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Manuel Tran
Yashin Dicente Cid
Amal Lahiani
Fabian J. Theis
Tingying Peng
Eldad Klaiman
11
2
0
23 May 2023
Neural Machine Translation for Code Generation
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
19
4
0
22 May 2023
A Framework for Fine-Grained Synchronization of Dependent GPU Kernels
A Framework for Fine-Grained Synchronization of Dependent GPU Kernels
Abhinav Jangda
Saeed Maleki
M. Dehnavi
Madan Musuvathi
Olli Saarikivi
14
4
0
22 May 2023
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head
  Checkpoints
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Joshua Ainslie
James Lee-Thorp
Michiel de Jong
Yury Zemlyanskiy
Federico Lebrón
Sumit Sanghai
10
569
0
22 May 2023
Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM
  Inference Pipeline
Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline
Zangwei Zheng
Xiaozhe Ren
Fuzhao Xue
Yang Luo
Xin Jiang
Yang You
16
53
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
43
550
0
22 May 2023
Non-Autoregressive Document-Level Machine Translation
Non-Autoregressive Document-Level Machine Translation
Guangsheng Bao
Zhiyang Teng
Hao Zhou
Jianhao Yan
Yue Zhang
26
0
0
22 May 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
19
12
0
22 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large
  Language Models
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Rada Mihalcea
LRM
22
6
0
21 May 2023
CARD: Channel Aligned Robust Blend Transformer for Time Series
  Forecasting
CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting
Xue Wang
Tian Zhou
Qingsong Wen
Jinyang Gao
Bolin Ding
Rong Jin
AI4TS
10
35
0
20 May 2023
Efficient ConvBN Blocks for Transfer Learning and Beyond
Efficient ConvBN Blocks for Transfer Learning and Beyond
Kaichao You
Guo Qin
Anchang Bao
Mengsi Cao
Ping-Chia Huang
Jiulong Shan
Mingsheng Long
11
1
0
19 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
10
113
0
18 May 2023
Less is More! A slim architecture for optimal language translation
Less is More! A slim architecture for optimal language translation
Luca Herranz-Celotti
E. Rrapaj
23
0
0
18 May 2023
Discffusion: Discriminative Diffusion Models as Few-shot Vision and
  Language Learners
Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He
Weixi Feng
Tsu-jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
William Yang Wang
X. Wang
DiffM
32
7
0
18 May 2023
Ray-Patch: An Efficient Querying for Light Field Transformers
Ray-Patch: An Efficient Querying for Light Field Transformers
T. B. Martins
Javier Civera
ViT
26
0
0
16 May 2023
Diffusion Models for Imperceptible and Transferable Adversarial Attack
Diffusion Models for Imperceptible and Transferable Adversarial Attack
Jianqi Chen
H. Chen
Keyan Chen
Yilan Zhang
Zhengxia Zou
Z. Shi
DiffM
10
56
0
14 May 2023
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group
  Attention
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
Xinyu Liu
Houwen Peng
Ningxin Zheng
Yuqing Yang
Han Hu
Yixuan Yuan
ViT
15
266
0
11 May 2023
StarCoder: may the source be with you!
StarCoder: may the source be with you!
Raymond Li
Loubna Ben Allal
Yangtian Zi
Niklas Muennighoff
Denis Kocetkov
...
Sean M. Hughes
Thomas Wolf
Arjun Guha
Leandro von Werra
H. D. Vries
31
688
0
09 May 2023
CaloClouds: Fast Geometry-Independent Highly-Granular Calorimeter
  Simulation
CaloClouds: Fast Geometry-Independent Highly-Granular Calorimeter Simulation
E. Buhmann
S. Diefenbacher
E. Eren
F. Gaede
Gregor Kasieczka
A. Korol
W. Korcari
K. Krüger
Peter McKeown
DiffM
6
45
0
08 May 2023
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
  Transformer APIs
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs
Deepak Narayanan
Keshav Santhanam
Peter Henderson
Rishi Bommasani
Tony Lee
Percy Liang
121
3
0
03 May 2023
Approximating CKY with Transformers
Approximating CKY with Transformers
Ghazal Khalighinejad
Ollie Liu
Sam Wiseman
35
2
0
03 May 2023
Key-Locked Rank One Editing for Text-to-Image Personalization
Key-Locked Rank One Editing for Text-to-Image Personalization
Yoad Tewel
Rinon Gal
Gal Chechik
Y. Atzmon
DiffM
132
163
0
02 May 2023
Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs
Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs
Shixun Wu
Yujia Zhai
Jinyang Liu
Jiajun Huang
Zizhe Jian
Bryan M. Wong
Zizhong Chen
14
12
0
01 May 2023
SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor
  3D Object Detection
SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection
Yichen Xie
Chenfeng Xu
Marie-Julie Rakotosaona
Patrick Rim
F. Tombari
Kurt Keutzer
M. Tomizuka
Wei Zhan
3DPC
39
49
0
27 Apr 2023
A Cookbook of Self-Supervised Learning
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
31
270
0
24 Apr 2023
Transformer-Based Language Model Surprisal Predicts Human Reading Times
  Best with About Two Billion Training Tokens
Transformer-Based Language Model Surprisal Predicts Human Reading Times Best with About Two Billion Training Tokens
Byung-Doh Oh
William Schuler
26
13
0
22 Apr 2023
Speed Is All You Need: On-Device Acceleration of Large Diffusion Models
  via GPU-Aware Optimizations
Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations
Yu-Hui Chen
Raman Sarokin
Juhyun Lee
Jiuqiang Tang
Chuo-Ling Chang
Andrei Kulik
Matthias Grundmann
VLM
32
37
0
21 Apr 2023
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner
Benedikt Alkin
Andreas Fürst
Elisabeth Rumetshofer
Lukas Miklautz
Sepp Hochreiter
11
18
0
20 Apr 2023
Long-term Forecasting with TiDE: Time-series Dense Encoder
Long-term Forecasting with TiDE: Time-series Dense Encoder
Abhimanyu Das
Weihao Kong
Andrew B. Leach
Shaan Mathur
Rajat Sen
Rose Yu
AI4TS
31
232
0
17 Apr 2023
DINOv2: Learning Robust Visual Features without Supervision
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
18
2,983
0
14 Apr 2023
Modeling Dense Multimodal Interactions Between Biological Pathways and
  Histology for Survival Prediction
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
Guillaume Jaume
Anurag J. Vaidya
Richard J. Chen
Drew F. K. Williamson
Paul Pu Liang
Faisal Mahmood
25
22
0
13 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
16
39
0
07 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
25
1,160
0
03 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
14
75
0
03 Apr 2023
Previous
123...26272829
Next