ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.14135
  4. Cited By
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

27 May 2022
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
    VLM
ArXivPDFHTML

Papers citing "FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness"

50 / 1,418 papers shown
Title
Your ViT is Secretly an Image Segmentation Model
Your ViT is Secretly an Image Segmentation Model
Tommie Kerssies
Niccolò Cavagnero
Alexander Hermans
Narges Norouzi
Giuseppe Averta
Bastian Leibe
Gijs Dubbelman
Daan de Geus
ViT
VLM
56
1
0
24 Mar 2025
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Minsu Kim
Seongmin Hong
RyeoWook Ko
S. Choi
Hunjong Lee
Junsoo Kim
J. Kim
Jongse Park
52
0
0
24 Mar 2025
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
...
Jinghua Yan
Y. Bai
P. Sadayappan
Xia Hu
Bo Yuan
VLM
51
0
0
24 Mar 2025
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
Zhanda Zhu
Christina Giannoula
Muralidhar Andoorveedu
Qidong Su
Karttikeya Mangalam
Bojian Zheng
Gennady Pekhimenko
VLM
MoE
42
0
0
24 Mar 2025
WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training
WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training
Z. Wang
Anna Cai
Xinfeng Xie
Zaifeng Pan
Yue Guan
...
Shikai Li
Jianyu Huang
Chris Cai
Yuchen Hao
Yufei Ding
36
2
0
23 Mar 2025
OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery
OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery
Vignesh Prabhakar
Md Amirul Islam
Adam Atanas
Y. Wang
J. N. Han
...
Rucha Apte
Robert Clark
Kang Xu
Zihan Wang
Kai Liu
LRM
77
1
0
22 Mar 2025
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
Yu-Hsi Chen
34
0
0
21 Mar 2025
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism
Venmugil Elango
45
0
0
20 Mar 2025
PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification
PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification
Sharon Peled
Y. Maruvka
Moti Freiman
38
0
0
20 Mar 2025
FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article
FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article
Ibrahim Al Azher
Miftahul Jannat Mokarrama
Zhishuai Guo
Sagnik Ray Choudhury
Hamed Alhoori
LLMAG
43
0
0
20 Mar 2025
iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation
iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation
Hanxiao Wang
Biao Zhang
Weize Quan
Dong-ming Yan
Peter Wonka
46
0
0
20 Mar 2025
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens
Shuqi Lu
Haowei Lin
Lin Yao
Zhifeng Gao
Xiaohong Ji
W. Elwasif
Linfeng Zhang
Guolin Ke
41
0
0
20 Mar 2025
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Boshen Xu
Yuting Mei
Xinbi Liu
Sipeng Zheng
Qin Jin
VLM
MDE
57
0
0
19 Mar 2025
EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?
EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?
Xinyan Chen
Jiaxin Ge
Hongming Dai
Qiang Zhou
Qiuxuan Feng
Jingtong Hu
Y. Wang
Jiaming Liu
Shanghang Zhang
LM&Ro
65
0
0
19 Mar 2025
Prada: Black-Box LLM Adaptation with Private Data on Resource-Constrained Devices
Prada: Black-Box LLM Adaptation with Private Data on Resource-Constrained Devices
Z. Wang
Yexiao He
Zheyu Shen
Yu Li
Guoheng Sun
Myungjin Lee
Ang Li
48
0
0
19 Mar 2025
Benchmarking Large Language Models for Handwritten Text Recognition
Benchmarking Large Language Models for Handwritten Text Recognition
Giorgia Crosilla
Lukas Klic
Giovanni Colavizza
38
0
0
19 Mar 2025
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency
Jiangxuan Long
Zhao-quan Song
Chiwun Yang
AI4TS
65
0
0
18 Mar 2025
Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Fake Runs, Real Fixes -- Analyzing xPU Performance Through Simulation
Ioannis Zarkadas
Amanda Tomlinson
Asaf Cidon
Baris Kasikci
Ofir Weisse
43
0
0
18 Mar 2025
State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
Chuxin Wang
Wenfei Yang
Xiang Liu
Tianzhu Zhang
51
0
0
18 Mar 2025
SplatVoxel: History-Aware Novel View Streaming without Temporal Training
SplatVoxel: History-Aware Novel View Streaming without Temporal Training
Yiming Wang
Lucy Chai
Xuan Luo
Michael Niemeyer
Manuel Lagunas
Stephen Lombardi
Siyu Tang
Tiancheng Sun
3DGS
54
0
0
18 Mar 2025
Identifying and Mitigating Position Bias of Multi-image Vision-Language Models
Identifying and Mitigating Position Bias of Multi-image Vision-Language Models
Xinyu Tian
Shu Zou
Zhaoyuan Yang
Jing Zhang
58
0
0
18 Mar 2025
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
M. Beck
Korbinian Poppel
Phillip Lippe
Sepp Hochreiter
59
1
0
18 Mar 2025
Bolt3D: Generating 3D Scenes in Seconds
Bolt3D: Generating 3D Scenes in Seconds
Stanislaw Szymanowicz
Jason Y. Zhang
P. Srinivasan
Ruiqi Gao
Arthur Brussee
Aleksander Holynski
Ricardo Martín Brualla
Jonathan T. Barron
Philipp Henzler
90
4
0
18 Mar 2025
Growing a Twig to Accelerate Large Vision-Language Models
Growing a Twig to Accelerate Large Vision-Language Models
Zhenwei Shao
Mingyang Wang
Zhou Yu
Wenwen Pan
Yan Yang
Tao Wei
H. Zhang
Ning Mao
Wei Chen
Jun Yu
VLM
59
1
0
18 Mar 2025
Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference
Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference
Hao Yin
Guangzong Si
Zilei Wang
46
0
0
17 Mar 2025
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling
Yingyue Li
Bencheng Liao
Wenyu Liu
Xinggang Wang
Mamba
58
0
0
17 Mar 2025
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Teng Wang
Zhangyi Jiang
Zhenqi He
Wenhan Yang
Yanan Zheng
Zeyu Li
Zifan He
Shenyang Tong
Hailei Gong
LRM
90
1
0
16 Mar 2025
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Nir Ailon
Akhiad Bercovich
Omri Weinstein
52
0
0
15 Mar 2025
Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training
Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training
Zhenxin Li
Shihao Wang
Shiyi Lan
Zhiding Yu
Zuxuan Wu
Jose M. Alvarez
46
1
0
15 Mar 2025
TransiT: Transient Transformer for Non-line-of-sight Videography
Ruiqian Li
Siyuan Shen
Suan Xia
Z. Wang
Xingyue Peng
Chengxuan Song
Yingsheng Zhu
Tao Wu
Shiying Li
Jingyi Yu
50
0
0
14 Mar 2025
TigerLLM -- A Family of Bangla Large Language Models
Nishat Raihan
Marcos Zampieri
38
0
0
14 Mar 2025
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen
Guoqiang Gong
Tao He
Yifeng Zhang
Pengzhang Liu
Sicheng Zhao
Guiguang Ding
VLM
63
0
0
14 Mar 2025
Similarity-Aware Token Pruning: Your VLM but Faster
Ahmadreza Jeddi
Negin Baghbanzadeh
Elham Dolatabadi
Babak Taati
3DV
VLM
50
1
0
14 Mar 2025
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
Chenpeng Wu
Qiqi Gu
Heng Shi
Jianguo Yao
Haibing Guan
MoE
43
0
0
13 Mar 2025
Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling
Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling
Shuqi Lu
Xiaohong Ji
Bohang Zhang
Lin Yao
Siyuan Liu
Zhifeng Gao
Linfeng Zhang
Guolin Ke
AI4CE
38
1
0
13 Mar 2025
Take Off the Training Wheels Progressive In-Context Learning for Effective Alignment
Zhenyu Liu
Dongfang Li
Xinshuo Hu
X. Zhao
Yibin Chen
Baotian Hu
Min-Ling Zhang
40
1
0
13 Mar 2025
Autoregressive Image Generation with Randomized Parallel Decoding
Haopeng Li
Jinyue Yang
Guoqi Li
Huan Wang
53
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
76
7
0
13 Mar 2025
Speedy MASt3R
Jingxing Li
Yongjae Lee
Abhay Kumar Yadav
Cheng-Fang Peng
Rama Chellappa
Deliang Fan
3DGS
55
0
0
13 Mar 2025
ZSMerge: Zero-Shot KV Cache Compression for Memory-Efficient Long-Context LLMs
ZSMerge: Zero-Shot KV Cache Compression for Memory-Efficient Long-Context LLMs
Xin Liu
Pei Liu
Guoming Tang
MoMe
44
0
0
13 Mar 2025
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
Yanming Zhang
Jun-Kun Chen
Jipeng Lyu
Yu-Xiong Wang
DiffM
VGen
44
0
0
13 Mar 2025
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
Zexuan Yan
Yue Ma
Chang Zou
Wenteng Chen
Qifeng Chen
Linfeng Zhang
49
0
0
13 Mar 2025
Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding
Haoyu Zhang
Qiaohui Chu
Meng Liu
Yunxiao Wang
Bin Wen
Fan Yang
Tingting Gao
Di Zhang
Yaowei Wang
Liqiang Nie
EgoV
68
0
0
12 Mar 2025
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola
Aaron Gokaslan
Justin T Chiu
Zhihan Yang
Zhixuan Qi
Jiaqi Han
S. Sahoo
Volodymyr Kuleshov
DiffM
60
3
0
12 Mar 2025
LiSu: A Dataset and Method for LiDAR Surface Normal Estimation
Dušan Malić
Christian Fruhwirth-Reisinger
Samuel Schulter
Horst Possegger
3DV
50
0
0
11 Mar 2025
AI-native Memory 2.0: Second Me
Jiale Wei
Xiang Ying
Tao Gao
Fangyi Bao
Felix Tao
Jingbo Shang
48
1
0
11 Mar 2025
RigoChat 2: an adapted language model to Spanish using a bounded dataset and reduced hardware
Gonzalo Santamaría Gómez
Guillem García Subies
Pablo Gutiérrez Ruiz
Mario González Valero
Natàlia Fuertes
...
Nuria Aldama García
David Betancur Sánchez
Kateryna Sushkova
Marta Guerrero Nieto
Á. Jiménez
51
0
0
11 Mar 2025
Representing 3D Shapes With 64 Latent Vectors for 3D Diffusion Models
I. Cho
Youngbeom Yoo
Subin Jeon
Seon Joo Kim
DiffM
58
0
0
11 Mar 2025
Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM Inference
Pol G. Recasens
Ferran Agullo
Yue Zhu
Chen Wang
Eun Kyung Lee
Olivier Tardieu
Jordi Torres
Josep Ll. Berral
38
0
0
11 Mar 2025
From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
Jiacheng Liu
Chang Zou
Yuanhuiyi Lyu
Junjie Chen
Linfeng Zhang
DiffM
54
0
0
10 Mar 2025
Previous
123456...272829
Next