ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.12924
  4. Cited By
OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the
  Memory Usage of Neural Networks

OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the Memory Usage of Neural Networks

24 October 2022
Benoit Steiner
Mostafa Elhoushi
Jacob Kahn
James Hegarty
ArXivPDFHTML

Papers citing "OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the Memory Usage of Neural Networks"

8 / 8 papers shown
Title
Efficient Memory Management for Large Language Model Serving with
  PagedAttention
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
29
1,771
0
12 Sep 2023
Rockmate: an Efficient, Fast, Automatic and Generic Tool for
  Re-materialization in PyTorch
Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
Xunyi Zhao
Théotime Le Hellard
Lionel Eyraud
Julia Gusak
Olivier Beaumont
19
6
0
03 Jul 2023
Breaking On-device Training Memory Wall: A Systematic Survey
Breaking On-device Training Memory Wall: A Systematic Survey
Shitian Li
Chunlin Tian
Kahou Tam
Ruirui Ma
Li Li
21
2
0
17 Jun 2023
FlexGen: High-Throughput Generative Inference of Large Language Models
  with a Single GPU
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
144
366
0
13 Mar 2023
Overcoming Oscillations in Quantization-Aware Training
Overcoming Oscillations in Quantization-Aware Training
Markus Nagel
Marios Fournarakis
Yelysei Bondarenko
Tijmen Blankevoort
MQ
106
98
0
21 Mar 2022
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
178
1,027
0
06 Mar 2020
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural
  Networks for Edge Devices
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices
Byung Hoon Ahn
Jinwon Lee
J. Lin
Hsin-Pai Cheng
Jilei Hou
H. Esmaeilzadeh
62
54
0
04 Mar 2020
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,549
0
17 Apr 2017
1