ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.04131
  4. Cited By
Plumber: Diagnosing and Removing Performance Bottlenecks in Machine
  Learning Data Pipelines

Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines

7 November 2021
Michael Kuchnik
Ana Klimovic
Jiří Šimša
Virginia Smith
George Amvrosiadis
ArXivPDFHTML

Papers citing "Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines"

12 / 12 papers shown
Title
Mixtera: A Data Plane for Foundation Model Training
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
99
0
0
27 Feb 2025
Efficient Tabular Data Preprocessing of ML Pipelines
Efficient Tabular Data Preprocessing of ML Pipelines
Yu Zhu
Wenqi Jiang
Gustavo Alonso
LMTD
25
1
0
23 Sep 2024
PreSto: An In-Storage Data Preprocessing System for Training
  Recommendation Models
PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Yunjae Lee
Hyeseong Kim
Minsoo Rhu
34
3
0
11 Jun 2024
Croissant: A Metadata Format for ML-Ready Datasets
Croissant: A Metadata Format for ML-Ready Datasets
Mubashara Akhtar
Omar Benjelloun
Costanza Conforti
Pieter Gijsbers
Joan Giner-Miguelez
...
Slava Tykhonov
Joaquin Vanschoren
Jos van der Velde
Steffen Vogler
Carole-Jean Wu
32
28
0
28 Mar 2024
InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep
  Recommendation Models
InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models
Kabir Nagrecha
Lingyi Liu
P. Delgado
Prasanna Padmanabhan
OffRL
AI4CE
25
5
0
13 Aug 2023
tf.data service: A Case for Disaggregating ML Input Data Processing
tf.data service: A Case for Disaggregating ML Input Data Processing
Andrew Audibert
Yangrui Chen
D. Graur
Ana Klimovic
Jiří Šimša
C. A. Thekkath
42
16
0
26 Oct 2022
An Overview of the Data-Loader Landscape: Comparative Performance
  Analysis
An Overview of the Data-Loader Landscape: Comparative Performance Analysis
Iason Ofeidis
Diego Kiedanski
Leandros Tassiulas
13
7
0
27 Sep 2022
DataPerf: Benchmarks for Data-Centric AI Development
DataPerf: Benchmarks for Data-Centric AI Development
Mark Mazumder
Colby R. Banbury
Xiaozhe Yao
Bojan Karlavs
W. G. Rojas
...
Carole-Jean Wu
Cody Coleman
Andrew Y. Ng
Peter Mattson
Vijay Janapa Reddi
VLM
33
101
0
20 Jul 2022
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning
  Preprocessing Pipelines
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines
Alexander Isenko
R. Mayer
Jeffrey Jedele
Hans-Arno Jacobsen
19
23
0
17 Feb 2022
tf.data: A Machine Learning Data Processing Framework
tf.data: A Machine Learning Data Processing Framework
D. Murray
Jiří Šimša
Ana Klimovic
Ihor Indyk
PINN
AI4CE
LMTD
39
87
0
28 Jan 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Clairvoyant Prefetching for Distributed Machine Learning I/O
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
31
55
0
21 Jan 2021
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
1