Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2112.06439
Cited By
What can Data-Centric AI Learn from Data and ML Engineering?
13 December 2021
N. Polyzotis
Matei A. Zaharia
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"What can Data-Centric AI Learn from Data and ML Engineering?"
24 / 24 papers shown
Minimizing Risk Through Minimizing Model-Data Interaction: A Protocol For Relying on Proxy Tasks When Designing Child Sexual Abuse Imagery Detection Models
Conference on Fairness, Accountability and Transparency (FAccT), 2025
Thamiris Coelho
Leo S. F. Ribeiro
João Macedo
J. A. dos Santos
Sandra Avila
218
3
0
10 May 2025
Data Acquisition for Improving Model Fairness using Reinforcement Learning
Jahid Hasan
Romila Pradhan
213
0
0
04 Dec 2024
Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting
Jingjing Xu
Caesar Wu
Yuan-Fang Li
Grégoire Danoy
Pascal Bouvry
AI4TS
283
2
0
29 Jul 2024
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
Nabeel Seedat
Nicolas Huynh
B. V. Breugel
M. Schaar
346
49
0
19 Dec 2023
Better, Not Just More: Data-Centric Machine Learning for Earth Observation
IEEE Geoscience and Remote Sensing Magazine (GRSM), 2023
R. Roscher
M. Rußwurm
Caroline Gevaert
Michael C. Kampffmeyer
J. A. dos Santos
...
Ronny Hansch
Stine Hansen
Keiller Nogueira
Jonathan Prexl
D. Tuia
466
23
0
08 Dec 2023
Trust, Accountability, and Autonomy in Knowledge Graph-based AI for Self-determination
Luis-Daniel Ibánez
J. Domingue
Sabrina Kirrane
Oshani Seneviratne
Aisling Third
Maria-Esther Vidal
182
3
0
30 Oct 2023
TRIAGE: Characterizing and auditing training data for improved regression
Neural Information Processing Systems (NeurIPS), 2023
Nabeel Seedat
Jonathan Crabbé
Zhaozhi Qian
M. Schaar
192
7
0
29 Oct 2023
Can You Rely on Your Model Evaluation? Improving Model Evaluation with Synthetic Test Data
Neural Information Processing Systems (NeurIPS), 2023
B. V. Breugel
Nabeel Seedat
F. Imrie
M. Schaar
SyDa
229
36
0
25 Oct 2023
Dataset Factory: A Toolchain For Generative Computer Vision Datasets
Daniel Kharitonov
Ryan Turner
195
1
0
20 Sep 2023
Towards Data-centric Graph Machine Learning: Review and Outlook
Xin Zheng
Yixin Liu
Zhifeng Bao
Meng Fang
Xia Hu
Alan Wee-Chung Liew
Shirui Pan
GNN
AI4CE
260
22
0
20 Sep 2023
Synthetic Alone: Exploring the Dark Side of Synthetic Data for Grammatical Error Correction
Chanjun Park
Seonmin Koo
Seolhwa Lee
Jaehyung Seo
Sugyeong Eo
Hyeonseok Moon
Heu-Jeoung Lim
171
0
0
26 Jun 2023
GPT Self-Supervision for a Better Data Annotator
Xiaohuan Pei
Yanxi Li
Chang Xu
151
11
0
07 Jun 2023
Transition Role of Entangled Data in Quantum Machine Learning
Nature Communications (Nat. Commun.), 2023
Xinbiao Wang
Yuxuan Du
Zhuozhuo Tu
Yong Luo
Xiao Yuan
Dacheng Tao
293
20
0
06 Jun 2023
Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Machine-mediated learning (ML), 2023
Xiao-Yang Liu
Ziyi Xia
Hongyang Yang
Jiechao Gao
Daochen Zha
Ming Zhu
Chris Wang
Zhaoran Wang
Jian Guo
OffRL
228
38
0
25 Apr 2023
Data-centric Artificial Intelligence: A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Daochen Zha
Zaid Pervaiz Bhat
Kwei-Herng Lai
Fan Yang
Zhimeng Jiang
Shaochen Zhong
Helen Zhou
492
341
0
17 Mar 2023
Learning to Select Pivotal Samples for Meta Re-weighting
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yinjun Wu
Adam Stein
Jacob R. Gardner
Mayur Naik
194
2
0
09 Feb 2023
Data-centric AI: Perspectives and Challenges
SDM (SDM), 2023
Daochen Zha
Zaid Pervaiz Bhat
Kwei-Herng Lai
Fan Yang
Helen Zhou
240
89
0
12 Jan 2023
DMOps: Data Management Operation and Recipes
E. Choi
Chanjun Park
216
7
0
02 Jan 2023
The Principles of Data-Centric AI (DCAI)
Communications of the ACM (CACM), 2022
M. H. Jarrahi
Ali Memariani
Shion Guha
165
82
0
26 Nov 2022
DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems
IEEE Transactions on Artificial Intelligence (IEEE TAI), 2022
Nabeel Seedat
F. Imrie
M. Schaar
239
18
0
09 Nov 2022
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data
Neural Information Processing Systems (NeurIPS), 2022
Nabeel Seedat
Jonathan Crabbé
Ioana Bica
M. Schaar
175
33
0
24 Oct 2022
DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2022
Donald Bertucci
M. Hamid
Yashwanthi Anand
Anita Ruangrotsakun
Delyar Tabatabai
Melissa Perez
Minsuk Kahng
219
35
0
14 May 2022
Modern Views of Machine Learning for Precision Psychiatry
Patterns (Patterns), 2022
Z. Chen
Prathamesh Kulkarni
Kulkarni
I. Galatzer-Levy
Benedetta Bigio
C. Nasca
Yu Zhang
197
139
0
04 Apr 2022
Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems
Harald Foidl
Michael Felderer
Rudolf Ramler
206
44
0
19 Mar 2022
1
Page 1 of 1