ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.05935
  4. Cited By
Data Quality Toolkit: Automatic assessment of data quality and
  remediation for machine learning datasets
v1v2 (latest)

Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets

12 August 2021
Nitin Gupta
Hima Patel
S. Afzal
Naveen Panwar
Ruhi Sharma Mittal
Shanmukha C. Guttula
Abhinav C. P. Jain
Lokesh Nagalapatti
S. Mehta
Sandeep Hans
P. Lohia
Aniya Aggarwal
Diptikalyan Saha
ArXiv (abs)PDFHTML

Papers citing "Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets"

15 / 15 papers shown
CADRE: Customizable Assurance of Data Readiness in Privacy-Preserving Federated Learning
CADRE: Customizable Assurance of Data Readiness in Privacy-Preserving Federated LearningeScience (eScience), 2025
Kaveen Hiniduma
Zilinghan Li
Aditya Sinha
Ravi Madduri
Suren Byna
335
1
0
28 May 2025
Assessing the Impact of the Quality of Textual Data on Feature Representation and Machine Learning Models
Assessing the Impact of the Quality of Textual Data on Feature Representation and Machine Learning Models
Tabinda Sarwar
Antonio Jose Jimeno Yepes
Lawrence Cavedon
350
1
0
12 Feb 2025
Data Quality Awareness: A Journey from Traditional Data Management to
  Data Science Systems
Data Quality Awareness: A Journey from Traditional Data Management to Data Science Systems
Sijie Dong
Soror Sahri
Themis Palpanas
318
2
0
05 Nov 2024
Matchmaker: Self-Improving Large Language Model Programs for Schema
  Matching
Matchmaker: Self-Improving Large Language Model Programs for Schema Matching
Nabeel Seedat
Mihaela van der Schaar
236
14
0
31 Oct 2024
AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI
AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI
Kaveen Hiniduma
Suren Byna
J. L. Bez
Ravi Madduri
468
14
0
27 Jun 2024
You can't handle the (dirty) truth: Data-centric insights improve
  pseudo-labeling
You can't handle the (dirty) truth: Data-centric insights improve pseudo-labeling
Nabeel Seedat
Nicolas Huynh
F. Imrie
Mihaela van der Schaar
282
6
0
19 Jun 2024
DCA-Bench: A Benchmark for Dataset Curation Agents
DCA-Bench: A Benchmark for Dataset Curation Agents
Benhao Huang
Yingzhuo Yu
Jin Huang
Xingjian Zhang
Jiaqi Ma
423
4
0
11 Jun 2024
Data Readiness for AI: A 360-Degree Survey
Data Readiness for AI: A 360-Degree Survey
Kaveen Hiniduma
Suren Byna
J. L. Bez
264
23
0
08 Apr 2024
Model-Based Data-Centric AI: Bridging the Divide Between Academic Ideals
  and Industrial Pragmatism
Model-Based Data-Centric AI: Bridging the Divide Between Academic Ideals and Industrial Pragmatism
Chanjun Park
Minsoo Khang
Dahyun Kim
199
2
0
04 Mar 2024
TRIAGE: Characterizing and auditing training data for improved
  regression
TRIAGE: Characterizing and auditing training data for improved regressionNeural Information Processing Systems (NeurIPS), 2023
Nabeel Seedat
Jonathan Crabbé
Zhaozhi Qian
M. Schaar
273
7
0
29 Oct 2023
Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A
  Comprehensive Benchmark
Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A Comprehensive BenchmarkNeural Information Processing Systems (NeurIPS), 2023
Lasse Hansen
Nabeel Seedat
M. Schaar
Andrija Petrović
374
34
0
25 Oct 2023
MLOps Spanning Whole Machine Learning Life Cycle: A Survey
MLOps Spanning Whole Machine Learning Life Cycle: A Survey
Fang Zhengxin
Yuan Yi
Zhang Jingyu
Liu Yue
Mu Yuechen
...
Xu Xiwei
Wang Jeff
Wang Chen
Zhang Shuai
Chen Shiping
199
10
0
13 Apr 2023
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular
  data
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular dataNeural Information Processing Systems (NeurIPS), 2022
Nabeel Seedat
Jonathan Crabbé
Ioana Bica
M. Schaar
224
35
0
24 Oct 2022
Data Smells: Categories, Causes and Consequences, and Detection of
  Suspicious Data in AI-based Systems
Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems
Harald Foidl
Michael Felderer
Rudolf Ramler
247
44
0
19 Mar 2022
Hypothesis Testing for Class-Conditional Label Noise
Hypothesis Testing for Class-Conditional Label Noise
Rafael Poyiadzi
Weisong Yang
Niall Twomey
Raúl Santos-Rodríguez
NoLa
265
0
0
03 Mar 2021
1
Page 1 of 1