Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2210.04018
Cited By
v1
v2
v3
v4 (latest)
STaSy: Score-based Tabular data Synthesis
International Conference on Learning Representations (ICLR), 2022
8 October 2022
Jayoung Kim
C. Lee
Noseong Park
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (28★)
Papers citing
"STaSy: Score-based Tabular data Synthesis"
50 / 53 papers shown
Title
Boosting Predictive Performance on Tabular Data through Data Augmentation with Latent-Space Flow-Based Diffusion
Md. Tawfique Ihsan
Md. Rakibul Hasan Rafi
Ahmed Shoyeb Raihan
Imtiaz Ahmed
Abdullahil Azeem
DiffM
102
0
0
20 Nov 2025
Privacy-Preserving Tabular Synthetic Data Generation Using TabularARGN
Andrey Sidorenko
P. Tiwald
136
1
0
08 Aug 2025
Democratizing Tabular Data Access with an Open
\unicode
x
2013
\unicode{x2013}
\unicode
x
2013
Source Synthetic
\unicode
x
2013
\unicode{x2013}
\unicode
x
2013
Data SDK
Ivona Krchova
Mariana Vargas-Vieyra
Mario Scriminaci
Andrey Sidorenko
66
0
0
01 Aug 2025
Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces
Kevin Rojas
Yuchen Zhu
Sichen Zhu
Felix X.-F. Ye
Molei Tao
DiffM
186
10
0
09 Jun 2025
A Closer Look on Memorization in Tabular Diffusion Model: A Data-Centric Perspective
Zhengyu Fang
Zhimeng Jiang
Huiyuan Chen
Xiaoge Zhang
Kaiyu Tang
Xiao Li
Jing Li
TDI
194
1
0
28 May 2025
LLMSynthor: Macro-Aligned Micro-Records Synthesis with Large Language Models
Yihong Tang
Menglin Kong
Lijun Sun
Tong Nie
Lijun Sun
SyDa
212
1
0
20 May 2025
A Comprehensive Survey of Synthetic Tabular Data Generation
Ruxue Shi
Yili Wang
Mengnan Du
Xu Shen
Yi Chang
Xin Wang
625
10
0
23 Apr 2025
Diffusion Transformers for Tabular Data Time Series Generation
International Conference on Learning Representations (ICLR), 2025
Fabrizio Garuti
E. Sangineto
Simone Luetto
L. Forni
Rita Cucchiara
396
3
0
10 Apr 2025
Towards Synthesizing High-Dimensional Tabular Data with Limited Samples
Zuqing Li
Junhao Gan
Jianzhong Qi
DiffM
185
0
0
09 Mar 2025
A Generalized Theory of Mixup for Structure-Preserving Synthetic Data
International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Chungpa Lee
Jongho Im
Joseph H.T. Kim
244
0
0
03 Mar 2025
Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints
International Conference on Learning Representations (ICLR), 2025
Mihaela C. Stoian
Eleonora Giunchiglia
260
8
0
25 Feb 2025
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions
Zhong Li
Qi Huang
Lincen Yang
Jiayang Shi
Zhao Yang
Niki van Stein
Thomas Bäck
M. Leeuwen
DiffM
259
5
0
24 Feb 2025
TabGen-ICL: Residual-Aware In-Context Example Selection for Tabular Data Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Liancheng Fang
Aiwei Liu
Hengrui Zhang
Henry Peng Zou
Weizhi Zhang
Philip S. Yu
LMTD
194
3
0
23 Feb 2025
TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic Data
P. Tiwald
Ivona Krchova
Andrey Sidorenko
Mariana Vargas-Vieyra
Mario Scriminaci
Michael Platzer
377
10
0
21 Jan 2025
Understanding and Mitigating Memorization in Diffusion Models for Tabular Data
Zhengyu Fang
Zhimeng Jiang
Huiyuan Chen
Xiao Li
Jing Li
348
6
0
15 Dec 2024
Diffusion-nested Auto-Regressive Synthesis of Heterogeneous Tabular Data
Hengrui Zhang
Liancheng Fang
Qitian Wu
Philip S. Yu
DiffM
LMTD
118
5
0
28 Oct 2024
TabDiff: a Multi-Modal Diffusion Model for Tabular Data Generation
International Conference on Learning Representations (ICLR), 2024
Juntong Shi
Minkai Xu
Harper Hua
Hengrui Zhang
Stefano Ermon
J. Leskovec
DiffM
210
1
0
27 Oct 2024
Targeted synthetic data generation for tabular data via hardness characterization
Tommaso Ferracci
Leonie Goldmann
Anton Hinel
Francesco Sanna Passino
512
1
0
01 Oct 2024
TabEBM: A Tabular Data Augmentation Method with Distinct Class-Specific Energy-Based Models
Neural Information Processing Systems (NeurIPS), 2024
Andrei Margeloiu
Xiangjian Jiang
Nikola Simidjievski
M. Jamnik
235
8
0
24 Sep 2024
Deep generative models as an adversarial attack strategy for tabular machine learning
International Conference on Machine Learning and Computing (ICMLC), 2024
Salijona Dyrmishi
Mihaela C. Stoian
Eleonora Giunchiglia
Maxime Cordy
AAML
LMTD
122
2
0
19 Sep 2024
Tabular Data Augmentation for Machine Learning: Progress and Prospects of Embracing Generative AI
Lingxi Cui
Huan Li
Ke Chen
Alexander Lerch
Gang Chen
LMTD
275
23
0
31 Jul 2024
Self-Supervision Improves Diffusion Models for Tabular Data Imputation
Yixin Liu
Thalaiyasingam Ajanthan
Hisham Husain
Vu-Linh Nguyen
174
20
0
25 Jul 2024
Unmasking Trees for Tabular Data
Calvin McCarter
271
4
0
08 Jul 2024
Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios
Patricia A. Apellániz
Ana Jiménez
Borja Arroyo Galende
J. Parras
Santiago Zazo
339
1
0
03 Jul 2024
Diffusion Models for Tabular Data Imputation and Synthetic Data Generation
Mario Villaizán-Vallelado
Matteo Salvatori
Carlos Segura
Ioannis Arapakis
MedIm
DiffM
237
15
0
02 Jul 2024
TimeAutoDiff: A Unified Framework for Generation, Imputation, Forecasting, and Time-Varying Metadata Conditioning of Heterogeneous Time Series Tabular Data
Namjoon Suh
Yuning Yang
Din-Yin Hsieh
Qitong Luan
Shirong Xu
Shixiang Zhu
Guang Cheng
216
15
0
23 Jun 2024
Data Plagiarism Index: Characterizing the Privacy Risk of Data-Copying in Tabular Generative Models
Joshua Ward
Chi-Hua Wang
Guang Cheng
188
9
0
18 Jun 2024
Causality for Tabular Data Synthesis: A High-Order Structure Causal Benchmark Framework
Ruibo Tu
Zineb Senane
Lele Cao
Cheng Zhang
Hedvig Kjellström
G. Henter
CML
321
5
0
12 Jun 2024
Navigating Tabular Data Synthesis Research: Understanding User Needs and Tool Capabilities
Maria F. Davila
Sven Groen
Fabian Panse
Wolfram Wingerath
LMTD
169
6
0
31 May 2024
Masked Language Modeling Becomes Conditional Density Estimation for Tabular Data Synthesis
SeungHwan An
Gyeongdong Woo
Jaesung Lim
ChangHyun Kim
Sungchul Hong
Jong-June Jeon
274
2
0
31 May 2024
ClavaDDPM: Multi-relational Data Synthesis with Cluster-guided Diffusion Models
Wei Pang
Masoumeh Shafieinejad
Lucy Liu
Xi He
209
22
0
28 May 2024
Guided Discrete Diffusion for Electronic Health Record Generation
Jun Han
Zixiang Chen
Yongqian Li
Yiwen Kou
Eran Halperin
Robert E. Tillman
Quanquan Gu
MedIm
DiffM
231
8
0
18 Apr 2024
Balanced Mixed-Type Tabular Data Synthesis with Diffusion Models
Zeyu Yang
Peikun Guo
Khadija Zanna
Akane Sano
Xiaoxue Yang
Akane Sano
DiffM
364
11
0
12 Apr 2024
Structured Evaluation of Synthetic Tabular Data
Scott Cheng-Hsin Yang
Baxter S. Eaves
Michael Schmidt
Ken Swanson
Patrick Shafto
237
7
0
15 Mar 2024
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey
Xi Fang
Weijie Xu
Fiona Anting Tan
Jiani Zhang
Ziqing Hu
Yanjun Qi
Scott Nickleach
Diego Socolinsky
Srinivasan H. Sengamedu
Christos Faloutsos
LMTD
ALM
432
158
0
27 Feb 2024
Systematic Assessment of Tabular Data Synthesis
Yuntao Du
Ninghui Li
196
10
0
09 Feb 2024
How Realistic Is Your Synthetic Data? Constraining Deep Generative Models for Tabular Data
Mihaela C. Stoian
Salijona Dyrmishi
Maxime Cordy
Thomas Lukasiewicz
Eleonora Giunchiglia
156
24
0
07 Feb 2024
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
Annual Review of Statistics and Its Application (ARSIA), 2024
Namjoon Suh
Guang Cheng
MedIm
265
17
0
14 Jan 2024
Improve Fidelity and Utility of Synthetic Credit Card Transaction Time Series from Data-centric Perspective
Din-Yin Hsieh
ChiHua Wang
Guang Cheng
AI4TS
146
5
0
01 Jan 2024
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
Nabeel Seedat
Nicolas Huynh
B. V. Breugel
M. Schaar
248
45
0
19 Dec 2023
Continuous Diffusion for Mixed-Type Tabular Data
International Conference on Learning Representations (ICLR), 2023
Markus Mueller
Kathrin Gruber
Dennis Fok
DiffM
448
8
0
16 Dec 2023
TabMT: Generating tabular data with masked transformers
Manbir Gulati
Paul F. Roysdon
LMTD
163
56
0
11 Dec 2023
Boosting Data Analytics With Synthetic Volume Expansion
Xiaotong Shen
Yifei Liu
Rex Shen
248
4
0
27 Oct 2023
AutoDiff: combining Auto-encoder and Diffusion model for tabular data synthesizing
Namjoon Suh
Xiaofeng Lin
Din-Yin Hsieh
Merhdad Honarkhah
Guang Cheng
221
25
0
24 Oct 2023
Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space
Hengrui Zhang
Jiani Zhang
Ninad Kulkarni
Zhengyuan Shen
Xiao Qin
Christos Faloutsos
Huzefa Rangwala
George Karypis
DiffM
327
172
0
14 Oct 2023
ResBit: Residual Bit Vector for Categorical Values
Masane Fuchi
Amar Zanashir
Si-Qing Chen
Tomohiro Takagi
209
1
0
29 Sep 2023
Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Alexia Jolicoeur-Martineau
Kilian Fatras
Tal Kachman
234
55
0
18 Sep 2023
CuTS: Customizable Tabular Synthetic Data Generation
International Conference on Machine Learning (ICML), 2023
Mark Vero
Mislav Balunović
Martin Vechev
221
8
0
07 Jul 2023
MissDiff: Training Diffusion Models on Tabular Data with Missing Values
Yidong Ouyang
Liyan Xie
Chongxuan Li
Guang Cheng
DiffM
247
35
0
02 Jul 2023
On the Usefulness of Synthetic Tabular Data Generation
Dionysis Manousakas
Sergul Aydore
164
15
0
27 Jun 2023
1
2
Next