Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2010.01794
Cited By
v1
v2 (latest)
GenAug: Data Augmentation for Finetuning Text Generators
Workshop on Knowledge Extraction and Integration for Deep Learning Architectures; Deep Learning Inside Out (DEELIO), 2020
5 October 2020
Steven Y. Feng
Varun Gangal
Luan Tuyen Chau
Teruko Mitamura
Eduard H. Hovy
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GenAug: Data Augmentation for Finetuning Text Generators"
37 / 37 papers shown
Filtering with Confidence: When Data Augmentation Meets Conformal Prediction
Zixuan Wu
So Won Jeong
Yating Liu
Yeo Jin Jung
Claire Donnat
149
0
0
25 Sep 2025
Backtranslation and paraphrasing in the LLM era? Comparing data augmentation methods for emotion classification
International Conference on Conceptual Structures (ICCS), 2025
Łukasz Radliński
Mateusz Guściora
Jan Kocoñ
120
2
0
19 Jul 2025
The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR
Injy Hamed
Ngoc Thang Vu
Nizar Habash
247
2
0
30 Mar 2025
Few-shot LLM Synthetic Data with Distribution Matching
The Web Conference (WWW), 2025
Jiyuan Ren
Zhaocheng Du
Zhihao Wen
Qinglin Jia
Sunhao Dai
Chuhan Wu
Zhenhua Dong
SyDa
438
5
0
09 Feb 2025
Exploring Empty Spaces: Human-in-the-Loop Data Augmentation
International Conference on Human Factors in Computing Systems (CHI), 2024
Catherine Yeh
Donghao Ren
Yannick Assogba
Dominik Moritz
Fred Hohman
283
3
0
01 Oct 2024
A Survey of Data Synthesis Approaches
Hsin-Yu Chang
Pei-Yu Chen
Tun-Hsiang Chou
Chang-Sheng Kao
Hsuan-Yun Yu
Yen-Ting Lin
Yun-Nung Chen
265
10
0
04 Jul 2024
Targeted Augmentation for Low-Resource Event Extraction
Sijia Wang
Lifu Huang
302
2
0
14 May 2024
Evaluation Metrics for Text Data Augmentation in NLP
Marcellus Amadeus
William Alberto Cruz Castañeda
149
1
0
09 Feb 2024
Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications
IEEE Transactions on Big Data (IEEE Trans. Big Data), 2023
Saed Rezayi
Zheng Liu
Zihao Wu
Chandra Dhakal
Bao Ge
...
Gengchen Mai
Ninghao Liu
Chen Zhen
Tianming Liu
Sheng Li
193
45
0
20 Jun 2023
Data Augmentation for Low-Resource Keyphrase Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Krishna Garg
Jishnu Ray Chowdhury
Cornelia Caragea
219
10
0
29 May 2023
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Anders Giovanni Møller
Jacob Aarup Dalsgaard
Arianna Pera
L. Aiello
265
58
0
26 Apr 2023
STA: Self-controlled Text Augmentation for Improving Text Classifications
Congcong Wang
Gonzalo Fiz Pontiveros
Steven Derby
Tri Kurniawan Wijaya
214
4
0
24 Feb 2023
GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation
Biyang Guo
Yeyun Gong
Yelong Shen
Songqiao Han
Hailiang Huang
Nan Duan
Weizhu Chen
VLM
195
23
0
18 Nov 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
226
38
0
25 Oct 2022
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text Generation Models
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Steven Y. Feng
Vivek Khetan
Bogdan Sacaleanu
A. Gershman
Eduard H. Hovy
LRM
165
14
0
09 Oct 2022
PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation
International Conference on Computational Linguistics (COLING), 2022
Sedrick Scott Keh
Kevin Lu
Varun Gangal
Steven Y. Feng
Harsh Jhamtani
Malihe Alikhani
Eduard H. Hovy
244
2
0
16 Sep 2022
PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue Twisters Automatically
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Sedrick Scott Keh
Steven Y. Feng
Varun Gangal
Malihe Alikhani
Eduard H. Hovy
132
6
0
13 Sep 2022
Selective Text Augmentation with Word Roles for Low-Resource Text Classification
Biyang Guo
Songqiao Han
Hailiang Huang
180
10
0
04 Sep 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
256
19
0
25 May 2022
Self-training with Two-phase Self-augmentation for Few-shot Dialogue Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Wanyu Du
Hanjie Chen
Yangfeng Ji
199
1
0
19 May 2022
TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Le Zhang
Zichao Yang
Diyi Yang
264
26
0
12 May 2022
UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm Detection Using Generative-based and Mutation-based Data Augmentation
International Workshop on Semantic Evaluation (SemEval), 2022
Amirhossein Abaskohi
A. Rasouli
Tanin Zeraati
B. Bahrak
245
11
0
18 Apr 2022
DAGAM: Data Augmentation with Generation And Modification
Byeong-Cheol Jo
Tak-Sung Heo
Yeongjoon Park
Yongmin Yoo
Won-Ik Cho
Kyungsun Kim
VLM
133
2
0
06 Apr 2022
Impact of Environmental Noise on Alzheimer's Disease Detection from Speech: Should You Let a Baby Cry?
Jekaterina Novikova
169
0
0
31 Mar 2022
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zhuang Li
Zhuang Li
Xingliang Yuan
Tongtong Wu
Tianyang Zhan
Gholamreza Haffari
CoGe
UD
DRL
297
4
0
27 Feb 2022
MDPFuzz: Testing Models Solving Markov Decision Processes
International Symposium on Software Testing and Analysis (ISSTA), 2021
Qi Pang
Yuanyuan Yuan
Shuai Wang
341
41
0
06 Dec 2021
To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP
Gözde Gül Sahin
201
39
0
18 Nov 2021
Data Augmentation Methods for Anaphoric Zero Pronouns
Abdulrahman Aloraini
Massimo Poesio
174
5
0
20 Sep 2021
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models
AAAI Conference on Artificial Intelligence (AAAI), 2021
Steven Y. Feng
Kevin Lu
Zhuofu Tao
Malihe Alikhani
Teruko Mitamura
Eduard H. Hovy
Varun Gangal
LRM
224
14
0
08 Sep 2021
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question Answering over Historical News Collections
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021
Jiexin Wang
Adam Jatowt
Masatoshi Yoshikawa
412
39
0
08 Sep 2021
SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation
Steven Y. Feng
Jessica Huynh
Chaitanya Narisetty
Eduard H. Hovy
Varun Gangal
VLM
205
11
0
15 Aug 2021
A Survey on Data Augmentation for Text Classification
Markus Bayer
M. Kaufhold
Christian A. Reuter
459
426
0
07 Jul 2021
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Transactions of the Association for Computational Linguistics (TACL), 2021
Jiaao Chen
Derek Tam
Colin Raffel
Joey Tianyi Zhou
Diyi Yang
245
210
0
14 Jun 2021
Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation
Findings (Findings), 2021
Varun Gangal
Harsh Jhamtani
Eduard H. Hovy
Taylor Berg-Kirkpatrick
159
9
0
05 Jun 2021
A Survey of Data Augmentation Approaches for NLP
Findings (Findings), 2021
Steven Y. Feng
Varun Gangal
Jason W. Wei
Sarath Chandar
Soroush Vosoughi
Teruko Mitamura
Eduard H. Hovy
AIMat
689
918
0
07 May 2021
NAREOR: The Narrative Reordering Problem
AAAI Conference on Artificial Intelligence (AAAI), 2021
Varun Gangal
Steven Y. Feng
Malihe Alikhani
Teruko Mitamura
Eduard H. Hovy
373
27
0
14 Apr 2021
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
IEEE International Joint Conference on Neural Network (IJCNN), 2021
P. Ranade
Aritran Piplai
Sudip Mittal
A. Joshi
Tim Finin
354
92
0
08 Feb 2021
1