v1v2 (latest)

GenAug: Data Augmentation for Finetuning Text Generators

Workshop on Knowledge Extraction and Integration for Deep Learning Architectures; Deep Learning Inside Out (DEELIO), 2020

5 October 2020

Papers citing "GenAug: Data Augmentation for Finetuning Text Generators"

37 / 37 papers shown

Filtering with Confidence: When Data Augmentation Meets Conformal Prediction

149

25 Sep 2025

Backtranslation and paraphrasing in the LLM era? Comparing data augmentation methods for emotion classificationInternational Conference on Conceptual Structures (ICCS), 2025

Łukasz Radliński

Mateusz Guściora

Jan Kocoñ

120

19 Jul 2025

The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR

Injy Hamed

Ngoc Thang Vu

Nizar Habash

247

30 Mar 2025

Few-shot LLM Synthetic Data with Distribution MatchingThe Web Conference (WWW), 2025

438

09 Feb 2025

Exploring Empty Spaces: Human-in-the-Loop Data AugmentationInternational Conference on Human Factors in Computing Systems (CHI), 2024

Dominik Moritz

283

01 Oct 2024

A Survey of Data Synthesis Approaches

265

04 Jul 2024

Targeted Augmentation for Low-Resource Event Extraction

Sijia Wang

Lifu Huang

302

14 May 2024

Evaluation Metrics for Text Data Augmentation in NLP

Marcellus Amadeus

William Alberto Cruz Castañeda

149

09 Feb 2024

Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food ApplicationsIEEE Transactions on Big Data (IEEE Trans. Big Data), 2023

...

Ninghao Liu

Tianming Liu

193

20 Jun 2023

Data Augmentation for Low-Resource Keyphrase GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Krishna Garg

Jishnu Ray Chowdhury

Cornelia Caragea

219

29 May 2023

The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification TasksConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Anders Giovanni Møller

Jacob Aarup Dalsgaard

Arianna Pera

L. Aiello

265

26 Apr 2023

STA: Self-controlled Text Augmentation for Improving Text Classifications

Congcong Wang

Gonzalo Fiz Pontiveros

Steven Derby

Tri Kurniawan Wijaya

214

24 Feb 2023

GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation

195

18 Nov 2022

Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

Maximillian Chen

Alexandros Papangelis

Yang Liu

226

25 Oct 2022

CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text Generation ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

165

09 Oct 2022

PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generationInternational Conference on Computational Linguistics (COLING), 2022

244

16 Sep 2022

PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue Twisters AutomaticallyConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

132

13 Sep 2022

Selective Text Augmentation with Word Roles for Low-Resource Text Classification

Biyang Guo

Songqiao Han

Hailiang Huang

180

04 Sep 2022

Leveraging QA Datasets to Improve Generative Data AugmentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

256

25 May 2022

Self-training with Two-phase Self-augmentation for Few-shot Dialogue GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Wanyu Du

Hanjie Chen

Yangfeng Ji

199

19 May 2022

TreeMix: Compositional Constituency-based Data Augmentation for Natural Language UnderstandingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

Le Zhang

Zichao Yang

Diyi Yang

264

12 May 2022

UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm Detection Using Generative-based and Mutation-based Data AugmentationInternational Workshop on Semantic Evaluation (SemEval), 2022

245

18 Apr 2022

DAGAM: Data Augmentation with Generation And Modification

133

06 Apr 2022

Impact of Environmental Noise on Alzheimer's Disease Detection from Speech: Should You Let a Baby Cry?

Jekaterina Novikova

169

31 Mar 2022

Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

297

27 Feb 2022

MDPFuzz: Testing Models Solving Markov Decision ProcessesInternational Symposium on Software Testing and Analysis (ISSTA), 2021

Qi Pang

Yuanyuan Yuan

Shuai Wang

341

06 Dec 2021

To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP

Gözde Gül Sahin

201

18 Nov 2021

Data Augmentation Methods for Anaphoric Zero Pronouns

Abdulrahman Aloraini

Massimo Poesio

174

20 Sep 2021

Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation ModelsAAAI Conference on Artificial Intelligence (AAAI), 2021

224

08 Sep 2021

ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question Answering over Historical News CollectionsAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021

Jiexin Wang

Adam Jatowt

Masatoshi Yoshikawa

412

08 Sep 2021

SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation

205

15 Aug 2021

A Survey on Data Augmentation for Text Classification

Markus Bayer

M. Kaufhold

Christian A. Reuter

459

426

07 Jul 2021

An Empirical Survey of Data Augmentation for Limited Data Learning in NLPTransactions of the Association for Computational Linguistics (TACL), 2021

Jiaao Chen

Derek Tam

Colin Raffel

Joey Tianyi Zhou

Diyi Yang

245

210

14 Jun 2021

Improving Automated Evaluation of Open Domain Dialog via Diverse Reference AugmentationFindings (Findings), 2021

Varun Gangal

Harsh Jhamtani

Eduard H. Hovy

Taylor Berg-Kirkpatrick

159

05 Jun 2021

A Survey of Data Augmentation Approaches for NLPFindings (Findings), 2021

689

918

07 May 2021

NAREOR: The Narrative Reordering ProblemAAAI Conference on Artificial Intelligence (AAAI), 2021

373

14 Apr 2021

Generating Fake Cyber Threat Intelligence Using Transformer-Based ModelsIEEE International Joint Conference on Neural Network (IJCNN), 2021

354

08 Feb 2021