ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.01794
  4. Cited By
GenAug: Data Augmentation for Finetuning Text Generators

GenAug: Data Augmentation for Finetuning Text Generators

5 October 2020
Steven Y. Feng
Varun Gangal
Dongyeop Kang
Teruko Mitamura
Eduard H. Hovy
ArXivPDFHTML

Papers citing "GenAug: Data Augmentation for Finetuning Text Generators"

36 / 36 papers shown
Title
The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR
The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR
Injy Hamed
Ngoc Thang Vu
Nizar Habash
40
0
0
30 Mar 2025
Few-shot LLM Synthetic Data with Distribution Matching
Few-shot LLM Synthetic Data with Distribution Matching
Jiyuan Ren
Zhaocheng Du
Zhihao Wen
Qinglin Jia
Sunhao Dai
Chuhan Wu
Zhenhua Dong
SyDa
87
0
0
09 Feb 2025
Exploring Empty Spaces: Human-in-the-Loop Data Augmentation
Exploring Empty Spaces: Human-in-the-Loop Data Augmentation
Catherine Yeh
Donghao Ren
Yannick Assogba
Dominik Moritz
Fred Hohman
38
0
0
01 Oct 2024
A Survey of Data Synthesis Approaches
A Survey of Data Synthesis Approaches
Hsin-Yu Chang
Pei-Yu Chen
Tun-Hsiang Chou
Chang-Sheng Kao
Hsuan-Yun Yu
Yen-Ting Lin
Yun-Nung Chen
40
6
0
04 Jul 2024
Targeted Augmentation for Low-Resource Event Extraction
Targeted Augmentation for Low-Resource Event Extraction
Sijia Wang
Lifu Huang
32
1
0
14 May 2024
Evaluation Metrics for Text Data Augmentation in NLP
Evaluation Metrics for Text Data Augmentation in NLP
Marcellus Amadeus
William Alberto Cruz Castañeda
38
1
0
09 Feb 2024
Exploring New Frontiers in Agricultural NLP: Investigating the Potential
  of Large Language Models for Food Applications
Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications
Saed Rezayi
Zheng Liu
Zihao Wu
Chandra Dhakal
Bao Ge
...
Gengchen Mai
Ninghao Liu
Chen Zhen
Tianming Liu
Sheng Li
28
32
0
20 Jun 2023
Data Augmentation for Low-Resource Keyphrase Generation
Data Augmentation for Low-Resource Keyphrase Generation
Krishna Garg
Jishnu Ray Chowdhury
Cornelia Caragea
33
8
0
29 May 2023
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in
  Classification Tasks
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks
Anders Giovanni Møller
Jacob Aarup Dalsgaard
Arianna Pera
L. Aiello
81
35
0
26 Apr 2023
STA: Self-controlled Text Augmentation for Improving Text
  Classifications
STA: Self-controlled Text Augmentation for Improving Text Classifications
Congcong Wang
Gonzalo Fiz Pontiveros
Steven Derby
Tri Kurniawan Wijaya
46
3
0
24 Feb 2023
GENIUS: Sketch-based Language Model Pre-training via Extreme and
  Selective Masking for Text Generation and Augmentation
GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation
Biyang Guo
Yeyun Gong
Yelong Shen
Songqiao Han
Hailiang Huang
Nan Duan
Weizhu Chen
VLM
44
18
0
18 Nov 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue
  Understanding
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
39
32
0
25 Oct 2022
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text
  Generation Models
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text Generation Models
Steven Y. Feng
Vivek Khetan
Bogdan Sacaleanu
A. Gershman
Eduard H. Hovy
LRM
35
10
0
09 Oct 2022
PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel
  Personification data for Learning Enhanced generation
PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation
Sedrick Scott Keh
Kevin Lu
Varun Gangal
Steven Y. Feng
Harsh Jhamtani
Malihe Alikhani
Eduard H. Hovy
40
2
0
16 Sep 2022
PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue Twisters
  Automatically
PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue Twisters Automatically
Sedrick Scott Keh
Steven Y. Feng
Varun Gangal
Malihe Alikhani
Eduard H. Hovy
23
4
0
13 Sep 2022
Selective Text Augmentation with Word Roles for Low-Resource Text
  Classification
Selective Text Augmentation with Word Roles for Low-Resource Text Classification
Biyang Guo
Songqiao Han
Hailiang Huang
11
9
0
04 Sep 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
21
18
0
25 May 2022
Self-training with Two-phase Self-augmentation for Few-shot Dialogue
  Generation
Self-training with Two-phase Self-augmentation for Few-shot Dialogue Generation
Wanyu Du
Hanjie Chen
Yangfeng Ji
21
1
0
19 May 2022
TreeMix: Compositional Constituency-based Data Augmentation for Natural
  Language Understanding
TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding
Le Zhang
Zichao Yang
Diyi Yang
36
24
0
12 May 2022
UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm
  Detection Using Generative-based and Mutation-based Data Augmentation
UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm Detection Using Generative-based and Mutation-based Data Augmentation
Amirhossein Abaskohi
A. Rasouli
Tanin Zeraati
B. Bahrak
24
10
0
18 Apr 2022
DAGAM: Data Augmentation with Generation And Modification
DAGAM: Data Augmentation with Generation And Modification
Byeong-Cheol Jo
Tak-Sung Heo
Yeongjoon Park
Yongmin Yoo
Won-Ik Cho
Kyungsun Kim
VLM
15
2
0
06 Apr 2022
Impact of Environmental Noise on Alzheimer's Disease Detection from
  Speech: Should You Let a Baby Cry?
Impact of Environmental Noise on Alzheimer's Disease Detection from Speech: Should You Let a Baby Cry?
Jekaterina Novikova
24
0
0
31 Mar 2022
Variational Autoencoder with Disentanglement Priors for Low-Resource
  Task-Specific Natural Language Generation
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
Zhuang Li
Lizhen Qu
Qiongkai Xu
Tongtong Wu
Tianyang Zhan
Gholamreza Haffari
CoGe
UD
DRL
44
4
0
27 Feb 2022
MDPFuzz: Testing Models Solving Markov Decision Processes
MDPFuzz: Testing Models Solving Markov Decision Processes
Qi Pang
Yuanyuan Yuan
Shuai Wang
20
27
0
06 Dec 2021
To Augment or Not to Augment? A Comparative Study on Text Augmentation
  Techniques for Low-Resource NLP
To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP
Gözde Gül Sahin
40
33
0
18 Nov 2021
Data Augmentation Methods for Anaphoric Zero Pronouns
Data Augmentation Methods for Anaphoric Zero Pronouns
Abdulrahman Aloraini
Massimo Poesio
29
5
0
20 Sep 2021
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense
  in Text Generation Models
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models
Steven Y. Feng
Kevin Lu
Zhuofu Tao
Malihe Alikhani
Teruko Mitamura
Eduard H. Hovy
Varun Gangal
LRM
35
13
0
08 Sep 2021
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question
  Answering over Historical News Collections
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question Answering over Historical News Collections
Jiexin Wang
Adam Jatowt
Masatoshi Yoshikawa
35
33
0
08 Sep 2021
SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation
SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation
Steven Y. Feng
Jessica Huynh
Chaitanya Narisetty
Eduard H. Hovy
Varun Gangal
VLM
32
9
0
15 Aug 2021
A Survey on Data Augmentation for Text Classification
A Survey on Data Augmentation for Text Classification
Markus Bayer
M. Kaufhold
Christian A. Reuter
36
334
0
07 Jul 2021
An Empirical Survey of Data Augmentation for Limited Data Learning in
  NLP
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Jiaao Chen
Derek Tam
Colin Raffel
Joey Tianyi Zhou
Diyi Yang
28
172
0
14 Jun 2021
Improving Automated Evaluation of Open Domain Dialog via Diverse
  Reference Augmentation
Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation
Varun Gangal
Harsh Jhamtani
Eduard H. Hovy
Taylor Berg-Kirkpatrick
14
8
0
05 Jun 2021
A Survey of Data Augmentation Approaches for NLP
A Survey of Data Augmentation Approaches for NLP
Steven Y. Feng
Varun Gangal
Jason W. Wei
Sarath Chandar
Soroush Vosoughi
Teruko Mitamura
Eduard H. Hovy
AIMat
39
799
0
07 May 2021
NAREOR: The Narrative Reordering Problem
NAREOR: The Narrative Reordering Problem
Varun Gangal
Steven Y. Feng
Malihe Alikhani
Teruko Mitamura
Eduard H. Hovy
30
26
0
14 Apr 2021
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
P. Ranade
Aritran Piplai
Sudip Mittal
A. Joshi
Tim Finin
34
69
0
08 Feb 2021
Data Augmentation using Pre-trained Transformer Models
Data Augmentation using Pre-trained Transformer Models
Varun Kumar
Ashutosh Choudhary
Eunah Cho
VLM
216
348
0
04 Mar 2020
1