Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.08826
Cited By
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
18 April 2021
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation"
50 / 155 papers shown
Title
CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP
Chandra Kiran Reddy Evuru
Sreyan Ghosh
Sonal Kumar
S. Ramaneswaran
Utkarsh Tyagi
Dinesh Manocha
42
8
0
30 Mar 2024
EDDA: A Encoder-Decoder Data Augmentation Framework for Zero-Shot Stance Detection
Daijun Ding
Li Dong
Zhichao Huang
Guangning Xu
Xu Huang
Bo Liu
Liwen Jing
Bowen Zhang
39
3
0
23 Mar 2024
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Nicholas Lee
Thanakul Wattanawong
Sehoon Kim
K. Mangalam
Sheng Shen
Gopala Anumanchipalli
Michael W. Mahoney
Kurt Keutzer
A. Gholami
61
46
0
22 Mar 2024
Enhancing Effectiveness and Robustness in a Low-Resource Regime via Decision-Boundary-aware Data Augmentation
Kyohoon Jin
Junho Lee
Juhwan Choi
Sangmin Song
Youngbin Kim
40
0
0
22 Mar 2024
Automated data processing and feature engineering for deep learning and big data applications: a survey
A. Mumuni
F. Mumuni
TPM
43
48
0
18 Mar 2024
MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation
Jiahuan Li
Shanbo Cheng
Shujian Huang
Jiajun Chen
35
7
0
14 Mar 2024
Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges
Bosheng Ding
Chengwei Qin
Ruochen Zhao
Tianze Luo
Xinze Li
Guizhen Chen
Wenhan Xia
Junjie Hu
A. Luu
Shafiq R. Joty
31
18
0
05 Mar 2024
SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization
Prakamya Mishra
Zonghai Yao
Parth Vashisht
Feiyun Ouyang
Beining Wang
Vidhi Mody
Hong-ye Yu
SyDa
MedIm
44
4
0
21 Feb 2024
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Wenlin Yao
Lu Cheng
Huan Liu
SyDa
56
50
0
21 Feb 2024
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
Ajay Patel
Colin Raffel
Chris Callison-Burch
SyDa
AI4CE
33
25
0
16 Feb 2024
AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes
Juhwan Choi
Kyohoon Jin
Junho Lee
Sangmin Song
Youngbin Kim
22
1
0
08 Feb 2024
GPTs Are Multilingual Annotators for Sequence Generation Tasks
Juhwan Choi
Eunju Lee
Kyohoon Jin
Youngbin Kim
25
10
0
08 Feb 2024
A Survey on Data Augmentation in Large Model Era
Yue Zhou
Chenlu Guo
Xu Wang
Yi-Ju Chang
Yuan Wu
LM&MA
VLM
49
23
0
27 Jan 2024
Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications
Yuhang Zhou
Paiheng Xu
Xiyao Wang
Xuan Lu
Ge Gao
Wei Ai
65
5
0
22 Jan 2024
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk
Dennis Ulmer
Elman Mansimov
Kaixiang Lin
Justin Sun
Xibin Gao
Yi Zhang
LLMAG
32
27
0
10 Jan 2024
EHR Interaction Between Patients and AI: NoteAid EHR Interaction
Xiaocheng Zhang
Zonghai Yao
Hong-ye Yu
LM&MA
26
2
0
29 Dec 2023
InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models
Bingbing Wen
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Bill Howe
Lijuan Wang
MLLM
44
1
0
21 Dec 2023
Student as an Inherent Denoiser of Noisy Teacher
Jiachen Zhao
24
0
0
15 Dec 2023
Automated Annotation of Scientific Texts for ML-based Keyphrase Extraction and Validation
O. Amusat
Harshad B. Hegde
Christopher J. Mungall
Anna Giannakou
Neil Byers
Dan Gunter
Kjiersten Fagnan
Lavanya Ramakrishnan
18
2
0
08 Nov 2023
People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection
Indira Sen
Dennis Assenmacher
Mattia Samory
Isabelle Augenstein
Wil M.P. van der Aalst
Claudia Wagner
17
19
0
02 Nov 2023
Making Large Language Models Better Data Creators
Dong-Ho Lee
Jay Pujara
Mohit Sewak
Ryen W. White
S. Jauhar
ALM
SyDa
13
23
0
31 Oct 2023
Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization
Prakamya Mishra
Zonghai Yao
Shuwei Chen
Beining Wang
Rohan Mittal
Hong-ye Yu
KELM
ALM
HILM
31
7
0
30 Oct 2023
LLMaAA: Making Large Language Models as Active Annotators
Ruoyu Zhang
Yanzeng Li
Yongliang Ma
Ming Zhou
Lei Zou
35
68
0
30 Oct 2023
Using GPT-4 to Augment Unbalanced Data for Automatic Scoring
Luyang Fang
Gyeong-Geon Lee
Xiaoming Zhai
26
17
0
25 Oct 2023
Large Language Models can Share Images, Too!
Young-Jun Lee
Dokyong Lee
Joo Won Sung
Jonghwan Hyeon
Ho-Jin Choi
MLLM
24
2
0
23 Oct 2023
Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review
Banghao Chen
Zhaofeng Zhang
Nicolas Langrené
Shengxin Zhu
LLMAG
28
89
0
23 Oct 2023
Text generation for dataset augmentation in security classification tasks
Alexander P. Welsh
Matthew Edwards
30
1
0
22 Oct 2023
PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation
Gaurav Sahu
Olga Vechtomova
Dzmitry Bahdanau
I. Laradji
VLM
55
24
0
22 Oct 2023
Ask Language Model to Clean Your Noisy Translation Data
Quinten Bolding
Baohao Liao
Brandon James Denis
Jun Luo
Christof Monz
29
5
0
20 Oct 2023
Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations
Zhuoyan Li
Hangxiao Zhu
Zhuoran Lu
Ming Yin
SyDa
69
67
0
11 Oct 2023
BYOC: Personalized Few-Shot Classification with Co-Authored Class Descriptions
Arth Bohra
Govert Verkes
Artem Harutyunyan
Pascal Weinberger
Giovanni Campagna
27
5
0
09 Oct 2023
GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence
Zhihua Wen
Zhiliang Tian
Wei Wu
Yuxin Yang
Yanqi Shi
Zhen Huang
Dongsheng Li
RALM
42
13
0
09 Oct 2023
Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models
Jean Kaddour
Qi Liu
SyDa
32
2
0
02 Oct 2023
Knowledge Engineering using Large Language Models
Bradley Paul Allen
Lise Stork
Paul T. Groth
15
24
0
01 Oct 2023
Can LLMs Augment Low-Resource Reading Comprehension Datasets? Opportunities and Challenges
Vinay Samuel
Houda Aynaou
Arijit Ghosh Chowdhury
Karthik Venkat Ramanan
Aman Chadha
SyDa
33
7
0
21 Sep 2023
Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models
Hsuan Su
Ting-Yao Hu
H. Koppula
Raviteja Vemulapalli
Jen-Hao Rick Chang
Karren D. Yang
G. Mantena
Oncel Tuzel
SyDa
44
1
0
18 Sep 2023
Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation
Charles OÑeill
Y. Ting 丁
I. Ciucă
Jack Miller
Thang Bui
SyDa
37
1
0
15 Aug 2023
Deep Learning-Based Knowledge Injection for Metaphor Detection: A Comprehensive Review
Cheng-Fu Yang
Wenye Zhao
Zhiyue Liu
Qingbao Huang
35
0
0
08 Aug 2023
A Critical Review of Large Language Models: Sensitivity, Bias, and the Path Toward Specialized AI
Arash Hajikhani
Carolyn Cole
ELM
22
14
0
28 Jul 2023
Data Augmentation for Neural Machine Translation using Generative Language Model
Seokjin Oh
Su ah Lee
Woohwan Jung
SyDa
21
13
0
26 Jul 2023
Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation
Letian Peng
Yuwei Zhang
Jingbo Shang
LRM
24
7
0
14 Jul 2023
EaSyGuide : ESG Issue Identification Framework leveraging Abilities of Generative Large Language Models
Hanwool Albert Lee
Jonghyun Choi
Sohyeon Kwon
Sungbum Jung
17
3
0
11 Jun 2023
Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification
Shouvon Sarker
Lijun Qian
Xishuang Dong
LM&MA
AI4MH
18
10
0
10 Jun 2023
Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training
Haode Zhang
Haowen Liang
Li-Ming Zhan
Xiao-Ming Wu
Albert Y. S. Lam
VLM
16
8
0
08 Jun 2023
Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions
John Joon Young Chung
Ece Kamar
Saleema Amershi
ALM
34
109
0
07 Jun 2023
Targeted Data Generation: Finding and Fixing Model Weaknesses
Zexue He
Marco Tulio Ribeiro
Fereshte Khani
29
13
0
28 May 2023
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration
Hwaran Lee
Seokhee Hong
Joonsuk Park
Takyoung Kim
M. Cha
...
Eun-Ju Lee
Yong Lim
Alice H. Oh
San-hee Park
Jung-Woo Ha
36
16
0
28 May 2023
EASE: An Easily-Customized Annotation System Powered by Efficiency Enhancement Mechanisms
Naihao Deng
Yikai Liu
Mingye Chen
Winston Wu
Siyang Liu
Yulong Chen
Yue Zhang
Rada Mihalcea
31
0
0
23 May 2023
Understanding the Effect of Data Augmentation on Knowledge Distillation
Ziqi Wang
Chi Han
Wenxuan Bao
Heng Ji
21
2
0
21 May 2023
Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He
Weixi Feng
Tsu-jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
William Yang Wang
Qing Guo
DiffM
49
7
0
18 May 2023
Previous
1
2
3
4
Next