Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.08826
Cited By
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
18 April 2021
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation"
50 / 155 papers shown
Title
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges
Ranjan Sapkota
Konstantinos I Roumeliotis
Manoj Karkee
AI4TS
24
0
0
15 May 2025
Span-level Emotion-Cause-Category Triplet Extraction with Instruction Tuning LLMs and Data Augmentation
Xiaomeng Li
Dong Yang
Xiaogang Zhu
Faliang Huang
Peng Zhang
Zhongying Zhao
32
0
0
13 Apr 2025
Speech-to-Trajectory: Learning Human-Like Verbal Guidance for Robot Motion
Eran Bamani
Eden Nissinman
Rotem Atari
Nevo Heimann Saadon
A. Sintov
121
0
0
07 Apr 2025
Synthetic Function Demonstrations Improve Generation in Low-Resource Programming Languages
Nick McKenna
X. Xu
Jack Williams
Nick Wilson
Benjamin Van Durme
Christian Poelitz
39
0
0
24 Mar 2025
Synthetic Data Generation Using Large Language Models: Advances in Text and Code
Mihai Nadas
Laura Diosan
Andreea Tomescu
SyDa
72
0
0
18 Mar 2025
Dual-Class Prompt Generation: Enhancing Indonesian Gender-Based Hate Speech Detection through Data Augmentation
Muhammad Amien Ibrahim
Faisal
Tora Sangputra Yopie Winarto
Zefanya Delvin Sulistiya
41
0
0
06 Mar 2025
BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training
Nikitas Theodoropoulos
Giorgos Filandrianos
Vassilis Lyberatos
Maria Lymperaiou
Giorgos Stamou
SyDa
52
1
0
24 Feb 2025
Synthetic vs. Gold: The Role of LLM-Generated Labels and Data in Cyberbullying Detection
Arefeh Kazemi
Sri Balaaji Natarajan Kalaivendan
Joachim Wagner
Hamza Qadeer
Brian Davis
60
1
0
21 Feb 2025
Diversity-Oriented Data Augmentation with Large Language Models
Zaitian Wang
Jinghan Zhang
Xinhao Zhang
Kunpeng Liu
Pengfei Wang
Yuanchun Zhou
80
1
0
17 Feb 2025
Measuring Diversity in Synthetic Datasets
Yuchang Zhu
Huizhe Zhang
Bingzhe Wu
Jintang Li
Zibin Zheng
Peilin Zhao
Liang Chen
Yatao Bian
100
0
0
12 Feb 2025
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Ran Xu
Hejie Cui
Yue Yu
Xuan Kan
Wenqi Shi
Yuchen Zhuang
Wei Jin
Joyce C. Ho
Carl Yang
69
14
0
28 Jan 2025
TARDiS : Text Augmentation for Refining Diversity and Separability
Kyungmin Kim
Sanghun Im
Gibaeg Kim
Heung-Seon Oh
VLM
34
0
0
06 Jan 2025
"My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise
Sharath Naganna
Saprativa Bhattacharjee
Pushpak Bhattacharyya
Biplab Banerjee
31
0
0
31 Dec 2024
Evaluating LLM Prompts for Data Augmentation in Multi-label Classification of Ecological Texts
Anna Glazkova
Olga Zakharova
74
2
0
22 Nov 2024
Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling
Daehoon Gwak
Junwoo Park
Minho Park
C. Park
Hyunchan Lee
E. Choi
Jaegul Choo
76
0
0
21 Nov 2024
Leveraging Large Language Models for Code-Mixed Data Augmentation in Sentiment Analysis
Linda Zeng
43
2
0
01 Nov 2024
Order Matters: Exploring Order Sensitivity in Multimodal Large Language Models
Zhijie Tan
Xu Chu
Weiping Li
Tong Mo
31
1
0
22 Oct 2024
A Persuasion-Based Prompt Learning Approach to Improve Smishing Detection through Data Augmentation
Ho Sung Shim
Hyoungjun Park
Kyuhan Lee
Jang-Sun Park
Seonhye Kang
AAML
28
0
0
18 Oct 2024
A Survey on Data Synthesis and Augmentation for Large Language Models
Ke Wang
Jiahui Zhu
Minjie Ren
Ziqiang Liu
Shiwei Li
...
Chenkai Zhang
Xiaoyu Wu
Qiqi Zhan
Qingjie Liu
Yunhong Wang
SyDa
40
18
0
16 Oct 2024
JurEE not Judges: safeguarding llm interactions with small, specialised Encoder Ensembles
Dom Nasrabadi
31
1
0
11 Oct 2024
Data Processing for the OpenGPT-X Model Family
Nicolo' Brandizzi
Hammam Abdelwahab
Anirban Bhowmick
Lennard Helmer
Benny Jörg Stein
...
Georg Rehm
Dennis Wegener
Nicolas Flores-Herr
Joachim Kohler
Johannes Leveling
VLM
79
2
0
11 Oct 2024
The Effects of Hallucinations in Synthetic Training Data for Relation Extraction
Steven Rogulsky
Nicholas Popovic
Michael Färber
HILM
32
1
0
10 Oct 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui
Zhaoxiao Guo
Hang Zhao
Juanyong Duan
Congrui Huang
36
6
0
23 Sep 2024
A Large Language Model and Denoising Diffusion Framework for Targeted Design of Microstructures with Commands in Natural Language
Nikita Kartashov
Nikolaos N. Vlassis
DiffM
AI4CE
30
1
0
22 Sep 2024
EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding
Muye Huang
Han Lai
Xinyu Zhang
Wenjun Wu
Jie Ma
Lingling Zhang
Jun Liu
39
4
0
03 Sep 2024
TinyAgent: Function Calling at the Edge
Lutfi Eren Erdogan
Nicholas Lee
Siddharth Jha
Sehoon Kim
Ryan Tabrizi
Suhong Moon
Coleman Hooper
Gopala Anumanchipalli
Kurt Keutzer
Amir Gholami
LLMAG
41
12
0
01 Sep 2024
ARMADA: Attribute-Based Multimodal Data Augmentation
Xiaomeng Jin
Jeonghwan Kim
Yu Zhou
Kuan-Hao Huang
Te-Lin Wu
Nanyun Peng
Heng Ji
26
2
0
19 Aug 2024
PatUntrack: Automated Generating Patch Examples for Issue Reports without Tracked Insecure Code
Ziyou Jiang
Lin Shi
Guowei Yang
Qing Wang
25
0
0
16 Aug 2024
LLM-Based Robust Product Classification in Commerce and Compliance
Sina Gholamian
Gianfranco Romani
Bartosz Rudnikowicz
Laura Skylaki
26
1
0
11 Aug 2024
Does Incomplete Syntax Influence Korean Language Model? Focusing on Word Order and Case Markers
Jong Myoung Kim
Young-Jun Lee
Yong-jin Han
Sangkeun Jung
Ho-Jin Choi
42
2
0
12 Jul 2024
Into the Unknown: Generating Geospatial Descriptions for New Environments
Tzuf Paz-Argaman
John Palowitch
Sayali Kulkarni
Reut Tsarfaty
Jason Baldridge
34
1
0
28 Jun 2024
Fairness and Bias in Multimodal AI: A Survey
Tosin P. Adewumi
Lama Alkhaled
Namrata Gurung
G. V. Boven
Irene Pagliai
58
9
0
27 Jun 2024
On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey
Lin Long
Rui Wang
Ruixuan Xiao
Junbo Zhao
Xiao Ding
Gang Chen
Haobo Wang
SyDa
59
93
0
14 Jun 2024
A Synthetic Dataset for Personal Attribute Inference
Hanna Yukhymenko
Robin Staab
Mark Vero
Martin Vechev
SyDa
48
6
0
11 Jun 2024
Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
Avijit Mitra
Emily Druhl
Raelene Goodwin
Hong Yu
37
2
0
10 Jun 2024
Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios
Yuhang Zhou
Wei Ai
37
5
0
08 Jun 2024
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions
Sreyan Ghosh
Utkarsh Tyagi
Sonal Kumar
C. K. Evuru
S Ramaneswaran
S. Sakshi
Dinesh Manocha
48
5
0
06 Jun 2024
PGA-SciRE: Harnessing LLM on Data Augmentation for Enhancing Scientific Relation Extraction
Yang Zhou
Shimin Shan
Hongkui Wei
Zhehuan Zhao
Wenshuo Feng
41
1
0
30 May 2024
SciQAG: A Framework for Auto-Generated Science Question Answering Dataset with Fine-grained Evaluation
Yuwei Wan
Yixuan Liu
Aswathy Ajith
Clara Grazian
B. Hoex
Wenjie Zhang
Chunyu Kit
Tong Xie
Ian Foster
21
7
0
16 May 2024
A Comprehensive Survey on Data Augmentation
Zaitian Wang
Pengfei Wang
Kunpeng Liu
Pengyang Wang
Yanjie Fu
Chang-Tien Lu
Charu Aggarwal
Jian Pei
Yuanchun Zhou
ViT
109
22
0
15 May 2024
Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video
Tomoya Sugihara
Shuntaro Masuda
Ling Xiao
Toshihiko Yamasaki
46
3
0
14 May 2024
News Recommendation with Category Description by a Large Language Model
Yukiharu Yada
Hayato Yamana
36
4
0
13 May 2024
Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection
Bhawesh Kumar
Jonathan Amar
Eric Yang
Nan Li
Yugang Jia
18
3
0
09 May 2024
UniGen: Universal Domain Generalization for Sentiment Classification via Zero-shot Dataset Generation
Juhwan Choi
Yeonghwa Kim
Seunguk Yu
Jungmin Yun
Youngbin Kim
41
1
0
02 May 2024
A Framework for Real-time Safeguarding the Text Generation of Large Language Model
Ximing Dong
Dayi Lin
Shaowei Wang
Ahmed E. Hassan
41
1
0
29 Apr 2024
Empowering Large Language Models for Textual Data Augmentation
Yichuan Li
Kaize Ding
Jianling Wang
Kyumin Lee
26
10
0
26 Apr 2024
Asking and Answering Questions to Extract Event-Argument Structures
Md Nayem Uddin
Enfa Rose George
Eduardo Blanco
Steven Corman
25
3
0
25 Apr 2024
GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification
Ali Pesaranghader
Nikhil Verma
Manasa Bharadwaj
47
3
0
03 Apr 2024
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations
Deqing Fu
Ghazal Khalighinejad
Ollie Liu
Bhuwan Dhingra
Dani Yogatama
Robin Jia
W. Neiswanger
33
14
0
01 Apr 2024
PairEval: Open-domain Dialogue Evaluation with Pairwise Comparison
chaeHun Park
Minseok Choi
Dohyun Lee
Jaegul Choo
35
5
0
01 Apr 2024
1
2
3
4
Next