Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1911.03118
Cited By
v1
v2 (latest)
Not Enough Data? Deep Learning to the Rescue!
AAAI Conference on Artificial Intelligence (AAAI), 2019
8 November 2019
Ateret Anaby-Tavor
Boaz Carmeli
Esther Goldbraich
Amir Kantor
George Kour
Segev Shlomov
N. Tepper
Naama Zwerdling
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Not Enough Data? Deep Learning to the Rescue!"
50 / 169 papers shown
Title
Filtering with Confidence: When Data Augmentation Meets Conformal Prediction
Zixuan Wu
So Won Jeong
Yating Liu
Yeo Jin Jung
Claire Donnat
104
0
0
25 Sep 2025
Bridging Generative and Discriminative Learning: Few-Shot Relation Extraction via Two-Stage Knowledge-Guided Pre-training
International Joint Conference on Artificial Intelligence (IJCAI), 2025
Quanjiang Guo
Jinchuan Zhang
Sijie Wang
Ling Tian
Zhao Kang
Bin Yan
Weidong Xiao
197
5
0
18 May 2025
WHERE and WHICH: Iterative Debate for Biomedical Synthetic Data Augmentation
Zhengyi Zhao
Shubo Zhang
Bin Liang
Binyang Li
Kam-Fai Wong
SyDa
186
1
0
31 Mar 2025
HILGEN: Hierarchically-Informed Data Generation for Biomedical NER Using Knowledgebases and Large Language Models
Yao Ge
Yuting Guo
Sudeshna Das
Swati Rajwal
Selen Bozkurt
A. Sarker
MedIm
LM&MA
218
0
0
06 Mar 2025
Synthetic vs. Gold: The Role of LLM Generated Labels and Data in Cyberbullying Detection
Arefeh Kazemi
Sri Balaaji Natarajan Kalaivendan
Joachim Wagner
Hamza Qadeer
Kanishk Verma
Brian Davis
561
4
0
21 Feb 2025
Diversity-oriented Data Augmentation with Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Zaitian Wang
Jinghan Zhang
Xinhao Zhang
Kunpeng Liu
Pengfei Wang
Yuanchun Zhou
395
5
0
17 Feb 2025
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Suhas S Kowshik
Abhishek Divekar
Vijit Malik
SyDa
306
0
0
13 Nov 2024
Relation-based Counterfactual Data Augmentation and Contrastive Learning for Robustifying Natural Language Inference Models
Interspeech (Interspeech), 2023
H. Yang
Sseung-won Hwang
Jungmin So
185
0
0
28 Oct 2024
The Effects of Hallucinations in Synthetic Training Data for Relation Extraction
Steven Rogulsky
Nicholas Popovic
Michael Färber
HILM
163
7
0
10 Oct 2024
A Target-Aware Analysis of Data Augmentation for Hate Speech Detection
Camilla Casula
Sara Tonelli
163
1
0
10 Oct 2024
Generating Synthetic Datasets for Few-shot Prompt Tuning
Xu Guo
Zilin Du
Boyang Li
Chunyan Miao
166
2
0
08 Oct 2024
Reducing and Exploiting Data Augmentation Noise through Meta Reweighting Contrastive Learning for Text Classification
Guanyi Mou
Yichuan Li
Kyumin Lee
240
3
0
26 Sep 2024
An Effective, Robust and Fairness-aware Hate Speech Detection Framework
Guanyi Mou
Kyumin Lee
220
2
0
25 Sep 2024
An Effective Deployment of Diffusion LM for Data Augmentation in Low-Resource Sentiment Classification
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zhuowei Chen
Lianxi Wang
Yuben Wu
Xinfeng Liao
Yujia Tian
Junyang Zhong
DiffM
312
6
0
05 Sep 2024
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Ján Cegin
Jakub Simko
Peter Brusilovsky
183
7
0
29 Aug 2024
See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Yulong Chen
Yang Liu
Jianhao Yan
X. Bai
Ming Zhong
Yinghao Yang
Ziyi Yang
Chenguang Zhu
Yue Zhang
ALM
ELM
158
17
0
16 Aug 2024
Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference
Claudio Angione
Yue Zhao
Harry Yang
Ahmad Farhan
Fielding Johnston
James Buban
Patrick Colangelo
193
1
0
29 Jul 2024
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
Ruida Wang
Jipeng Zhang
Yizhen Jia
Boyao Wang
Shizhe Diao
Renjie Pi
Tong Zhang
LRM
234
45
0
03 Jul 2024
Prompting-based Synthetic Data Generation for Few-Shot Question Answering
International Conference on Language Resources and Evaluation (LREC), 2024
Maximilian Schmidt
Andrea Bartezzaghi
Ngoc Thang Vu
SyDa
149
10
0
15 May 2024
A Comprehensive Survey on Data Augmentation
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Zaitian Wang
Pengfei Wang
Kunpeng Liu
Pengyang Wang
Yanjie Fu
Chang-Tien Lu
Charu Aggarwal
Jian Pei
Yuanchun Zhou
ViT
501
66
0
15 May 2024
UniGen: Universal Domain Generalization for Sentiment Classification via Zero-shot Dataset Generation
Juhwan Choi
Yeonghwa Kim
Seunguk Yu
Jungmin Yun
Youngbin Kim
132
7
0
02 May 2024
LLM-Augmented Retrieval: Enhancing Retrieval Models Through Language Models and Doc-Level Embedding
Mingrui Wu
Sheng Cao
KELM
RALM
149
6
0
08 Apr 2024
Edisum: Summarizing and Explaining Wikipedia Edits at Scale
Marija Sakota
Isaac Johnson
Guosheng Feng
Robert West
SyDa
KELM
165
3
0
04 Apr 2024
Controllable and Diverse Data Augmentation with Large Language Model for Low-Resource Open-Domain Dialogue Generation
Zhenhua Liu
Tong Zhu
Jianxiang Xiang
Wenliang Chen
260
3
0
30 Mar 2024
Adverb Is the Key: Simple Text Data Augmentation with Adverb Deletion
Juhwan Choi
Youngbin Kim
186
0
0
29 Mar 2024
Enhancing Effectiveness and Robustness in a Low-Resource Regime via Decision-Boundary-aware Data Augmentation
International Conference on Language Resources and Evaluation (LREC), 2024
Kyohoon Jin
Junho Lee
Juhwan Choi
Sangmin Song
Youngbin Kim
148
0
0
22 Mar 2024
Beyond Surface Similarity: Detecting Subtle Semantic Shifts in Financial Narratives
Jiaxin Liu
Yi Yang
Kar Yan Tam
AIFin
AI4TS
145
12
0
21 Mar 2024
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition
Junjie Ye
Nuo Xu
Yikun Wang
Jie Zhou
Tao Gui
Tao Gui
Xuanjing Huang
109
22
0
22 Feb 2024
Advancing NLP Models with Strategic Text Augmentation: A Comprehensive Study of Augmentation Methods and Curriculum Strategies
Himmet Toprak Kesgin
M. Amasyalı
126
12
0
14 Feb 2024
Improving Black-box Robustness with In-Context Rewriting
Kyle O'Brien
Nathan Ng
Isha Puri
Jorge Mendez
Hamid Palangi
Yoon Kim
Elisa Kreiss
Tom Hartvigsen
293
7
0
13 Feb 2024
AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes
Juhwan Choi
Kyohoon Jin
Junho Lee
Sangmin Song
Youngbin Kim
170
3
0
08 Feb 2024
A Survey on Data Augmentation in Large Model Era
Yue Zhou
Chenlu Guo
Xu Wang
Yi-Ju Chang
Yuan Wu
LM&MA
VLM
433
46
0
27 Jan 2024
Cheap Learning: Maximising Performance of Language Models for Social Data Science Using Minimal Data
Leonardo Castro-Gonzalez
Yi-Ling Chung
Hannak Rose Kirk
John Francis
Angus R. Williams
Pica Johansson
Jonathan Bright
214
2
0
22 Jan 2024
InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models
Bingbing Wen
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Bill Howe
Lijuan Wang
MLLM
152
3
0
21 Dec 2023
Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Neural Information Processing Systems (NeurIPS), 2023
Jinho Park
Jack Hessel
Khyathi Chandu
Paul Pu Liang
Ximing Lu
...
Youngjae Yu
Qiuyuan Huang
Jianfeng Gao
Ali Farhadi
Yejin Choi
VLM
200
13
0
08 Dec 2023
BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
D. Roussinov
Serge Sharoff
109
3
0
27 Nov 2023
Generative AI for Hate Speech Detection: Evaluation and Findings
Sagi Pendzel
Tomer Wullach
Amir Adler
Einat Minkov
121
15
0
16 Nov 2023
Exploring ChatGPT's Capabilities on Vulnerability Management
USENIX Security Symposium (USENIX Security), 2023
Peiyu Liu
Junming Liu
Lirong Fu
Kangjie Lu
Yifan Xia
Xuhong Zhang
Wenzhi Chen
Haiqin Weng
R. Beyah
Wenhai Wang
194
36
0
11 Nov 2023
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiduan Liu
Jiahao Liu
Qifan Wang
Jingang Wang
Xunliang Cai
Dongyan Zhao
Ran Wang
Rui Yan
150
5
0
24 Oct 2023
Text generation for dataset augmentation in security classification tasks
Alexander P. Welsh
Matthew Edwards
89
2
0
22 Oct 2023
PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Gaurav Sahu
Olga Vechtomova
Dzmitry Bahdanau
I. Laradji
VLM
320
35
0
22 Oct 2023
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Ruida Wang
Wangchunshu Zhou
Mrinmaya Sachan
188
35
0
20 Oct 2023
Does Synthetic Data Make Large Language Models More Efficient?
Sia Gholami
Marwan Omar
179
19
0
11 Oct 2023
"A Tale of Two Movements": Identifying and Comparing Perspectives in #BlackLivesMatter and #BlueLivesMatter Movements-related Tweets using Weakly Supervised Graph-based Structured Prediction
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shamik Roy
Dan Goldwasser
193
5
0
11 Oct 2023
InstructProtein: Aligning Human and Protein Language via Knowledge Instruction
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zeyuan Wang
Qiang Zhang
Keyan Ding
Ming Qin
Zhuang Xiang
Xiaotong Li
Huajun Chen
198
36
0
05 Oct 2023
Can LLMs Augment Low-Resource Reading Comprehension Datasets? Opportunities and Challenges
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Vinay Samuel
Houda Aynaou
Arijit Ghosh Chowdhury
Karthik Venkat Ramanan
Vasu Sharma
SyDa
209
13
0
21 Sep 2023
Distributional Data Augmentation Methods for Low Resource Language
Mosleh Mahamud
Zed Lee
Isak Samsten
175
6
0
09 Sep 2023
Community-Based Hierarchical Positive-Unlabeled (PU) Model Fusion for Chronic Disease Prediction
International Conference on Information and Knowledge Management (CIKM), 2023
Yang Wu
Xurui Li
Xuhong Zhang
Yangyang Kang
Changlong Sun
Xiaozhong Liu
141
7
0
06 Sep 2023
I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2023
Yongzhu Chang
Rongsheng Zhang
Jiashu Pu
105
1
0
08 Aug 2023
From Fake to Hyperpartisan News Detection Using Domain Adaptation
Recent Advances in Natural Language Processing (RANLP), 2023
Razvan-Alexandru Smadu
Sebastian-Vasile Echim
Dumitru-Clementin Cercel
Iuliana Marin
Florin-Catalin Pop
138
4
0
04 Aug 2023
1
2
3
4
Next