Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2409.02098
Cited By
v1
v2 (latest)
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation
3 September 2024
Ingo Ziegler
Abdullatif Köksal
Desmond Elliott
Hinrich Schütze
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation"
6 / 6 papers shown
Title
AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs
Xiaopeng Ke
Hexuan Deng
Xuebo Liu
Jun Rao
Zhenxi Song
Jun-chen Yu
Min Zhang
SyDa
211
1
0
24 Jul 2025
Learning from Reasoning Failures via Synthetic Data Generation
Gabriela Ben-Melech Stan
Estelle Aflalo
Avinash Madasu
Vasudev Lal
Phillip Howard
SyDa
LRM
321
0
0
20 Apr 2025
ELTEX: A Framework for Domain-Driven Synthetic Data Generation
Arina Razmyslovich
Kseniia Murasheva
Sofia Sedlova
Julien Capitaine
Eugene Dmitriev
SyDa
257
0
0
19 Mar 2025
ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models
Danae Sánchez Villegas
Ingo Ziegler
Desmond Elliott
LRM
306
4
0
26 Feb 2025
Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources
A. Lupidi
Carlos Gemmell
Nicola Cancedda
Jane Dwivedi-Yu
Jason Weston
Jakob Foerster
Roberta Raileanu
Maria Lomeli
SyDa
393
21
0
12 Sep 2024
LongForm: Effective Instruction Tuning with Reverse Instructions
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Abdullatif Köksal
Timo Schick
Anna Korhonen
Hinrich Schütze
SyDa
ALM
231
47
0
17 Apr 2023
1