Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.00530
Cited By
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
1 February 2024
Ming Li
Yong Zhang
Shwai He
Zhitao Li
Hongyu Zhao
Jianzong Wang
Ning Cheng
Tianyi Zhou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning"
15 / 15 papers shown
Title
ICon: In-Context Contribution for Automatic Data Selection
Yixin Yang
Qingxiu Dong
Linli Yao
Fangwei Zhu
Zhifang Sui
41
0
0
08 May 2025
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
113
3
0
06 Feb 2025
ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation
Weilong Dong
Xinwei Wu
Renren Jin
Shaoyang Xu
Deyi Xiong
49
6
0
31 Dec 2024
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
83
11
0
31 Dec 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Bill Yuchen Lin
Radha Poovendran
ALM
48
5
0
11 Nov 2024
Mastering the Craft of Data Synthesis for CodeLLMs
Meng Chen
Philip Arthur
Qianyu Feng
Cong Duy Vu Hoang
Yu-Heng Hong
...
Mark Johnson
K. K.
Don Dharmasiri
Long Duong
Yuan-Fang Li
SyDa
46
1
0
16 Oct 2024
Weak-to-Strong Generalization beyond Accuracy: a Pilot Study in Safety, Toxicity, and Legal Reasoning
Ruimeng Ye
Yang Xiao
Bo Hui
ALM
ELM
OffRL
27
2
0
16 Oct 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
84
5
0
13 Sep 2024
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models
Ziche Liu
Rui Ke
Feng Jiang
Feng Jiang
Haizhou Li
58
1
0
20 Jun 2024
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang
Shiqi Shen
Guangyao Shen
Zhi Gong
Yankai Lin
Zhi Gong
Yankai Lin
Ji-Rong Wen
47
13
0
17 Jun 2024
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models
Hyunbyung Park
Sukyung Lee
Gyoungjin Gim
Yungi Kim
Dahyun Kim
Chanjun Park
VLM
29
0
0
28 Mar 2024
Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
Ming Li
Jiuhai Chen
Lichang Chen
Tianyi Zhou
66
17
0
16 Feb 2024
Data Diversity Matters for Robust Instruction Tuning
Alexander Bukharin
Tuo Zhao
70
35
0
21 Nov 2023
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning
Ming Li
Lichang Chen
Jiuhai Chen
Shwai He
Heng-Chiao Huang
Jiuxiang Gu
Tianyi Zhou
84
20
0
18 Oct 2023
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
209
179
0
18 Apr 2021
1