v1v2 (latest)

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

1 February 2024

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (188★)

Papers citing "Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning"

50 / 57 papers shown

Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning

362

24 Dec 2025

SmolKalam: Ensemble Quality-Filtered Translation at Scale for High Quality Arabic Post-Training Data

136

23 Nov 2025

Revisiting the Data Sampling in Multimodal Post-training from a Difficulty-Distinguish View

313

10 Nov 2025

Importance-Aware Data Selection for Efficient LLM Instruction Tuning

432

10 Nov 2025

Selecting Auxiliary Data via Neural Tangent Kernels for Low-Resource Domains

170

10 Nov 2025

BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning

235

30 Oct 2025

ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning

178

21 Oct 2025

Alibaba International E-commerce Product Search Competition DILAB Team Technical Report

132

21 Oct 2025

On the Role of Preference Variance in Preference Optimization

211

14 Oct 2025

Does Weak-to-strong Generalization Happen under Spurious Correlations?

Chenruo Liu

Yijun Dong

Qi Lei

201

28 Sep 2025

Understanding the Thinking Process of Reasoning Models: A Perspective from Schoenfeld's Episode Theory

199

18 Sep 2025

GRAM-R

^2

: Self-Training Generative Foundation Reward Models for Reward Reasoning

...

403

02 Sep 2025

Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning

394

29 Aug 2025

Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning

221

27 Aug 2025

Mirroring Users: Towards Building Preference-aligned User Simulator with User Feedback in Recommendation

390

25 Aug 2025

Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations

292

05 Aug 2025

Discrete Diffusion in Large Language and Multimodal Models: A Survey

628

16 Jun 2025

Infinity Instruct: Scaling Instruction Selection and Synthesis to Enhance Language Models

229

09 Jun 2025

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

461

02 Jun 2025

GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal SynthesisAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

460

24 May 2025

InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models

391

20 May 2025

ProDS: Preference-oriented Data Selection for Instruction Tuning

330

19 May 2025

HBO: Hierarchical Balancing Optimization for Fine-Tuning Large Language Models

616

18 May 2025

How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients

377

14 Apr 2025

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

690

112

10 Apr 2025

Towards Automatic Continual Learning: A Self-Adaptive Framework for Continual Instruction Tuning

374

20 Mar 2025

D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction TuningInternational Joint Conference on Artificial Intelligence (IJCAI), 2025

442

14 Mar 2025

Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm

454

04 Mar 2025

ATLaS: Agent Tuning via Learning Critical StepsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

686

04 Mar 2025

CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom

337

03 Mar 2025

Large-Scale Data Selection for Instruction Tuning

433

03 Mar 2025

Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning

Hongyi Cal

Jie Li

Mohammad Mahdinur Rahman

Wenzhen Dong

463

26 Feb 2025

MergeIT: From Selection to Merging for Efficient Instruction Tuning

430

25 Feb 2025

From Perceptions to Decisions: Wildfire Evacuation Decision Prediction with Behavioral Theory-informed LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

537

24 Feb 2025

SAE-V: Interpreting Multimodal Models for Enhanced Alignment

453

22 Feb 2025

Unhackable Temporal Rewarding for Scalable Video MLLMs

...

323

17 Feb 2025

Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasksTransactions of the Association for Computational Linguistics (TACL), 2024

Jing Yang

Max Glockner

Anderson de Rezende Rocha

Iryna Gurevych

LRM

487

07 Feb 2025

The Best Instruction-Tuning Data are Those That Fit

652

06 Feb 2025

Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models

951

31 Dec 2024

ConTrans: Weak-to-Strong Alignment Engineering via Concept TransplantationInternational Conference on Computational Linguistics (COLING), 2024

366

31 Dec 2024

Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024

340

17 Dec 2024

Stronger Models are NOT Stronger Teachers for Instruction Tuning

477

11 Nov 2024

Weak-to-Strong Generalization beyond Accuracy: a Pilot Study in Safety, Toxicity, and Legal Reasoning

384

16 Oct 2024

Mastering the Craft of Data Synthesis for CodeLLMsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

...

742

16 Oct 2024

Federated Data-Efficient Instruction Tuning for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

430

14 Oct 2024

PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning

597

11 Oct 2024

MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference OptimizationInternational Conference on Learning Representations (ICLR), 2024

684

10 Oct 2024

Your Weak LLM is Secretly a Strong Teacher for AlignmentInternational Conference on Learning Representations (ICLR), 2024

Leitian Tao

Yixuan Li

637

13 Sep 2024

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

Yiming Liang

Ge Zhang

Xingwei Qu

Tianyu Zheng

Jiawei Guo

...

Jiaheng Liu

Chenghua Lin

Lei Ma

Wenhao Huang

Jiajun Zhang

ALM

336

15 Aug 2024

RuleR: Improving LLM Controllability by Rule-based Data Recycling

680

22 Jun 2024