Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.10957
Cited By
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
25 February 2020
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers"
50 / 144 papers shown
Title
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning
Zhehao Zhang
Weijie Xu
Fanyou Wu
Chandan K. Reddy
24
0
0
12 May 2025
Assessing and Mitigating Medical Knowledge Drift and Conflicts in Large Language Models
Weiyi Wu
Xinwen Xu
Chongyang Gao
Xingjian Diao
Siting Li
Lucas A. Salas
Jiang Gui
14
0
0
12 May 2025
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via
α
α
α
-
β
β
β
-Divergence
Guanghui Wang
Zhiyong Yang
Z. Wang
Shi Wang
Qianqian Xu
Q. Huang
37
0
0
07 May 2025
A Reasoning-Focused Legal Retrieval Benchmark
Lucia Zheng
Neel Guha
Javokhir Arifov
Sarah Zhang
Michal Skreta
Christopher D. Manning
Peter Henderson
Daniel E. Ho
AILaw
RALM
ELM
87
2
0
06 May 2025
Towards High-Fidelity Synthetic Multi-platform Social Media Datasets via Large Language Models
Henry Tari
Nojus Sereiva
Rishabh Kaushal
T. Bertaglia
Adriana Iamnitchi
26
0
0
02 May 2025
Effective Inference-Free Retrieval for Learned Sparse Representations
F. M. Nardini
Thong Nguyen
Cosimo Rulli
Rossano Venturini
Andrew Yates
RALM
40
0
0
30 Apr 2025
A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage
Rui Xin
Niloofar Mireshghallah
Shuyue Stella Li
Michael Duan
Hyunwoo Kim
Yejin Choi
Yulia Tsvetkov
Sewoong Oh
Pang Wei Koh
69
1
0
28 Apr 2025
Leveraging Decoder Architectures for Learned Sparse Retrieval
Jingfen Qiao
Thong Nguyen
Evangelos Kanoulas
Andrew Yates
49
0
0
25 Apr 2025
AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine
Carlo Siebenschuh
Kyle Hippe
Ozan Gokdemir
Alexander Brace
A. Khan
...
V. Vishwanath
R. Stevens
Arvind Ramanathan
Ian Foster
Robert Underwood
MoE
41
0
0
23 Apr 2025
Learning Critically: Selective Self Distillation in Federated Learning on Non-IID Data
Yuting He
Yiqiang Chen
Xiaodong Yang
H. Yu
Yi-Hua Huang
Yang Gu
FedML
40
20
0
20 Apr 2025
OnRL-RAG: Real-Time Personalized Mental Health Dialogue System
Ahsan Bilal
Beiyu Lin
OffRL
RALM
AI4MH
42
1
0
02 Apr 2025
ConSCompF: Consistency-focused Similarity Comparison Framework for Generative Large Language Models
Alexey Karev
Dong Xu
48
0
0
18 Mar 2025
FedMentalCare: Towards Privacy-Preserving Fine-Tuned LLMs to Analyze Mental Health Status Using Federated Learning Framework
S M Sarwar
AI4MH
39
0
0
27 Feb 2025
Moving Past Single Metrics: Exploring Short-Text Clustering Across Multiple Resolutions
Justin K. Miller
Tristram J. Alexander
29
1
0
24 Feb 2025
Enhancing Domain-Specific Retrieval-Augmented Generation: Synthetic Data Generation and Evaluation using Reasoning Models
Aryan Jadon
Avinash Patil
Shashank Kumar
SyDa
39
1
0
21 Feb 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Zihuiwen Ye
L. Melo
Younesse Kaddar
Phil Blunsom
S. Kamath S
Yarin Gal
LRM
44
0
0
16 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
98
4
0
12 Feb 2025
EvoFlow: Evolving Diverse Agentic Workflows On The Fly
Guibin Zhang
Kaijie Chen
Guancheng Wan
Heng Chang
Hong Cheng
K. Wang
Shuyue Hu
Lei Bai
73
2
0
11 Feb 2025
Benchmarking Prompt Sensitivity in Large Language Models
Amirhossein Razavi
Mina Soltangheis
Negar Arabzadeh
Sara Salamat
Morteza Zihayat
Ebrahim Bagheri
64
1
0
09 Feb 2025
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Nicolas Boizard
Kevin El Haddad
C´eline Hudelot
Pierre Colombo
68
14
0
28 Jan 2025
Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI
Yuya Asano
Sabit Hassan
P. Sharma
Anthony Sicilia
Katherine Atwell
Diane Litman
Malihe Alikhani
36
0
0
10 Jan 2025
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
Manan Suri
Puneet Mathur
Franck Dernoncourt
Kanika Goswami
Ryan Rossi
Dinesh Manocha
95
3
0
14 Dec 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
Jahyun Koo
Yerin Hwang
Yongil Kim
Taegwan Kang
Hyunkyung Bae
Kyomin Jung
40
0
0
25 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
65
5
0
22 Oct 2024
Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents
Sabit Hassan
Hye-Young Chung
Xiang Zhi Tan
Malihe Alikhani
47
0
0
18 Oct 2024
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks
Guibin Zhang
Yanwei Yue
Xiangguo Sun
Guancheng Wan
Miao Yu
Junfeng Fang
Kun Wang
Dawei Cheng
Dawei Cheng
AAML
AI4CE
46
7
0
15 Oct 2024
GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs
Yun Zhu
Haizhou Shi
Xiaotang Wang
Yongchao Liu
Yaoke Wang
Boci Peng
Chuntao Hong
Siliang Tang
VLM
43
6
0
14 Oct 2024
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
Hojae Lee
Junho Kim
SangKeun Lee
LRM
26
1
0
11 Oct 2024
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Boyu Gou
Ruohan Wang
Boyuan Zheng
Yanan Xie
Cheng Chang
Yiheng Shu
Huan Sun
Yu Su
LM&Ro
LLMAG
76
48
0
07 Oct 2024
Agent-Oriented Planning in Multi-Agent Systems
Ao Li
Yuexiang Xie
Songze Li
Fugee Tsung
Bolin Ding
Yaliang Li
AIFin
69
6
0
03 Oct 2024
Enhancing Screen Time Identification in Children with a Multi-View Vision Language Model and Screen Time Tracker
Xinlong Hou
Sen Shen
Xueshen Li
Xinran Gao
Ziyi Huang
Steven J. Holiday
Matthew R. Cribbet
Susan W. White
Edward Sazonov
Yu Gan
34
0
0
02 Oct 2024
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
Seanie Lee
Haebin Seong
Dong Bok Lee
Minki Kang
Xiaoyin Chen
Dominik Wagner
Yoshua Bengio
Juho Lee
Sung Ju Hwang
65
2
0
02 Oct 2024
Conversational Exploratory Search of Scholarly Publications Using Knowledge Graphs
Phillip Schneider
Florian Matthes
21
0
0
01 Oct 2024
Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling
Xinyue Fang
Zhen Huang
Zhiliang Tian
Minghui Fang
Ziyi Pan
Quntian Fang
Zhihua Wen
Hengyue Pan
Dongsheng Li
HILM
88
2
0
17 Sep 2024
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
43
24
0
19 Aug 2024
Surpassing Cosine Similarity for Multidimensional Comparisons: Dimension Insensitive Euclidean Metric
F. Tessari
Neville Hogan
Neville Hogan
24
3
0
11 Jul 2024
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
34
6
0
28 Jun 2024
ColPali: Efficient Document Retrieval with Vision Language Models
Manuel Faysse
Hugues Sibille
Tony Wu
Bilel Omrani
Gautier Viaud
C´eline Hudelot
Pierre Colombo
VLM
60
21
0
27 Jun 2024
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
Huanxuan Liao
Yao Xu
Shizhu He
Yuanzhe Zhang
Yanchao Hao
Shengping Liu
Kang Liu
Jun Zhao
39
1
0
18 Jun 2024
EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems
Mohammad Dehghan
Mohammad Ali Alomrani
Sunyam Bagga
David Alfonso-Hermelo
Khalil Bibi
...
Jimmy Lin
Boxing Chen
Prasanna Parthasarathi
Mahdi Biparva
Mehdi Rezagholizadeh
RALM
42
5
0
14 Jun 2024
BAKU: An Efficient Transformer for Multi-Task Policy Learning
Siddhant Haldar
Zhuoran Peng
Lerrel Pinto
OffRL
32
25
0
11 Jun 2024
World Models with Hints of Large Language Models for Goal Achieving
Zeyuan Liu
Ziyu Huan
Xiyao Wang
Jiafei Lyu
Jian Tao
Xiu Li
Furong Huang
Huazhe Xu
LM&Ro
LRM
AI4CE
29
1
0
11 Jun 2024
Hello Again! LLM-powered Personalized Agent for Long-term Dialogue
Hao Li
Chenghao Yang
An Zhang
Yang Deng
Xiang Wang
Tat-Seng Chua
LLMAG
74
24
0
09 Jun 2024
How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect
Siddhartha K. Vemuri
Raj Sanjay Shah
Sashank Varma
VLM
22
4
0
25 May 2024
Identifying Narrative Patterns and Outliers in Holocaust Testimonies Using Topic Modeling
Maxim Ifergan
Renana Keydar
Omri Abend
Amit Pinchevski
21
0
0
04 May 2024
Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
Xiao Chen
Sihang Zhou
K. Liang
Xinwang Liu
ReLM
LRM
27
2
0
14 Apr 2024
Measuring Bias in a Ranked List using Term-based Representations
Amin Abolghasemi
Leif Azzopardi
Arian Askari
Maarten de Rijke
Suzan Verberne
32
6
0
09 Mar 2024
Fast Vocabulary Transfer for Language Model Compression
Leonidas Gee
Andrea Zugarini
Leonardo Rigutini
Paolo Torroni
17
26
0
15 Feb 2024
PACE: A Pragmatic Agent for Enhancing Communication Efficiency Using Large Language Models
Jiaxuan Li
Minxi Yang
Dahua Gao
Wenlong Xu
Guangming Shi
28
0
0
30 Jan 2024
Find the Cliffhanger: Multi-Modal Trailerness in Soap Operas
Carlo Bretti
Pascal Mettes
Hendrik Vincent Koops
Daan Odijk
N. V. Noord
27
4
0
29 Jan 2024
1
2
3
Next