Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.10957
Cited By
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
25 February 2020
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers"
50 / 144 papers shown
Title
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
Yu Zhang
Mei Di
Haozheng Luo
Chenwei Xu
Richard Tzong-Han Tsai
54
7
0
22 Jan 2024
Knowledge Fusion of Large Language Models
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
27
61
0
19 Jan 2024
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
27
6
0
29 Dec 2023
Vulnerability Analysis of Transformer-based Optical Character Recognition to Adversarial Attacks
Lucas Beerens
D. Higham
21
1
0
28 Nov 2023
How Well Do Large Language Models Truly Ground?
Hyunji Lee
Se June Joo
Chaeeun Kim
Joel Jang
Doyoung Kim
Kyoung-Woon On
Minjoon Seo
HILM
21
6
0
15 Nov 2023
SAGE: Smart home Agent with Grounded Execution
D. Rivkin
F. Hogan
Amal Feriani
Abhisek Konar
Adam Sigal
Steve Liu
Gregory Dudek
LM&Ro
LLMAG
ELM
LRM
18
3
0
01 Nov 2023
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Morris Alper
Hadar Averbuch-Elor
28
10
0
25 Oct 2023
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Ruida Wang
Wangchunshu Zhou
Mrinmaya Sachan
19
32
0
20 Oct 2023
IDTraffickers: An Authorship Attribution Dataset to link and connect Potential Human-Trafficking Operations on Text Escort Advertisements
V. Saxena
Benjamin Bashpole
Gijs Van Dijck
Gerasimos Spanakis
37
2
0
09 Oct 2023
Papeos: Augmenting Research Papers with Talk Videos
Tae Soo Kim
Matt Latzke
Jonathan Bragg
Amy X. Zhang
Joseph Chee Chang
14
10
0
29 Aug 2023
Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models
Seungcheol Park
Ho-Jin Choi
U. Kang
VLM
25
5
0
07 Aug 2023
Towards General Text Embeddings with Multi-stage Contrastive Learning
Zehan Li
Xin Zhang
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
47
336
0
07 Aug 2023
DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications
Adam Ivankay
Mattia Rigotti
P. Frossard
OOD
MedIm
16
1
0
05 Jul 2023
GIO: Gradient Information Optimization for Training Dataset Selection
Dante Everaert
Christopher Potts
19
3
0
20 Jun 2023
Scalable Performance Analysis for Vision-Language Models
Santiago Castro
Oana Ignat
Rada Mihalcea
VLM
19
1
0
30 May 2023
Referral Augmentation for Zero-Shot Information Retrieval
Michael Tang
Shunyu Yao
John Yang
Karthik Narasimhan
19
3
0
24 May 2023
Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph
Jennifer D'Souza
Moussab Hrou
Sören Auer
19
2
0
22 May 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang
Wei Zhou
Qi Zhang
Jie Zhou
Songyang Gao
Junzhe Wang
Menghan Zhang
Xiang Gao
Yunwen Chen
Tao Gui
34
7
0
22 May 2023
MemoryBank: Enhancing Large Language Models with Long-Term Memory
Wanjun Zhong
Lianghong Guo
Qi-Fei Gao
He Ye
Yanlin Wang
LLMAG
RALM
KELM
28
123
0
17 May 2023
Curating corpora with classifiers: A case study of clean energy sentiment online
M. V. Arnold
P. Dodds
C. Danforth
16
0
0
04 May 2023
RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Jiahao Wang
Songyang Zhang
Yong Liu
Taiqiang Wu
Yujiu Yang
Xihui Liu
Kai-xiang Chen
Ping Luo
Dahua Lin
11
20
0
12 Apr 2023
Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond
Ensheng Shi
Yanlin Wang
Hongyu Zhang
Lun Du
Shi Han
Dongmei Zhang
Hongbin Sun
28
41
0
11 Apr 2023
On Codex Prompt Engineering for OCL Generation: An Empirical Study
Seif Abukhalaf
Mohammad Hamdaqa
Foutse Khomh
26
20
0
28 Mar 2023
The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces
Kyle Lo
Joseph Chee Chang
Andrew Head
Jonathan Bragg
Amy X. Zhang
...
Caroline M Wu
Jiangjiang Yang
Angele Zamarron
Marti A. Hearst
Daniel S. Weld
19
19
0
25 Mar 2023
VideoXum: Cross-modal Visual and Textural Summarization of Videos
Jingyang Lin
Hang Hua
Ming Chen
Yikang Li
Jenhao Hsiao
C. Ho
Jiebo Luo
23
30
0
21 Mar 2023
Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models
Aashka Trivedi
Takuma Udagawa
Michele Merler
Rameswar Panda
Yousef El-Kurdi
Bishwaranjan Bhattacharjee
16
6
0
16 Mar 2023
Smooth and Stepwise Self-Distillation for Object Detection
Jieren Deng
Xiaoxia Zhou
Hao Tian
Zhihong Pan
Derek Aguiar
ObjD
13
0
0
09 Mar 2023
KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer
Kaikai Zhao
Norimichi Ukita
MU
23
1
0
22 Feb 2023
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Chen Liang
Haoming Jiang
Zheng Li
Xianfeng Tang
Bin Yin
Tuo Zhao
VLM
16
24
0
19 Feb 2023
Few-shot Multimodal Multitask Multilingual Learning
Aman Chadha
Vinija Jain
34
0
0
19 Feb 2023
Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective
Jongwoo Ko
Seungjoon Park
Minchan Jeong
S. Hong
Euijai Ahn
Duhyeuk Chang
Se-Young Yun
21
6
0
03 Feb 2023
idT5: Indonesian Version of Multilingual T5 Transformer
Mukhlish Fuadi
A. Wibawa
S. Sumpeno
6
6
0
02 Feb 2023
Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection
Chenglong Wang
Yi Lu
Yongyu Mu
Yimin Hu
Tong Xiao
Jingbo Zhu
24
8
0
01 Feb 2023
Music Playlist Title Generation Using Artist Information
Haven Kim
Seungheon Doh
Junwon Lee
Juhan Nam
24
3
0
14 Jan 2023
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
Leonid Boytsov
Preksha Patel
Vivek Sourabh
Riddhi Nisar
Sayan Kundu
R. Ramanathan
Eric Nyberg
19
19
0
08 Jan 2023
Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt
Zhichao Yang
Sunjae Kwon
Zonghai Yao
Hongfeng Yu
13
17
0
24 Nov 2022
MGTCOM: Community Detection in Multimodal Graphs
E. Dmitriev
M. Chekol
S. Wang
25
0
0
10 Nov 2022
Gradient Knowledge Distillation for Pre-trained Language Models
Lean Wang
Lei Li
Xu Sun
VLM
16
5
0
02 Nov 2022
Multimodal Transformer Distillation for Audio-Visual Synchronization
Xuan-Bo Chen
Haibin Wu
Chung-Che Wang
Hung-yi Lee
J. Jang
19
3
0
27 Oct 2022
MTEB: Massive Text Embedding Benchmark
Niklas Muennighoff
Nouamane Tazi
L. Magne
Nils Reimers
21
369
0
13 Oct 2022
Short Text Pre-training with Extended Token Classification for E-commerce Query Understanding
Haoming Jiang
Tianyu Cao
Zheng Li
Cheng-hsin Luo
Xianfeng Tang
Qingyu Yin
Danqing Zhang
R. Goutam
Bing Yin
RALM
16
11
0
08 Oct 2022
An Embedding-Based Grocery Search Model at Instacart
Yuqing Xie
Taesik Na
X. Xiao
Saurav Manchanda
Young Rao
Zhihong Xu
Guanghua Shu
Esther Vasiete
Tejaswi Tenneti
Haixun Wang
DML
RALM
16
6
0
12 Sep 2022
Towards explainable evaluation of language models on the semantic similarity of visual concepts
Maria Lymperaiou
George Manoliadis
Orfeas Menis-Mastromichalakis
Edmund Dervakos
Giorgos Stamou
AAML
16
5
0
08 Sep 2022
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Waiman Si
Michael Backes
Jeremy Blackburn
Emiliano De Cristofaro
Gianluca Stringhini
Savvas Zannettou
Yang Zhang
19
58
0
07 Sep 2022
Evaluating Dense Passage Retrieval using Transformers
Nima Sadri
11
0
0
15 Aug 2022
Threddy: An Interactive System for Personalized Thread-based Exploration and Organization of Scientific Literature
Hyeonsu B Kang
Joseph Chee Chang
Yongsung Kim
A. Kittur
22
39
0
06 Aug 2022
Towards trustworthy Energy Disaggregation: A review of challenges, methods and perspectives for Non-Intrusive Load Monitoring
Maria Kaselimi
Eftychios E. Protopapadakis
A. Voulodimos
N. Doulamis
Anastasios Doulamis
20
64
0
05 Jul 2022
An End-to-End Set Transformer for User-Level Classification of Depression and Gambling Disorder
Ana-Maria Bucur
Adrian Cosma
Liviu P. Dinu
Paolo Rosso
23
8
0
02 Jul 2022
Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm
Aidan Good
Jia-Huei Lin
Hannah Sieg
Mikey Ferguson
Xin Yu
Shandian Zhe
J. Wieczorek
Thiago Serra
14
11
0
07 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLM
MQ
34
438
0
04 Jun 2022
Previous
1
2
3
Next