Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.07316
Cited By
MTEB: Massive Text Embedding Benchmark
13 October 2022
Niklas Muennighoff
Nouamane Tazi
L. Magne
Nils Reimers
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MTEB: Massive Text Embedding Benchmark"
44 / 44 papers shown
Title
Revealing economic facts: LLMs know more than they say
Marcus Buckmann
Quynh Anh Nguyen
Edward Hill
16
0
0
13 May 2025
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Ruikun Hou
B. Bühler
Tim Fütterer
Efe Bozkir
Peter Gerjets
Ulrich Trautwein
Enkelejda Kasneci
19
0
0
12 May 2025
GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets
Mingqian He
Fei Zhao
Chonggang Lu
Z. Liu
Y. Wang
Haofu Qian
OffRL
AI4TS
VLM
64
0
0
28 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
0
0
26 Apr 2025
EvidenceBench: A Benchmark for Extracting Evidence from Biomedical Papers
J. Wang
Weili Cao
Kaicheng Wang
Xiaoyue Wang
Ashish Dalvi
...
David E. Neal
Maxim Khan
Christopher D. Rosin
R. Paturi
Leon Bergen
21
0
0
25 Apr 2025
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Yifei Duan
Raphael Shang
Deng Liang
Yongqiang Cai
80
0
0
28 Feb 2025
Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs
Christoph Schuhmann
Gollam Rabby
Ameya Prabhu
Tawsif Ahmed
Andreas Hochlehnert
...
Ludwig Schmidt
R. Kaczmarczyk
Sören Auer
J. Jitsev
Matthias Bethge
82
0
0
26 Feb 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
60
0
0
17 Feb 2025
Comply: Learning Sentences with Complex Weights inspired by Fruit Fly Olfaction
Alexei Figueroa
Justus Westerhoff
Golzar Atefi
Dennis Fast
B. Winter
Felix Alexader Gers
Alexander Loser
Wolfang Nejdl
50
0
0
03 Feb 2025
QuIM-RAG: Advancing Retrieval-Augmented Generation with Inverted Question Matching for Enhanced QA Performance
Binita Saha
Utsha Saha
Muhammad Zubair Malik
RALM
3DV
56
2
0
06 Jan 2025
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Ziyan Jiang
Rui Meng
Xinyi Yang
Semih Yavuz
Yingbo Zhou
Wenhu Chen
MLLM
VLM
51
18
0
03 Jan 2025
ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain
Ali Shiraee Kasmaee
Mohammad Khodadad
Mohammad Arshi Saloot
Nick Sherck
Stephen Dokas
H. Mahyar
Soheila Samiee
ELM
89
0
0
30 Nov 2024
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers
Chancharik Mitra
Brandon Huang
Tianning Chai
Zhiqiu Lin
Assaf Arbelle
Rogerio Feris
Leonid Karlinsky
Trevor Darrell
Deva Ramanan
Roei Herzig
VLM
119
4
0
28 Nov 2024
MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs
Sheng-Chieh Lin
Chankyu Lee
M. Shoeybi
Jimmy J. Lin
Bryan Catanzaro
Wei Ping
57
10
0
04 Nov 2024
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Haotian Tang
Yecheng Wu
Shang Yang
Enze Xie
Junsong Chen
Junyu Chen
Zhuoyang Zhang
Han Cai
Y. Lu
Song Han
61
32
0
14 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
90
3
2
10 Oct 2024
LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge Graphs
Vincent Emonet
Jerven Bolleman
Severine Duvaud
T. M. Farias
A. Sima
RALM
21
3
0
08 Oct 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
34
3
0
30 Sep 2024
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
Gentiana Rashiti
G. Karunaratne
Mrinmaya Sachan
Abu Sebastian
Abbas Rahimi
RALM
32
0
0
12 Sep 2024
Extracting Paragraphs from LLM Token Activations
Nicholas Pochinkov
Angelo Benoit
Lovkush Agarwal
Zainab Ali Majid
Lucile Ter-Minassian
30
1
0
10 Sep 2024
RAGent: Retrieval-based Access Control Policy Generation
Sakuna Jayasundara
N. Arachchilage
Giovanni Russello
44
1
0
08 Sep 2024
Cost-Effective Hallucination Detection for LLMs
Simon Valentin
Jinmiao Fu
Gianluca Detommaso
Shaoyuan Xu
Giovanni Zappella
Bryan Wang
HILM
33
4
0
31 Jul 2024
Towards the Terminator Economy: Assessing Job Exposure to AI through LLMs
Emilio Colombo
Fabio Mercorio
Mario Mezzanzanica
Antonio Serino
36
1
0
27 Jul 2024
CoIR: A Comprehensive Benchmark for Code Information Retrieval Models
Xiangyang Li
Kuicai Dong
Yi Quan Lee
Wei Xia
Yichun Yin
Xinyi Dai
Yasheng Wang
Ruiming Tang
47
13
0
03 Jul 2024
MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications
Irene Siragusa
Salvatore Contino
Massimo La Ciura
Rosario Alicata
Roberto Pirrone
52
3
0
03 Jul 2024
When Search Engine Services meet Large Language Models: Visions and Challenges
Haoyi Xiong
Jiang Bian
Yuchen Li
Xuhong Li
Mengnan Du
Shuaiqiang Wang
Dawei Yin
Sumi Helal
43
28
0
28 Jun 2024
CodeRAG-Bench: Can Retrieval Augment Code Generation?
Zora Zhiruo Wang
Akari Asai
Xinyan Velocity Yu
Frank F. Xu
Yiqing Xie
Graham Neubig
Daniel Fried
RALM
67
29
0
20 Jun 2024
Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment
Joseph Liu
Mahesh Kumar Nandwana
Janne Pylkkönen
Hannes Heikinheimo
Morgan McGuire
27
0
0
14 Jun 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
M. Shoeybi
Bryan Catanzaro
Wei Ping
RALM
36
132
0
27 May 2024
Theoretical Analysis of Weak-to-Strong Generalization
Hunter Lang
David Sontag
Aravindan Vijayaraghavan
19
19
0
25 May 2024
Linguistic Changes in Spontaneous Speech for Detecting Parkinsons Disease Using Large Language Models
Jonathan Crawford
36
0
0
08 Apr 2024
Gecko: Versatile Text Embeddings Distilled from Large Language Models
Jinhyuk Lee
Zhuyun Dai
Xiaoqi Ren
Blair Chen
Daniel Matthew Cer
...
Aditya Kusupati
Prateek Jain
Siddhartha Reddy Jonnalagadda
Ming-Wei Chang
Iftekhar Naim
RALM
VLM
SyDa
27
40
0
29 Mar 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
22
59
0
08 Feb 2024
RETSim: Resilient and Efficient Text Similarity
Marina Zhang
Owen Vallis
Aysegul Bumin
Tanay Vakharia
Elie Bursztein
8
1
0
28 Nov 2023
Text Embeddings Reveal (Almost) As Much As Text
John X. Morris
Volodymyr Kuleshov
Vitaly Shmatikov
Alexander M. Rush
RALM
18
89
0
10 Oct 2023
Embed-Search-Align: DNA Sequence Alignment using Transformer Models
Pavan Holur
K. Enevoldsen
Shreyas Rajesh
L. Mboning
Thalia Georgiou
Louis-S. Bouchard
Matteo Pellegrini
V. Roychowdhury
8
0
0
20 Sep 2023
RaLLe: A Framework for Developing and Evaluating Retrieval-Augmented Large Language Models
Yasuto Hoshi
Daisuke Miyashita
Youyang Ng
Kento Tatsuno
Yasuhiro Morioka
Osamu Torii
J. Deguchi
LRM
19
11
0
21 Aug 2023
Description-Based Text Similarity
Shauli Ravfogel
Valentina Pyatkin
Amir D. N. Cohen
Avshalom Manevich
Yoav Goldberg
12
5
0
21 May 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
27
2,297
0
09 Nov 2022
Text and Code Embeddings by Contrastive Pre-Training
Arvind Neelakantan
Tao Xu
Raul Puri
Alec Radford
Jesse Michael Han
...
Tabarak Khan
Toki Sherbakov
Joanne Jang
Peter Welinder
Lilian Weng
SSL
AI4TS
204
412
0
24 Jan 2022
Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval
Luyu Gao
Jamie Callan
RALM
152
326
0
12 Aug 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
229
961
0
17 Apr 2021
Efficient Intent Detection with Dual Sentence Encoders
I. Casanueva
Tadas Temvcinas
D. Gerz
Matthew Henderson
Ivan Vulić
VLM
167
444
0
10 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1