Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling

14 April 2021

Papers citing "Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling"

50 / 195 papers shown

Title
Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition Zheng Yao Shuai Wang Guido Zuccon 21 0 0 12 May 2025
QBR: A Question-Bank-Based Approach to Fine-Grained Legal Knowledge Retrieval for the General Public Mingruo Yuan Ben Kao Tien-Hsuan Wu AILaw 79 0 0 08 May 2025
Rational Retrieval Acts: Leveraging Pragmatic Reasoning to Improve Sparse Retrieval Arthur Satouf Gabriel Ben Zenou Benjamin Piwowarski Habiboulaye Amadou Boubacar Pablo Piantanida 32 0 0 06 May 2025
Interpreting Multilingual and Document-Length Sensitive Relevance Computations in Neural Retrieval Models through Axiomatic Causal Interventions Oliver Savolainen Dur e Najaf Amjad Roxana Petcu AAML 40 0 0 04 May 2025
Effective Inference-Free Retrieval for Learned Sparse Representations F. M. Nardini Thong Nguyen Cosimo Rulli Rossano Venturini Andrew Yates RALM 50 0 0 30 Apr 2025
Unsupervised Corpus Poisoning Attacks in Continuous Space for Dense Retrieval Yongkang Li Panagiotis Eustratiadis Simon Lupart Evangelos Kanoulas AAML 53 0 0 24 Apr 2025
$Exploring $\ell_0$ Sparsification for Inference-free Sparse Retrievers$ Exploring $\ell_0$ Sparsification for Inference-free Sparse Retrievers Xinjie Shen Zhichao Geng Yang Yang 29 0 0 21 Apr 2025
Towards Lossless Token Pruning in Late-Interaction Retrieval Models Yuxuan Zong Benjamin Piwowarski 51 0 0 17 Apr 2025
Enhancing Document Retrieval for Curating N-ary Relations in Knowledge Bases Xing David Wang Ulf Leser 31 0 0 14 Apr 2025
Breaking the Lens of the Telescope: Online Relevance Estimation over Large Retrieval Sets Mandeep Rathee Venktesh V Sean MacAvaney Avishek Anand KELM 37 1 0 12 Apr 2025
Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling Hengran Zhang Keping Bi Jiafeng Guo Xiaojie Sun Shihao Liu Daiting Shi Dawei Yin Xueqi Cheng RALM 248 0 0 07 Apr 2025
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking Chris Samarinas Hamed Zamani ALM LRM 74 1 0 04 Apr 2025
LLM-VPRF: Large Language Model Based Vector Pseudo Relevance Feedback Hang Li Shengyao Zhuang Bevan Koopman Guido Zuccon VLM 46 0 0 02 Apr 2025
Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance Reza Esfandiarpoor George Zerveas Ruochen Zhang Macton Mgonzo Carsten Eickhoff Stephen H. Bach SyDa 52 0 0 29 Mar 2025
Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers Francesca Pezzuti Sean MacAvaney Nicola Tonellotto 36 0 0 28 Mar 2025
EqualizeIR: Mitigating Linguistic Biases in Retrieval Models Jiali Cheng Hadi Amiri 43 1 0 22 Mar 2025
Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents Haoyu Wang Sunhao Dai Haiyuan Zhao Liang Pang Xiao Zhang Gang Wang Zhenhua Dong Jun Xu Zhicheng Dou 72 2 0 11 Mar 2025
Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation Manveer Singh Tamber Suleman Kazi Vivek Sourabh Jimmy J. Lin 71 1 0 27 Feb 2025
Hierarchical corpus encoder: Fusing generative retrieval and dense indices Tongfei Chen Ankita Sharma Adam Pauls Benjamin Van Durme RALM 57 1 0 26 Feb 2025
Scaling Sparse and Dense Retrieval in Decoder-Only LLMs Hansi Zeng Julian Killingback Hamed Zamani RALM 78 2 0 24 Feb 2025
A Survey of Model Architectures in Information Retrieval Zhichao Xu Fengran Mo Zhiqi Huang Crystina Zhang Puxuan Yu Bei Wang Jimmy J. Lin Vivek Srikumar KELM 3DV 75 2 0 21 Feb 2025
FactIR: A Real-World Zero-shot Open-Domain Retrieval Benchmark for Fact-Checking Venktesh V Vinay Setty HILM 56 0 0 09 Feb 2025
Hypencoder: Hypernetworks for Information Retrieval Julian Killingback Hansi Zeng Hamed Zamani 112 1 0 07 Feb 2025
Hierarchical Multi-field Representations for Two-Stage E-commerce Retrieval Niklas Freymuth Dong Liu Thomas Ricatte Saab Mansour 78 0 0 30 Jan 2025
GeAR: Generation Augmented Retrieval Haoyu Liu Shaohan Huang Jianfeng Liu Yuefeng Zhan H. Sun Weiwei Deng Feng Sun Furu Wei Qi Zhang 49 1 0 06 Jan 2025
Boosting LLM-based Relevance Modeling with Distribution-Aware Robust Learning Hong Liu Saisai Gong Yixin Ji Kaixin Wu Jia Xu Jinjie Gu 17 1 0 17 Dec 2024
PTR: Precision-Driven Tool Recommendation for Large Language Models Hang Gao Yongfeng Zhang KELM 51 0 0 14 Nov 2024
Neural Corrective Machine Unranking Jingrui Hou Axel Finke Georgina Cosma MU 42 1 0 13 Nov 2024
Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers Zhichao Geng Dongyu Ru Yang Yang 25 1 0 07 Nov 2024
Link, Synthesize, Retrieve: Universal Document Linking for Zero-Shot Information Retrieval Dae Yon Hwang Bilal Taha Harshit Pande Yaroslav Nechaev SyDa 33 0 0 24 Oct 2024
Contextual Document Embeddings John X. Morris Alexander M. Rush 34 8 0 03 Oct 2024
Elaborative Subtopic Query Reformulation for Broad and Indirect Queries in Travel Destination Recommendation Qianfeng Wen Yifan Liu Joshua Zhang George Saad Anton Korikov Yury Sambale Scott Sanner 36 3 0 02 Oct 2024
QAEncoder: Towards Aligned Representation Learning in Question Answering System Zhengren Wang Qinhan Yu Shida Wei Zhiyu Li Zhiyu Li Xiaoxing Wang Pengnian Qi Hao Liang Wentao Zhang RALM 35 1 0 30 Sep 2024
ASTRA: Accurate and Scalable ANNS-based Training of Extreme Classifiers Sonu Mehta Jayashree Mohan Nagarajan Natarajan Ramachandran Ramjee Manik Varma 31 0 0 30 Sep 2024
Few-shot Prompting for Pairwise Ranking: An Effective Non-Parametric Retrieval Model Nilanjan Sinhababu Andrew Parry Debasis Ganguly D. Samanta Pabitra Mitra 41 3 0 26 Sep 2024
Disentangling Questions from Query Generation for Task-Adaptive Retrieval Yoonsang Lee Minsoo Kim Seung-won Hwang RALM 28 0 0 25 Sep 2024
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely Siyun Zhao Yuqing Yang Zilong Wang Zhiyuan He Luna Qiu Lili Qiu SyDa RALM 3DV 51 36 0 23 Sep 2024
Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain Antoine Louis Gijs van Dijck Gerasimos Spanakis 39 0 0 02 Sep 2024
Mistral-SPLADE: LLMs for better Learned Sparse Retrieval Meet Doshi Vishwajeet Kumar Rudra Murthy Vignesh P Jaydeep Sen RALM 43 2 0 20 Aug 2024
Synergistic Approach for Simultaneous Optimization of Monolingual, Cross-lingual, and Multilingual Information Retrieval Adel Elmahdy Sheng-Chieh Lin Amin Ahmad 47 2 0 20 Aug 2024
W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering Jinming Nian Zhiyuan Peng Qifan Wang Yi Fang RALM 78 2 0 15 Aug 2024
Neural Machine Unranking Jingrui Hou Axel Finke Georgina Cosma MU 38 0 0 09 Aug 2024
Embedding And Clustering Your Data Can Improve Contrastive Pretraining Luke Merrick 18 3 0 26 Jul 2024
How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions Bojana Bašaragin Adela Ljajić Darija Medvecki Lorenzo Cassano Milos Kosprdic Nikola Milosevic LM&MA 45 3 0 06 Jul 2024
BERGEN: A Benchmarking Library for Retrieval-Augmented Generation David Rau Hervé Déjean Nadezhda Chirkova Thibault Formal Shuai Wang Vassilina Nikoulina S. Clinchant 47 12 0 01 Jul 2024
Preserving Multilingual Quality While Tuning Query Encoder on English Only Oleg V. Vasilyev Randy Sawaya John Bohannon 37 1 0 01 Jul 2024
DEXTER: A Benchmark for open-domain Complex Question Answering using LLMs Venktesh V. Deepali Prabhu Avishek Anand RALM CoGe 41 3 0 24 Jun 2024
Tool Learning with Large Language Models: A Survey Changle Qu Sunhao Dai Xiaochi Wei Hengyi Cai Shuaiqiang Wang Dawei Yin Jun Xu Jirong Wen LLMAG 41 87 0 28 May 2024
Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration Sunhao Dai Weihao Liu Yuqi Zhou Liang Pang Rongju Ruan Gang Wang Zhenhua Dong Jun Xu Jirong Wen 59 8 0 26 May 2024
Retrieval-Augmented Conversational Recommendation with Prompt-based Semi-Structured Natural Language State Tracking Sara Kemper Justin Cui Kai Dicarlantonio Kathy Lin Danjie Tang Anton Korikov Scott Sanner 35 11 0 25 May 2024