CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and RerankingInternational Conference on Learning Representations (ICLR), 2024 |
Model Editing for LLMs4Code: How Far are We?International Conference on Software Engineering (ICSE), 2024 |
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via
Model Fusion of Embedding ModelsApplied Informatics (AI), 2024 |
Efficient Pretraining Data Selection for Language Models via Multi-Actor CollaborationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like
Language ModelsInternational Conference on Learning Representations (ICLR), 2024 |
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model designNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 |