v1v2v3 (latest)

Are Labels Always Necessary for Classifier Accuracy Evaluation?

6 July 2020

Liang Zheng

Papers citing "Are Labels Always Necessary for Classifier Accuracy Evaluation?"

50 / 91 papers shown

Title
Volatility in Certainty (VC): A Metric for Detecting Adversarial Perturbations During Inference in Neural Network Classifiers Vahid Hemmati Ahmad Mohammadi Abdul-Rauf Nuhu Reza Ahmari Parham Kebria A. Homaifar AAML 61 0 0 14 Nov 2025
ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction Han Yu Kehan Li Dongbai Li Yue He Xingxuan Zhang Peng Cui OODD 278 0 0 31 Oct 2025
DISCO: Diversifying Sample Condensation for Efficient Model Evaluation Alexander Rubinstein Benjamin Raible Martin Gubri Seong Joon Oh ELM 315 0 1 09 Oct 2025
Confidence and Dispersity as Signals: Unsupervised Model Evaluation and Ranking Weijian Deng Weijie Tu Ibrahim Radwan Mohammad Abu Alsheikh Stephen Gould Liang Zheng 96 0 0 03 Oct 2025
ALSA: Anchors in Logit Space for Out-of-Distribution Accuracy Estimation Chenzhi Liu Mahsa Baktashmotlagh Yanran Tang Zi Huang Ruihong Qiu 64 0 0 27 Aug 2025
Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability Seungju Yoo Hyuk Kwon Joong-Won Hwang Kibok Lee 120 0 0 16 Aug 2025
ODD: Overlap-aware Estimation of Model Performance under Distribution ShiftConference on Uncertainty in Artificial Intelligence (UAI), 2025 Aayush Mishra Anqi Liu 120 1 0 17 Jun 2025
Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings Angéline Pouget Mohammad Yaghini Stephan Rabanser Nicolas Papernot 146 0 0 28 May 2025
Is Supervised Learning Really That Different from Unsupervised? Oskar Allerbo Thomas B. Schön OOD SSL 442 0 0 16 May 2025
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background PriorsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025 Yu Wang Junxian Mu Hongzhi Huang Qilong Wang Pengfei Zhu Q. Hu 436 5 0 22 Mar 2025
Early Stopping Against Label Noise Without Validation DataInternational Conference on Learning Representations (ICLR), 2025 Suqin Yuan Lei Feng Tongliang Liu NoLa 520 30 0 11 Feb 2025
Towards Unsupervised Model Selection for Domain Adaptive Object DetectionNeural Information Processing Systems (NeurIPS), 2024 Hengfu Yu Jinhong Deng Wen Li Lixin Duan 223 4 0 23 Dec 2024
Sequential Harmful Shift Detection Without LabelsNeural Information Processing Systems (NeurIPS), 2024 Salim I. Amoukou Tom Bewley Saumitra Mishra Freddy Lecue Daniele Magazzeni Manuela Veloso 228 5 0 17 Dec 2024
Can We Predict Performance of Large Models across Vision-Language Tasks? Qinyu Zhao Ming Xu Kartik Gupta Akshay Asthana Liang Zheng Stephen Gould 352 1 0 14 Oct 2024
Bias Assessment and Data Drift Detection in Medical Image Analysis: A Survey Andrea Prenner Bernhard Kainz 171 1 0 26 Sep 2024
Calibration of Network Confidence for Unsupervised Domain Adaptation Using Estimated Accuracy Coby Penso Jacob Goldberger 184 0 0 06 Sep 2024
AcTracer: Active Testing of Large Language Model via Multi-Stage SamplingACM Transactions on Software Engineering and Methodology (TOSEM), 2024 Yuheng Huang Yuheng Huang Qiang Hu Felix Juefei Xu Lei Ma 279 7 0 07 Aug 2024
Source-Free Domain-Invariant Performance PredictionEuropean Conference on Computer Vision (ECCV), 2024 Ekaterina Khramtsova Mahsa Baktashmotlagh Guido Zuccon Xi Wang Mathieu Salzmann UQCV 202 1 0 05 Aug 2024
What Does Softmax Probability Tell Us about Classifiers Ranking Across Diverse Test Conditions? Weijie Tu Weijian Deng Liang Zheng Tom Gedeon 273 4 0 14 Jun 2024
Assessing Model Generalization in Vicinity Yuchi Liu Yifan Sun Jingdong Wang Liang Zheng AAML 165 0 0 13 Jun 2024
A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation Riccardo Fogliato Pratik Patil Mathew Monfort Pietro Perona 175 6 0 11 Jun 2024
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts Renchunzi Xie Ambroise Odonnat Vasilii Feofanov Weijian Deng Jianfeng Zhang Bo An 297 3 0 29 May 2024
Temporal Generalization Estimation in Evolving Graphs Bin Lu Tingyan Ma Xiaoying Gan Xinbing Wang Yunqiang Zhu Cheng Zhou Shiyu Liang 153 3 0 07 Apr 2024
Predicting the Performance of Foundation Models via Agreement-on-the-LineNeural Information Processing Systems (NeurIPS), 2024 Aman Mehra Rahul Saxena Taeyoun Kim Christina Baek Zico Kolter Aditi Raghunathan UQCV 162 4 0 02 Apr 2024
FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models Lin Zhao Tianchen Zhao Zinan Lin Xuefei Ning Guohao Dai Huazhong Yang Yu Wang EGVM 185 13 0 25 Mar 2024
Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments Yang Yang Wenhai Wang Zhe Chen Jifeng Dai Liang Zheng 144 6 0 20 Mar 2024
Online GNN Evaluation Under Test-time Graph Distribution ShiftsInternational Conference on Learning Representations (ICLR), 2024 Xin-Yang Zheng Dongjin Song Qingsong Wen Bo Du Shirui Pan 134 13 0 15 Mar 2024
A Survey on Evaluation of Out-of-Distribution Generalization Han Yu Tianyu Wang Xingxuan Zhang Jiayun Wu Peng Cui OOD 249 17 0 04 Mar 2024
Domain-adaptive and Subgroup-specific Cascaded Temperature Regression for Out-of-distribution Calibration Jiexin Wang Jiahao Chen Fuchun Sun UQCV 135 1 0 14 Feb 2024
Mission Critical -- Satellite Data is a Distinct Modality in Machine Learning Esther Rolf Konstantin Klemmer Caleb Robinson Hannah Kerner 164 65 0 02 Feb 2024
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift Renchunzi Xie Ambroise Odonnat Vasilii Feofanov I. Redko Jianfeng Zhang Bo An UQCV 365 2 0 17 Jan 2024
Estimating Model Performance Under Covariate Shift Without Labels Jakub Bialek W. Kuberski W. Kuberski Nikolaos Perrakis 212 5 0 16 Jan 2024
Simple Transferability Estimation for Regression TasksConference on Uncertainty in Artificial Intelligence (UAI), 2023 Cuong N. Nguyen Phong Tran L. Ho Vu C. Dinh Anh Tran Tal Hassner Cuong V Nguyen 231 6 0 01 Dec 2023
MetaDefa: Meta-learning based on Domain Enhancement and Feature Alignment for Single Domain Generalization Can Sun Hao Zheng Zhigang Hu Liu Yang Meiguang Zheng Bo Xu 163 0 0 27 Nov 2023
GNNEvaluator: Evaluating GNN Performance On Unseen Graphs Without LabelsNeural Information Processing Systems (NeurIPS), 2023 Xin-Yang Zheng Miao Zhang C. Chen Soheila Molaei Chuan Zhou Shirui Pan GNN 236 19 0 23 Oct 2023
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift Lin Li Yifei Wang Chawin Sitawarin Michael W. Spratling 220 11 0 19 Oct 2023
On the Transferability of Learning Models for Semantic Segmentation for Remote Sensing Data Rongjun Qin Guixiang Zhang Yang Tang 170 3 0 16 Oct 2023
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization AnalysisInternational Conference on Learning Representations (ICLR), 2023 Chong You Xingjian Leng Zijian Wang Yang Yang Zi Huang Liang Zheng OOD 261 14 0 06 Oct 2023
AdaEvo: Edge-Assisted Continuous and Timely DNN Model Evolution for Mobile DevicesIEEE Transactions on Mobile Computing (IEEE TMC), 2023 Lehao Wang Zhiwen Yu Haoyi Yu Sicong Liu Yaxiong Xie Bin Guo Yunxin Liu 160 5 0 27 Sep 2023
PAGER: A Framework for Failure Analysis of Deep Regression Models Jayaraman J. Thiagarajan V. Narayanaswamy Puja Trivedi Rushil Anirudh 298 0 0 20 Sep 2023
Selecting which Dense Retriever to use for Zero-Shot Search Ekaterina Khramtsova Shengyao Zhuang Mahsa Baktashmotlagh Xi Wang Guido Zuccon 170 12 0 18 Sep 2023
Anchor Points: Benchmarking Models with Much Fewer ExamplesConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 Rajan Vivek Kawin Ethayarajh Diyi Yang Douwe Kiela ALM 276 41 0 14 Sep 2023
CAME: Contrastive Automated Model EvaluationIEEE International Conference on Computer Vision (ICCV), 2023 Ru Peng Qiuyang Duan Haobo Wang Jiachen Ma Yanbo Jiang Yongjun Tu Xiu Jiang Jiaqi Zhao ELM 199 8 0 22 Aug 2023
Distance Matters For Improving Performance Estimation Under Covariate Shift Mélanie Roschewitz Ben Glocker 146 1 0 14 Aug 2023
Unsupervised Accuracy Estimation of Deep Visual Models using Domain-Adaptive Adversarial Perturbation without Source SamplesIEEE International Conference on Computer Vision (ICCV), 2023 JoonHo Lee J. Woo H. Moon Kwonho Lee 187 4 0 19 Jul 2023
Validation of the Practicability of Logical Assessment Formula for Evaluations with Inaccurate Ground-Truth Labels Yongquan Yang Hong Bu 199 0 0 06 Jul 2023
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models Uddeshya Upadhyay Shyamgopal Karthik Goran Frehse Zeynep Akata MLLM VLM 404 5 0 01 Jul 2023
On Orderings of Probability Vectors and Unsupervised Performance Estimation Muhammad Maaz Rui Qiao Yiheng Zhou Renxian Zhang 180 0 0 16 Jun 2023
LOVM: Language-Only Vision Model SelectionNeural Information Processing Systems (NeurIPS), 2023 O. Zohar Shih-Cheng Huang Kuan-Chieh Wang Serena Yeung MLLM 178 16 0 15 Jun 2023
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement DiscrepancyNeural Information Processing Systems (NeurIPS), 2023 Elan Rosenfeld Saurabh Garg UQCV 130 12 0 01 Jun 2023