ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.02915
  4. Cited By
Are Labels Always Necessary for Classifier Accuracy Evaluation?
v1v2v3 (latest)

Are Labels Always Necessary for Classifier Accuracy Evaluation?

6 July 2020
Weijian Deng
Liang Zheng
ArXiv (abs)PDFHTML

Papers citing "Are Labels Always Necessary for Classifier Accuracy Evaluation?"

50 / 91 papers shown
Title
Volatility in Certainty (VC): A Metric for Detecting Adversarial Perturbations During Inference in Neural Network Classifiers
Volatility in Certainty (VC): A Metric for Detecting Adversarial Perturbations During Inference in Neural Network Classifiers
Vahid Hemmati
Ahmad Mohammadi
Abdul-Rauf Nuhu
Reza Ahmari
Parham Kebria
A. Homaifar
AAML
61
0
0
14 Nov 2025
ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction
ODP-Bench: Benchmarking Out-of-Distribution Performance Prediction
Han Yu
Kehan Li
Dongbai Li
Yue He
Xingxuan Zhang
Peng Cui
OODD
278
0
0
31 Oct 2025
DISCO: Diversifying Sample Condensation for Efficient Model Evaluation
DISCO: Diversifying Sample Condensation for Efficient Model Evaluation
Alexander Rubinstein
Benjamin Raible
Martin Gubri
Seong Joon Oh
ELM
315
0
1
09 Oct 2025
Confidence and Dispersity as Signals: Unsupervised Model Evaluation and Ranking
Confidence and Dispersity as Signals: Unsupervised Model Evaluation and Ranking
Weijian Deng
Weijie Tu
Ibrahim Radwan
Mohammad Abu Alsheikh
Stephen Gould
Liang Zheng
96
0
0
03 Oct 2025
ALSA: Anchors in Logit Space for Out-of-Distribution Accuracy Estimation
ALSA: Anchors in Logit Space for Out-of-Distribution Accuracy Estimation
Chenzhi Liu
Mahsa Baktashmotlagh
Yanran Tang
Zi Huang
Ruihong Qiu
64
0
0
27 Aug 2025
Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability
Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability
Seungju Yoo
Hyuk Kwon
Joong-Won Hwang
Kibok Lee
120
0
0
16 Aug 2025
ODD: Overlap-aware Estimation of Model Performance under Distribution Shift
ODD: Overlap-aware Estimation of Model Performance under Distribution ShiftConference on Uncertainty in Artificial Intelligence (UAI), 2025
Aayush Mishra
Anqi Liu
120
1
0
17 Jun 2025
Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings
Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings
Angéline Pouget
Mohammad Yaghini
Stephan Rabanser
Nicolas Papernot
146
0
0
28 May 2025
Is Supervised Learning Really That Different from Unsupervised?
Is Supervised Learning Really That Different from Unsupervised?
Oskar Allerbo
Thomas B. Schön
OODSSL
442
0
0
16 May 2025
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background PriorsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yu Wang
Junxian Mu
Hongzhi Huang
Qilong Wang
Pengfei Zhu
Q. Hu
436
5
0
22 Mar 2025
Early Stopping Against Label Noise Without Validation Data
Early Stopping Against Label Noise Without Validation DataInternational Conference on Learning Representations (ICLR), 2025
Suqin Yuan
Lei Feng
Tongliang Liu
NoLa
520
30
0
11 Feb 2025
Towards Unsupervised Model Selection for Domain Adaptive Object
  Detection
Towards Unsupervised Model Selection for Domain Adaptive Object DetectionNeural Information Processing Systems (NeurIPS), 2024
Hengfu Yu
Jinhong Deng
Wen Li
Lixin Duan
223
4
0
23 Dec 2024
Sequential Harmful Shift Detection Without Labels
Sequential Harmful Shift Detection Without LabelsNeural Information Processing Systems (NeurIPS), 2024
Salim I. Amoukou
Tom Bewley
Saumitra Mishra
Freddy Lecue
Daniele Magazzeni
Manuela Veloso
228
5
0
17 Dec 2024
Can We Predict Performance of Large Models across Vision-Language Tasks?
Can We Predict Performance of Large Models across Vision-Language Tasks?
Qinyu Zhao
Ming Xu
Kartik Gupta
Akshay Asthana
Liang Zheng
Stephen Gould
352
1
0
14 Oct 2024
Bias Assessment and Data Drift Detection in Medical Image Analysis: A
  Survey
Bias Assessment and Data Drift Detection in Medical Image Analysis: A Survey
Andrea Prenner
Bernhard Kainz
171
1
0
26 Sep 2024
Calibration of Network Confidence for Unsupervised Domain Adaptation
  Using Estimated Accuracy
Calibration of Network Confidence for Unsupervised Domain Adaptation Using Estimated Accuracy
Coby Penso
Jacob Goldberger
184
0
0
06 Sep 2024
AcTracer: Active Testing of Large Language Model via Multi-Stage Sampling
AcTracer: Active Testing of Large Language Model via Multi-Stage SamplingACM Transactions on Software Engineering and Methodology (TOSEM), 2024
Yuheng Huang
Yuheng Huang
Qiang Hu
Felix Juefei Xu
Lei Ma
279
7
0
07 Aug 2024
Source-Free Domain-Invariant Performance Prediction
Source-Free Domain-Invariant Performance PredictionEuropean Conference on Computer Vision (ECCV), 2024
Ekaterina Khramtsova
Mahsa Baktashmotlagh
Guido Zuccon
Xi Wang
Mathieu Salzmann
UQCV
202
1
0
05 Aug 2024
What Does Softmax Probability Tell Us about Classifiers Ranking Across
  Diverse Test Conditions?
What Does Softmax Probability Tell Us about Classifiers Ranking Across Diverse Test Conditions?
Weijie Tu
Weijian Deng
Liang Zheng
Tom Gedeon
273
4
0
14 Jun 2024
Assessing Model Generalization in Vicinity
Assessing Model Generalization in Vicinity
Yuchi Liu
Yifan Sun
Jingdong Wang
Liang Zheng
AAML
165
0
0
13 Jun 2024
A Framework for Efficient Model Evaluation through Stratification,
  Sampling, and Estimation
A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation
Riccardo Fogliato
Pratik Patil
Mathew Monfort
Pietro Perona
175
6
0
11 Jun 2024
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under
  Distribution Shifts
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
Weijian Deng
Jianfeng Zhang
Bo An
297
3
0
29 May 2024
Temporal Generalization Estimation in Evolving Graphs
Temporal Generalization Estimation in Evolving Graphs
Bin Lu
Tingyan Ma
Xiaoying Gan
Xinbing Wang
Yunqiang Zhu
Cheng Zhou
Shiyu Liang
153
3
0
07 Apr 2024
Predicting the Performance of Foundation Models via
  Agreement-on-the-Line
Predicting the Performance of Foundation Models via Agreement-on-the-LineNeural Information Processing Systems (NeurIPS), 2024
Aman Mehra
Rahul Saxena
Taeyoun Kim
Christina Baek
Zico Kolter
Aditi Raghunathan
UQCV
162
4
0
02 Apr 2024
FlashEval: Towards Fast and Accurate Evaluation of Text-to-image
  Diffusion Generative Models
FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models
Lin Zhao
Tianchen Zhao
Zinan Lin
Xuefei Ning
Guohao Dai
Huazhong Yang
Yu Wang
EGVM
185
13
0
25 Mar 2024
Bounding Box Stability against Feature Dropout Reflects Detector
  Generalization across Environments
Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments
Yang Yang
Wenhai Wang
Zhe Chen
Jifeng Dai
Liang Zheng
144
6
0
20 Mar 2024
Online GNN Evaluation Under Test-time Graph Distribution Shifts
Online GNN Evaluation Under Test-time Graph Distribution ShiftsInternational Conference on Learning Representations (ICLR), 2024
Xin-Yang Zheng
Dongjin Song
Qingsong Wen
Bo Du
Shirui Pan
134
13
0
15 Mar 2024
A Survey on Evaluation of Out-of-Distribution Generalization
A Survey on Evaluation of Out-of-Distribution Generalization
Han Yu
Tianyu Wang
Xingxuan Zhang
Jiayun Wu
Peng Cui
OOD
249
17
0
04 Mar 2024
Domain-adaptive and Subgroup-specific Cascaded Temperature Regression
  for Out-of-distribution Calibration
Domain-adaptive and Subgroup-specific Cascaded Temperature Regression for Out-of-distribution Calibration
Jiexin Wang
Jiahao Chen
Fuchun Sun
UQCV
135
1
0
14 Feb 2024
Mission Critical -- Satellite Data is a Distinct Modality in Machine
  Learning
Mission Critical -- Satellite Data is a Distinct Modality in Machine Learning
Esther Rolf
Konstantin Klemmer
Caleb Robinson
Hannah Kerner
164
65
0
02 Feb 2024
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
I. Redko
Jianfeng Zhang
Bo An
UQCV
365
2
0
17 Jan 2024
Estimating Model Performance Under Covariate Shift Without Labels
Estimating Model Performance Under Covariate Shift Without Labels
Jakub Bialek
W. Kuberski
W. Kuberski
Nikolaos Perrakis
212
5
0
16 Jan 2024
Simple Transferability Estimation for Regression Tasks
Simple Transferability Estimation for Regression TasksConference on Uncertainty in Artificial Intelligence (UAI), 2023
Cuong N. Nguyen
Phong Tran
L. Ho
Vu C. Dinh
Anh Tran
Tal Hassner
Cuong V Nguyen
231
6
0
01 Dec 2023
MetaDefa: Meta-learning based on Domain Enhancement and Feature
  Alignment for Single Domain Generalization
MetaDefa: Meta-learning based on Domain Enhancement and Feature Alignment for Single Domain Generalization
Can Sun
Hao Zheng
Zhigang Hu
Liu Yang
Meiguang Zheng
Bo Xu
163
0
0
27 Nov 2023
GNNEvaluator: Evaluating GNN Performance On Unseen Graphs Without Labels
GNNEvaluator: Evaluating GNN Performance On Unseen Graphs Without LabelsNeural Information Processing Systems (NeurIPS), 2023
Xin-Yang Zheng
Miao Zhang
C. Chen
Soheila Molaei
Chuan Zhou
Shirui Pan
GNN
236
19
0
23 Oct 2023
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial
  Robustness under Distribution Shift
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift
Lin Li
Yifei Wang
Chawin Sitawarin
Michael W. Spratling
220
11
0
19 Oct 2023
On the Transferability of Learning Models for Semantic Segmentation for
  Remote Sensing Data
On the Transferability of Learning Models for Semantic Segmentation for Remote Sensing Data
Rongjun Qin
Guixiang Zhang
Yang Tang
170
3
0
16 Oct 2023
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model
  Generalization Analysis
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization AnalysisInternational Conference on Learning Representations (ICLR), 2023
Chong You
Xingjian Leng
Zijian Wang
Yang Yang
Zi Huang
Liang Zheng
OOD
261
14
0
06 Oct 2023
AdaEvo: Edge-Assisted Continuous and Timely DNN Model Evolution for
  Mobile Devices
AdaEvo: Edge-Assisted Continuous and Timely DNN Model Evolution for Mobile DevicesIEEE Transactions on Mobile Computing (IEEE TMC), 2023
Lehao Wang
Zhiwen Yu
Haoyi Yu
Sicong Liu
Yaxiong Xie
Bin Guo
Yunxin Liu
160
5
0
27 Sep 2023
PAGER: A Framework for Failure Analysis of Deep Regression Models
PAGER: A Framework for Failure Analysis of Deep Regression Models
Jayaraman J. Thiagarajan
V. Narayanaswamy
Puja Trivedi
Rushil Anirudh
298
0
0
20 Sep 2023
Selecting which Dense Retriever to use for Zero-Shot Search
Selecting which Dense Retriever to use for Zero-Shot Search
Ekaterina Khramtsova
Shengyao Zhuang
Mahsa Baktashmotlagh
Xi Wang
Guido Zuccon
170
12
0
18 Sep 2023
Anchor Points: Benchmarking Models with Much Fewer Examples
Anchor Points: Benchmarking Models with Much Fewer ExamplesConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Rajan Vivek
Kawin Ethayarajh
Diyi Yang
Douwe Kiela
ALM
276
41
0
14 Sep 2023
CAME: Contrastive Automated Model Evaluation
CAME: Contrastive Automated Model EvaluationIEEE International Conference on Computer Vision (ICCV), 2023
Ru Peng
Qiuyang Duan
Haobo Wang
Jiachen Ma
Yanbo Jiang
Yongjun Tu
Xiu Jiang
Jiaqi Zhao
ELM
199
8
0
22 Aug 2023
Distance Matters For Improving Performance Estimation Under Covariate
  Shift
Distance Matters For Improving Performance Estimation Under Covariate Shift
Mélanie Roschewitz
Ben Glocker
146
1
0
14 Aug 2023
Unsupervised Accuracy Estimation of Deep Visual Models using
  Domain-Adaptive Adversarial Perturbation without Source Samples
Unsupervised Accuracy Estimation of Deep Visual Models using Domain-Adaptive Adversarial Perturbation without Source SamplesIEEE International Conference on Computer Vision (ICCV), 2023
JoonHo Lee
J. Woo
H. Moon
Kwonho Lee
187
4
0
19 Jul 2023
Validation of the Practicability of Logical Assessment Formula for
  Evaluations with Inaccurate Ground-Truth Labels
Validation of the Practicability of Logical Assessment Formula for Evaluations with Inaccurate Ground-Truth Labels
Yongquan Yang
Hong Bu
199
0
0
06 Jul 2023
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models
Uddeshya Upadhyay
Shyamgopal Karthik
Goran Frehse
Zeynep Akata
MLLMVLM
404
5
0
01 Jul 2023
On Orderings of Probability Vectors and Unsupervised Performance
  Estimation
On Orderings of Probability Vectors and Unsupervised Performance Estimation
Muhammad Maaz
Rui Qiao
Yiheng Zhou
Renxian Zhang
180
0
0
16 Jun 2023
LOVM: Language-Only Vision Model Selection
LOVM: Language-Only Vision Model SelectionNeural Information Processing Systems (NeurIPS), 2023
O. Zohar
Shih-Cheng Huang
Kuan-Chieh Wang
Serena Yeung
MLLM
178
16
0
15 Jun 2023
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement
  Discrepancy
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement DiscrepancyNeural Information Processing Systems (NeurIPS), 2023
Elan Rosenfeld
Saurabh Garg
UQCV
130
12
0
01 Jun 2023
12
Next