v1v2v3 (latest)

Adaptive Parametric Activation: Unifying and Generalising Activation Functions Across Tasks

European Conference on Computer Vision (ECCV), 2024

11 July 2024

Konstantinos Panagiotis Alexandridis

Jiankang Deng

A. Nguyen

Shan Luo

ArXiv (abs)PDF HTML Github (53★)

Main:9 Pages

10 Figures

Bibliography:5 Pages

1 Tables

Appendix:5 Pages

Abstract

The activation function plays a crucial role in model optimisation, yet the optimal choice remains unclear. For example, the Sigmoid activation is the de-facto activation in balanced classification tasks, however, in imbalanced classification, it proves inappropriate due to bias towards frequent classes. In this work, we delve deeper in this phenomenon by performing a comprehensive statistical analysis in the classification and intermediate layers of both balanced and imbalanced networks and we empirically show that aligning the activation function with the data distribution, enhances the performance in both balanced and imbalanced tasks. To this end, we propose the Adaptive Parametric Activation (APA) function, a novel and versatile activation function that unifies most common activation functions under a single formula. APA can be applied in both intermediate layers and attention layers, significantly outperforming the state-of-the-art on several imbalanced benchmarks such as ImageNet-LT, iNaturalist2018, Places-LT, CIFAR100-LT and LVIS. Also, we extend APA to a plethora of other tasks such as classification, detection, visual instruction following tasks, image generation and next-text-token prediction benchmarks. APA increases the performance in multiple benchmarks across various model architectures. The code is available atthis https URL.

View on arXiv

Comments on this paper