ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.02068
  4. Cited By
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label
  Classification

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

5 February 2016
André F. T. Martins
Ramón Fernández Astudillo
ArXivPDFHTML

Papers citing "From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification"

50 / 87 papers shown
Title
Smooth Quadratic Prediction Markets
Smooth Quadratic Prediction Markets
Enrique Nueve
Bo Waggoner
25
0
0
05 May 2025
Aligning Instance-Semantic Sparse Representation towards Unsupervised Object Segmentation and Shape Abstraction with Repeatable Primitives
Jiaxin Li
Hongxing Wang
Jiawei Tan
Zhilong Ou
Junsong Yuan
3DPC
40
0
0
10 Mar 2025
Transfer Learning with Pre-trained Conditional Generative Models
Transfer Learning with Pre-trained Conditional Generative Models
Shinýa Yamaguchi
Sekitoshi Kanai
Atsutoshi Kumagai
Daiki Chijiwa
H. Kashima
VLM
CLL
BDL
DiffM
132
5
0
21 Feb 2025
Learning to Decouple Complex Systems
Learning to Decouple Complex Systems
Zihan Zhou
Tianshu Yu
BDL
68
4
0
17 Feb 2025
Aggregate to Adapt: Node-Centric Aggregation for Multi-Source-Free Graph Domain Adaptation
Aggregate to Adapt: Node-Centric Aggregation for Multi-Source-Free Graph Domain Adaptation
Zhen Zhang
Bingsheng He
104
2
0
05 Feb 2025
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Oussama Zekri
Nicolas Boullé
DiffM
62
3
0
03 Feb 2025
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Duc Hau Nguyen
Cyrielle Mallart
Guillaume Gravier
Pascale Sébillot
60
0
0
22 Jan 2025
Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs
Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs
Amirmohammad Farzaneh
Osvaldo Simeone
86
0
0
22 Jan 2025
Privacy Vulnerabilities in Marginals-based Synthetic Data
Privacy Vulnerabilities in Marginals-based Synthetic Data
Steven Golob
Sikha Pentyala
Anuar Maratkhan
Martine De Cock
26
3
0
07 Oct 2024
Can Transformers Learn $n$-gram Language Models?
Can Transformers Learn nnn-gram Language Models?
Anej Svete
Nadav Borenstein
M. Zhou
Isabelle Augenstein
Ryan Cotterell
33
6
0
03 Oct 2024
Attention layers provably solve single-location regression
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
125
2
0
02 Oct 2024
q-exponential family for policy optimization
q-exponential family for policy optimization
Lingwei Zhu
Haseeb Shah
Han Wang
Yukie Nagai
Martha White
OffRL
73
0
0
14 Aug 2024
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
Franz Nowak
Anej Svete
Alexandra Butoi
Ryan Cotterell
ReLM
LRM
46
12
0
20 Jun 2024
Causal Discovery Inspired Unsupervised Domain Adaptation for Emotion-Cause Pair Extraction
Causal Discovery Inspired Unsupervised Domain Adaptation for Emotion-Cause Pair Extraction
Yuncheng Hua
Yujin Huang
Shuo Huang
Tao Feng
Lizhen Qu
Chris Bain
R. Bassed
Gholamreza Haffari
CML
OOD
50
2
0
18 Jun 2024
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for
  Low-Resource Languages
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages
Trinh Pham
Khoi M. Le
Luu Anh Tuan
36
1
0
14 Jun 2024
Building a stable classifier with the inflated argmax
Building a stable classifier with the inflated argmax
Jake A. Soloff
Rina Foygel Barber
Rebecca Willett
91
2
0
22 May 2024
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Dennis Wu
Jerry Yao-Chieh Hu
Teng-Yun Hsiao
Han Liu
40
28
0
04 Apr 2024
Regularized Q-Learning with Linear Function Approximation
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
27
2
0
26 Jan 2024
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced
  linear classification
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification
Hyenkyun Woo
15
0
0
26 Dec 2023
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Anej Svete
Ryan Cotterell
32
2
0
08 Oct 2023
Decision-Focused Learning: Foundations, State of the Art, Benchmark and
  Future Opportunities
Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future Opportunities
Jayanta Mandi
James Kotary
Senne Berden
Maxime Mulamba
Víctor Bucarey
Tias Guns
Ferdinando Fioretto
AI4CE
26
54
0
25 Jul 2023
Generative Meta-Learning Robust Quality-Diversity Portfolio
Generative Meta-Learning Robust Quality-Diversity Portfolio
K. Yuksel
11
2
0
15 Jul 2023
High-Similarity-Pass Attention for Single Image Super-Resolution
High-Similarity-Pass Attention for Single Image Super-Resolution
Jianmei Su
Min Gan
Ieee Guang-Yong Chen Senior Member
Wenzhong Guo
F. I. C. L. Philip Chen
27
16
0
25 May 2023
Interpretable Multimodal Misinformation Detection with Logic Reasoning
Interpretable Multimodal Misinformation Detection with Logic Reasoning
Hui Liu
Wenya Wang
Haoliang Li
37
22
0
10 May 2023
Filling out the missing gaps: Time Series Imputation with
  Semi-Supervised Learning
Filling out the missing gaps: Time Series Imputation with Semi-Supervised Learning
Karan Aggarwal
Jaideep Srivastava
AI4TS
19
0
0
09 Apr 2023
Learning Sparsity of Representations with Discrete Latent Variables
Learning Sparsity of Representations with Discrete Latent Variables
Zhao Xu
Daniel Oñoro-Rubio
G. Serra
Mathias Niepert
13
0
0
03 Apr 2023
GTRL: An Entity Group-Aware Temporal Knowledge Graph Representation
  Learning Method
GTRL: An Entity Group-Aware Temporal Knowledge Graph Representation Learning Method
Xing Tang
Ling-Hao Chen
AI4TS
14
4
0
22 Feb 2023
A Study on ReLU and Softmax in Transformer
A Study on ReLU and Softmax in Transformer
Kai Shen
Junliang Guo
Xuejiao Tan
Siliang Tang
Rui Wang
Jiang Bian
19
53
0
13 Feb 2023
HanoiT: Enhancing Context-aware Translation via Selective Context
HanoiT: Enhancing Context-aware Translation via Selective Context
Jian Yang
Yuwei Yin
Shuming Ma
Liqun Yang
Hongcheng Guo
Haoyang Huang
Dongdong Zhang
Yutao Zeng
Zhoujun Li
Furu Wei
24
5
0
17 Jan 2023
T2G-Former: Organizing Tabular Features into Relation Graphs Promotes
  Heterogeneous Feature Interaction
T2G-Former: Organizing Tabular Features into Relation Graphs Promotes Heterogeneous Feature Interaction
Jiahuan Yan
Jintai Chen
YiXuan Wu
D. Z. Chen
Jian Wu
17
35
0
30 Nov 2022
Weakly Supervised Learning Significantly Reduces the Number of Labels
  Required for Intracranial Hemorrhage Detection on Head CT
Weakly Supervised Learning Significantly Reduces the Number of Labels Required for Intracranial Hemorrhage Detection on Head CT
Jacopo Teneggi
P. Yi
Jeremias Sulam
25
3
0
29 Nov 2022
SEAT: Stable and Explainable Attention
SEAT: Stable and Explainable Attention
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
OOD
18
18
0
23 Nov 2022
Truncation Sampling as Language Model Desmoothing
Truncation Sampling as Language Model Desmoothing
John Hewitt
Christopher D. Manning
Percy Liang
BDL
38
75
0
27 Oct 2022
SIMPLE: A Gradient Estimator for $k$-Subset Sampling
SIMPLE: A Gradient Estimator for kkk-Subset Sampling
Kareem Ahmed
Zhe Zeng
Mathias Niepert
Guy Van den Broeck
BDL
40
24
0
04 Oct 2022
Self-supervised Representation Learning on Electronic Health Records
  with Graph Kernel Infomax
Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax
Hao-Ren Yao
Nairen Cao
Katina Russell
D. Chang
O. Frieder
Jeremy T. Fineman
SSL
20
1
0
01 Sep 2022
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel
Ryuichi Kanoh
M. Sugiyama
20
2
0
25 May 2022
CLCNet: Rethinking of Ensemble Modeling with Classification Confidence
  Network
CLCNet: Rethinking of Ensemble Modeling with Classification Confidence Network
Yaodong Yu
S. Horng
10
0
0
19 May 2022
Learning Self-Modulating Attention in Continuous Time Space with
  Applications to Sequential Recommendation
Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation
Chao Chen
Haoyu Geng
Nianzu Yang
Junchi Yan
Daiyue Xue
Jianping Yu
Xiaokang Yang
HAI
AI4TS
27
11
0
30 Mar 2022
On Neural Network Equivalence Checking using SMT Solvers
On Neural Network Equivalence Checking using SMT Solvers
Charis Eleftheriadis
Nikolaos Kekatos
Panagiotis Katsaros
S. Tripakis
AAML
21
12
0
22 Mar 2022
TraceNet: Tracing and Locating the Key Elements in Sentiment Analysis
TraceNet: Tracing and Locating the Key Elements in Sentiment Analysis
Qinghua Zhao
Shuai Ma
4
0
0
28 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in
  Actor-Critic Algorithms
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
35
2
0
15 Feb 2022
Are Transformers More Robust? Towards Exact Robustness Verification for
  Transformers
Are Transformers More Robust? Towards Exact Robustness Verification for Transformers
B. Liao
Chih-Hong Cheng
Hasan Esen
Alois C. Knoll
AAML
26
1
0
08 Feb 2022
Towards Controllable Agent in MOBA Games with Generative Modeling
Towards Controllable Agent in MOBA Games with Generative Modeling
Shubao Zhang
32
0
0
15 Dec 2021
Exploring Social Posterior Collapse in Variational Autoencoder for
  Interaction Modeling
Exploring Social Posterior Collapse in Variational Autoencoder for Interaction Modeling
Chen Tang
Wei Zhan
M. Tomizuka
DRL
29
19
0
01 Dec 2021
Evidential Softmax for Sparse Multimodal Distributions in Deep
  Generative Models
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
Phil Chen
Masha Itkina
Ransalu Senanayake
Mykel J. Kochenderfer
28
6
0
27 Oct 2021
Understanding Interlocking Dynamics of Cooperative Rationalization
Understanding Interlocking Dynamics of Cooperative Rationalization
Mo Yu
Yang Zhang
Shiyu Chang
Tommi Jaakkola
18
41
0
26 Oct 2021
Deep Neural Networks and Tabular Data: A Survey
Deep Neural Networks and Tabular Data: A Survey
V. Borisov
Tobias Leemann
Kathrin Seßler
Johannes Haug
Martin Pawelczyk
Gjergji Kasneci
LMTD
27
645
0
05 Oct 2021
Trustworthy AI: From Principles to Practices
Trustworthy AI: From Principles to Practices
Bo-wen Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
117
355
0
04 Oct 2021
A Deep Learning Perspective on Connected Automated Vehicle (CAV)
  Cybersecurity and Threat Intelligence
A Deep Learning Perspective on Connected Automated Vehicle (CAV) Cybersecurity and Threat Intelligence
M. Basnet
Mohd. Hasan Ali
24
7
0
22 Sep 2021
Identifying Autism Spectrum Disorder Based on Individual-Aware
  Down-Sampling and Multi-Modal Learning
Identifying Autism Spectrum Disorder Based on Individual-Aware Down-Sampling and Multi-Modal Learning
Li Pan
Jundong Liu
M. Shi
C. Wong
K. Chan
22
11
0
19 Sep 2021
12
Next