Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.02068
Cited By
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification
5 February 2016
André F. T. Martins
Ramón Fernández Astudillo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification"
50 / 87 papers shown
Title
Smooth Quadratic Prediction Markets
Enrique Nueve
Bo Waggoner
25
0
0
05 May 2025
Aligning Instance-Semantic Sparse Representation towards Unsupervised Object Segmentation and Shape Abstraction with Repeatable Primitives
Jiaxin Li
Hongxing Wang
Jiawei Tan
Zhilong Ou
Junsong Yuan
3DPC
40
0
0
10 Mar 2025
Transfer Learning with Pre-trained Conditional Generative Models
Shinýa Yamaguchi
Sekitoshi Kanai
Atsutoshi Kumagai
Daiki Chijiwa
H. Kashima
VLM
CLL
BDL
DiffM
132
5
0
21 Feb 2025
Learning to Decouple Complex Systems
Zihan Zhou
Tianshu Yu
BDL
68
4
0
17 Feb 2025
Aggregate to Adapt: Node-Centric Aggregation for Multi-Source-Free Graph Domain Adaptation
Zhen Zhang
Bingsheng He
104
2
0
05 Feb 2025
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Oussama Zekri
Nicolas Boullé
DiffM
62
3
0
03 Feb 2025
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Duc Hau Nguyen
Cyrielle Mallart
Guillaume Gravier
Pascale Sébillot
60
0
0
22 Jan 2025
Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs
Amirmohammad Farzaneh
Osvaldo Simeone
86
0
0
22 Jan 2025
Privacy Vulnerabilities in Marginals-based Synthetic Data
Steven Golob
Sikha Pentyala
Anuar Maratkhan
Martine De Cock
26
3
0
07 Oct 2024
Can Transformers Learn
n
n
n
-gram Language Models?
Anej Svete
Nadav Borenstein
M. Zhou
Isabelle Augenstein
Ryan Cotterell
33
6
0
03 Oct 2024
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
125
2
0
02 Oct 2024
q-exponential family for policy optimization
Lingwei Zhu
Haseeb Shah
Han Wang
Yukie Nagai
Martha White
OffRL
73
0
0
14 Aug 2024
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
Franz Nowak
Anej Svete
Alexandra Butoi
Ryan Cotterell
ReLM
LRM
46
12
0
20 Jun 2024
Causal Discovery Inspired Unsupervised Domain Adaptation for Emotion-Cause Pair Extraction
Yuncheng Hua
Yujin Huang
Shuo Huang
Tao Feng
Lizhen Qu
Chris Bain
R. Bassed
Gholamreza Haffari
CML
OOD
50
2
0
18 Jun 2024
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages
Trinh Pham
Khoi M. Le
Luu Anh Tuan
36
1
0
14 Jun 2024
Building a stable classifier with the inflated argmax
Jake A. Soloff
Rina Foygel Barber
Rebecca Willett
91
2
0
22 May 2024
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Dennis Wu
Jerry Yao-Chieh Hu
Teng-Yun Hsiao
Han Liu
40
28
0
04 Apr 2024
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
27
2
0
26 Jan 2024
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification
Hyenkyun Woo
15
0
0
26 Dec 2023
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Anej Svete
Ryan Cotterell
32
2
0
08 Oct 2023
Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future Opportunities
Jayanta Mandi
James Kotary
Senne Berden
Maxime Mulamba
Víctor Bucarey
Tias Guns
Ferdinando Fioretto
AI4CE
26
54
0
25 Jul 2023
Generative Meta-Learning Robust Quality-Diversity Portfolio
K. Yuksel
11
2
0
15 Jul 2023
High-Similarity-Pass Attention for Single Image Super-Resolution
Jianmei Su
Min Gan
Ieee Guang-Yong Chen Senior Member
Wenzhong Guo
F. I. C. L. Philip Chen
27
16
0
25 May 2023
Interpretable Multimodal Misinformation Detection with Logic Reasoning
Hui Liu
Wenya Wang
Haoliang Li
37
22
0
10 May 2023
Filling out the missing gaps: Time Series Imputation with Semi-Supervised Learning
Karan Aggarwal
Jaideep Srivastava
AI4TS
19
0
0
09 Apr 2023
Learning Sparsity of Representations with Discrete Latent Variables
Zhao Xu
Daniel Oñoro-Rubio
G. Serra
Mathias Niepert
13
0
0
03 Apr 2023
GTRL: An Entity Group-Aware Temporal Knowledge Graph Representation Learning Method
Xing Tang
Ling-Hao Chen
AI4TS
14
4
0
22 Feb 2023
A Study on ReLU and Softmax in Transformer
Kai Shen
Junliang Guo
Xuejiao Tan
Siliang Tang
Rui Wang
Jiang Bian
19
53
0
13 Feb 2023
HanoiT: Enhancing Context-aware Translation via Selective Context
Jian Yang
Yuwei Yin
Shuming Ma
Liqun Yang
Hongcheng Guo
Haoyang Huang
Dongdong Zhang
Yutao Zeng
Zhoujun Li
Furu Wei
24
5
0
17 Jan 2023
T2G-Former: Organizing Tabular Features into Relation Graphs Promotes Heterogeneous Feature Interaction
Jiahuan Yan
Jintai Chen
YiXuan Wu
D. Z. Chen
Jian Wu
17
35
0
30 Nov 2022
Weakly Supervised Learning Significantly Reduces the Number of Labels Required for Intracranial Hemorrhage Detection on Head CT
Jacopo Teneggi
P. Yi
Jeremias Sulam
25
3
0
29 Nov 2022
SEAT: Stable and Explainable Attention
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
OOD
18
18
0
23 Nov 2022
Truncation Sampling as Language Model Desmoothing
John Hewitt
Christopher D. Manning
Percy Liang
BDL
38
75
0
27 Oct 2022
SIMPLE: A Gradient Estimator for
k
k
k
-Subset Sampling
Kareem Ahmed
Zhe Zeng
Mathias Niepert
Guy Van den Broeck
BDL
40
24
0
04 Oct 2022
Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax
Hao-Ren Yao
Nairen Cao
Katina Russell
D. Chang
O. Frieder
Jeremy T. Fineman
SSL
20
1
0
01 Sep 2022
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel
Ryuichi Kanoh
M. Sugiyama
20
2
0
25 May 2022
CLCNet: Rethinking of Ensemble Modeling with Classification Confidence Network
Yaodong Yu
S. Horng
10
0
0
19 May 2022
Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation
Chao Chen
Haoyu Geng
Nianzu Yang
Junchi Yan
Daiyue Xue
Jianping Yu
Xiaokang Yang
HAI
AI4TS
27
11
0
30 Mar 2022
On Neural Network Equivalence Checking using SMT Solvers
Charis Eleftheriadis
Nikolaos Kekatos
Panagiotis Katsaros
S. Tripakis
AAML
21
12
0
22 Mar 2022
TraceNet: Tracing and Locating the Key Elements in Sentiment Analysis
Qinghua Zhao
Shuai Ma
4
0
0
28 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
35
2
0
15 Feb 2022
Are Transformers More Robust? Towards Exact Robustness Verification for Transformers
B. Liao
Chih-Hong Cheng
Hasan Esen
Alois C. Knoll
AAML
26
1
0
08 Feb 2022
Towards Controllable Agent in MOBA Games with Generative Modeling
Shubao Zhang
32
0
0
15 Dec 2021
Exploring Social Posterior Collapse in Variational Autoencoder for Interaction Modeling
Chen Tang
Wei Zhan
M. Tomizuka
DRL
29
19
0
01 Dec 2021
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
Phil Chen
Masha Itkina
Ransalu Senanayake
Mykel J. Kochenderfer
28
6
0
27 Oct 2021
Understanding Interlocking Dynamics of Cooperative Rationalization
Mo Yu
Yang Zhang
Shiyu Chang
Tommi Jaakkola
18
41
0
26 Oct 2021
Deep Neural Networks and Tabular Data: A Survey
V. Borisov
Tobias Leemann
Kathrin Seßler
Johannes Haug
Martin Pawelczyk
Gjergji Kasneci
LMTD
27
645
0
05 Oct 2021
Trustworthy AI: From Principles to Practices
Bo-wen Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
117
355
0
04 Oct 2021
A Deep Learning Perspective on Connected Automated Vehicle (CAV) Cybersecurity and Threat Intelligence
M. Basnet
Mohd. Hasan Ali
24
7
0
22 Sep 2021
Identifying Autism Spectrum Disorder Based on Individual-Aware Down-Sampling and Multi-Modal Learning
Li Pan
Jundong Liu
M. Shi
C. Wong
K. Chan
22
11
0
19 Sep 2021
1
2
Next