Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.08063
Cited By
Gradient Estimation with Stochastic Softmax Tricks
15 June 2020
Max B. Paulus
Dami Choi
Daniel Tarlow
Andreas Krause
Chris J. Maddison
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gradient Estimation with Stochastic Softmax Tricks"
50 / 62 papers shown
Title
Large (Vision) Language Models are Unsupervised In-Context Learners
Artyom Gadetsky
Andrei Atanov
Yulun Jiang
Zhitong Gao
Ghazal Hosseini Mighan
Amir Zamir
Maria Brbić
VLM
MLLM
LRM
64
0
0
03 Apr 2025
Soft Condorcet Optimization for Ranking of General Agents
Marc Lanctot
Kate Larson
Michael Kaisers
Quentin Berthet
I. Gemp
Manfred Diaz
Roberto-Rafael Maura-Rivero
Yoram Bachrach
Anna Koop
Doina Precup
37
0
0
31 Oct 2024
End-to-end Planner Training for Language Modeling
Nathan Cornille
Florian Mai
Jingyuan Sun
Marie-Francine Moens
23
0
0
16 Oct 2024
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
Anselm Paulus
Georg Martius
Vít Musil
AI4CE
47
1
0
08 Jul 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
41
2
0
28 Jun 2024
Conditional Gumbel-Softmax for constrained feature selection with application to node selection in wireless sensor networks
Thomas Strypsteen
Alexander Bertrand
21
0
0
03 Jun 2024
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty
M. H. Dinh
James Kotary
Ferdinando Fioretto
20
0
0
12 Feb 2024
Analyzing and Enhancing the Backward-Pass Convergence of Unrolled Optimization
James Kotary
Jacob K Christopher
M. H. Dinh
Ferdinando Fioretto
6
0
0
28 Dec 2023
Compact Neural Graphics Primitives with Learned Hash Probing
Towaki Takikawa
Thomas Müller
Merlin Nimier-David
Alex Evans
Sanja Fidler
Alec Jacobson
Alexander Keller
19
18
0
28 Dec 2023
Orchard: building large cancer phylogenies using stochastic combinatorial search
E. Kulman
R. Kuang
Q. Morris
17
4
0
21 Nov 2023
Graph Deep Learning for Time Series Forecasting
Andrea Cini
Ivan Marisca
Daniele Zambon
C. Alippi
AI4TS
AI4CE
24
14
0
24 Oct 2023
A Model-Agnostic Graph Neural Network for Integrating Local and Global Information
Wenzhuo Zhou
Annie Qu
Keiland W Cooper
Norbert Fortin
B. Shahbaba
23
1
0
23 Sep 2023
Contrastive Learning for Non-Local Graphs with Multi-Resolution Structural Views
Asif Khan
Amos Storkey
17
1
0
19 Aug 2023
SynJax: Structured Probability Distributions for JAX
Miloš Stanojević
Laurent Sartran
SyDa
13
4
0
07 Aug 2023
Efficient Learning of Discrete-Continuous Computation Graphs
David Friede
Mathias Niepert
13
3
0
26 Jul 2023
Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future Opportunities
Jayanta Mandi
James Kotary
Senne Berden
Maxime Mulamba
Víctor Bucarey
Tias Guns
Ferdinando Fioretto
AI4CE
24
54
0
25 Jul 2023
Structured Dialogue Discourse Parsing
Ta-Chung Chi
Alexander I. Rudnicky
20
11
0
26 Jun 2023
Differentiable Clustering with Perturbed Spanning Forests
Lawrence Stewart
Francis R. Bach
Felipe Llinares-López
Quentin Berthet
21
8
0
25 May 2023
Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining
Nicholas Monath
Manzil Zaheer
Kelsey R. Allen
Andrew McCallum
22
6
0
27 Mar 2023
Backpropagation of Unrolled Solvers with Folded Optimization
James Kotary
M. H. Dinh
Ferdinando Fioretto
16
14
0
28 Jan 2023
DAG Learning on the Permutahedron
Valentina Zantedeschi
Luca Franceschi
Jean Kaddour
Matt J. Kusner
Vlad Niculae
22
11
0
27 Jan 2023
CORE: Learning Consistent Ordinal REpresentations for Image Ordinal Estimation
Yiming Lei
Zilong Li
Yangyang Li
Junping Zhang
Hongming Shan
6
3
0
15 Jan 2023
Mirror Sinkhorn: Fast Online Optimization on Transport Polytopes
Marin Ballu
Quentin Berthet
17
7
0
18 Nov 2022
Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization
Thomas Effland
Michael Collins
23
6
0
17 Oct 2022
SIMPLE: A Gradient Estimator for
k
k
k
-Subset Sampling
Kareem Ahmed
Zhe Zeng
Mathias Niepert
Guy Van den Broeck
BDL
21
24
0
04 Oct 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
21
5
0
26 Sep 2022
Structured Recognition for Generative Models with Explaining Away
Changmin Yu
Hugo Soulat
Neil Burgess
M. Sahani
CML
BDL
19
3
0
12 Sep 2022
Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models
Pasquale Minervini
Luca Franceschi
Mathias Niepert
38
11
0
11 Sep 2022
Gradient Estimation for Binary Latent Variables via Gradient Variance Clipping
Russell Z. Kunes
Mingzhang Yin
Max Land
D. Haviv
D. Pe’er
Simon Tavaré
BDL
6
2
0
12 Aug 2022
Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions
Nikolaos Karalias
Joshua Robinson
Andreas Loukas
Stefanie Jegelka
25
8
0
08 Aug 2022
Unsupervised Learning for Combinatorial Optimization with Principled Objective Relaxation
Haoyu Wang
Nan Wu
Hang Yang
Cong Hao
Pan Li
6
29
0
13 Jul 2022
Ordered Subgraph Aggregation Networks
Chao Qian
Gaurav Rattan
Floris Geerts
Christopher Morris
Mathias Niepert
22
56
0
22 Jun 2022
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Ting-Han Fan
Ta-Chung Chi
Alexander I. Rudnicky
Peter J. Ramadge
BDL
14
7
0
15 Jun 2022
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images
Tom Ron
M. Weiler-Sagie
Tamir Hazan
FAtt
MedIm
16
6
0
06 Jun 2022
Backpropagation through Combinatorial Algorithms: Identity with Projection Works
Subham S. Sahoo
Anselm Paulus
Marin Vlastelica
Vít Musil
Volodymyr Kuleshov
Georg Martius
17
20
0
30 May 2022
Sparse Graph Learning from Spatiotemporal Time Series
Andrea Cini
Daniele Zambon
C. Alippi
CML
AI4TS
35
18
0
26 May 2022
Learning Discrete Structured Variational Auto-Encoder using Natural Evolution Strategies
Alon Berliner
Guy Rotman
Yossi Adi
Roi Reichart
Tamir Hazan
BDL
DRL
24
4
0
03 May 2022
Differentiable DAG Sampling
Bertrand Charpentier
Simon Kibler
Stephan Günnemann
12
41
0
16 Mar 2022
Learning Group Importance using the Differentiable Hypergeometric Distribution
Thomas M. Sutter
Laura Manduchi
Alain Ryser
Julia E. Vogt
34
7
0
03 Mar 2022
Scaling Structured Inference with Randomization
Yao Fu
John P. Cunningham
Mirella Lapata
BDL
19
2
0
07 Dec 2021
Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces
Kirill Struminsky
Artyom Gadetsky
D. Rakitin
Danil Karpushkin
Dmitry Vetrov
BDL
11
9
0
28 Oct 2021
Learning with Algorithmic Supervision via Continuous Relaxations
Felix Petersen
Christian Borgelt
Hilde Kuehne
Oliver Deussen
CLL
27
25
0
11 Oct 2021
A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning
Iris A. M. Huijben
W. Kool
Max B. Paulus
Ruud J. G. van Sloun
16
92
0
04 Oct 2021
Sparse Communication via Mixed Distributions
António Farinhas
Wilker Aziz
Vlad Niculae
André F. T. Martins
15
3
0
05 Aug 2021
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning
Hussein Hazimeh
Zhe Zhao
Aakanksha Chowdhery
M. Sathiamoorthy
Yihua Chen
Rahul Mazumder
Lichan Hong
Ed H. Chi
MoE
14
138
0
07 Jun 2021
Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach
Ahmed Abbas
Paul Swoboda
11
14
0
06 Jun 2021
Stochastic Iterative Graph Matching
Linfeng Liu
M. C. Hughes
S. Hassoun
Liping Liu
11
15
0
04 Jun 2021
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions
Mathias Niepert
Pasquale Minervini
Luca Franceschi
21
81
0
03 Jun 2021
Learning to Extend Program Graphs to Work-in-Progress Code
Xuechen Li
Chris J. Maddison
Daniel Tarlow
11
2
0
28 May 2021
Reconciling the Discrete-Continuous Divide: Towards a Mathematical Theory of Sparse Communication
André F. T. Martins
11
1
0
01 Apr 2021
1
2
Next