ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.08063
  4. Cited By
Gradient Estimation with Stochastic Softmax Tricks
v1v2v3 (latest)

Gradient Estimation with Stochastic Softmax Tricks

Neural Information Processing Systems (NeurIPS), 2020
15 June 2020
Max B. Paulus
Dami Choi
Daniel Tarlow
Andreas Krause
Chris J. Maddison
    BDL
ArXiv (abs)PDFHTML

Papers citing "Gradient Estimation with Stochastic Softmax Tricks"

50 / 65 papers shown
Geometric Algorithms for Neural Combinatorial Optimization with Constraints
Geometric Algorithms for Neural Combinatorial Optimization with Constraints
Nikolaos Karalias
Akbar Rafiey
Yifei Xu
Zhishang Luo
B. Tahmasebi
Connie Jiang
Stefanie Jegelka
220
1
0
28 Oct 2025
Beyond Softmax: A Natural Parameterization for Categorical Random Variables
Beyond Softmax: A Natural Parameterization for Categorical Random Variables
A. Manenti
Cesare Alippi
BDL
153
0
0
29 Sep 2025
Information Geometry of Variational Bayes
Information Geometry of Variational Bayes
Mohammad Emtiyaz Khan
94
1
0
19 Sep 2025
Going from a Representative Agent to Counterfactuals in Combinatorial Choice
Going from a Representative Agent to Counterfactuals in Combinatorial Choice
Yanqiu Ruan
Karthyek Murthy
K. Natarajan
342
0
0
29 May 2025
Large (Vision) Language Models are Unsupervised In-Context Learners
Large (Vision) Language Models are Unsupervised In-Context LearnersInternational Conference on Learning Representations (ICLR), 2025
Artyom Gadetsky
Andrei Atanov
Yulun Jiang
Zhitong Gao
Ghazal Hosseini Mighan
Amir Zamir
Maria Brbić
VLMMLLMLRM
541
4
0
03 Apr 2025
Soft Condorcet Optimization for Ranking of General Agents
Soft Condorcet Optimization for Ranking of General AgentsAdaptive Agents and Multi-Agent Systems (AAMAS), 2024
Marc Lanctot
Kate Larson
Michael Kaisers
Quentin Berthet
I. Gemp
Manfred Diaz
Roberto-Rafael Maura-Rivero
Yoram Bachrach
Anna Koop
Doina Precup
978
3
0
31 Oct 2024
End-to-end Planner Training for Language Modeling
End-to-end Planner Training for Language Modeling
Nathan Cornille
Florian Mai
Jingyuan Sun
Marie-Francine Moens
283
0
0
16 Oct 2024
LPGD: A General Framework for Backpropagation through Embedded
  Optimization Layers
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
Anselm Paulus
Georg Martius
Vít Musil
AI4CE
446
5
0
08 Jul 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Tingting Gao
Xi Li
MoE
490
8
0
28 Jun 2024
Conditional Gumbel-Softmax for constrained feature selection with
  application to node selection in wireless sensor networks
Conditional Gumbel-Softmax for constrained feature selection with application to node selection in wireless sensor networks
Thomas Strypsteen
Alexander Bertrand
175
2
0
03 Jun 2024
End-to-End Learning for Fair Multiobjective Optimization Under
  Uncertainty
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty
M. H. Dinh
James Kotary
Ferdinando Fioretto
228
2
0
12 Feb 2024
Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
  Optimization
Analyzing and Enhancing the Backward-Pass Convergence of Unrolled Optimization
James Kotary
Jacob K Christopher
M. H. Dinh
Ferdinando Fioretto
188
0
0
28 Dec 2023
Compact Neural Graphics Primitives with Learned Hash Probing
Compact Neural Graphics Primitives with Learned Hash Probing
Towaki Takikawa
Thomas Müller
Merlin Nimier-David
Alex Evans
Sanja Fidler
Alec Jacobson
Alexander Keller
240
32
0
28 Dec 2023
Orchard: building large cancer phylogenies using stochastic
  combinatorial search
Orchard: building large cancer phylogenies using stochastic combinatorial search
E. Kulman
R. Kuang
Q. Morris
240
8
0
21 Nov 2023
Graph Deep Learning for Time Series Forecasting
Graph Deep Learning for Time Series ForecastingACM Computing Surveys (ACM Comput. Surv.), 2023
Andrea Cini
Ivan Marisca
Daniele Zambon
Cesare Alippi
AI4TSAI4CE
500
37
0
24 Oct 2023
A Model-Agnostic Graph Neural Network for Integrating Local and Global
  Information
A Model-Agnostic Graph Neural Network for Integrating Local and Global InformationJournal of the American Statistical Association (JASA), 2023
Wenzhuo Zhou
Annie Qu
Keiland W Cooper
Norbert Fortin
Babak Shahbaba
393
4
0
23 Sep 2023
Contrastive Learning for Non-Local Graphs with Multi-Resolution
  Structural Views
Contrastive Learning for Non-Local Graphs with Multi-Resolution Structural Views
Asif Khan
Amos Storkey
147
2
0
19 Aug 2023
SynJax: Structured Probability Distributions for JAX
SynJax: Structured Probability Distributions for JAXConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Miloš Stanojević
Laurent Sartran
SyDa
343
4
0
07 Aug 2023
Efficient Learning of Discrete-Continuous Computation Graphs
Efficient Learning of Discrete-Continuous Computation GraphsNeural Information Processing Systems (NeurIPS), 2023
David Friede
Mathias Niepert
179
3
0
26 Jul 2023
Decision-Focused Learning: Foundations, State of the Art, Benchmark and
  Future Opportunities
Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future OpportunitiesJournal of Artificial Intelligence Research (JAIR), 2023
Jayanta Mandi
James Kotary
Senne Berden
Maxime Mulamba
Víctor Bucarey
Tias Guns
Ferdinando Fioretto
AI4CE
516
150
0
25 Jul 2023
Structured Dialogue Discourse Parsing
Structured Dialogue Discourse ParsingSIGDIAL Conferences (SIGDIAL), 2023
Ta-Chung Chi
Alexander I. Rudnicky
298
14
0
26 Jun 2023
Differentiable Clustering with Perturbed Spanning Forests
Differentiable Clustering with Perturbed Spanning ForestsNeural Information Processing Systems (NeurIPS), 2023
Lawrence Stewart
Francis R. Bach
Felipe Llinares-López
Quentin Berthet
372
14
0
25 May 2023
Improving Dual-Encoder Training through Dynamic Indexes for Negative
  Mining
Improving Dual-Encoder Training through Dynamic Indexes for Negative MiningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Nicholas Monath
Manzil Zaheer
Kelsey R. Allen
Andrew McCallum
295
7
0
27 Mar 2023
Backpropagation of Unrolled Solvers with Folded Optimization
Backpropagation of Unrolled Solvers with Folded OptimizationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
James Kotary
M. H. Dinh
Ferdinando Fioretto
374
20
0
28 Jan 2023
DAG Learning on the Permutahedron
DAG Learning on the PermutahedronInternational Conference on Learning Representations (ICLR), 2023
Valentina Zantedeschi
Luca Franceschi
Jean Kaddour
Matt J. Kusner
Vlad Niculae
345
12
0
27 Jan 2023
CORE: Learning Consistent Ordinal REpresentations for Image Ordinal
  Estimation
CORE: Learning Consistent Ordinal REpresentations for Image Ordinal Estimation
Yiming Lei
Zilong Li
Yangyang Li
Junping Zhang
Hongming Shan
284
3
0
15 Jan 2023
Mirror Sinkhorn: Fast Online Optimization on Transport Polytopes
Mirror Sinkhorn: Fast Online Optimization on Transport PolytopesInternational Conference on Machine Learning (ICML), 2022
Marin Ballu
Quentin Berthet
407
9
0
18 Nov 2022
Improving Low-Resource Cross-lingual Parsing with Expected Statistic
  Regularization
Improving Low-Resource Cross-lingual Parsing with Expected Statistic RegularizationTransactions of the Association for Computational Linguistics (TACL), 2022
Thomas Effland
Michael Collins
291
9
0
17 Oct 2022
SIMPLE: A Gradient Estimator for $k$-Subset Sampling
SIMPLE: A Gradient Estimator for kkk-Subset SamplingInternational Conference on Learning Representations (ICLR), 2022
Kareem Ahmed
Zhe Zeng
Mathias Niepert
Karen Ullrich
BDL
369
35
0
04 Oct 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEsNeural Information Processing Systems (NeurIPS), 2022
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
367
5
0
26 Sep 2022
Structured Recognition for Generative Models with Explaining Away
Structured Recognition for Generative Models with Explaining AwayNeural Information Processing Systems (NeurIPS), 2022
Changmin Yu
Hugo Soulat
Neil Burgess
M. Sahani
CMLBDL
449
3
0
12 Sep 2022
Adaptive Perturbation-Based Gradient Estimation for Discrete Latent
  Variable Models
Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable ModelsAAAI Conference on Artificial Intelligence (AAAI), 2022
Pasquale Minervini
Luca Franceschi
Mathias Niepert
231
16
0
11 Sep 2022
Gradient Estimation for Binary Latent Variables via Gradient Variance
  Clipping
Gradient Estimation for Binary Latent Variables via Gradient Variance ClippingAAAI Conference on Artificial Intelligence (AAAI), 2022
Russell Z. Kunes
Mingzhang Yin
Max Land
Doron Haviv
Dana Peér
Simon Tavaré
BDL
256
5
0
12 Aug 2022
Neural Set Function Extensions: Learning with Discrete Functions in High
  Dimensions
Neural Set Function Extensions: Learning with Discrete Functions in High DimensionsNeural Information Processing Systems (NeurIPS), 2022
Nikolaos Karalias
Joshua Robinson
Andreas Loukas
Stefanie Jegelka
388
14
0
08 Aug 2022
Unsupervised Learning for Combinatorial Optimization with Principled
  Objective Relaxation
Unsupervised Learning for Combinatorial Optimization with Principled Objective RelaxationNeural Information Processing Systems (NeurIPS), 2022
Haoyu Wang
Nan Wu
Hang Yang
Cong Hao
Pan Li
473
43
0
13 Jul 2022
Ordered Subgraph Aggregation Networks
Ordered Subgraph Aggregation NetworksNeural Information Processing Systems (NeurIPS), 2022
Chao Qian
Gaurav Rattan
Floris Geerts
Christopher Morris
Mathias Niepert
441
75
0
22 Jun 2022
Training Discrete Deep Generative Models via Gapped Straight-Through
  Estimator
Training Discrete Deep Generative Models via Gapped Straight-Through EstimatorInternational Conference on Machine Learning (ICML), 2022
Ting-Han Fan
Ta-Chung Chi
Alexander I. Rudnicky
Peter J. Ramadge
BDL
211
9
0
15 Jun 2022
Dual Decomposition of Convex Optimization Layers for Consistent
  Attention in Medical Images
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical ImagesInternational Conference on Machine Learning (ICML), 2022
Tom Ron
M. Weiler-Sagie
Tamir Hazan
FAttMedIm
322
7
0
06 Jun 2022
Backpropagation through Combinatorial Algorithms: Identity with
  Projection Works
Backpropagation through Combinatorial Algorithms: Identity with Projection WorksInternational Conference on Learning Representations (ICLR), 2022
Subham S. Sahoo
Anselm Paulus
Marin Vlastelica
Vít Musil
Volodymyr Kuleshov
Georg Martius
428
33
0
30 May 2022
Sparse Graph Learning from Spatiotemporal Time Series
Sparse Graph Learning from Spatiotemporal Time SeriesJournal of machine learning research (JMLR), 2022
Andrea Cini
Daniele Zambon
Cesare Alippi
CMLAI4TS
512
30
0
26 May 2022
Learning Discrete Structured Variational Auto-Encoder using Natural
  Evolution Strategies
Learning Discrete Structured Variational Auto-Encoder using Natural Evolution StrategiesInternational Conference on Learning Representations (ICLR), 2022
Alon Berliner
Guy Rotman
Yossi Adi
Roi Reichart
Tamir Hazan
BDLDRL
222
5
0
03 May 2022
Differentiable DAG Sampling
Differentiable DAG SamplingInternational Conference on Learning Representations (ICLR), 2022
Bertrand Charpentier
Simon Kibler
Stephan Günnemann
383
50
0
16 Mar 2022
Learning Group Importance using the Differentiable Hypergeometric
  Distribution
Learning Group Importance using the Differentiable Hypergeometric DistributionInternational Conference on Learning Representations (ICLR), 2022
Thomas M. Sutter
Laura Manduchi
Alain Ryser
Julia E. Vogt
473
8
0
03 Mar 2022
Scaling Structured Inference with Randomization
Scaling Structured Inference with Randomization
Yao Fu
John P. Cunningham
Mirella Lapata
BDL
299
2
0
07 Dec 2021
Leveraging Recursive Gumbel-Max Trick for Approximate Inference in
  Combinatorial Spaces
Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial SpacesNeural Information Processing Systems (NeurIPS), 2021
Kirill Struminsky
Artyom Gadetsky
D. Rakitin
Danil Karpushkin
Dmitry Vetrov
BDL
315
10
0
28 Oct 2021
Learning with Algorithmic Supervision via Continuous Relaxations
Learning with Algorithmic Supervision via Continuous RelaxationsNeural Information Processing Systems (NeurIPS), 2021
Felix Petersen
Christian Borgelt
Hilde Kuehne
Oliver Deussen
CLL
253
33
0
11 Oct 2021
A Review of the Gumbel-max Trick and its Extensions for Discrete
  Stochasticity in Machine Learning
A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning
Iris A. M. Huijben
W. Kool
Max B. Paulus
Ruud J. G. van Sloun
431
134
0
04 Oct 2021
Sparse Communication via Mixed Distributions
Sparse Communication via Mixed DistributionsInternational Conference on Learning Representations (ICLR), 2021
António Farinhas
Wilker Aziz
Vlad Niculae
André F. T. Martins
226
3
0
05 Aug 2021
DSelect-k: Differentiable Selection in the Mixture of Experts with
  Applications to Multi-Task Learning
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task LearningNeural Information Processing Systems (NeurIPS), 2021
Hussein Hazimeh
Zhe Zhao
Aakanksha Chowdhery
M. Sathiamoorthy
Yihua Chen
Rahul Mazumder
Lichan Hong
Ed H. Chi
MoE
513
190
0
07 Jun 2021
Combinatorial Optimization for Panoptic Segmentation: A Fully
  Differentiable Approach
Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable ApproachNeural Information Processing Systems (NeurIPS), 2021
Ahmed Abbas
Paul Swoboda
456
16
0
06 Jun 2021
12
Next
Page 1 of 2