v1v2v3 (latest)

Gradient Estimation with Stochastic Softmax Tricks

Neural Information Processing Systems (NeurIPS), 2020

15 June 2020

Papers citing "Gradient Estimation with Stochastic Softmax Tricks"

50 / 65 papers shown

Geometric Algorithms for Neural Combinatorial Optimization with Constraints

220

28 Oct 2025

Beyond Softmax: A Natural Parameterization for Categorical Random Variables

A. Manenti

Cesare Alippi

BDL

153

29 Sep 2025

Information Geometry of Variational Bayes

Mohammad Emtiyaz Khan

19 Sep 2025

Going from a Representative Agent to Counterfactuals in Combinatorial Choice

Yanqiu Ruan

Karthyek Murthy

K. Natarajan

342

29 May 2025

Large (Vision) Language Models are Unsupervised In-Context LearnersInternational Conference on Learning Representations (ICLR), 2025

Ghazal Hosseini Mighan

Amir Zamir

Maria Brbić

VLM MLLM LRM

541

03 Apr 2025

Soft Condorcet Optimization for Ranking of General AgentsAdaptive Agents and Multi-Agent Systems (AAMAS), 2024

Roberto-Rafael Maura-Rivero

Yoram Bachrach

Anna Koop

Doina Precup

978

31 Oct 2024

End-to-end Planner Training for Language Modeling

283

16 Oct 2024

LPGD: A General Framework for Backpropagation through Embedded Optimization Layers

446

08 Jul 2024

Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model

Longrong Yang

Dong Shen

Chaoxiang Cai

Fan Yang

490

28 Jun 2024

Conditional Gumbel-Softmax for constrained feature selection with application to node selection in wireless sensor networks

Thomas Strypsteen

Alexander Bertrand

175

03 Jun 2024

End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty

M. H. Dinh

James Kotary

Ferdinando Fioretto

228

12 Feb 2024

Analyzing and Enhancing the Backward-Pass Convergence of Unrolled Optimization

188

28 Dec 2023

Compact Neural Graphics Primitives with Learned Hash Probing

Sanja Fidler

240

28 Dec 2023

Orchard: building large cancer phylogenies using stochastic combinatorial search

E. Kulman

R. Kuang

Q. Morris

240

21 Nov 2023

Graph Deep Learning for Time Series ForecastingACM Computing Surveys (ACM Comput. Surv.), 2023

500

24 Oct 2023

A Model-Agnostic Graph Neural Network for Integrating Local and Global InformationJournal of the American Statistical Association (JASA), 2023

Annie Qu

393

23 Sep 2023

Contrastive Learning for Non-Local Graphs with Multi-Resolution Structural Views

Asif Khan

Amos Storkey

147

19 Aug 2023

SynJax: Structured Probability Distributions for JAXConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Miloš Stanojević

Laurent Sartran

SyDa

343

07 Aug 2023

Efficient Learning of Discrete-Continuous Computation GraphsNeural Information Processing Systems (NeurIPS), 2023

David Friede

Mathias Niepert

179

26 Jul 2023

Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future OpportunitiesJournal of Artificial Intelligence Research (JAIR), 2023

516

150

25 Jul 2023

Structured Dialogue Discourse ParsingSIGDIAL Conferences (SIGDIAL), 2023

Ta-Chung Chi

Alexander I. Rudnicky

298

26 Jun 2023

Differentiable Clustering with Perturbed Spanning ForestsNeural Information Processing Systems (NeurIPS), 2023

Lawrence Stewart

Francis R. Bach

Felipe Llinares-López

Quentin Berthet

372

25 May 2023

Improving Dual-Encoder Training through Dynamic Indexes for Negative MiningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

295

27 Mar 2023

Backpropagation of Unrolled Solvers with Folded OptimizationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

James Kotary

M. H. Dinh

Ferdinando Fioretto

374

28 Jan 2023

DAG Learning on the PermutahedronInternational Conference on Learning Representations (ICLR), 2023

Valentina Zantedeschi

Luca Franceschi

Jean Kaddour

Matt J. Kusner

Vlad Niculae

345

27 Jan 2023

CORE: Learning Consistent Ordinal REpresentations for Image Ordinal Estimation

284

15 Jan 2023

Mirror Sinkhorn: Fast Online Optimization on Transport PolytopesInternational Conference on Machine Learning (ICML), 2022

Marin Ballu

Quentin Berthet

407

18 Nov 2022

Improving Low-Resource Cross-lingual Parsing with Expected Statistic RegularizationTransactions of the Association for Computational Linguistics (TACL), 2022

Thomas Effland

Michael Collins

291

17 Oct 2022

SIMPLE: A Gradient Estimator for

k

-Subset SamplingInternational Conference on Learning Representations (ICLR), 2022

369

04 Oct 2022

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEsNeural Information Processing Systems (NeurIPS), 2022

367

26 Sep 2022

Structured Recognition for Generative Models with Explaining AwayNeural Information Processing Systems (NeurIPS), 2022

449

12 Sep 2022

Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable ModelsAAAI Conference on Artificial Intelligence (AAAI), 2022

Pasquale Minervini

Luca Franceschi

Mathias Niepert

231

11 Sep 2022

Gradient Estimation for Binary Latent Variables via Gradient Variance ClippingAAAI Conference on Artificial Intelligence (AAAI), 2022

256

12 Aug 2022

Neural Set Function Extensions: Learning with Discrete Functions in High DimensionsNeural Information Processing Systems (NeurIPS), 2022

388

08 Aug 2022

Unsupervised Learning for Combinatorial Optimization with Principled Objective RelaxationNeural Information Processing Systems (NeurIPS), 2022

473

13 Jul 2022

Ordered Subgraph Aggregation NetworksNeural Information Processing Systems (NeurIPS), 2022

441

22 Jun 2022

Training Discrete Deep Generative Models via Gapped Straight-Through EstimatorInternational Conference on Machine Learning (ICML), 2022

Ting-Han Fan

Ta-Chung Chi

Alexander I. Rudnicky

Peter J. Ramadge

BDL

211

15 Jun 2022

Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical ImagesInternational Conference on Machine Learning (ICML), 2022

322

06 Jun 2022

Backpropagation through Combinatorial Algorithms: Identity with Projection WorksInternational Conference on Learning Representations (ICLR), 2022

428

30 May 2022

Sparse Graph Learning from Spatiotemporal Time SeriesJournal of machine learning research (JMLR), 2022

512

26 May 2022

Learning Discrete Structured Variational Auto-Encoder using Natural Evolution StrategiesInternational Conference on Learning Representations (ICLR), 2022

Yossi Adi

222

03 May 2022

Differentiable DAG SamplingInternational Conference on Learning Representations (ICLR), 2022

Bertrand Charpentier

Simon Kibler

Stephan Günnemann

383

16 Mar 2022

Learning Group Importance using the Differentiable Hypergeometric DistributionInternational Conference on Learning Representations (ICLR), 2022

473

03 Mar 2022

Scaling Structured Inference with Randomization

299

07 Dec 2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial SpacesNeural Information Processing Systems (NeurIPS), 2021

Dmitry Vetrov

315

28 Oct 2021

Learning with Algorithmic Supervision via Continuous RelaxationsNeural Information Processing Systems (NeurIPS), 2021

253

11 Oct 2021

A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

431

134

04 Oct 2021

Sparse Communication via Mixed DistributionsInternational Conference on Learning Representations (ICLR), 2021

António Farinhas

Wilker Aziz

Vlad Niculae

André F. T. Martins

226

05 Aug 2021

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task LearningNeural Information Processing Systems (NeurIPS), 2021

513

190

07 Jun 2021

Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable ApproachNeural Information Processing Systems (NeurIPS), 2021

Ahmed Abbas

Paul Swoboda

456

06 Jun 2021