ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.10829
  4. Cited By
Sigsoftmax: Reanalysis of the Softmax Bottleneck

Sigsoftmax: Reanalysis of the Softmax Bottleneck

28 May 2018
Sekitoshi Kanai
Yasuhiro Fujiwara
Yuki Yamanaka
S. Adachi
ArXiv (abs)PDFHTML

Papers citing "Sigsoftmax: Reanalysis of the Softmax Bottleneck"

24 / 24 papers shown
Title
Adaptive Sparse Softmax: An Effective and Efficient Softmax Variant
Adaptive Sparse Softmax: An Effective and Efficient Softmax Variant
Qi Lv
Lei Geng
Ziqiang Cao
Min Cao
Sujian Li
Wenjie Li
Guohong Fu
36
0
0
05 Aug 2025
AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism
AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism
Zhepei Wei
Wei-Lin Chen
Xinyu Zhu
Yu Meng
OffRL
189
2
0
04 Jun 2025
Design of Restricted Normalizing Flow towards Arbitrary Stochastic
  Policy with Computational Efficiency
Design of Restricted Normalizing Flow towards Arbitrary Stochastic Policy with Computational Efficiency
Taisuke Kobayashi
Takumi Aotani
235
5
0
17 Dec 2024
Why do small language models underperform? Studying Language Model
  Saturation via the Softmax Bottleneck
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
117
15
0
11 Apr 2024
Using Sequences of Life-events to Predict Human Lives
Using Sequences of Life-events to Predict Human Lives
Germans Savcisens
Tina Eliassi-Rad
L. K. Hansen
L. Mortensen
Lau Lilleholt
Anna Rogers
Ingo Zettler
Sune Lehmann
AI4TS
122
59
0
05 Jun 2023
HistAlign: Improving Context Dependency in Language Generation by
  Aligning with History
HistAlign: Improving Context Dependency in Language Generation by Aligning with History
David Wan
Shiyue Zhang
Joey Tianyi Zhou
AI4TS
128
7
0
08 May 2023
Enhancing Classifier Conservativeness and Robustness by Polynomiality
Enhancing Classifier Conservativeness and Robustness by Polynomiality
Ziqi Wang
Marco Loog
AAML
70
3
0
23 Mar 2022
Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in
  Practice
Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice
Andreas Grivas
Nikolay Bogoychev
Adam Lopez
101
13
0
12 Mar 2022
Evidential Softmax for Sparse Multimodal Distributions in Deep
  Generative Models
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
Phil Chen
Masha Itkina
Ransalu Senanayake
Mykel J. Kochenderfer
85
8
0
27 Oct 2021
Breaking the Softmax Bottleneck for Sequential Recommender Systems with
  Dropout and Decoupling
Breaking the Softmax Bottleneck for Sequential Recommender Systems with Dropout and Decoupling
Yi Lin
BDL
54
3
0
11 Oct 2021
Recognition Awareness: An Application of Latent Cognizance to Open-Set
  Recognition
Recognition Awareness: An Application of Latent Cognizance to Open-Set Recognition
Tatpong Katanyukul
Pisit Nakjai
BDL
155
1
0
27 Aug 2021
Decision Machines: An Extension of Decision Trees
Decision Machines: An Extension of Decision Trees
Jinxiong Zhang
OffRL
123
0
0
27 Jan 2021
Yet Another Representation of Binary Decision Trees: A Mathematical
  Demonstration
Yet Another Representation of Binary Decision Trees: A Mathematical Demonstration
Jinxiong Zhang
222
2
0
18 Jan 2021
Effectiveness of MPC-friendly Softmax Replacement
Effectiveness of MPC-friendly Softmax Replacement
Marcel Keller
Ke Sun
59
10
0
23 Nov 2020
The Two-Pass Softmax Algorithm
The Two-Pass Softmax Algorithm
Marat Dukhan
Artsiom Ablavatski
TPM
63
8
0
13 Jan 2020
Softmax-based Classification is k-means Clustering: Formal Proof,
  Consequences for Adversarial Attacks, and Improvement through Centroid Based
  Tailoring
Softmax-based Classification is k-means Clustering: Formal Proof, Consequences for Adversarial Attacks, and Improvement through Centroid Based Tailoring
Sibylle Hess
W. Duivesteijn
Decebal Constantin Mocanu
84
13
0
07 Jan 2020
Exploring Kernel Functions in the Softmax Layer for Contextual Word
  Classification
Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification
Yingbo Gao
Christian Herold
Weiyue Wang
Hermann Ney
107
4
0
28 Oct 2019
Deep Complex Networks for Protocol-Agnostic Radio Frequency Device
  Fingerprinting in the Wild
Deep Complex Networks for Protocol-Agnostic Radio Frequency Device Fingerprinting in the Wild
Ioannis Agadakos
Nikolaos Agadakos
Jason Polakis
Mohamed R. Amer
82
18
0
18 Sep 2019
Extracting and Learning a Dependency-Enhanced Type Lexicon for Dutch
Extracting and Learning a Dependency-Enhanced Type Lexicon for Dutch
Konstantinos Kogkalidis
50
0
0
06 Sep 2019
Improving Neural Language Modeling via Adversarial Training
Improving Neural Language Modeling via Adversarial Training
Dilin Wang
Chengyue Gong
Qiang Liu
AAML
186
119
0
10 Jun 2019
Constructive Type-Logical Supertagging with Self-Attention Networks
Constructive Type-Logical Supertagging with Self-Attention Networks
Konstantinos Kogkalidis
M. Moortgat
Tejaswini Deoskar
NAI
46
14
0
31 May 2019
Deep Residual Output Layers for Neural Language Generation
Deep Residual Output Layers for Neural Language Generation
Nikolaos Pappas
James Henderson
134
7
0
14 May 2019
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise
  Non-linearities
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities
O. Ganea
Sylvain Gelly
Gary Bécigneul
Aliaksei Severyn
102
20
0
21 Feb 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
438
3,898
0
09 Jan 2019
1