Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.01576
Cited By
Quasi-Recurrent Neural Networks
5 November 2016
James Bradbury
Stephen Merity
Caiming Xiong
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quasi-Recurrent Neural Networks"
50 / 206 papers shown
Title
Physics-inspired Energy Transition Neural Network for Sequence Learning
Zhou Wu
Junyi An
Baile Xu
F. Shen
Jian Zhao
PINN
22
0
0
06 May 2025
Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo
Ocheme Anthony Ekle
Biswarup Das
29
0
0
24 Apr 2025
Efficient Language Modeling for Low-Resource Settings with Hybrid RNN-Transformer Architectures
Gabriel Lindenmaier
Sean Papay
Sebastian Padó
53
0
0
02 Feb 2025
Streaming Detection of Queried Event Start
Cristobal Eyzaguirre
Eric Tang
S. Buch
Adrien Gaidon
Jiajun Wu
Juan Carlos Niebles
74
0
0
04 Dec 2024
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
Yuhong Chou
Man Yao
Kexin Wang
Yuqi Pan
Ruijie Zhu
Yiran Zhong
Yu Qiao
J. Wu
Bo Xu
Guoqi Li
46
4
0
16 Nov 2024
An Effective, Robust and Fairness-aware Hate Speech Detection Framework
Guanyi Mou
Kyumin Lee
29
2
0
25 Sep 2024
Orthogonal Constrained Minimization with Tensor
ℓ
2
,
p
\ell_{2,p}
ℓ
2
,
p
Regularization for HSI Denoising and Destriping
Xiaoxia Liu
Shijie Yu
Jian Lu
Xiaojun Chen
23
0
0
04 Jul 2024
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof
Yana Veitsman
Michael Hahn
Mamba
30
7
0
27 May 2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Soham De
Samuel L. Smith
Anushan Fernando
Aleksandar Botev
George-Christian Muraru
...
David Budden
Yee Whye Teh
Razvan Pascanu
Nando de Freitas
Çağlar Gülçehre
Mamba
58
117
0
29 Feb 2024
Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints
B. Subramanian
Rathinaraja Jeyaraj
Akhrorjon Akhmadjon Ugli Rakhmonov
Jeonghong Kim
15
0
0
14 Feb 2024
Regional inflation analysis using social network data
Vasilii Chsherbakov
Ilia Karpov
14
0
0
14 Feb 2024
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
97
78
0
01 Feb 2024
Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation
J. Hu
Roberto Cavicchioli
Giulia Berardinelli
Alessandro Capotondi
36
2
0
26 Dec 2023
Advancing State of the Art in Language Modeling
David Herel
Tomáš Mikolov
29
1
0
28 Nov 2023
Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
28
11
0
24 Oct 2023
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
Aleksandar Stanić
Dylan R. Ashley
Oleg Serikov
Louis Kirsch
Francesco Faccio
Jürgen Schmidhuber
Thomas Hofmann
Imanol Schlag
MoE
38
9
0
20 Sep 2023
Koopman Invertible Autoencoder: Leveraging Forward and Backward Dynamics for Temporal Modeling
Kshitij Tayal
Arvind Renganathan
Rahul Ghosh
X. Jia
Vipin Kumar
SyDa
AI4CE
52
5
0
19 Sep 2023
Gradient Sparsification For Masked Fine-Tuning of Transformers
J. Ó. Neill
Sourav Dutta
16
0
0
19 Jul 2023
Exploring the Promise and Limits of Real-Time Recurrent Learning
Kazuki Irie
Anand Gopalakrishnan
Jürgen Schmidhuber
19
15
0
30 May 2023
A Quantitative Review on Language Model Efficiency Research
Meng-Long Jiang
Hy Dang
Lingbo Tong
25
0
0
28 May 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
76
556
0
22 May 2023
Fusing Structure from Motion and Simulation-Augmented Pose Regression from Optical Flow for Challenging Indoor Environments
Felix Ott
Lucas Heublein
David Rügamer
Bernd Bischl
Christopher Mutschler
16
4
0
14 Apr 2023
Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity Analysis
Haoyu He
Yuede Ji
H. H. Huang
23
20
0
26 Mar 2023
Hybrid Spectral Denoising Transformer with Guided Attention
Zeqiang Lai
C. Yan
Ying Fu
23
19
0
16 Mar 2023
Unsupervised Deep Learning for IoT Time Series
Ya Liu
Ying Zhou
Kai Yang
X. Wang
AI4TS
33
33
0
07 Feb 2023
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
24
5
0
26 Sep 2022
Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Shuo-yiin Chang
Guru Prakash
Zelin Wu
Qiao Liang
Tara N. Sainath
Bo-wen Li
Adam Stambler
Shyam Upadhyay
Manaal Faruqui
Trevor Strohman
32
5
0
29 Aug 2022
VacciNet: Towards a Smart Framework for Learning the Distribution Chain Optimization of Vaccines for a Pandemic
Jayeeta Mondal
Jeet Dutta
H. Barua
OffRL
19
0
0
01 Aug 2022
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
33
3
0
07 Jul 2022
FedorAS: Federated Architecture Search under system heterogeneity
L. Dudziak
Stefanos Laskaridis
Javier Fernandez-Marques
FedML
25
7
0
22 Jun 2022
Efficient recurrent architectures through activity sparsity and sparse back-propagation through time
Anand Subramoney
Khaleelulla Khan Nazeer
Mark Schöne
Christian Mayr
David Kappel
30
16
0
13 Jun 2022
Simple Recurrence Improves Masked Language Models
Tao Lei
Ran Tian
Jasmijn Bastings
Ankur P. Parikh
77
4
0
23 May 2022
A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling
Yike Zhang
Xiaobing Feng
Yi Y. Liu
Songjun Cao
Long Ma
16
0
0
09 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language
Francesco Fusco
Damian Pascual
Peter W. J. Staar
Diego Antognini
32
29
0
09 Feb 2022
FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding
B. Pung
Alvin Chan
9
0
0
28 Nov 2021
Capitalization and Punctuation Restoration: a Survey
V. Pais
D. Tufis
17
19
0
21 Nov 2021
Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers
Albert Gu
Isys Johnson
Karan Goel
Khaled Kamal Saab
Tri Dao
Atri Rudra
Christopher Ré
43
546
0
26 Oct 2021
Deep Learning for Bias Detection: From Inception to Deployment
M. A. Bashar
R. Nayak
Anjor Kothare
Vishal Sharma
Kesavan Kandadai
8
2
0
12 Oct 2021
Multi-axis Attentive Prediction for Sparse EventData: An Application to Crime Prediction
Yi Sui
Ga Wu
Scott Sanner
10
2
0
05 Oct 2021
Rumour Detection via Zero-shot Cross-lingual Transfer Learning
Lin Tian
Xiuzhen Zhang
Jey Han Lau
44
13
0
27 Sep 2021
An Interpretable Framework for Drug-Target Interaction with Gated Cross Attention
Yeachan Kim
Bonggun Shin
19
8
0
17 Sep 2021
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
Yang Wu
Dingheng Wang
Xiaotong Lu
Fan Yang
Guoqi Li
W. Dong
Jianbo Shi
29
18
0
30 Aug 2021
Transfer Learning for Multi-lingual Tasks -- a Survey
A. Jafari
Behnam Heidary
R. Farahbakhsh
Mostafa Salehi
Mahdi Jalili
LRM
21
5
0
28 Aug 2021
SHAQ: Single Headed Attention with Quasi-Recurrence
Nashwin Bharwani
Warren Kushner
Sangeet Dandona
Ben Schreiber
17
0
0
18 Aug 2021
Tiny Neural Models for Seq2Seq
A. Kandoor
26
0
0
07 Aug 2021
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging
Csaba Zainkó
L. Tóth
Amin Honarmandi Shandiz
G. Gosztolya
Alexandra Markó
Géza Németh
Tamás Gábor Csapó
25
4
0
26 Jul 2021
Improving Speech Recognition Accuracy of Local POI Using Geographical Models
Songjun Cao
Yike Zhang
Xiaobing Feng
Long Ma
10
3
0
07 Jul 2021
Representation based meta-learning for few-shot spoken intent recognition
Ashish R. Mittal
Samarth Bharadwaj
Shreya Khare
Saneem A. Chemmengath
Karthik Sankaranarayanan
Brian Kingsbury
18
12
0
29 Jun 2021
Stabilizing Equilibrium Models by Jacobian Regularization
Shaojie Bai
V. Koltun
J. Zico Kolter
22
57
0
28 Jun 2021
1
2
3
4
5
Next