Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.08083
Cited By
Language Modeling with Gated Convolutional Networks
23 December 2016
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Modeling with Gated Convolutional Networks"
50 / 915 papers shown
Title
Arabic Automatic Story Generation with Large Language Models
Ahmed Oumar El-Shangiti
Fakhraddin Alwajih
Muhammad Abdul-Mageed
21
0
0
10 Jul 2024
On the Power of Convolution Augmented Transformer
Mingchen Li
Xuechen Zhang
Yixiao Huang
Samet Oymak
40
1
0
08 Jul 2024
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Ruochen Wang
Sohyun An
Minhao Cheng
Tianyi Zhou
Sung Ju Hwang
Cho-Jui Hsieh
44
7
0
28 Jun 2024
From Efficient Multimodal Models to World Models: A Survey
Xinji Mai
Zeng Tao
Junxiong Lin
Haoran Wang
Yang Chang
Yanlan Kang
Yan Wang
Wenqiang Zhang
34
5
0
27 Jun 2024
SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series
Hugo Inzirillo
Remi Genet
29
12
0
25 Jun 2024
On the consistency of hyper-parameter selection in value-based deep reinforcement learning
J. Obando-Ceron
J. G. Araújo
Rameswar Panda
Pablo Samuel Castro
48
7
0
25 Jun 2024
Detecting Frames in News Headlines and Lead Images in U.S. Gun Violence Coverage
Isidora Chara Tourni
Lei Guo
Hengchang Hu
Edward Edberg Halim
Prakash Ishwar
...
Boqi Chen
Margrit Betke
Fabian Zhafransyah
Sha Lai
Derry Wijaya
39
19
0
25 Jun 2024
RouteFinder: Towards Foundation Models for Vehicle Routing Problems
Federico Berto
Chuanbo Hua
Nayeli Gast Zepeda
André Hottung
N. Wouda
Leon Lan
Kevin Tierney
J. Park
Jinkyoo Park
61
10
0
21 Jun 2024
Informed along the road: roadway capacity driven graph convolution network for network-wide traffic prediction
Zilin Bian
Jingqin Gao
K. Ozbay
Fan Zuo
Dachuan Zuo
Zhenning Li
GNN
31
0
0
18 Jun 2024
Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach
Zilin Bian
Jingqin Gao
K. Ozbay
Zhenning Li
22
0
0
18 Jun 2024
MCSD: An Efficient Language Model with Diverse Fusion
Hua Yang
Duohai Li
Shiman Li
35
2
0
18 Jun 2024
GEB-1.3B: Open Lightweight Large Language Model
Jie Wu
Yufeng Zhu
Lei Shen
Xuqing Lu
ALM
37
0
0
14 Jun 2024
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
Qihao Liu
Zhanpeng Zeng
Ju He
Qihang Yu
Xiaohui Shen
Liang-Chieh Chen
53
21
0
13 Jun 2024
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You
Yichao Fu
Zheng Wang
Amir Yazdanbakhsh
Yingyan Celine Lin
48
2
0
11 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
77
57
0
11 Jun 2024
Mamba YOLO: SSMs-Based YOLO For Object Detection
Zeyu Wang
Chen Li
Huiying Xu
Xinzhong Zhu
Mamba
55
2
0
09 Jun 2024
Hidden Holes: topological aspects of language models
Stephen Fitz
P. Romero
Jiyan Jonas Schneider
43
0
0
09 Jun 2024
LoCoCo: Dropping In Convolutions for Long Context Compression
Ruisi Cai
Yuandong Tian
Zhangyang Wang
Beidi Chen
49
10
0
08 Jun 2024
Weight-based Decomposition: A Case for Bilinear MLPs
Michael T. Pearce
Thomas Dooms
Alice Rigg
42
1
0
06 Jun 2024
Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Chang Zong
Jian Shao
Weiming Lu
Yueting Zhuang
46
2
0
06 Jun 2024
ConPCO: Preserving Phoneme Characteristics for Automatic Pronunciation Assessment Leveraging Contrastive Ordinal Regularization
Bi-Cheng Yan
Wei-Cheng Chao
Jiun-Ting Li
Yi-Cheng Wang
Hsin-Wei Wang
Meng-Shin Lin
Berlin Chen
23
0
0
05 Jun 2024
Temporal Graph Learning Recurrent Neural Network for Traffic Forecasting
Sanghyun Lee
Chanyoung Park
AI4TS
44
0
0
04 Jun 2024
Scalable MatMul-free Language Modeling
Rui-Jie Zhu
Yu Zhang
Ethan Sifferman
Tyler Sheaves
Yiqiao Wang
Dustin Richmond
P. Zhou
Jason Eshraghian
31
17
0
04 Jun 2024
A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting
Remi Genet
Hugo Inzirillo
AI4TS
34
43
0
04 Jun 2024
Recurrent neural networks: vanishing and exploding gradients are not the end of the story
Nicolas Zucchet
Antonio Orvieto
ODL
AAML
45
9
0
31 May 2024
Length independent generalization bounds for deep SSM architectures
Dániel Rácz
Mihaly Petreczky
Bálint Daróczy
44
1
0
30 May 2024
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
Derek Lim
Moe Putterman
Robin Walters
Haggai Maron
Stefanie Jegelka
53
5
0
30 May 2024
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof
Yana Veitsman
Michael Hahn
Mamba
38
8
0
27 May 2024
Disentangling and Integrating Relational and Sensory Information in Transformer Architectures
Awni Altabaa
John Lafferty
37
3
0
26 May 2024
Expanded Gating Ranges Improve Activation Functions
Allen Hao Huang
AI4CE
29
1
0
25 May 2024
Activator: GLU Activation Function as the Core Component of a Vision Transformer
Abdullah Nazhat Abdullah
Tarkan Aydin
ViT
43
0
0
24 May 2024
Dynamic Context Adaptation and Information Flow Control in Transformers: Introducing the Evaluator Adjuster Unit and Gated Residual Connections
Sahil Rajesh Dhayalkar
24
1
0
22 May 2024
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
50
50
0
13 May 2024
State-Free Inference of State-Space Models: The Transfer Function Approach
Rom N. Parnichkun
Stefano Massaroli
Alessandro Moro
Jimmy T.H. Smith
Ramin Hasani
...
Hajime Asama
Stefano Ermon
Taiji Suzuki
Atsushi Yamashita
Michael Poli
44
5
0
10 May 2024
Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models
Zhengxing Lan
Hongbo Li
Lingshan Liu
Bo Fan
Yisheng Lv
Yilong Ren
Zhiyong Cui
47
16
0
08 May 2024
HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech
Zhongren Dong
Zixing Zhang
Weixiang Xu
Jing Han
Jianjun Ou
Björn W. Schuller
40
2
0
07 May 2024
Dependency-Aware Semi-Structured Sparsity: Declining Roles of Outliers in Pruning GLU-based LLMs
Zhiyu Guo
Hidetaka Kamigaito
Taro Wanatnabe
32
0
0
03 May 2024
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
Yucheng Hu
Yuxing Lu
RALM
60
18
0
30 Apr 2024
Scalable Event-by-event Processing of Neuromorphic Sensory Signals With Deep State-Space Models
Mark Schöne
Neeraj Mohan Sushma
Jingyue Zhuge
Christian Mayr
Anand Subramoney
David Kappel
AI4TS
BDL
48
9
0
29 Apr 2024
EvaNet: Elevation-Guided Flood Extent Mapping on Earth Imagery
Mirza Tanzim Sami
Da Yan
Saugat Adhikari
Lyuheng Yuan
Jiao Han
Zhe Jiang
Jalal Khalil
Yang Zhou
23
1
0
27 Apr 2024
A Cognitive-Driven Trajectory Prediction Model for Autonomous Driving in Mixed Autonomy Environment
Haicheng Liao
Zhenning Li
Chengyue Wang
Bonan Wang
Hanlin Kong
Yanchen Guan
Guofa Li
Zhiyong Cui
Chengzhong Xu
21
8
0
26 Apr 2024
Improving Dictionary Learning with Gated Sparse Autoencoders
Senthooran Rajamanoharan
Arthur Conmy
Lewis Smith
Tom Lieberum
Vikrant Varma
János Kramár
Rohin Shah
Neel Nanda
RALM
37
79
0
24 Apr 2024
A Unified Replay-based Continuous Learning Framework for Spatio-Temporal Prediction on Streaming Data
Hao Miao
Yan Zhao
Chenjuan Guo
Bin Yang
Kai Zheng
Feiteng Huang
Jiandong Xie
Christian S. Jensen
AI4TS
CLL
44
31
0
23 Apr 2024
HumMUSS: Human Motion Understanding using State Space Models
Arnab Kumar Mondal
Stefano Alletto
Denis Tome
39
4
0
16 Apr 2024
Gull: A Generative Multifunctional Audio Codec
Yi Luo
Jianwei Yu
Hangting Chen
Rongzhi Gu
Chao Weng
AuLLM
41
3
0
07 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
3DV
VLM
47
25
0
02 Apr 2024
Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks
Yang Ai
Zhenhua Ling
34
3
0
26 Mar 2024
FedMIL: Federated-Multiple Instance Learning for Video Analysis with Optimized DPP Scheduling
Ashish Bastola
Hao Wang
Xiwen Chen
Abolfazl Razi
31
0
0
26 Mar 2024
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Mohammad Mahmudul Alam
Edward Raff
Stella Biderman
Tim Oates
James Holt
AAML
38
3
0
23 Mar 2024
Model order reduction of deep structured state-space models: A system-theoretic approach
Marco Forgione
Manas Mejari
Dario Piga
31
1
0
21 Mar 2024
Previous
1
2
3
4
5
6
...
17
18
19
Next