ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.08083
  4. Cited By
Language Modeling with Gated Convolutional Networks

Language Modeling with Gated Convolutional Networks

23 December 2016
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
ArXivPDFHTML

Papers citing "Language Modeling with Gated Convolutional Networks"

50 / 915 papers shown
Title
Notochord: a Flexible Probabilistic Model for Real-Time MIDI Performance
Notochord: a Flexible Probabilistic Model for Real-Time MIDI Performance
Victor Shepardson
Jack Armitage
Thor Magnusson
23
2
0
18 Mar 2024
Bridging Expert Knowledge with Deep Learning Techniques for Just-In-Time
  Defect Prediction
Bridging Expert Knowledge with Deep Learning Techniques for Just-In-Time Defect Prediction
Xin Zhou
Donggyun Han
David Lo
VLM
35
2
0
17 Mar 2024
Search-based Ordered Password Generation of Autoregressive Neural
  Networks
Search-based Ordered Password Generation of Autoregressive Neural Networks
Min Jin
Junbin Ye
Rongxuan Shen
Huaxing Lu
AI4TS
16
0
0
15 Mar 2024
SSM Meets Video Diffusion Models: Efficient Video Generation with
  Structured State Spaces
SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces
Yuta Oshima
Shohei Taniguchi
Masahiro Suzuki
Yutaka Matsuo
45
2
0
12 Mar 2024
Mastering Memory Tasks with World Models
Mastering Memory Tasks with World Models
Mohammad Reza Samsami
Artem Zholus
Janarthanan Rajendran
Sarath Chandar
CLL
OffRL
34
23
0
07 Mar 2024
Multi-Level Attention Aggregation for Language-Agnostic Speaker
  Replication
Multi-Level Attention Aggregation for Language-Agnostic Speaker Replication
Yejin Jeon
Gary Geunbae Lee
31
2
0
06 Mar 2024
NiNformer: A Network in Network Transformer with Token Mixing Generated
  Gating Function
NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function
Abdullah Nazhat Abdullah
Tarkan Aydin
39
0
0
04 Mar 2024
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech
  Enhancement
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement
Ravi Shankar
Ke Tan
Buye Xu
Anurag Kumar
36
0
0
03 Mar 2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for
  Efficient Language Models
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Soham De
Samuel L. Smith
Anushan Fernando
Aleksandar Botev
George-Christian Muraru
...
David Budden
Yee Whye Teh
Razvan Pascanu
Nando de Freitas
Çağlar Gülçehre
Mamba
61
117
0
29 Feb 2024
Effective Two-Stage Knowledge Transfer for Multi-Entity Cross-Domain
  Recommendation
Effective Two-Stage Knowledge Transfer for Multi-Entity Cross-Domain Recommendation
Jianyu Guan
Zongming Yin
Tianyi Zhang
Leihui Chen
Yin Zhang
Fei Huang
Jufeng Chen
Shuguang Han
24
1
0
29 Feb 2024
Parallelized Spatiotemporal Binding
Parallelized Spatiotemporal Binding
Gautam Singh
Yue Wang
Jiawei Yang
Boris Ivanovic
Sungjin Ahn
Marco Pavone
Tong Che
48
1
0
26 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for
  On-Device Use Cases
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
43
77
0
22 Feb 2024
Improving Language Understanding from Screenshots
Improving Language Understanding from Screenshots
Tianyu Gao
Zirui Wang
Adithya Bhaskar
Danqi Chen
VLM
43
10
0
21 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
53
25
0
21 Feb 2024
How do Hyenas deal with Human Speech? Speech Recognition and Translation
  with ConfHyena
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
46
1
0
20 Feb 2024
Can Transformers Predict Vibrations?
Can Transformers Predict Vibrations?
Fusataka Kuniyoshi
Yoshihide Sawada
27
0
0
16 Feb 2024
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Wenhao Zheng
Dongsheng Peng
Hongxia Xu
Yun Li
Hongtu Zhu
Tianfan Fu
Huaxiu Yao
Huaxiu Yao
52
5
0
09 Feb 2024
Learning Structure-Aware Representations of Dependent Types
Learning Structure-Aware Representations of Dependent Types
Konstantinos Kogkalidis
Orestis Melkonian
Jean-Philippe Bernardy
NAI
34
1
0
03 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
41
1
0
01 Feb 2024
HyperZ$\cdot$Z$\cdot$W Operator Connects Slow-Fast Networks for Full
  Context Interaction
HyperZ⋅\cdot⋅Z⋅\cdot⋅W Operator Connects Slow-Fast Networks for Full Context Interaction
Harvie Zhang
39
0
0
31 Jan 2024
Computation and Parameter Efficient Multi-Modal Fusion Transformer for
  Cued Speech Recognition
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition
Lei Liu
Li Liu
Haizhou Li
29
6
0
31 Jan 2024
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition
Ju Lin
Niko Moritz
Yiteng Huang
Ruiming Xie
Ming Sun
Christian Fuegen
Frank Seide
32
4
0
18 Jan 2024
Efficient Image Deblurring Networks based on Diffusion Models
Efficient Image Deblurring Networks based on Diffusion Models
Kang Chen
Yuanjie Liu
DiffM
16
2
0
11 Jan 2024
A Primer on Temporal Graph Learning
A Primer on Temporal Graph Learning
Aniq Ur Rahman
J. Coon
AI4CE
42
1
0
08 Jan 2024
TeleChat Technical Report
TeleChat Technical Report
Zhongjiang He
Zihan Wang
Xinzhan Liu
Shixuan Liu
Yitong Yao
...
Zilu Huang
Sishi Xiong
Yuxiang Zhang
Chao Wang
Shuangyong Song
AI4MH
LRM
ALM
66
3
0
08 Jan 2024
TinyLlama: An Open-Source Small Language Model
TinyLlama: An Open-Source Small Language Model
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
ALM
LRM
54
361
0
04 Jan 2024
Deep-ELA: Deep Exploratory Landscape Analysis with Self-Supervised
  Pretrained Transformers for Single- and Multi-Objective Continuous
  Optimization Problems
Deep-ELA: Deep Exploratory Landscape Analysis with Self-Supervised Pretrained Transformers for Single- and Multi-Objective Continuous Optimization Problems
M. Seiler
P. Kerschke
Heike Trautmann
13
6
0
02 Jan 2024
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
Jacob P. Portes
Alex Trott
Sam Havens
Daniel King
Abhinav Venigalla
Moin Nadeem
Nikhil Sardana
D. Khudia
Jonathan Frankle
26
17
0
29 Dec 2023
A bi-objective $ε$-constrained framework for quality-cost
  optimization in language model ensembles
A bi-objective εεε-constrained framework for quality-cost optimization in language model ensembles
Aditya Singh
Aditi Singla
Kanishk Kukreja
13
0
0
26 Dec 2023
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
73
77
0
23 Dec 2023
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for
  Enhanced Time-Domain Monaural Speech Separation
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation
Shengkui Zhao
Yukun Ma
Chongjia Ni
Chong Zhang
Hao Wang
Trung Hieu Nguyen
Kun Zhou
J. Yip
Dianwen Ng
Bin Ma
36
23
0
19 Dec 2023
Multi-level graph learning for audio event classification and
  human-perceived annoyance rating prediction
Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction
Yuanbo Hou
Qiaoqiao Ren
Siyang Song
Yuxin Song
Wenwu Wang
Dick Botteldooren
42
1
0
15 Dec 2023
Entropy Causal Graphs for Multivariate Time Series Anomaly Detection
Entropy Causal Graphs for Multivariate Time Series Anomaly Detection
F. Febrinanto
Kristen Moore
Chandra Thapa
Mujie Liu
Vidya Saikrishna
Jiangang Ma
Feng Xia
CML
25
2
0
15 Dec 2023
Learning Long Sequences in Spiking Neural Networks
Learning Long Sequences in Spiking Neural Networks
Matei Ioan Stan
Oliver Rhodes
37
11
0
14 Dec 2023
Detecting Voice Cloning Attacks via Timbre Watermarking
Detecting Voice Cloning Attacks via Timbre Watermarking
Chang-rui Liu
Jie Zhang
Tianwei Zhang
Xi Yang
Weiming Zhang
Neng H. Yu
33
29
0
06 Dec 2023
Leveraging Laryngograph Data for Robust Voicing Detection in Speech
Leveraging Laryngograph Data for Robust Voicing Detection in Speech
Yixuan Zhang
Heming Wang
DeLiang Wang
32
0
0
05 Dec 2023
Recurrent Distance Filtering for Graph Representation Learning
Recurrent Distance Filtering for Graph Representation Learning
Yuhui Ding
Antonio Orvieto
Bobby He
Thomas Hofmann
GNN
36
6
0
03 Dec 2023
MABViT -- Modified Attention Block Enhances Vision Transformers
MABViT -- Modified Attention Block Enhances Vision Transformers
Mahesh Ramesh
Aswinkumar Ramkumar
19
3
0
03 Dec 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
23
76
0
28 Nov 2023
Advancing State of the Art in Language Modeling
Advancing State of the Art in Language Modeling
David Herel
Tomáš Mikolov
34
1
0
28 Nov 2023
Ultra-Range Gesture Recognition using a Web-Camera in Human-Robot
  Interaction
Ultra-Range Gesture Recognition using a Web-Camera in Human-Robot Interaction
Eran Bamani
Eden Nissinman
Inbar Meir
L. Koenigsberg
A. Sintov
27
11
0
26 Nov 2023
Task adaption by biologically inspired stochastic comodulation
Task adaption by biologically inspired stochastic comodulation
Gauthier Boeshertz
Caroline Haimerl
Cristina Savin
40
0
0
25 Nov 2023
Attention-Challenging Multiple Instance Learning for Whole Slide Image
  Classification
Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification
Yunlong Zhang
Honglin Li
Yuxuan Sun
Sunyi Zheng
Chenglu Zhu
Lin Yang
30
29
0
13 Nov 2023
Towards Climate Variable Prediction with Conditioned Spatio-Temporal
  Normalizing Flows
Towards Climate Variable Prediction with Conditioned Spatio-Temporal Normalizing Flows
Christina Winkler
David Rolnick
27
0
0
12 Nov 2023
Vital Sign Forecasting for Sepsis Patients in ICUs
Vital Sign Forecasting for Sepsis Patients in ICUs
Anubhav Bhatti
Yuwei Liu
Chen Dan
Bingjie Shen
San Lee
Yonghwan Kim
Jang Yong Kim
20
4
0
08 Nov 2023
Recursion in Recursion: Two-Level Nested Recursion for Length
  Generalization with Scalability
Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability
Jishnu Ray Chowdhury
Cornelia Caragea
37
5
0
08 Nov 2023
Simplifying Transformer Blocks
Simplifying Transformer Blocks
Bobby He
Thomas Hofmann
27
31
0
03 Nov 2023
Global Transformer Architecture for Indoor Room Temperature Forecasting
Global Transformer Architecture for Indoor Room Temperature Forecasting
Alfredo V. Clemente
A. Nocente
Massimiliano Ruocco
AI4CE
18
1
0
31 Oct 2023
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long
  Documents
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents
Michael Gunther
Jackmin Ong
Isabelle Mohr
Alaeddine Abdessalem
Tanguy Abel
...
Saba Sturua
Bo Wang
Maximilian Werk
Nan Wang
Han Xiao
RALM
27
58
0
30 Oct 2023
PartialFormer: Modeling Part Instead of Whole for Machine Translation
PartialFormer: Modeling Part Instead of Whole for Machine Translation
Tong Zheng
Bei Li
Huiwen Bao
Jiale Wang
Weiqiao Shan
Tong Xiao
Jingbo Zhu
MoE
AI4CE
16
0
0
23 Oct 2023
Previous
12345...171819
Next