Modelling Temporal Document Sequences for Clinical ICD CodingConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Clarence Boon Liang Ng

Diogo Santos

Marek Rei

185

24 Feb 2023

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal PropagationInternational Conference on Learning Representations (ICLR), 2023

235

20 Feb 2023

Neural Attention Memory

Hyoungwook Nam

S. Seo

HAI

152

18 Feb 2023

MorphGANFormer: Transformer-based Face Morphing and De-Morphing

Guo-Jun Qi

165

18 Feb 2023

Enhancing Multivariate Time Series Classifiers through Self-Attention and Relative Positioning InfusionIEEE Access (IEEE Access), 2023

Mehryar Abbasi

Parvaneh Saeedi

AI4TS

239

13 Feb 2023

Simple Hardware-Efficient Long Convolutions for Sequence ModelingInternational Conference on Machine Learning (ICML), 2023

207

13 Feb 2023

A Study on ReLU and Softmax in Transformer

Junliang Guo

Jiang Bian

235

13 Feb 2023

Transformer models: an introduction and catalog

X. Amatriain

Ananth Sankar

Jie Bing

Praveen Kumar Bodigutla

Timothy J. Hazen

Michaeel Kazi

503

12 Feb 2023

GTR-CTRL: Instrument and Genre Conditioning for Guitar-Focused Music Generation with Transformers

212

10 Feb 2023

Cut your Losses with SquentropyInternational Conference on Machine Learning (ICML), 2023

135

08 Feb 2023

EvoText: Enhancing Natural Language Generation Models via Self-Escalation Learning for Up-to-Date Knowledge and Improved PerformanceApplied Sciences (Appl. Sci.), 2023

257

08 Feb 2023

Transformer-based Models for Long-Form Document Matching: Challenges and Empirical AnalysisFindings (Findings), 2023

158

07 Feb 2023

Memory-Based Meta-Learning on Non-Stationary DistributionsInternational Conference on Machine Learning (ICML), 2023

Marcus Hutter

257

06 Feb 2023

Computation vs. Communication Scaling for Future Transformers on Future Hardware

Suchita Pati

Shaizeen Aga

Mahzabeen Islam

Nuwan Jayasena

Matthew D. Sinclair

267

06 Feb 2023

Towards energy-efficient Deep Learning: An overview of energy-efficient approaches along the Deep Learning Lifecycle

245

05 Feb 2023

Learning a Fourier Transform for Linear Relative Positional Encodings in TransformersInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

K. Choromanski

Shanda Li

Valerii Likhosherstov

327

03 Feb 2023

Grounding Language Models to Images for Multimodal Inputs and OutputsInternational Conference on Machine Learning (ICML), 2023

Jing Yu Koh

Ruslan Salakhutdinov

Daniel Fried

MLLM

448

151

31 Jan 2023

An Comparative Analysis of Different Pitch and Metrical Grid Encoding Methods in the Task of Sequential Music Generation

Yuqiang Li

Shengchen Li

Georgy Fazekas

192

31 Jan 2023

A Comparative Study of Pretrained Language Models for Long Clinical Text

261

113

27 Jan 2023

Robust Transformer with Locality Inductive Bias and Feature NormalizationEngineering Science and Technology, an International Journal (JEST), 2023

Omid Nejati Manzari

Hossein Kashiani

Hojat Asgarian Dehkordi

S. B. Shokouhi

ViT

184

27 Jan 2023

Open Problems in Applied Deep Learning

M. Raissi

AI4CE

234

26 Jan 2023

Out of Distribution Performance of State of Art Vision Model

Salman Rahman

W. Lee

402

25 Jan 2023

Human-Timescale Adaptation in an Open-Ended Task SpaceInternational Conference on Machine Learning (ICML), 2023

Feryal M. P. Behbahani

...

Lei Zhang

LM&Ro OffRL AI4CE LRM

329

148

18 Jan 2023

Ankh: Optimized Protein Language Model Unlocks General-Purpose ModellingbioRxiv (bioRxiv), 2023

424

144

16 Jan 2023

Language Cognition and Language Computation -- Human and Machine Language Understanding

250

12 Jan 2023

WuYun: Exploring hierarchical skeleton-guided melody generation using knowledge-enhanced deep learning

Kejun Zhang

Xu Tan

254

11 Jan 2023

A Survey on Transformers in Reinforcement Learning

547

08 Jan 2023

Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition

209

06 Jan 2023

An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models

320

30 Dec 2022

Transformer in Transformer as Backbone for Deep Reinforcement Learning

Hangyu Mao

Rui Zhao

Hao Chen

Jianye Hao

Junge Zhang

189

30 Dec 2022

Efficient Movie Scene Detection using State-Space TransformersComputer Vision and Pattern Recognition (CVPR), 2022

Gedas Bertasius

246

29 Dec 2022

Transformers in Action Recognition: A Review on Temporal Modeling

Elham Shabaninia

Hossein Nezamabadi-pour

Fatemeh Shafizadegan

ViT

211

29 Dec 2022

On Transforming Reinforcement Learning by Transformer: The Development TrajectoryIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Shengchao Hu

Li Shen

340

29 Dec 2022

Hungry Hungry Hippos: Towards Language Modeling with State Space ModelsInternational Conference on Learning Representations (ICLR), 2022

442

556

28 Dec 2022

Part-guided Relational Transformers for Fine-grained Visual RecognitionIEEE Transactions on Image Processing (TIP), 2021

Yonghong Tian

210

28 Dec 2022

On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective

Jingxiao Chen

221

24 Dec 2022

Scalable Adaptive Computation for Iterative GenerationInternational Conference on Machine Learning (ICML), 2022

Allan Jabri

David Fleet

Ting-Li Chen

DiffM

254

153

22 Dec 2022

Generating music with sentiment using Transformer-GANsInternational Society for Music Information Retrieval Conference (ISMIR), 2022

145

21 Dec 2022

ORCA: A Challenging Benchmark for Arabic Language UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

AbdelRahim Elmadany

El Moatez Billah Nagoudi

Muhammad Abdul-Mageed

ELM

301

21 Dec 2022

A Length-Extrapolatable TransformerAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Xia Song

330

156

20 Dec 2022