Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.01705
Cited By
The Information Pathways Hypothesis: Transformers are Dynamic Self-Ensembles
2 June 2023
Md Shamim Hussain
Mohammed J. Zaki
D. Subramanian
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Information Pathways Hypothesis: Transformers are Dynamic Self-Ensembles"
10 / 10 papers shown
Title
Disrupting Diffusion-based Inpainters with Semantic Digression
Geonho Son
Juhun Lee
Simon S. Woo
DiffM
34
2
0
14 Jul 2024
Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers
Md Shamim Hussain
Mohammed J. Zaki
D. Subramanian
ViT
26
4
0
07 Feb 2024
GRPE: Relative Positional Encoding for Graph Transformer
Wonpyo Park
Woonggi Chang
Donggeon Lee
Juntae Kim
Seung-won Hwang
39
74
0
30 Jan 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
245
695
0
27 Aug 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
73
77
0
12 Jul 2021
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
219
88
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
251
2,012
0
28 Jul 2020
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
148
376
0
23 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
579
0
12 Mar 2020
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
247
9,109
0
06 Jun 2015
1