Approximation Rate of the Transformer Architecture for Sequence Modeling

3 January 2025

Papers citing "Approximation Rate of the Transformer Architecture for Sequence Modeling"

9 / 9 papers shown

Title
Transformers Can Overcome the Curse of Dimensionality: A Theoretical Study from an Approximation Perspective Yuling Jiao Yanming Lai Yang Wang Bokai Yan 31 0 0 18 Apr 2025
Approximation Bounds for Transformer Networks with Application to Regression Yuling Jiao Yanming Lai Defeng Sun Yang Wang Bokai Yan 26 0 0 16 Apr 2025
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding Kevin Xu Issei Sato 37 3 0 02 Oct 2024
Anchor function: a type of benchmark functions for studying language models Zhongwang Zhang Zhiwei Wang Junjie Yao Zhangchen Zhou Xiaolong Li E. Weinan Z. Xu 21 5 0 16 Jan 2024
A mathematical perspective on Transformers Borjan Geshkovski Cyril Letrouit Yury Polyanskiy Philippe Rigollet EDL AI4CE 26 25 0 17 Dec 2023
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators? T. Kajitsuka Issei Sato 24 16 0 26 Jul 2023
Your Transformer May Not be as Powerful as You Expect Shengjie Luo Shanda Li Shuxin Zheng Tie-Yan Liu Liwei Wang Di He 52 50 0 26 May 2022
Universal Approximation Under Constraints is Possible with Transformers Anastasis Kratsios Behnoosh Zamanlooy Tianlin Liu Ivan Dokmanić 42 26 0 07 Oct 2021
Optimal Approximation Rate of ReLU Networks in terms of Width and Depth Zuowei Shen Haizhao Yang Shijun Zhang 79 114 0 28 Feb 2021