Transformers in Uniform TC $^0$

20 September 2024

David Chiang

Abstract

Previous work has shown that the languages recognized by average-hard attention transformers (AHATs) and softmax-attention transformers (SMATs) are within the circuit complexity class TC $^0$ . However, these results assume limited-precision arithmetic: using floating-point numbers with O(log n) bits (where n is the length of the input string), Strobl showed that AHATs can be approximated in L-uniform TC $^0$ , and Merrill and Sabharwal showed that SMATs can be approximated in DLOGTIME-uniform TC $^0$ . Here, we improve these results, showing that AHATs with no approximation, SMATs with O(poly(n)) bits of floating-point precision, and SMATs with at most $2^{-O(poly(n))}$ absolute error are all in DLOGTIME-uniform TC $^0$ .

View on arXiv

Comments on this paper

Transformers in Uniform TC0^00

Transformers in Uniform TC $^0$