9
26

Double descent in the condition number

Abstract

In solving a system of nn linear equations in dd variables Ax=bAx=b, the condition number of the n,dn,d matrix AA measures how much errors in the data bb affect the solution xx. Estimates of this type are important in many inverse problems. An example is machine learning where the key task is to estimate an underlying function from a set of measurements at random points in a high dimensional space and where low sensitivity to error in the data is a requirement for good predictive performance. Here we discuss the simple observation, which is known but surprisingly little quoted (see Theorem 4.2 in \cite{Brgisser:2013:CGN:2526261}): when the columns of AA are random vectors, the condition number of AA is highest if d=nd=n, that is when the inverse of AA exists. An overdetermined system (n>dn>d) as well as an underdetermined system (n<dn<d), for which the pseudoinverse must be used instead of the inverse, typically have significantly better, that is lower, condition numbers. Thus the condition number of AA plotted as function of dd shows a double descent behavior with a peak at d=nd=n.

View on arXiv
Comments on this paper