Double descent in the condition number

In solving a system of linear equations in variables , the condition number of the matrix measures how much errors in the data affect the solution . Estimates of this type are important in many inverse problems. An example is machine learning where the key task is to estimate an underlying function from a set of measurements at random points in a high dimensional space and where low sensitivity to error in the data is a requirement for good predictive performance. Here we discuss the simple observation, which is known but surprisingly little quoted (see Theorem 4.2 in \cite{Brgisser:2013:CGN:2526261}): when the columns of are random vectors, the condition number of is highest if , that is when the inverse of exists. An overdetermined system () as well as an underdetermined system (), for which the pseudoinverse must be used instead of the inverse, typically have significantly better, that is lower, condition numbers. Thus the condition number of plotted as function of shows a double descent behavior with a peak at .
View on arXiv